Spectral Clustering

Spectrum clustering for MS/MS based proteomics data analysis.

Introduction

The "Spectral Cluster" nodes integrates the spectra-cluster algorithm (Griss et al., 2016, Nat. Meth.) into Proteome Discoverer. It is wrapper around the spectra-cluster-cli command line version of the spectra-cluster algorithm (https://spectra-cluster.github.io).

Proteome Discoverer Nodes

The "Spectra Cluster" node allows the clustering of MS/MS spectra before or after the identification step. The "Spectral Cluster Search" node can subsequently be used to transfer identification data from identified to unidentified spectra based on the clustering results. The additionally inferred identifications are visible throughout all other Proteome Discoverer result tables and can, for example, be used to improve label-free quantitation.

Clustering results are available through the "Spectra Cluster Visualization" consensus node which adds a "Clusters" table to the Proteome Discoverer result. This table contains one entry per cluster showing basic statistics such as the number of spectra, the number of identified spectra, and the frequency of sequences within a cluster, as well as the number of spectra per sample. As with other Proteome Discoverer tables, this table can easily be exported to other formats for further analysis using, for example, R.

Tutorial

You can find a tutorial on how to use the spectra-cluster node here:
https://spectra-cluster.github.io/tutorials/proteome_discoverer.html

Citation

  • Griss et al., Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets., Nat. Meth. 2016 Aug;13(8):651-656.

Download

IMP logo
IMBA logo
GMI logo