HJEM


Poster
with graphics


DeltaProt: Molecular comparison of proteins based on sequence alignments.
(© University of Tromsø 2005, version 2.1 Oct 2010.)

A Matlab© companion Toolbox.

Steinar Thorvaldsen1, Tor Flå1 and Nils P. Willassen1,2
1University of Tromsø and 2Norwegian Structural Biology Centre
9037 Tromsø - Norway.


Download Matlab code of the DeltaProt Toolbox  here!
This software can be used freely for academic, non-commercial use.
Note: Matlab Statistical Toolbox is used in some functions.


Abstract. In bioinformatics there is an increasing desire for more statistical methods in the data analysis. DeltaProt is a software toolbox that facilitates importing, analyzing and visualizing data from multiple alignments of proteins. We present statistical methods and trend-tests that are useful when the aligned sequences can be divided into two or more subgroups based on known phenotypic traits such as preference of temperature, pH, salt concentration or pressure. The algorithms are considered valuable for discovering differences between phenotypic groups at a molecular level, and the approach has been successfully applied in the comparative research on extremophile organisms. It has special relevance for membrane proteins, since it is very difficult to obtain structural models of this important group, and may be used for high-throughput sequence analysis. The toolbox also contains procedures for comparative plots of alignments and substitution matrices.
 


Edited by: Steinar Thorvaldsen
Latest update: October 25, 2010.

 

File name Brief description

manual.pdf

Program's user guide.

DeltaProt1.m
DeltaProt.m
Main program scripts

readfasta.m
aa2int.m
dssp2int.m
wasa2int.m

Reads text files (alignments) in fasta format
Transforms symbols of amino acids
Convert secondary structure to numbers
Convert 3D predictions to numbers

findSeqSimilarity.m
findSeqSiteVar.m
Finds overall similarity between input seq.
Finds variations and conserved sites(Var=1).

aaCount.m
aaAnova1Pvalues.m
aaAnova2Pvalues.m
aaFreq.m
aaFreqPlot.m
aaDeltaFreq.m
aaDeltaFreqPlot.m

Calculates counts of amino acids
Runs one-way unbalanced ANOVA test
Runs two-way unbalanced ANOVA test
Calculates frequencies extracted previously and
Plots resulting frequencies
Calculates compositional changes between sequences
Plots compositional changes

substPairsCount.m
substPairsPvalues.m
substPairs2bin.m
FisherExtest.m*
chi2Tests.m*
MantelHaenTest.m*
substPairsPlot.m
substCountSort.m
substPvalueSort

Calculates substitutions between aligned pairs
Runs the appropriate statistical test
Reduce the full substitution matrix to fewer categories
Fisher's exact test with mid-P-values
Chi-square tests (Read-Cressie, Pearson or LogL)
Stratified Mantel-Haenszel test
Plot the substitution matrix results
Sort the substitutions by numbers
Sort the substitutions by P-values

filterAlignedseq.m
propAnovaPvalues.m
propNormalityTest.m*
propPairedPvalues.m
propRegress.m
propRegressKendall.m
cumKendallTest.m*
FDR.m

Reduces sequence data to property data.
Runs one-way ANOVA test. Optional Box-plots.
Data-adaptive goodness of fit to normality
Runs paired trend-tests (t-test or Wilcoxon). Optional plots
To detect trends in ONE protein by parametric regression
To detect trends by non-parametric regression
Implemented cumulative Mann-Kendall trend test
False Discovery Rate for multiple test correction

aa_prop60.xls

Excel-file with the physicochemical properties.

align_vibrio.tgz

Zipped file with Python scripts.
This script will not run on another computer without some work and adjustments of the code

* Implemented statistical test in DeltaProt.


References
Please, use the following references:

S. Thorvaldsen and E. Ytterstad: Environmental adaptation of proteins: Regression models with simple physicochemical properties. Computational Biology and Chemistry. Vol. 33 (5) 2009, pp. 351-356.

S. Thorvaldsen, E. Ytterstad and T. Flå: Property-dependent analysis of aligned proteins from two or more populations. Proceedings of the 4th Asia-Pacific Bioinformatics Conference (Eds.: T. Jiang et al.). Imperial College Press 2006, pp. 169-178.

S. Thorvaldsen, T. Flå and N. P. Willassen: Extracting molecular diversity between populations through sequence alignments. Lecture Notes in Bioinformatics, Vol. 3745, Springer-Verlag 2005, pp. 317-328.

DeltaProt has been presented as software demo at CompLife and ECCB.