Hello,

Sorry if this comes a bit late; we had to solve some email issues - Thanks again to Andreas for doing it.

This is part of the email exchange I had with Christine Hoogland and Gregoire Rossier a few years ago regarding the algorithm used by "Compute pI/Mw" on the Expazy server. The code which was given to me is included at the end of this email; I used it to update bj1.

Good luck to all GSoC candidates,

George


On Tue, May 22, 2007 at 9:26 AM, Christine Hoogland via RT <[email protected]> wrote:

    Dear George,

    Please find enclosed the algorithm we are using on ExPASy.

    I hope this helps.

    Best regards
    Christine

    >
    > The pK values used for "Compute pI/Mw" can be found in
    >
    > # Bjellqvist, B.,Hughes, G.J., Pasquali, Ch., Paquet, N., Ravier, F.,
    > Sanchez, J.-Ch., Frutiger, S. & Hochstrasser, D.F. The focusing
    > positions of polypeptides in immobilized pH gradients can be predicted
    > from their amino acid sequences. Electrophoresis 1993, 14, 1023-1031.
    >
    > MEDLINE: 8125050
    >
    > # Bjellqvist, B., Basse, B., Olsen, E. and Celis, J.E. Reference
    > points
    > for comparisons of two-dimensional maps of proteins from different
    > human
    > cell types defined in a pH scale where isoelectric points correlate
    > with
    > polypeptide compositions. Electrophoresis 1994, 15, 529-539.
    >
    > MEDLINE: 8055880
    >
    > The pK were defined by examining polypeptide migration between pH 4.5
    > to
    > 7.3 in an immobilised pH gradient gel environment with 9.2M and 9.8M
    > urea at 15ºC or 25ºC. Prediction of protein pI for highly basic
    > proteins
    > is yet to be studied and it is possible that current Compute pI/Mw
    > predictions may not be adequate for this purpose.
    >
    > I hope this helps.
    >
    >
    > Best regards
    > Gregoire Rossier
    >
    >

    --------------------------------------------------------
    Christine Hoogland
    Swiss Institute of Bioinformatics
    CMU - 1, rue Michel Servet      Tel. (+41 22) 379 58 28
    CH - 1211 Geneva 4 Switzerland  Fax  (+41 22) 379 58 58
    [email protected]   http://www.expasy.org/
    --------------------------------------------------------

    //  VERSION      :   1.6
    //  DATE         :   1/25/95
// Copyright 1993 by Swiss Institute of Bioinformatics. All rights reserved.

    //
    // Table of pk values :
    //  Note: the current algorithm does not use the last two columns. Each
    //  row corresponds to an amino acid starting with Ala. J, O and U are
    //  inexistant, but here only in order to have the complete alphabet.
    //
    //     Ct    Nt   Sm     Sc     Sn
    //

    static double cPk[26][5] = {
    3.55, 7.59, 0.   , 0.   , 0.    , // A
    3.55, 7.50, 0.   , 0.   , 0.    , // B
    3.55, 7.50, 9.00 , 9.00 , 9.00  , // C
    4.55, 7.50, 4.05 , 4.05 , 4.05  , // D
    4.75, 7.70, 4.45 , 4.45 , 4.45  , // E
    3.55, 7.50, 0.   , 0.   , 0.    , // F
    3.55, 7.50, 0.   , 0.   , 0.    , // G
    3.55, 7.50, 5.98 , 5.98 , 5.98  , // H
    3.55, 7.50, 0.   , 0.   , 0.    , // I
    0.00, 0.00, 0.   , 0.   , 0.    , // J
    3.55, 7.50, 10.00, 10.00, 10.00 , // K
    3.55, 7.50, 0.   , 0.   , 0.    , // L
    3.55, 7.00, 0.   , 0.   , 0.    , // M
    3.55, 7.50, 0.   , 0.   , 0.    , // N
    0.00, 0.00, 0.   , 0.   , 0.    , // O
    3.55, 8.36, 0.   , 0.   , 0.    , // P
    3.55, 7.50, 0.   , 0.   , 0.    , // Q
    3.55, 7.50, 12.0 , 12.0 , 12.0  , // R
    3.55, 6.93, 0.   , 0.   , 0.    , // S
    3.55, 6.82, 0.   , 0.   , 0.    , // T
    0.00, 0.00, 0.   , 0.   , 0.    , // U
    3.55, 7.44, 0.   , 0.   , 0.    , // V
    3.55, 7.50, 0.   , 0.   , 0.    , // W
    3.55, 7.50, 0.   , 0.   , 0.    , // X
    3.55, 7.50, 10.00, 10.00, 10.00 , // Y
    3.55, 7.50, 0.   , 0.   , 0.    }; // Z

    #define PH_MIN 0 /* minimum pH value */
    #define PH_MAX 14 /* maximum pH value */
    #define MAXLOOP 2000 /* maximum number of iterations */
    #define EPSI 0.0001 /* desired precision */

      //
      // Compute the amino-acid composition.
      //
      for (i = 0; i < sequenceLength; i++)
        comp[sequence[i] - 'A']++;

      //
      // Look up N-terminal and C-terminal residue.
      //
      nTermResidue = sequence[0] - 'A';
      cTermResidue = sequence[sequenceLength - 1] - 'A';

      phMin = PH_MIN;
      phMax = PH_MAX;

      for (i = 0, charge = 1.0; i < MAXLOOP && (phMax - phMin) > EPSI; i++)
        {
          phMid = phMin + (phMax - phMin) / 2;

          cter = exp10(-cPk[cTermResidue][0]) /
     (exp10(-cPk[cTermResidue][0]) + exp10(-phMid));
          nter = exp10(-phMid) /
     (exp10(-cPk[nTermResidue][1]) + exp10(-phMid));

          carg = comp[R] * exp10(-phMid) /
     (exp10(-cPk[R][2]) + exp10(-phMid));
          chis = comp[H] * exp10(-phMid) /
     (exp10(-cPk[H][2]) + exp10(-phMid));
          clys = comp[K] * exp10(-phMid) /
     (exp10(-cPk[K][2]) + exp10(-phMid));

          casp = comp[D] * exp10(-cPk[D][2]) /
     (exp10(-cPk[D][2]) + exp10(-phMid));
          cglu = comp[E] * exp10(-cPk[E][2]) /
     (exp10(-cPk[E][2]) + exp10(-phMid));

          ccys = comp[C] * exp10(-cPk[C][2]) /
     (exp10(-cPk[C][2]) + exp10(-phMid));
          ctyr = comp[Y] * exp10(-cPk[Y][2]) /
     (exp10(-cPk[Y][2]) + exp10(-phMid));

          charge = carg + clys + chis + nter -
     (casp + cglu + ctyr + ccys + cter);

          if (charge > 0.0)
             phMin = phMid;
          else
             phMax = phMid;
        }
      }


_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to