Re: [ccp4bb] AW: [ccp4bb] modified amino acids in the PDB
Unfortunately, not all is ever that easy. If we were to follow Mark's rules, glycoproteins would end up losing their asparagines (and others residues depending on the type of glycosylation) to numerous underdefined new residue types. And consider the confusion caused by the different versions of N-linked glycosylation that dominates yeast vs. insects vs. mammals, which would lead to different residue names. It does not even end there: due to its heterogenous nature, you would end up with different "modified residue names" for, say 6' fucosylated vs 3' and 6' fucosylated, arthropod-type N-linked glycosylated Asparagine. All of this, when we can barely model the basic sugars half-right, let alone identify them. For covalent inhibitors, what Herman mentions - regular amino acid names with links, would also be clearly the more practical solution. My gripe is, as Peter mentioned, having X's in protein sequences. Why can't we have MSE as M? Isn't this obvious? But as always, it does not end there. For example, pyroglutamic acid, a naturally occuring form of *both* Glu (E) and Gln (Q) ends up being labeled in sequence as E (or it used to be). Even when every sequence database has the residue as a Q. A sensible solution would be to follow the depositor-submitted sequence, or the obvious encoded amino acid, here. Despite being a very ugly solution, the practical way does appear to be a combination of (1) very loosely following simple rules (such as a ten-atom rule), (2) following the established convention, when there is one (such as for glycosylation), (3) and rational discussion between the depositors and annotators. The trick to this working is being able to compromise and being rational. Another reason why annotators should be trained biochemists and chemists. Engin P.S. Still recovering from when I was told by PDB staff that an HPUB structure could not yet be released, because the publication mentioned was just a Nature "Letter", not an "Article". On 7/9/13 10:21 AM, herman.schreu...@sanofi.com wrote: Dear Marc (and BB), I guess as usual, in real life the obvious is less obvious as it seems to be. I, and I guess many of my colleagues trying to find new drugs, have quite a few protein-inhibitor complexes where the inhibitor formed a covalent link with e.g. the active site serine. In these cases, I am perfectly happy with having the inhibitor being defined as a separate group, linked via a LINK record. For me, it does not make sense to treat these covalent inhibitors differently from noncovalent inhibitors. In the end, I guess, it will boil down to some arbitrary choice, either imposed upon us by the pdb, or individually taken by the crystallographer who produced the crystal structure. My 2 cts, Herman -Ursprüngliche Nachricht- Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Mark J van Raaij Gesendet: Dienstag, 9. Juli 2013 16:23 An: CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] modified amino acids in the PDB - really the only complicated case would be where a group is covalently linked to more than one amino acid, wouldn't it? Any case where only one covalent link with an is present could (should?) be treated as a special amino acid, i.e. like selenomethionine. - groups without any covalent links to the protein are better kept separate I would think (but I guess this is stating the obvious). Mark J van Raaij Lab 20B Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 9 Jul 2013, at 12:49, Frances C. Bernstein wrote: In trying to formulate a suggested policy on het groups versus modified side chains one needs to think about the various cases that have arisen. Perhaps the earliest one I can think of is a heme group. One could view it as a very large decoration on a side chain but, as everyone knows, one heme group makes four links to residues. In the early days of the PDB we decided that heme "obviously" had to be represented as a separate group. I would also point out that nobody would seriously suggest that selenomethionine should be represented as a methionine with a missing sulfur and a selenium het group bound to it. Unfortunately all the cases that fall between selenomethionine and heme are more difficult. Perhaps the best that one must hope for is that whichever representation is chosen for a particular case, it be consistent across all entries. Frances P.S. One can also have similar discussions about the representation of microheterogeneity and of sugar chains but we should leave those for another day. = Bernstein + Sons * * Information Systems Consultants 5 Brewster Lane, Bellport,
Re: [ccp4bb] modified amino acids in the PDB
which you might expect to occur in a modified Cys. But it has also been observed as a non-polymer ligand in its own right so goes on as a separate modification? However to be honest I am not sure I have ever seen the rationale for this written down. 'Non-polymer' heterogens can turn up either linked or not. Once they are in the residues they have to make a call on which kind of backbone they will feature in within the pdb. That is why there is 'D5M' for non-polymer deoxyAMP. Also known as ' DA' when it is 'DNA-linking' but so far not fessing up to life under a third code as 'RNA-linking' Now is perhaps the time to ask for explanations of these nomenclature features before they become hard-wired in the new pdb deposition system (however there may be time - I refer you to my previous posting ;). Cheers Martyn Dr Martyn Symmons Cambridge _____________________ From: Michael Weyand To: CCP4BB@JISCMAIL.AC.UK Sent: Monday, 8 July 2013, 10:03 Subject: [ccp4bb] modified amino acids in the PDB Dear colleagues, We deposited protein structures with modified lysine side chains and were surprised that the PDB treats the modification as an independent molecule, with a ?LINK? record indicating the covalent bond ? instead of defining a modified residue (that?s what we had uploaded to the PDB). Apparently, anything attached to an amino acid is considered an independent molecule (and the lysine just called a regular lysine) if it comprises more than 10 atoms (see below for the PDB guidelines). I think that?s kind of arbitrary and would give all modified residue also modified names ? i.e. individual names for all modified lysines, as it is done for acetyl- or methyl-lysines, for example. I wonder what other people?s opinion is?! Best regards Clemens This is in accordance to the wwPDB annotation guidelines (http://www.wwpdb.org/procedure.html#toc_2). "*Modified amino acids and nucleotides* If an amino acid or nucleotide is modified by a chemical group greater than 10 atoms, the residue will be split into two groups: the amino acid/nucleotide group and the modification. A link record will be generated between the amino acid/nucleotide group and the modification. For modified amino acids and nucleotides that were not split will follow standard atom nomenclature."
Re: [ccp4bb] [ccp4bb] modified amino acids in the PDB
Dear BB For my one cent's worth, I think it is simpler if the residue name in the coordinate set matches that in the sequence deposition for the protein (and DNA) - i.e. before any post-translational modification is carried out. Whether a particular modification is actually present or absent may depend on the organism (or even the cell line) in which the protein was expressed… which is seldom the natural one. best wishes Pete On 9 Jul 2013, at 16:21, herman.schreu...@sanofi.com wrote: > Dear Marc (and BB), > > I guess as usual, in real life the obvious is less obvious as it seems to be. > I, and I guess many of my colleagues trying to find new drugs, have quite a > few protein-inhibitor complexes where the inhibitor formed a covalent link > with e.g. the active site serine. In these cases, I am perfectly happy with > having the inhibitor being defined as a separate group, linked via a LINK > record. For me, it does not make sense to treat these covalent inhibitors > differently from noncovalent inhibitors. > > In the end, I guess, it will boil down to some arbitrary choice, either > imposed upon us by the pdb, or individually taken by the crystallographer who > produced the crystal structure. > > My 2 cts, > Herman > > > -Ursprüngliche Nachricht- > Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Mark J > van Raaij > Gesendet: Dienstag, 9. Juli 2013 16:23 > An: CCP4BB@JISCMAIL.AC.UK > Betreff: Re: [ccp4bb] modified amino acids in the PDB > > - really the only complicated case would be where a group is covalently > linked to more than one amino acid, wouldn't it? Any case where only one > covalent link with an is present could (should?) be treated as a special > amino acid, i.e. like selenomethionine. > - groups without any covalent links to the protein are better kept separate I > would think (but I guess this is stating the obvious). > > Mark J van Raaij > Lab 20B > Dpto de Estructura de Macromoleculas > Centro Nacional de Biotecnologia - CSIC > c/Darwin 3 > E-28049 Madrid, Spain > tel. (+34) 91 585 4616 > http://www.cnb.csic.es/~mjvanraaij > > > > > > On 9 Jul 2013, at 12:49, Frances C. Bernstein wrote: > >> In trying to formulate a suggested policy on het groups versus >> modified side chains one needs to think about the various cases that >> have arisen. >> >> Perhaps the earliest one I can think of is a heme group. >> One could view it as a very large decoration on a side chain but, as >> everyone knows, one heme group makes four links to residues. In the >> early days of the PDB we decided that heme "obviously" had to be >> represented as a separate group. >> >> I would also point out that nobody would seriously suggest that >> selenomethionine should be represented as a methionine with a missing >> sulfur and a selenium het group bound to it. >> >> Unfortunately all the cases that fall between selenomethionine and >> heme are more difficult. Perhaps the best that one must hope for is >> that whichever representation is chosen for a particular case, it be >> consistent across all entries. >> >> Frances >> >> P.S. One can also have similar discussions about the representation of >> microheterogeneity and of sugar chains but we should leave those for >> another day. >> >> = >> Bernstein + Sons >> * * Information Systems Consultants >> 5 Brewster Lane, Bellport, NY 11713-2803 >> * * *** >> *Frances C. Bernstein >> * *** f...@bernstein-plus-sons.com >> *** * >> * *** 1-631-286-1339FAX: 1-631-286-1999 >> = >> >> On Tue, 9 Jul 2013, MARTYN SYMMONS wrote: >> >>> Hi Clemens >>> I guess the reason you say 'arbitrary' is because there is no >>> explanation of this rule decision? >>> It would be nice if some rationalization was available alongside the >>> values given. >>> So a sentence along the lines of 'we set the number owing to the >>> following considerations' ? >>> However a further layer of variation is that the rule does not seem >>> to be consistently applied >>> - just browsing CYS modifications: >>> iodoacetamide treatment gives a CYS with only 4 additional atoms >>> but it is split off as ACM. >>> However some ligands much larger than 10 residues have been kept >>> with
[ccp4bb] AW: [ccp4bb] modified amino acids in the PDB
Dear Marc (and BB), I guess as usual, in real life the obvious is less obvious as it seems to be. I, and I guess many of my colleagues trying to find new drugs, have quite a few protein-inhibitor complexes where the inhibitor formed a covalent link with e.g. the active site serine. In these cases, I am perfectly happy with having the inhibitor being defined as a separate group, linked via a LINK record. For me, it does not make sense to treat these covalent inhibitors differently from noncovalent inhibitors. In the end, I guess, it will boil down to some arbitrary choice, either imposed upon us by the pdb, or individually taken by the crystallographer who produced the crystal structure. My 2 cts, Herman -Ursprüngliche Nachricht- Von: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] Im Auftrag von Mark J van Raaij Gesendet: Dienstag, 9. Juli 2013 16:23 An: CCP4BB@JISCMAIL.AC.UK Betreff: Re: [ccp4bb] modified amino acids in the PDB - really the only complicated case would be where a group is covalently linked to more than one amino acid, wouldn't it? Any case where only one covalent link with an is present could (should?) be treated as a special amino acid, i.e. like selenomethionine. - groups without any covalent links to the protein are better kept separate I would think (but I guess this is stating the obvious). Mark J van Raaij Lab 20B Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 9 Jul 2013, at 12:49, Frances C. Bernstein wrote: > In trying to formulate a suggested policy on het groups versus > modified side chains one needs to think about the various cases that > have arisen. > > Perhaps the earliest one I can think of is a heme group. > One could view it as a very large decoration on a side chain but, as > everyone knows, one heme group makes four links to residues. In the > early days of the PDB we decided that heme "obviously" had to be > represented as a separate group. > > I would also point out that nobody would seriously suggest that > selenomethionine should be represented as a methionine with a missing > sulfur and a selenium het group bound to it. > > Unfortunately all the cases that fall between selenomethionine and > heme are more difficult. Perhaps the best that one must hope for is > that whichever representation is chosen for a particular case, it be > consistent across all entries. > > Frances > > P.S. One can also have similar discussions about the representation of > microheterogeneity and of sugar chains but we should leave those for > another day. > > = > Bernstein + Sons > * * Information Systems Consultants > 5 Brewster Lane, Bellport, NY 11713-2803 > * * *** > *Frances C. Bernstein > * *** f...@bernstein-plus-sons.com > *** * > * *** 1-631-286-1339FAX: 1-631-286-1999 > = > > On Tue, 9 Jul 2013, MARTYN SYMMONS wrote: > >> Hi Clemens >>I guess the reason you say 'arbitrary' is because there is no >> explanation of this rule decision? >> It would be nice if some rationalization was available alongside the >> values given. >> So a sentence along the lines of 'we set the number owing to the >> following considerations' ? >> However a further layer of variation is that the rule does not seem >> to be consistently applied >> - just browsing CYS modifications: >>iodoacetamide treatment gives a CYS with only 4 additional atoms >> but it is split off as ACM. >>However some ligands much larger than 10 residues have been kept >> with the cysteine ( for example CY7 in 2jiv and NPH in 1a18. >>My betting is that it depends on whether something has been seen >> 'going solo' as a non-covalent ligand previously so that it pops up >> as an atomic structural match with a pre-defined three-letter code. >> This would explain for example the ACM case which you might expect >> to occur in a modified Cys. But it has also been observed as a >> non-polymer ligand in its own right so goes on as a separate modification? >>However to be honest I am not sure I have ever seen the rationale >> for this written down. >> 'Non-polymer' heterogens can turn up either linked or not. Once >> they are in the residues they have to make a call on which kind of >> backbone they will feature in within the pdb. >> That is why there is 'D5M' for non-polymer deoxyAMP.
Re: [ccp4bb] modified amino acids in the PDB
Hiya Mark once again I find myself asking why not give the authors deposited file alongside any other 'cleaned-up' PDB files. The authors' one could be foo.pdb_0 then after PDB heterogen annotation you get foo.pdb_1 - if the heterogens are subsequently handled differently then you could have foo.pdb_2 for the 'remediated' file. This latter has certainly happened as there appear to be 'orphan' heterogen definitions in the pdb that are not currently used by any entries - most likely these were 'split' when atoms were taken out and associated with polymer entities, or in some cases, I notice, 'lumped' when they have been included with residues alongside to give a new larger heterogen. PDB file versioning would also give transparency in cases such as occupancy remediation when the pdb altered all occupancies that summed to >1.0 to enforce a total of 1.0 (giving holes in the authors' density presumably) which may mystify the sharp-eyed user. Currently there are the REV_DAT lines in the header but these give only the titles of changed cards not any detailed explanation. Likewise REVDAT only starts at foo.pdb_1 in my example - still missing the authors original intent. all the best Martyn From: Mark J van Raaij To: CCP4BB@JISCMAIL.AC.UK Sent: Tuesday, 9 July 2013, 15:23 Subject: Re: [ccp4bb] modified amino acids in the PDB - really the only complicated case would be where a group is covalently linked to more than one amino acid, wouldn't it? Any case where only one covalent link with an is present could (should?) be treated as a special amino acid, i.e. like selenomethionine. - groups without any covalent links to the protein are better kept separate I would think (but I guess this is stating the obvious). Mark J van Raaij Lab 20B Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 9 Jul 2013, at 12:49, Frances C. Bernstein wrote: > In trying to formulate a suggested policy on het groups > versus modified side chains one needs to think about the > various cases that have arisen. > > Perhaps the earliest one I can think of is a heme group. > One could view it as a very large decoration on a side > chain but, as everyone knows, one heme group makes four > links to residues. In the early days of the PDB we decided > that heme "obviously" had to be represented as a separate group. > > I would also point out that nobody would seriously suggest that > selenomethionine should be represented as a methionine with a > missing sulfur and a selenium het group bound to it. > > Unfortunately all the cases that fall between selenomethionine > and heme are more difficult. Perhaps the best that one must > hope for is that whichever representation is chosen for a > particular case, it be consistent across all entries. > > Frances > > P.S. One can also have similar discussions about the representation > of microheterogeneity and of sugar chains but we should leave those > for another day. > > = > Bernstein + Sons > * * Information Systems Consultants > 5 Brewster Lane, Bellport, NY 11713-2803 > * * *** > * Frances C. Bernstein > * *** f...@bernstein-plus-sons.com > *** * > * *** 1-631-286-1339 FAX: 1-631-286-1999 > = > > On Tue, 9 Jul 2013, MARTYN SYMMONS wrote: > >> Hi Clemens >> I guess the reason you say 'arbitrary' is because there is no explanation >>of this >> rule decision? >> It would be nice if some rationalization was available alongside the >>values given. >> So a sentence along the lines of 'we set the number owing to the following >> considerations' ? >> However a further layer of variation is that the rule does not seem to be >> consistently applied >> - just browsing CYS modifications: >> iodoacetamide treatment gives a CYS with only 4 additional atoms but it >>is split >> off as ACM. >> However some ligands much larger than 10 residues have been kept with the >>cysteine >> ( for example CY7 in 2jiv and NPH in 1a18. >> My betting is that it depends on whether something has been seen 'going >>solo' as a >> non-covalent ligand previously so that it pops up as an atomic structural >> match with >> a pre-defined three-letter code. >> This would explain for example the ACM case which you might expect to >>occur in a >
Re: [ccp4bb] modified amino acids in the PDB
- really the only complicated case would be where a group is covalently linked to more than one amino acid, wouldn't it? Any case where only one covalent link with an is present could (should?) be treated as a special amino acid, i.e. like selenomethionine. - groups without any covalent links to the protein are better kept separate I would think (but I guess this is stating the obvious). Mark J van Raaij Lab 20B Dpto de Estructura de Macromoleculas Centro Nacional de Biotecnologia - CSIC c/Darwin 3 E-28049 Madrid, Spain tel. (+34) 91 585 4616 http://www.cnb.csic.es/~mjvanraaij On 9 Jul 2013, at 12:49, Frances C. Bernstein wrote: > In trying to formulate a suggested policy on het groups > versus modified side chains one needs to think about the > various cases that have arisen. > > Perhaps the earliest one I can think of is a heme group. > One could view it as a very large decoration on a side > chain but, as everyone knows, one heme group makes four > links to residues. In the early days of the PDB we decided > that heme "obviously" had to be represented as a separate group. > > I would also point out that nobody would seriously suggest that > selenomethionine should be represented as a methionine with a > missing sulfur and a selenium het group bound to it. > > Unfortunately all the cases that fall between selenomethionine > and heme are more difficult. Perhaps the best that one must > hope for is that whichever representation is chosen for a > particular case, it be consistent across all entries. > > Frances > > P.S. One can also have similar discussions about the representation > of microheterogeneity and of sugar chains but we should leave those > for another day. > > = > Bernstein + Sons > * * Information Systems Consultants > 5 Brewster Lane, Bellport, NY 11713-2803 > * * *** > *Frances C. Bernstein > * *** f...@bernstein-plus-sons.com > *** * > * *** 1-631-286-1339FAX: 1-631-286-1999 > = > > On Tue, 9 Jul 2013, MARTYN SYMMONS wrote: > >> Hi Clemens >>I guess the reason you say 'arbitrary' is because there is no explanation >> of this >> rule decision? >> It would be nice if some rationalization was available alongside the >> values given. >> So a sentence along the lines of 'we set the number owing to the following >> considerations' ? >> However a further layer of variation is that the rule does not seem to be >> consistently applied >> - just browsing CYS modifications: >>iodoacetamide treatment gives a CYS with only 4 additional atoms but it >> is split >> off as ACM. >>However some ligands much larger than 10 residues have been kept with the >> cysteine >> ( for example CY7 in 2jiv and NPH in 1a18. >>My betting is that it depends on whether something has been seen 'going >> solo' as a >> non-covalent ligand previously so that it pops up as an atomic structural >> match with >> a pre-defined three-letter code. >> This would explain for example the ACM case which you might expect to >> occur in a >> modified Cys. But it has also been observed as a non-polymer ligand in its >> own right >> so goes on as a separate modification? >>However to be honest I am not sure I have ever seen the rationale for >> this written >> down. >> 'Non-polymer' heterogens can turn up either linked or not. Once they are >> in the >> residues they have to make a call on which kind of backbone they will >> feature in >> within the pdb. >> That is why there is 'D5M' for non-polymer deoxyAMP. Also known as ' DA' >> when it >> is 'DNA-linking' but so far not fessing up to life under a third code as >> 'RNA-linking' >> Now is perhaps the time to ask for explanations of these nomenclature >> features before >> they become hard-wired in the new pdb deposition system (however there may >> be time - >> I refer you to my previous posting ;). >> >> Cheers >> Martyn >> >> Dr Martyn Symmons >> Cambridge >> _ >> From: Michael Weyand >> To: CCP4BB@JISCMAIL.AC.UK >> Sent: Monday, 8 July 2013, 10:03 >> Subject: [ccp4bb] modified amino acids in the PDB >> Dear colleagues, >> We deposited protein structures with modified lysin
Re: [ccp4bb] modified amino acids in the PDB
In trying to formulate a suggested policy on het groups versus modified side chains one needs to think about the various cases that have arisen. Perhaps the earliest one I can think of is a heme group. One could view it as a very large decoration on a side chain but, as everyone knows, one heme group makes four links to residues. In the early days of the PDB we decided that heme "obviously" had to be represented as a separate group. I would also point out that nobody would seriously suggest that selenomethionine should be represented as a methionine with a missing sulfur and a selenium het group bound to it. Unfortunately all the cases that fall between selenomethionine and heme are more difficult. Perhaps the best that one must hope for is that whichever representation is chosen for a particular case, it be consistent across all entries. Frances P.S. One can also have similar discussions about the representation of microheterogeneity and of sugar chains but we should leave those for another day. = Bernstein + Sons * * Information Systems Consultants 5 Brewster Lane, Bellport, NY 11713-2803 * * *** *Frances C. Bernstein * *** f...@bernstein-plus-sons.com *** * * *** 1-631-286-1339FAX: 1-631-286-1999 = On Tue, 9 Jul 2013, MARTYN SYMMONS wrote: Hi Clemens I guess the reason you say 'arbitrary' is because there is no explanation of this rule decision? It would be nice if some rationalization was available alongside the values given. So a sentence along the lines of 'we set the number owing to the following considerations' ? However a further layer of variation is that the rule does not seem to be consistently applied - just browsing CYS modifications: iodoacetamide treatment gives a CYS with only 4 additional atoms but it is split off as ACM. However some ligands much larger than 10 residues have been kept with the cysteine ( for example CY7 in 2jiv and NPH in 1a18. My betting is that it depends on whether something has been seen 'going solo' as a non-covalent ligand previously so that it pops up as an atomic structural match with a pre-defined three-letter code. This would explain for example the ACM case which you might expect to occur in a modified Cys. But it has also been observed as a non-polymer ligand in its own right so goes on as a separate modification? However to be honest I am not sure I have ever seen the rationale for this written down. 'Non-polymer' heterogens can turn up either linked or not. Once they are in the residues they have to make a call on which kind of backbone they will feature in within the pdb. That is why there is 'D5M' for non-polymer deoxyAMP. Also known as ' DA' when it is 'DNA-linking' but so far not fessing up to life under a third code as 'RNA-linking' Now is perhaps the time to ask for explanations of these nomenclature features before they become hard-wired in the new pdb deposition system (however there may be time - I refer you to my previous posting ;). Cheers Martyn Dr Martyn Symmons Cambridge _ From: Michael Weyand To: CCP4BB@JISCMAIL.AC.UK Sent: Monday, 8 July 2013, 10:03 Subject: [ccp4bb] modified amino acids in the PDB Dear colleagues, We deposited protein structures with modified lysine side chains and were surprised that the PDB treats the modification as an independent molecule, with a ?LINK? record indicating the covalent bond ? instead of defining a modified residue (that?s what we had uploaded to the PDB). Apparently, anything attached to an amino acid is considered an independent molecule (and the lysine just called a regular lysine) if it comprises more than 10 atoms (see below for the PDB guidelines). I think that?s kind of arbitrary and would give all modified residue also modified names ? i.e. individual names for all modified lysines, as it is done for acetyl- or methyl-lysines, for example. I wonder what other people?s opinion is?! Best regards Clemens This is in accordance to the wwPDB annotation guidelines (http://www.wwpdb.org/procedure.html#toc_2). "*Modified amino acids and nucleotides* If an amino acid or nucleotide is modified by a chemical group greater than 10 atoms, the residue will be split into two groups: the amino acid/nucleotide group and the modification. A link record will be generated between the amino acid/nucleotide group and the modification. For modified amino acids and nucleotides that were not split will follow standard atom nomenclature."
Re: [ccp4bb] modified amino acids in the PDB
Hi Clemens I guess the reason you say 'arbitrary' is because there is no explanation of this rule decision? It would be nice if some rationalization was available alongside the values given. So a sentence along the lines of 'we set the number owing to the following considerations' ? However a further layer of variation is that the rule does not seem to be consistently applied - just browsing CYS modifications: iodoacetamide treatment gives a CYS with only 4 additional atoms but it is split off as ACM. However some ligands much larger than 10 residues have been kept with the cysteine ( for example CY7 in 2jiv and NPH in 1a18. My betting is that it depends on whether something has been seen 'going solo' as a non-covalent ligand previously so that it pops up as an atomic structural match with a pre-defined three-letter code. This would explain for example the ACM case which you might expect to occur in a modified Cys. But it has also been observed as a non-polymer ligand in its own right so goes on as a separate modification? However to be honest I am not sure I have ever seen the rationale for this written down. 'Non-polymer' heterogens can turn up either linked or not. Once they are in the residues they have to make a call on which kind of backbone they will feature in within the pdb. That is why there is 'D5M' for non-polymer deoxyAMP. Also known as ' DA' when it is 'DNA-linking' but so far not fessing up to life under a third code as 'RNA-linking' Now is perhaps the time to ask for explanations of these nomenclature features before they become hard-wired in the new pdb deposition system (however there may be time - I refer you to my previous posting ;). Cheers Martyn Dr Martyn Symmons Cambridge From: Michael Weyand To: CCP4BB@JISCMAIL.AC.UK Sent: Monday, 8 July 2013, 10:03 Subject: [ccp4bb] modified amino acids in the PDB Dear colleagues, We deposited protein structures with modified lysine side chains and were surprised that the PDB treats the modification as an independent molecule, with a “LINK” record indicating the covalent bond – instead of defining a modified residue (that’s what we had uploaded to the PDB). Apparently, anything attached to an amino acid is considered an independent molecule (and the lysine just called a regular lysine) if it comprises more than 10 atoms (see below for the PDB guidelines). I think that’s kind of arbitrary and would give all modified residue also modified names – i.e. individual names for all modified lysines, as it is done for acetyl- or methyl-lysines, for example. I wonder what other people’s opinion is?! Best regards Clemens This is in accordance to the wwPDB annotation guidelines (http://www.wwpdb.org/procedure.html#toc_2). "*Modified amino acids and nucleotides* If an amino acid or nucleotide is modified by a chemical group greater than 10 atoms, the residue will be split into two groups: the amino acid/nucleotide group and the modification. A link record will be generated between the amino acid/nucleotide group and the modification. For modified amino acids and nucleotides that were not split will follow standard atom nomenclature."
[ccp4bb] modified amino acids in the PDB
Dear colleagues, We deposited protein structures with modified lysine side chains and were surprised that the PDB treats the modification as an independent molecule, with a “LINK” record indicating the covalent bond – instead of defining a modified residue (that’s what we had uploaded to the PDB). Apparently, anything attached to an amino acid is considered an independent molecule (and the lysine just called a regular lysine) if it comprises more than 10 atoms (see below for the PDB guidelines). I think that’s kind of arbitrary and would give all modified residue also modified names – i.e. individual names for all modified lysines, as it is done for acetyl- or methyl-lysines, for example. I wonder what other people’s opinion is?! Best regards Clemens This is in accordance to the wwPDB annotation guidelines (http://www.wwpdb.org/procedure.html#toc_2). "*Modified amino acids and nucleotides* If an amino acid or nucleotide is modified by a chemical group greater than 10 atoms, the residue will be split into two groups: the amino acid/nucleotide group and the modification. A link record will be generated between the amino acid/nucleotide group and the modification. For modified amino acids and nucleotides that were not split will follow standard atom nomenclature."