Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]
Hi Ben From discussions we have had with PDBe they consider tautomers to be different compounds (just as stereoisomers would be considered to be different compounds), since they require different restraint dictionaries, so each tautomer that was observed would require a unique 3-lettter code. Of course you still have to have evidence (e.g. from the H-bonding pattern) that what you are really seeing are different tautomers, but that's a different question. Cheers -- Ian On 24 June 2015 at 12:50, Ben Bax benjamin.d@gsk.com wrote: Another major problem with the PDB is that it does not seem to believe in the existence of different tautomers or protonation states. For example the ATP analogue AMPPNP can have the nitrogen between the beta and gamma phosphates protonated (-P-NH-P) or unprotonated (P-N=P), and there are well documented examples of both tautomers in the PDB (NH being a hydrogen bond donor and N a hydrogen bond acceptor). If you look in the CSD you can see that the protonation state of the nitrogen changes the geometry of the P-N-P bond. However, as I understand it, the PDB considers all tautomeric (and protonated) forms of AMPPNP the same. When I tried to deposit a specific AMPPNP tautomer in 2013, they would not accept it. The PDB also seems to believe, as I understand it, that the overall charge on AMPPNP is zero and that the phosphates do not carry negative charge. *Ben Bax* *Senior Scientific Investigator* BioMolecular Sciences UK RD Platform Technology Science *GSK* *Medicines Research Centre, Gunnels Wood Road, Stevenage, SG1 2NY, UK* *Email **benjamin.d@gsk.com* benjamin.d@gsk.com *Mobile +44 (*0) 7912 600604 *Tel +*44 (0) 1438 55 1156 *gsk.com* http://www.gsk.com/ | *Twitter* http://twitter.com/GSK | *YouTube* http://www.youtube.com/user/gskvision | *Facebook* http://www.facebook.com/glaxosmithkline | *Flickr* http://www.flickr.com/photos/glaxosmithkline -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK CCP4BB@JISCMAIL.AC.UK] On Behalf Of Martyn Symmons Sent: 22 June 2015 23:39 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)] Well the problem is there is a lot more to a ligand than PDB coordinates - little things like bond orders... In addition people can publish ligands with atoms for which they have no density - so zero-occupancy is allowed too. So who should get priority - the group who publishes a ligand first, or the ones who actually have density for all the atoms? These sorts of complications mean we all benefit from peer-review of the structure - that is why we put things on hold. And authors should have a chance to change their ligand definition based on reviewers' comments - just as they are allowed to improve the PDB coordinates. So it is a worry for them that the PDB might 'publish' the ligand aspect of their work before they have completed the peer-review process. Maybe you don't believe is peer-review - in reply to which I'd paraphrase what people say about democracy - it's pretty bad but better than the alternatives. But to return to the point I made: what really is the problem with maintaining and modifying _separate_ definitions with authors' _separate_ deposited coordinates (and bond orders) while structures are on hold and being reviewed? Journals manage to keep all those submitted papers separate in their databases. cheers M. On Mon, Jun 22, 2015 at 3:12 AM, Edward A. Berry ber...@upstate.edu wrote: I can't imagine a journal doing that can you? When I work on my supplementary material in a paper I don't expect that the journal will take a bit out and publish it separately to support the work of my competitors. Not out of spite that I was beaten - but because I don't want to take the responsibility for checking their science for them! I don't see the problem here. What about the dozens of authors who will benefit from using your ligand in their structure _after_ your structure comes out? You don't take responsibility for checking their science. Every author gets a copy of his final structure to check before it is released and each is responsible for his own. The only difference here is whether the competitor got to use it first, (which might sting a bit) or only after you had already made it your own with the first structure. I guess the ligand database is the responsibility of the pdb, but they depend on first depositors to help set up each ligand, so it is not surprising if the type model has coordinates from the first depositor's structure (although it would be convenient if they were all moved to c.o.m. at 0,0,0). When another group publishes a structure with the ligand, they will not be publishing the first depositor's coordinates because the ligand will be moved to its position in their structure and refined
Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]
Another major problem with the PDB is that it does not seem to believe in the existence of different tautomers or protonation states. For example the ATP analogue AMPPNP can have the nitrogen between the beta and gamma phosphates protonated (-P-NH-P) or unprotonated (P-N=P), and there are well documented examples of both tautomers in the PDB (NH being a hydrogen bond donor and N a hydrogen bond acceptor). If you look in the CSD you can see that the protonation state of the nitrogen changes the geometry of the P-N-P bond. However, as I understand it, the PDB considers all tautomeric (and protonated) forms of AMPPNP the same. When I tried to deposit a specific AMPPNP tautomer in 2013, they would not accept it. The PDB also seems to believe, as I understand it, that the overall charge on AMPPNP is zero and that the phosphates do not carry negative charge. Ben Bax Senior Scientific Investigator BioMolecular Sciences UK RD Platform Technology Science GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, SG1 2NY, UK Email benjamin.d@gsk.commailto:benjamin.d@gsk.com Mobile +44 (0) 7912 600604 Tel +44 (0) 1438 55 1156 gsk.comhttp://www.gsk.com/ | Twitterhttp://twitter.com/GSK | YouTubehttp://www.youtube.com/user/gskvision | Facebookhttp://www.facebook.com/glaxosmithkline | Flickrhttp://www.flickr.com/photos/glaxosmithkline -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Martyn Symmons Sent: 22 June 2015 23:39 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)] Well the problem is there is a lot more to a ligand than PDB coordinates - little things like bond orders... In addition people can publish ligands with atoms for which they have no density - so zero-occupancy is allowed too. So who should get priority - the group who publishes a ligand first, or the ones who actually have density for all the atoms? These sorts of complications mean we all benefit from peer-review of the structure - that is why we put things on hold. And authors should have a chance to change their ligand definition based on reviewers' comments - just as they are allowed to improve the PDB coordinates. So it is a worry for them that the PDB might 'publish' the ligand aspect of their work before they have completed the peer-review process. Maybe you don't believe is peer-review - in reply to which I'd paraphrase what people say about democracy - it's pretty bad but better than the alternatives. But to return to the point I made: what really is the problem with maintaining and modifying _separate_ definitions with authors' _separate_ deposited coordinates (and bond orders) while structures are on hold and being reviewed? Journals manage to keep all those submitted papers separate in their databases. cheers M. On Mon, Jun 22, 2015 at 3:12 AM, Edward A. Berry ber...@upstate.edumailto:ber...@upstate.edu wrote: I can't imagine a journal doing that can you? When I work on my supplementary material in a paper I don't expect that the journal will take a bit out and publish it separately to support the work of my competitors. Not out of spite that I was beaten - but because I don't want to take the responsibility for checking their science for them! I don't see the problem here. What about the dozens of authors who will benefit from using your ligand in their structure _after_ your structure comes out? You don't take responsibility for checking their science. Every author gets a copy of his final structure to check before it is released and each is responsible for his own. The only difference here is whether the competitor got to use it first, (which might sting a bit) or only after you had already made it your own with the first structure. I guess the ligand database is the responsibility of the pdb, but they depend on first depositors to help set up each ligand, so it is not surprising if the type model has coordinates from the first depositor's structure (although it would be convenient if they were all moved to c.o.m. at 0,0,0). When another group publishes a structure with the ligand, they will not be publishing the first depositor's coordinates because the ligand will be moved to its position in their structure and refined against their data, probably with somewhat different restraints. If the ligand is a top secret novel drug lead that your company is developing I guess it would come as a shock to find someone else has already deposited it, and it might be good to hasten not the publication but protecting of the compound with a patent! Although Miriam says a new 3-letter code is generated when no match is found, I believe the depositor's code will be used if it is available, at least one of mine was last year, so there is some use for Nigel's utility if you want to stamp your new compound with a rememberable name. eab
Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]
Good afternoon both, there is also the issue of inconsistency of presentation. For example, Lysine, that is L-Lysine (LYS) is protonated on the side chain nitrogen (NZ), whiles as D-lysine (DLY) is not. i.e. you have NZ(HZ1, HZ2) for DLY, and NZ(HZ1, HZ2, HZ3) for LYS Miri On Wed, 2015-06-24 at 13:35 +0100, Ian Tickle wrote: Hi Ben From discussions we have had with PDBe they consider tautomers to be different compounds (just as stereoisomers would be considered to be different compounds), since they require different restraint dictionaries, so each tautomer that was observed would require a unique 3-lettter code. Of course you still have to have evidence (e.g. from the H-bonding pattern) that what you are really seeing are different tautomers, but that's a different question. Cheers -- Ian On 24 June 2015 at 12:50, Ben Bax benjamin.d@gsk.com wrote: Another major problem with the PDB is that it does not seem to believe in the existence of different tautomers or protonation states. For example the ATP analogue AMPPNP can have the nitrogen between the beta and gamma phosphates protonated (-P-NH-P) or unprotonated (P-N=P), and there are well documented examples of both tautomers in the PDB (NH being a hydrogen bond donor and N a hydrogen bond acceptor). If you look in the CSD you can see that the protonation state of the nitrogen changes the geometry of the P-N-P bond. However, as I understand it, the PDB considers all tautomeric (and protonated) forms of AMPPNP the same. When I tried to deposit a specific AMPPNP tautomer in 2013, they would not accept it. The PDB also seems to believe, as I understand it, that the overall charge on AMPPNP is zero and that the phosphates do not carry negative charge. Ben Bax Senior Scientific Investigator BioMolecular Sciences UK RD Platform Technology Science GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, SG1 2NY, UK Email benjamin.d@gsk.com Mobile +44 (0) 7912 600604 Tel +44 (0) 1438 55 1156 gsk.com | Twitter | YouTube | Facebook | Flickr -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Martyn Symmons Sent: 22 June 2015 23:39 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)] Well the problem is there is a lot more to a ligand than PDB coordinates - little things like bond orders... In addition people can publish ligands with atoms for which they have no density - so zero-occupancy is allowed too. So who should get priority - the group who publishes a ligand first, or the ones who actually have density for all the atoms? These sorts of complications mean we all benefit from peer-review of the structure - that is why we put things on hold. And authors should have a chance to change their ligand definition based on reviewers' comments - just as they are allowed to improve the PDB coordinates. So it is a worry for them that the PDB might 'publish' the ligand aspect of their work before they have completed the peer-review process. Maybe you don't believe is peer-review - in reply to which I'd paraphrase what people say about democracy - it's pretty bad but better than the alternatives. But to return to the point I made: what really is the problem with maintaining and modifying _separate_ definitions with authors' _separate_ deposited coordinates (and bond orders) while structures are on hold and being reviewed? Journals manage to keep all those submitted papers separate in their databases. cheers M. On Mon, Jun 22, 2015 at 3:12 AM, Edward A. Berry ber...@upstate.edu wrote: I can't imagine a journal doing that can you? When I work on my supplementary material in a paper I don't expect that the journal will take a bit out and publish it separately to support the work of my competitors. Not out of spite that I was beaten - but because I don't want to take the responsibility for checking their science for them! I don't see the problem here. What about the dozens of authors who
Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]
Well the problem is there is a lot more to a ligand than PDB coordinates - little things like bond orders... In addition people can publish ligands with atoms for which they have no density - so zero-occupancy is allowed too. So who should get priority - the group who publishes a ligand first, or the ones who actually have density for all the atoms? These sorts of complications mean we all benefit from peer-review of the structure - that is why we put things on hold. And authors should have a chance to change their ligand definition based on reviewers' comments - just as they are allowed to improve the PDB coordinates. So it is a worry for them that the PDB might 'publish' the ligand aspect of their work before they have completed the peer-review process. Maybe you don't believe is peer-review - in reply to which I'd paraphrase what people say about democracy - it's pretty bad but better than the alternatives. But to return to the point I made: what really is the problem with maintaining and modifying _separate_ definitions with authors' _separate_ deposited coordinates (and bond orders) while structures are on hold and being reviewed? Journals manage to keep all those submitted papers separate in their databases. cheers M. On Mon, Jun 22, 2015 at 3:12 AM, Edward A. Berry ber...@upstate.edu wrote: I can't imagine a journal doing that can you? When I work on my supplementary material in a paper I don't expect that the journal will take a bit out and publish it separately to support the work of my competitors. Not out of spite that I was beaten - but because I don't want to take the responsibility for checking their science for them! I don't see the problem here. What about the dozens of authors who will benefit from using your ligand in their structure _after_ your structure comes out? You don't take responsibility for checking their science. Every author gets a copy of his final structure to check before it is released and each is responsible for his own. The only difference here is whether the competitor got to use it first, (which might sting a bit) or only after you had already made it your own with the first structure. I guess the ligand database is the responsibility of the pdb, but they depend on first depositors to help set up each ligand, so it is not surprising if the type model has coordinates from the first depositor's structure (although it would be convenient if they were all moved to c.o.m. at 0,0,0). When another group publishes a structure with the ligand, they will not be publishing the first depositor's coordinates because the ligand will be moved to its position in their structure and refined against their data, probably with somewhat different restraints. If the ligand is a top secret novel drug lead that your company is developing I guess it would come as a shock to find someone else has already deposited it, and it might be good to hasten not the publication but protecting of the compound with a patent! Although Miriam says a new 3-letter code is generated when no match is found, I believe the depositor's code will be used if it is available, at least one of mine was last year, so there is some use for Nigel's utility if you want to stamp your new compound with a rememberable name. eab On 06/21/2015 06:33 PM, Martyn Symmons wrote: Miri raises important points about issues in the PDB Chemical Component Dictionary - I think part of the problem is that this is published completely separately from the actual PDB - so for example I don't think we have an archive of the CCD for comparison alongside the PDB snapshots? This makes it difficult to follow the convoluted track of particular ligands through the PDB's many,many changes to small molecule definitions. But following discussion with other contributors offline I want to make it clear what is my understanding of the ZA3 (2Y2I /2Y59) case: I am clear there was no unethical behaviour by either group in the course of their work on these structures and the publication of them. The problem I am highlighting is that the PDB don't understand publishing ethics - what happened in ZA3 was that they published a little bit of one group's work to support the work of someone who was scooping them! I can't imagine a journal doing that can you? When I work on my supplementary material in a paper I don't expect that the journal will take a bit out and publish it separately to support the work of my competitors. Not out of spite that I was beaten - but because I don't want to take the responsibility for checking their science for them! All the best Martyn Cambridge On Sun, Jun 21, 2015 at 7:01 PM, Miri Hirshberg 02897e8e9f0f-dmarc-requ...@jiscmail.ac.uk wrote: Sun., June 21st 2015 Good evening, adding several general points to the thread. (1) Fundamentally PDB unlike other chemical databases insists that all equal structures should have the same
Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]
Sun., June 21st 2015 Good evening, adding several general points to the thread. (1) Fundamentally PDB unlike other chemical databases insists that all equal structures should have the same 3-letter code and the same atom names - obviously for amino acids and say ATP. (1.1) Needless to say there are endless examples in the PDB of two ligands differ by let say one hydroxyl group, where equivalent atoms in the two ligands having totally different names. (2) When a structure is deposited with a ligand, the ligand is first compared against PDB chem_comp database (CCD) and against the on-hold chem_comp (CCD) (naturally the latter is not publicly available), and only if no-match can be found a new three-letter code is generated and assigned. If not, then this is a mistake in annotation and should not happen. (3) Exception to the above take several different flavours. This include: (3.1) When the same ligand is described in PDB as a 3-letters-code and as well as a combination of two different 3-letters-code ligands. An example out of many is phosphoserine. The 3-letter-code in PDB CCD is SEP which is used in 704 PDB entries (RCSB counting 21-June-2015). But in the PDB entry 3uw2 the phosphoserine 109A is described as a combination of SER and the inorganic phosphate PO4 !!! (a side point: note the inorganic PO4 became organic upon this linkage - a PDB chemical conundrum!!). (3.2) CCDC does not make any attempt to standardise atom names nor to match same structures to have equal atom names - original author atom names are kept so that amino acids may have bizarre atom names and where required symmetry atom names are generated - this is rare in the PDB but not unknown, and the PDB is poor at completing atom/ligand names where symmetry is required and in fact often is not completed in any chemical reasonable sense as this would require changes in occupancy. The simplest case is in racemic PDB entries where the symmetry generated structure for say L-ALA should be the D-version DAL, but PDB as is, has not coped with it, as it would require two sets of coordinates each at say 1/2 occupancy (usually). One of several examples in the PDB archive is pdb entry 3e7r. The Xray structure of Racemic Plectasin. The entry consists of one protein chain, in SPG P-1. In the manuscript http://onlinelibrary.wiley.com/doi/10.1002/pro.127/pdf Figure 3a, for example shows Crystal packing. (a) Centrosymmetric P-1 unit cell. The L-plectasin molecule is shown in blue and the D-plectasin molecule is in gold. But if you use the PDB entry, and the symmetry operator of P-1 to generate the two symmetry related mates in the unit cell you will get a chain with L- naming residues GLY-PHE-GLY-CYS-ASN-GLY-PRO- etc representing D- amino acids. (GLY is a special case). (3.3) There is also the problem in assigning a 3-letter code where the submission has obviously assigned the wrong chirality. One example is a where the sugar must be NAG but is assigned NGA in a glycopeptide where NGA is impossible - the PDB should have assigned NAG with a CAVEAT that the chirality is incorrect. Note, re-refinement by other software will require a bond-breakage. NGA is used in 90 entries (RCSB counting 21-June-2015) regards Miri From: Yong Wang wang_yon...@lilly.com Reply-to: Yong Wang wang_yon...@lilly.com To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071) Date: Sat, 20 Jun 2015 18:36:34 + Sharing a ligand name should only be limited to having the same compound, i.e. same 2D structure or connectivity. Each deposition should have its own 3D coordinates. If a different publication gets your ligand 3D coordinates (2Y59 actually embodies the atomic coordinates from the 2Y2I), that looks to me an oversight by PDB. It is hard to believe that PDB intended to use the 3D coordinates from one entry for the other, ligand or not. In fact, the restraints as described by the ligand dictionary should also be kept separate as that reflects how the authors refine their ligand. Yong -Original Message- From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Martyn Symmons Sent: Friday, June 19, 2015 8:39 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071) By oversimplifying the situation here the PDB does not answer my related point about competing crystallographers: My scenario: Group A deposits structure with new drug - gets their three-letter code for example ZA3 they then get to check the coordinates and chemical definition of this ligand. But suppose a little after that a competing group B deposits their structure with the same drug which they think is novel - but no... they get assigned the now described ZA3 which has been checked by the other group. Then it is a race to see who gets to publish and release first. And if
Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]
I can't imagine a journal doing that can you? When I work on my supplementary material in a paper I don't expect that the journal will take a bit out and publish it separately to support the work of my competitors. Not out of spite that I was beaten - but because I don't want to take the responsibility for checking their science for them! I don't see the problem here. What about the dozens of authors who will benefit from using your ligand in their structure _after_ your structure comes out? You don't take responsibility for checking their science. Every author gets a copy of his final structure to check before it is released and each is responsible for his own. The only difference here is whether the competitor got to use it first, (which might sting a bit) or only after you had already made it your own with the first structure. I guess the ligand database is the responsibility of the pdb, but they depend on first depositors to help set up each ligand, so it is not surprising if the type model has coordinates from the first depositor's structure (although it would be convenient if they were all moved to c.o.m. at 0,0,0). When another group publishes a structure with the ligand, they will not be publishing the first depositor's coordinates because the ligand will be moved to its position in their structure and refined against their data, probably with somewhat different restraints. If the ligand is a top secret novel drug lead that your company is developing I guess it would come as a shock to find someone else has already deposited it, and it might be good to hasten not the publication but protecting of the compound with a patent! Although Miriam says a new 3-letter code is generated when no match is found, I believe the depositor's code will be used if it is available, at least one of mine was last year, so there is some use for Nigel's utility if you want to stamp your new compound with a rememberable name. eab On 06/21/2015 06:33 PM, Martyn Symmons wrote: Miri raises important points about issues in the PDB Chemical Component Dictionary - I think part of the problem is that this is published completely separately from the actual PDB - so for example I don't think we have an archive of the CCD for comparison alongside the PDB snapshots? This makes it difficult to follow the convoluted track of particular ligands through the PDB's many,many changes to small molecule definitions. But following discussion with other contributors offline I want to make it clear what is my understanding of the ZA3 (2Y2I /2Y59) case: I am clear there was no unethical behaviour by either group in the course of their work on these structures and the publication of them. The problem I am highlighting is that the PDB don't understand publishing ethics - what happened in ZA3 was that they published a little bit of one group's work to support the work of someone who was scooping them! I can't imagine a journal doing that can you? When I work on my supplementary material in a paper I don't expect that the journal will take a bit out and publish it separately to support the work of my competitors. Not out of spite that I was beaten - but because I don't want to take the responsibility for checking their science for them! All the best Martyn Cambridge On Sun, Jun 21, 2015 at 7:01 PM, Miri Hirshberg 02897e8e9f0f-dmarc-requ...@jiscmail.ac.uk wrote: Sun., June 21st 2015 Good evening, adding several general points to the thread. (1) Fundamentally PDB unlike other chemical databases insists that all equal structures should have the same 3-letter code and the same atom names - obviously for amino acids and say ATP. (1.1) Needless to say there are endless examples in the PDB of two ligands differ by let say one hydroxyl group, where equivalent atoms in the two ligands having totally different names. (2) When a structure is deposited with a ligand, the ligand is first compared against PDB chem_comp database (CCD) and against the on-hold chem_comp (CCD) (naturally the latter is not publicly available), and only if no-match can be found a new three-letter code is generated and assigned. If not, then this is a mistake in annotation and should not happen. (3) Exception to the above take several different flavours. This include: (3.1) When the same ligand is described in PDB as a 3-letters-code and as well as a combination of two different 3-letters-code ligands. An example out of many is phosphoserine. The 3-letter-code in PDB CCD is SEP which is used in 704 PDB entries (RCSB counting 21-June-2015). But in the PDB entry 3uw2 the phosphoserine 109A is described as a combination of SER and the inorganic phosphate PO4 !!! (a side point: note the inorganic PO4 became organic upon this linkage - a PDB chemical conundrum!!). (3.2) CCDC does not make any attempt to standardise atom names nor to match same structures to have equal atom names - original author atom names are kept so that amino acids may have
Re: [ccp4bb] [Fwd: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071)]
Miri raises important points about issues in the PDB Chemical Component Dictionary - I think part of the problem is that this is published completely separately from the actual PDB - so for example I don't think we have an archive of the CCD for comparison alongside the PDB snapshots? This makes it difficult to follow the convoluted track of particular ligands through the PDB's many,many changes to small molecule definitions. But following discussion with other contributors offline I want to make it clear what is my understanding of the ZA3 (2Y2I /2Y59) case: I am clear there was no unethical behaviour by either group in the course of their work on these structures and the publication of them. The problem I am highlighting is that the PDB don't understand publishing ethics - what happened in ZA3 was that they published a little bit of one group's work to support the work of someone who was scooping them! I can't imagine a journal doing that can you? When I work on my supplementary material in a paper I don't expect that the journal will take a bit out and publish it separately to support the work of my competitors. Not out of spite that I was beaten - but because I don't want to take the responsibility for checking their science for them! All the best Martyn Cambridge On Sun, Jun 21, 2015 at 7:01 PM, Miri Hirshberg 02897e8e9f0f-dmarc-requ...@jiscmail.ac.uk wrote: Sun., June 21st 2015 Good evening, adding several general points to the thread. (1) Fundamentally PDB unlike other chemical databases insists that all equal structures should have the same 3-letter code and the same atom names - obviously for amino acids and say ATP. (1.1) Needless to say there are endless examples in the PDB of two ligands differ by let say one hydroxyl group, where equivalent atoms in the two ligands having totally different names. (2) When a structure is deposited with a ligand, the ligand is first compared against PDB chem_comp database (CCD) and against the on-hold chem_comp (CCD) (naturally the latter is not publicly available), and only if no-match can be found a new three-letter code is generated and assigned. If not, then this is a mistake in annotation and should not happen. (3) Exception to the above take several different flavours. This include: (3.1) When the same ligand is described in PDB as a 3-letters-code and as well as a combination of two different 3-letters-code ligands. An example out of many is phosphoserine. The 3-letter-code in PDB CCD is SEP which is used in 704 PDB entries (RCSB counting 21-June-2015). But in the PDB entry 3uw2 the phosphoserine 109A is described as a combination of SER and the inorganic phosphate PO4 !!! (a side point: note the inorganic PO4 became organic upon this linkage - a PDB chemical conundrum!!). (3.2) CCDC does not make any attempt to standardise atom names nor to match same structures to have equal atom names - original author atom names are kept so that amino acids may have bizarre atom names and where required symmetry atom names are generated - this is rare in the PDB but not unknown, and the PDB is poor at completing atom/ligand names where symmetry is required and in fact often is not completed in any chemical reasonable sense as this would require changes in occupancy. The simplest case is in racemic PDB entries where the symmetry generated structure for say L-ALA should be the D-version DAL, but PDB as is, has not coped with it, as it would require two sets of coordinates each at say 1/2 occupancy (usually). One of several examples in the PDB archive is pdb entry 3e7r. The Xray structure of Racemic Plectasin. The entry consists of one protein chain, in SPG P-1. In the manuscript http://onlinelibrary.wiley.com/doi/10.1002/pro.127/pdf Figure 3a, for example shows Crystal packing. (a) Centrosymmetric P-1 unit cell. The L-plectasin molecule is shown in blue and the D-plectasin molecule is in gold. But if you use the PDB entry, and the symmetry operator of P-1 to generate the two symmetry related mates in the unit cell you will get a chain with L- naming residues GLY-PHE-GLY-CYS-ASN-GLY-PRO- etc representing D- amino acids. (GLY is a special case). (3.3) There is also the problem in assigning a 3-letter code where the submission has obviously assigned the wrong chirality. One example is a where the sugar must be NAG but is assigned NGA in a glycopeptide where NGA is impossible - the PDB should have assigned NAG with a CAVEAT that the chirality is incorrect. Note, re-refinement by other software will require a bond-breakage. NGA is used in 90 entries (RCSB counting 21-June-2015) regards Miri From: Yong Wang wang_yon...@lilly.com Reply-to: Yong Wang wang_yon...@lilly.com To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] FW: New ligand 3-letter code (help-7071) Date: Sat, 20 Jun 2015 18:36:34 + Sharing a ligand name should only be limited to having the