How difficult would it be to change the state script generator in Jmol to avoid using atom serial numbers?
I'm sure none of us ever envisioned the current state of affairs. We have who knows how many state scripts saved in Proteopedia, and now an unknown number of March-17-remediated PDB files have scrambled atom serial numbers. Further, there is no guarantee that such scrambling will not occur again in a future remediation.
A change in Jmol to avoid serial number dependency in saved state scripts will not avoid the need for repairs to Proteopedia now, but may avoid such calamities in the future.
Regards, -Eric
Date: Mon, 30 Mar 2009 15:09:56 -0400
From: John Westbrook <jw...@rcsb.rutgers.edu>
Reply-To: jw...@rcsb.rutgers.edu
Organization: RCSB - Protein Data Bank
To: ema...@microbio.umass.edu
CC: Helen Berman <ber...@rcsb.rutgers.edu>, Kim Henrick <henr...@ebi.ac.uk>
Subject: Re: Chain order changes: a problem for Proteopedia
Dear Eric,
In producing the V3.2 wwPDB release, we have tried to preserve as much
as possible the PDB chain and residue nomenclature for polymer molecular components.
With version 3.2 comes the introduction of more uniform assignment
PDB chain identifiers for ligands and solvent, so this has resulted in
some nomenclature changes in V3.2 files.
Neither the V2.6 nor V3.2 format specifications suggests that the atom serial
number be used as the primary atom identifier. I am sure that you are aware that
there are now many PDB entries that have been split across multiple PDB data
files specifically because of the limitation in the range atom serial numbers.
Atom serial numbers are also replicated between models so they do not represent
unique atom identifiers for NMR or other multi-model entries.
If you have a specific dependency prior atom serial numbers in your software
system, then you can always recover the particular version of the PDB entry
that you used from our ftp snapshot server ( ftp://snapshots.rcsb.org).
I should point out that we provided Jaim with an advanced copy of the
V3.2 files for testing on Dec 3, 2008. The issue of atom ordering
was not raised as an issue at that time.
Regards,
John
>>>
>>> Begin forwarded message:
>>>
>>>> From: Eric Martz <ema...@microbio.umass.edu>
>>>> Date: March 29, 2009 7:27:12 PM PDT
>>>> To: "i...@rcsb.org" <i...@rcsb.org>, "pd...@rcsb.org" <pd...@rcsb.org>
>>>> Subject: pdb-l: Chain order changes: a problem for Proteopedia
>>>>
>>>> Dear wwPDB:
>>>>
>>>> The March 17, 2009 remediation of PDB data in the wwPDB (PDB format
>>>> 3.20) appears to me to have, in many cases, changed the order of
>>>> chains, and hence the atom serial numbers in the PDB files. This has
>>>> created a major problem in the wiki Proteopedia.Org, where many
>>>> molecular scenes that took hours or weeks to develop are now
>>>> nonfunctional.
>>>>
>>>> The problem arises becaused Jmol uses atom serial numbers for
>>>> selecting groups of atoms when it saves a molecular scene (in a
>>>> "state script"). Proteopedia's Scene Authoring Tool uses Jmol's state
>>>> scripts to capture molecular scenes and attach them to "green links".
>>>>
>>>> Questions:
>>>>
>>>> 1. Were the names of ATOM chains ever changed? I assume (and hope)
>>>> not, but I have not checked carefully. I see that the chain names
>>>> assigned to HETATMs were changed in some cases, e.g. 1e3m, where an
>>>> ADP single-residue "chain" originally named chain C (before the 2007
>>>> remediation) is now deemed to be part of chain A (and its position
>>>> was moved to the end of the file, after all ATOM records). Since I
>>>> have been unable to get pre-March-17 snapshot PDB files (the
>>>> snapshot.wwpdb.org server is unresponsive) I am not sure when each of
>>>> these changes were made.
>>>>
>>>> 2. Was the changing of chain orders in the March 17 remediation
>>>> intentional? If so, is the new order specified somewhere in the 3.20
>>>> documentation? I can see no pattern to the new chain orders (see
>>>> examples below).
>>>>
>>>> 3. Were chain orders ever changed in files that contain only protein
>>>> chains (no nucleic acids)?
>>>>
>>>> 4. Will the changes in chain order be retained permanently (requiring
>>>> substantial repairs to Proteopedia.Org)?
>>>>
>>>> Observations:
>>>>
>>>> We first noticed the broken molecular scenes in Proteopedia in cases
>>>> that involved DNA. Therefore I have so far limited my inspection of
>>>> PDB files to those containing both protein and DNA.
>>>>
>>>> Since the snapshot ftp server is unresponsive today, my comparisons
>>>> were all made between files I had saved before the 2007 remediation
>>>> (typically saved 2001-2004), and current files. We have reason to
>>>> suspect that changes in chain ordering occurred in the March 17, 2009
>>>> remediation, but I cannot verify this for the cases below.
>>>>
>>>> Some files have NO CHANGE in chain order:
>>>> 1d66: DE (DNA), AB (protein).
>>>> 1osl: (an NMR multiple model file) AB (protein), CD (DNA).
>>>> 1e3m: old AB (protein), C (single residue ADP HETATM "chain"), EF
>>>> (DNA); new AB, EF. (ADP now in chain A at the end, thus changing ATOM
>>>> serial numbers.)
>>>> Thus there appears to be no requirement for nucleic acid or
>>>> protein chains to come first.
>>>>
>>>> Some files that had protein first were rearranged to put DNA first:
>>>> 1aoi: old ABCDEFGH (protein), IJ (DNA); new IJ, ABCDEDFH.
>>>> 1fzp: old DB (protein), WK (DNA); new WK, DB.
>>>> 1hcr: old A (protein), BC (DNA); new BC, A.
>>>> Thus there appears to be no requirement that chains be in
>>>> alphabetic order.
>>>>
>>>> One file had an RNA chain moved to BETWEEN two DNA chains, leaving
>>>> protein before DNA:
>>>> 1qln: old A (protein), TN (DNA), R (RNA); new A (protein), N
>>>> (DNA), R (RNA), T (DNA).
>>>> The new order happens to be alphabetical by chain name, but
>>>> this is not true in other files (see above).
>>>>
>>>> I did not happen to come across a case where DNA chains preceded
>>>> protein in the old format, with protein being moved before DNA in the
>>>> new format.
>>>>
>>>> There also appears to be no requirement that chains be in the order
>>>> given in the COMPND records. Examples where the order differs in the
>>>> new files: 1flo, 1qln.
>>>>
>>>> Sincerely, -Eric
>>>>
>>>> /* - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>> Eric Martz, Professor Emeritus, Dept Microbiology
>>>> U Mass, Amherst -- http://Martz.MolviZ.Org
>>>>
>>>> Top Five 3D MolVis Technologies http://Top5.MolviZ.Org
>>>> 3D Wiki with Scene-Authoring Tools http://Proteopedia.Org
>>>> Biochem 3D Education Resources http://MolviZ.org
>>>> See 3D Molecules, Install Nothing! - http://firstglance.jmol.org
>>>> ConSurf - Find Conserved Patches in Proteins: http://consurf.tau.ac.il
>>>> Atlas of Macromolecules: http://atlas.molviz.org
>>>> Workshops: http://workshops.molviz.org
>>>> World Index of Molecular Visualization Resources:
>>>> http://molvisindex.org
>>>> PDB Lite Macromolecule Finder: http://pdblite.org
>>>> Molecular Visualization EMail List (molvis-list):
>>>> http://list.molviz.org
>>>> Protein Explorer - 3D Visualization: http://proteinexplorer.org
>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - */
>>>>
>>>>
>>>> TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
>>>> https://lists.sdsc.edu/mailman/listinfo.cgi/pdb-l .
>>>
--
******************************************************************
John Westbrook, Ph.D.
Rutgers, The State University of New Jersey
Department of Chemistry and Chemical Biology
610 Taylor Road
Piscataway, NJ 08854-8087
e-mail: jw...@rcsb.rutgers.edu
Ph: (732) 445-4290 Fax: (732) 445-4320
******************************************************************
------------------------------------------------------------------------------
_______________________________________________ Jmol-users mailing list Jmol-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jmol-users