Re: [Jmol-users] Fwd: Re: Chain order changes: a problem for Proteopedia

Robert Hanson Mon, 30 Mar 2009 15:40:58 -0700

Let's clear up the atom serial number business.

PDB file contain atom serial numbers. These are in the ATOM and HETATM
records. Behind the scenes, in the Jmol code, we call these "atom serial
numbers" as opposed to "atom index numbers", which start at 0 and go
consecutively through ALL models loaded. You can use atom serial numbers
"atomno" in select commands, to select desired atoms, but we can't use them
in Jmol in terms of the state for several reasons:


1) They repeat with different structures in an multi-model file or when
multiple files are loaded.
2) They could be anything, including negative numbers.
3) They could be in any order and could have gaps.

Really it doesn't matter what the RCSB writes as the specification. This
atom serial number field has been used in various ways; Jmol just can't use
it as a true "unique identifier" for anything.

As for the state: Once a selection has been made, that original script is
history. What is selected is a set of atoms. Jmol must use its atom indexing
method, which is 100% reliable based on reading the same file again and is
100% independent of the history.

I really think it comes down to this simple tenet: A script file describes
the state for the file that was loaded. That's all that can be asked of it.
You can't ask that the script be able to "autoupdate" if the file it refers
to is modified.

If you ask me, it's a design flaw of Protopedia to be accessing live PDB
files from RCSB with Jmol state scripts. Of course that should be fine if
the files are never changed, but since they are subject to modification,
that's just not acceptable. Eric, you need to find the original files, copy
them, and access them locally. If there's a way to access all earlier files
before a certain date, then perhaps you could just make a blanket change in
all the Jmol script files (a simple bulk search and replace for the
directory and filename) to change all the pdb file directories to that older
directory and, perhaps, add ".gzip" or whatever to the filename. If not,
well, you just have to go find those files and copy them yourself.

I'm sorry, but I really think that's the only appropriate solution. Not fun,
I'm sure.

Bob


On Mon, Mar 30, 2009 at 5:06 PM, Eran Hodis <[email protected]> wrote:

> Dear Eric,
> Your email mentioned atom serial numbers but Bob said in a previous email 
> "Eric,
> Jmol does not use atom serial numbers in state scripts. It uses atom
> indexes, but maybe that's what you were referring to."
>
> However, Bob, if atom indexes are the same as atom serial numbers, then I
> too would argue that Jmol state scripts should use a means other than atom
> indexes to save colorings, representations, and so forth, not because it
> would make life more simple for Proteopedia, but rather because John
> Westbrook at the PDB suggested in the email Eric sent us that "Neither the
> V2.6 nor V3.2 format specifications suggests that the atom serial number be
> used as the primary atom identifier."  In other words they feel completely
> comfortable changing the atom serial number, and so if any Jmol user has a
> saved state script but hasn't locally saved the PDB file it recalls, that
> state script would not display correctly.
>
> As Eric said, such a change in the state saving function would not help
> Proteopedia with the current problem, but would help with future releases of
> remediated PDB files.
>
> Best regards,
> Eran
>
>
> On Tue, Mar 31, 2009 at 12:16 AM, Eric Martz <[email protected]>wrote:
>
>>  Dear Bob,
>>
>> How difficult would it be to change the state script generator in Jmol to
>> avoid using atom serial numbers?
>>
>> I'm sure none of us ever envisioned the current state of affairs. We have
>> who knows how many state scripts saved in Proteopedia, and now an unknown
>> number of March-17-remediated PDB files have scrambled atom serial numbers.
>> Further, there is no guarantee that such scrambling will not occur again in
>> a future remediation.
>>
>> A change in Jmol to avoid serial number dependency in saved state scripts
>> will not avoid the need for repairs to Proteopedia now, but may avoid such
>> calamities in the future.
>>
>> Regards, -Eric
>>
>> Date: Mon, 30 Mar 2009 15:09:56 -0400
>> From: John Westbrook <[email protected]>
>> Reply-To: [email protected]
>> Organization: RCSB - Protein Data Bank
>> To: [email protected]
>> CC: Helen Berman <[email protected]>, Kim Henrick <
>> [email protected]>
>> Subject: Re: Chain order changes: a problem for Proteopedia
>>
>> Dear Eric,
>>
>> In producing the V3.2 wwPDB release, we have tried to preserve as much
>> as possible the PDB chain and residue nomenclature for polymer molecular
>> components.
>> With version 3.2 comes the introduction of more uniform assignment
>> PDB chain identifiers for ligands and solvent, so this has resulted in
>> some nomenclature changes in V3.2 files.
>>
>> Neither the V2.6 nor V3.2 format specifications suggests that the atom
>> serial
>> number be used as the primary atom identifier.   I am sure that you are
>> aware that
>> there are now many PDB entries that have been split across multiple PDB
>> data
>> files specifically because of the limitation in the range atom serial
>> numbers.
>> Atom serial numbers are also replicated between models so they do not
>> represent
>> unique atom identifiers for NMR or other multi-model entries.
>>
>> If you have a specific dependency prior atom serial numbers in your
>> software
>> system, then you can always recover the particular version of the PDB
>> entry
>> that you used from our ftp snapshot server ( ftp://snapshots.rcsb.org).
>>
>> I should point out that we provided Jaim with an advanced copy of the
>> V3.2 files for testing on Dec 3, 2008.   The issue of atom ordering
>> was not raised as an issue at that time.
>>
>> Regards,
>>
>> John
>>
>>
>>
>> >>>
>> >>> Begin forwarded message:
>> >>>
>> >>>> From: Eric Martz <[email protected]>
>> >>>> Date: March 29, 2009 7:27:12 PM PDT
>> >>>> To: "[email protected]" <[email protected]>, "[email protected]" <
>> [email protected]>
>> >>>> Subject: pdb-l: Chain order changes: a problem for Proteopedia
>>
>> >>>>
>> >>>> Dear wwPDB:
>> >>>>
>> >>>> The March 17, 2009 remediation of PDB data in the wwPDB (PDB format
>> >>>> 3.20) appears to me to have, in many cases, changed the order of
>> >>>> chains, and hence the atom serial numbers in the PDB files. This has
>> >>>> created a major problem in the wiki Proteopedia.Org, where many
>> >>>> molecular scenes that took hours or weeks to develop are now
>> >>>> nonfunctional.
>> >>>>
>> >>>> The problem arises becaused Jmol uses atom serial numbers for
>> >>>> selecting groups of atoms when it saves a molecular scene (in a
>> >>>> "state script"). Proteopedia's Scene Authoring Tool uses Jmol's state
>> >>>> scripts to capture molecular scenes and attach them to "green links".
>> >>>>
>> >>>> Questions:
>> >>>>
>> >>>> 1. Were the names of ATOM chains ever changed? I assume (and hope)
>> >>>> not, but I have not checked carefully. I see that the chain names
>> >>>> assigned to HETATMs were changed in some cases, e.g. 1e3m, where an
>> >>>> ADP single-residue "chain" originally named chain C (before the 2007
>> >>>> remediation) is now deemed to be part of chain A (and its position
>> >>>> was moved to the end of the file, after all ATOM records). Since I
>> >>>> have been unable to get pre-March-17 snapshot PDB files (the
>> >>>> snapshot.wwpdb.org server is unresponsive) I am not sure when each
>> of
>> >>>> these changes were made.
>> >>>>
>> >>>> 2. Was the changing of chain orders in the March 17 remediation
>> >>>> intentional? If so, is the new order specified somewhere in the 3.20
>> >>>> documentation? I can see no pattern to the new chain orders (see
>> >>>> examples below).
>> >>>>
>> >>>> 3. Were chain orders ever changed in files that contain only protein
>> >>>> chains (no nucleic acids)?
>> >>>>
>> >>>> 4. Will the changes in chain order be retained permanently (requiring
>> >>>> substantial repairs to Proteopedia.Org)?
>> >>>>
>> >>>> Observations:
>> >>>>
>> >>>> We first noticed the broken molecular scenes in Proteopedia in cases
>> >>>> that involved DNA. Therefore I have so far limited my inspection of
>> >>>> PDB files to those containing both protein and DNA.
>> >>>>
>> >>>> Since the snapshot ftp server is unresponsive today, my comparisons
>> >>>> were all made between files I had saved before the 2007 remediation
>> >>>> (typically saved 2001-2004), and current files. We have reason to
>> >>>> suspect that changes in chain ordering occurred in the March 17, 2009
>> >>>> remediation, but I cannot verify this for the cases below.
>> >>>>
>> >>>> Some files have NO CHANGE in chain order:
>> >>>> 1d66: DE (DNA), AB (protein).
>> >>>> 1osl: (an NMR multiple model file) AB (protein), CD (DNA).
>> >>>> 1e3m: old AB (protein), C (single residue ADP HETATM "chain"), EF
>> >>>> (DNA); new AB, EF. (ADP now in chain A at the end, thus changing ATOM
>> >>>> serial numbers.)
>> >>>>   Thus there appears to be no requirement for nucleic acid or
>> >>>> protein chains to come first.
>> >>>>
>> >>>> Some files that had protein first were rearranged to put DNA first:
>> >>>> 1aoi: old ABCDEFGH (protein), IJ (DNA); new IJ, ABCDEDFH.
>> >>>> 1fzp: old DB (protein), WK (DNA); new WK, DB.
>> >>>> 1hcr: old A (protein), BC (DNA); new BC, A.
>> >>>>   Thus there appears to be no requirement that chains be in
>> >>>> alphabetic order.
>> >>>>
>> >>>> One file had an RNA chain moved to BETWEEN two DNA chains, leaving
>> >>>> protein before DNA:
>> >>>> 1qln: old A (protein), TN (DNA), R (RNA); new A (protein), N
>> >>>> (DNA), R (RNA), T (DNA).
>> >>>>    The new order happens to be alphabetical by chain name, but
>> >>>> this is not true in other files (see above).
>> >>>>
>> >>>> I did not happen to come across a case where DNA chains preceded
>> >>>> protein in the old format, with protein being moved before DNA in the
>> >>>> new format.
>> >>>>
>> >>>> There also appears to be no requirement that chains be in the order
>> >>>> given in the COMPND records. Examples where the order differs in the
>> >>>> new files: 1flo, 1qln.
>> >>>>
>> >>>> Sincerely, -Eric
>> >>>>
>> >>>> /* - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >>>> Eric Martz, Professor Emeritus, Dept Microbiology
>> >>>> U Mass, Amherst -- http://Martz.MolviZ.Org<http://martz.molviz.org/>
>> >>>>
>> >>>> Top Five 3D MolVis Technologies 
>> >>>> http://Top5.MolviZ.Org<http://top5.molviz.org/>
>> >>>> 3D Wiki with Scene-Authoring Tools 
>> >>>> http://Proteopedia.Org<http://proteopedia.org/>
>> >>>> Biochem 3D Education Resources http://MolviZ.org<http://molviz.org/>
>> >>>> See 3D Molecules, Install Nothing! - http://firstglance.jmol.org
>> >>>> ConSurf - Find Conserved Patches in Proteins:
>> http://consurf.tau.ac.il
>> >>>> Atlas of Macromolecules: http://atlas.molviz.org
>> >>>> Workshops: http://workshops.molviz.org
>> >>>> World Index of Molecular Visualization Resources:
>> >>>> http://molvisindex.org
>> >>>> PDB Lite Macromolecule Finder: http://pdblite.org
>> >>>> Molecular Visualization EMail List (molvis-list):
>> >>>> http://list.molviz.org
>> >>>> Protein Explorer - 3D Visualization: http://proteinexplorer.org
>> >>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - */
>> >>>>
>> >>>>
>> >>>> TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see
>> >>>> https://lists.sdsc.edu/mailman/listinfo.cgi/pdb-l .
>> >>>
>>
>>
>> --
>> ******************************************************************
>>   John Westbrook, Ph.D.
>>   Rutgers, The State University of New Jersey
>>   Department of Chemistry and Chemical Biology
>>   610 Taylor Road
>>   Piscataway, NJ 08854-8087
>>   e-mail: [email protected]
>>   Ph:  (732) 445-4290  Fax: (732) 445-4320
>> ******************************************************************
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Jmol-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/jmol-users
>>
>>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Jmol-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/jmol-users
>
>


-- 
Robert M. Hanson
Professor of Chemistry
St. Olaf College
1520 St. Olaf Ave.
Northfield, MN 55057
http://www.stolaf.edu/people/hansonr
phone: 507-786-3107


If nature does not answer first what we want,
it is better to take what answer we get.

-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900

------------------------------------------------------------------------------

_______________________________________________
Jmol-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jmol-users

Re: [Jmol-users] Fwd: Re: Chain order changes: a problem for Proteopedia

Reply via email to