I see that Coot exports to PDB with an invalid HETNAM field, putting
5-letter code in place of the 3 letter code. This is unnecessary and
breaks the PDB format rules. I will give an example, along with my
suggestion for the correct way to handle conversion.

First, the 3-letter residue name has never been a real limitation,
only a limitation in the simplistic way software has been written. The
HETNAM record was designed specifically to avoid global name
collisions, by declaring a local short name linked to the global long
name. Now that we are forced to abandon the simple 3-letter globals,
simply declaring PDB format obsolete is unwarranted. Its simplicity
for most uses will keep it alive for a very long time. Likewise, I
think the complexity of CIF is aimed at databases, and not
experimental structure determination.

Consider structure  9NQP.cif, which the following data elements to
define the ligand:

pdbx_entity_nonpoly.comp_id = A1BXB
entity.pdbx_description = Pritelivir
chem_comp.pdbx_synonym =
'N-methyl-N-(4-methyl-5-sulfamoyl-1,3-thiazol-2-yl)-2-[4-(pyridin-2-yl)phenyl]acetamide'
pdbx_nonpoly_scheme.auth_mon_id = PTL

>From this we can easily asses that an actually useful working 3-letter
code for use in the file-local scope should be what the author used (A
useful new data field from CIF), the primary name, a synonym, and the
comp_id global database key:

I propose that the PDB format should be as follows, and is fully
compatible with existing PDB format:

HETNAM     PTL Pritelivir
HETSYN     PTL N-methyl-N-(4-methyl-5-sulfamoyl-1,3-thiazol-2-yl)-2-[4-(pyridin-
HETSYN   2     2-yl)phenyl]acetamide
HETSYN     PTL A1BXB

or optionally tag the comp_id as:
HETSYN     PTL comp_id:A1BXB

The original specs for HETNAM, HETSYN don't include any sort of label,
and can easily be processed as "In the database of ligands, use the
first lookup string that matches" but since structural biologists
aren't running databases, we can avoid the slow-down of full database
lookup by including the quick comp_id: name label.

This is a simple solution to what really is not a format problem, but
an implementation problem. The above method will work with all
existing software, with the caveat that PTL will give an invalid
assumption about a global comp_id, which would in most cases be
handled as unknown ligands are already handled, and a very quick
software update to actually use the HETNAM, HETSYN data.

Juno Krahn

########################################################################

To unsubscribe from the COOT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=COOT&A=1

This message was issued to members of www.jiscmail.ac.uk/COOT, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to