Dear Jose, the question came up again because I did not receive an answer to my question. The thread discussed benefits and malefits of PDB vs. mmCIF, which was not my question. This time, Nat Echolls gave a very reasonable answer (at least for phenix) on the phenixbb, i.e., that there are no plans to abandon the PDB format (as working format), but very likely a smooth transition will take place - I guess this will be more slowly than the enforcement of the PDB to upload PDBx/mmCIF files for archiving. I agree that for archiving mmCIF is a reasonable format, but I guess less than 1% of all structures in the PDB hit the limits of the PDB format.
I greatly appreciate Nat's answer and I would appreciate an answer from the responsibles for the other refinement programs. Best, Tim On 10/05/2014 08:05 PM, Jose Manuel Duarte wrote: > Thanks Frances for the explanation. Indeed mmCIF format is a lot more > complicated and grep can be a dangerous tool to use with them. But for > most cases it can do the job and thus it maintains some sort of > backwards compatibility. I can't agree more that using specialised tools > (for either PDB files or mmCIF files) that deal with the formats > properly is the best solution (see for instance > http://mmcif.wwpdb.org/docs/software-resources.html for some of the > mmCIF readers). > > In any case I find it most surprising that this topic came yet again to > this BB, when it was thoroughly discussed last year in this thread: > > https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1308&L=ccp4bb&D=0&P=26939 > > I'm not sure why this kind of urban legends on the evilness of the mmCIF > format keep coming back to the list... > > As explained there and elsewhere endless times, the PDB format is > inadequate to represent the complexity of macromolecules and has been > needing a replacement for a long time. The decision to move on to mmCIF > has been made and in my opinion the sooner we move forward the better. > > Cheers > > Jose > > > > On 05.10.2014 15:52, Frances C. Bernstein wrote: >> mmCIF is a very general format with tag-value pairs, and loops >> so that tags do not need to be repeated endlessly. It was >> designed so that there is the flexibility of defining new terms >> easily and presenting the data in any order and with any kind >> of spacing. >> >> I understand that there are 100000+ files in cyberspace prepared >> by the PDB and that they all have the 'same' format. >> >> It is tempting to write software that treats these files as fixed >> format and hope that all software packages that generate coordinate >> files will use the same fixed format. But that loses the generality >> and flexibility of mmCIF, and software written that way will fail >> when some field requires more characters or a new field is added. >> There are software tools to allow one to read and extract data from >> any mmCIF file; using these is more complicated than using grep but >> using these assures that one's software will not fail when it encounters >> a date file that is not exactly what the PDB is currently producing. >> >> Note that mmCIf was defined when the limitations of the fixed-format >> PDB format became apparent with large structures. Let's not repeat >> the mistakes of the past. >> >> Frances >> >> ===================================================== >> **** Bernstein + Sons >> * * Information Systems Consultants >> **** 5 Brewster Lane, Bellport, NY 11713-2803 >> * * *** >> **** * Frances C. Bernstein >> * *** f...@bernstein-plus-sons.com >> *** * >> * *** 1-631-286-1339 FAX: 1-631-286-1999 >> ===================================================== >> >> On Sun, 5 Oct 2014, Tim Gruene wrote: >> >>> Hi Jose, >>> >>> I see. In the example on page >>> http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Categories/atom_site.html, >>> >>> it is in field 12, though, and I would have thought that mmCIF allows >>> line breaks. >>> >>> But as long as all developers writing PDBx/mmCIF with their programs >>> follow the PDB constraints (``styling plans'' in their FAQ), everything >>> is fine. >>> >>> Cheers, >>> Tim >>> >>> On 10/05/2014 01:13 PM, Jose Manuel Duarte wrote: >>>> Well, if you simply replace that "beauty" by this one: >>>> >>>> grep "^ATOM" filename.cif | awk '{print $15}' | awk '{s+=$1;} END >>>> {print >>>> s/NR;}' >>>> >>>> You will achieve exactly the same result (the b-factors are in the 15th >>>> field of the _atom_site section in deposited mmCIF files). I'm not an >>>> expert in awk, but I'm sure that can be made even shorter ;) >>>> >>>> It is important to keep in mind that mmCIF files are designed to be >>>> usable with grep-like tools, so I don't see any problems in moving >>>> forward to that format. Whilst I see a lot of problems in staying with >>>> the classic PDB format. >>>> >>>> Cheers >>>> >>>> Jose >>>> >>>> >>>> >>>> On 05.10.2014 11:18, Tim Gruene wrote: >>>>> Hi all, >>>>> >>>>> reading this beauty I would like to ask a question to the respective >>>>> developers: >>>>> Will the PDB format remain the working format for the users and only >>>>> upon deposition will it be converted to PDBml for archiving >>>>> purposes, or >>>>> are the refinement programs (et al.) going to abandon PDB, too? >>>>> >>>>> Best, >>>>> Tim >>>>> >>>>> On 10/04/2014 10:32 PM, Ed Pozharski wrote: >>>>>> grep "^ATOM " filename.pdb | cut -c 61-66 | awk '{s+=$1;} END {print >>>>>> s/NR;}' >>>>>> >>>>>> "Nobody likes a show off, Private" >>>>>> Skipper >>>>>> >>>>>> Cheers >>>>>> >>>>>> >>>>>> Sent on a Sprint Samsung Galaxy S? III >>>>>> >>>>>> <div>-------- Original message --------</div><div>From: Chen Zhao >>>>>> <c.z...@yale.edu> </div><div>Date:10/04/2014 4:03 PM (GMT-05:00) >>>>>> </div><div>To: PHENIX user mailing list <pheni...@phenix-online.org> >>>>>> </div><div>Subject: [phenixbb] Calculate average B-factor? >>>>>> </div><div> >>>>>> </div>Dear all, >>>>>> >>>>>> I am just wondering whether there is a command line tool in phenix >>>>>> that calculates the average B-factor of a PDB file? Can it deal with >>>>>> the ANISOU records (from TLS refinement or not) properly? I looked >>>>>> into previous posts but the --show-adp-statistics option in >>>>>> phenix.pdbtools seems to be no longer available in the version >>>>>> (1.9-1678) I installed. >>>>>> >>>>>> Thank you so much, >>>>>> Chen >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> phenixbb mailing list >>>>>> pheni...@phenix-online.org >>>>>> http://phenix-online.org/mailman/listinfo/phenixbb >>>>>> >>>> >>> >>> -- >>> Dr Tim Gruene >>> Institut fuer anorganische Chemie >>> Tammannstr. 4 >>> D-37077 Goettingen >>> >>> GPG Key ID = A46BEE1A >>> >>> > -- Dr Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A
signature.asc
Description: OpenPGP digital signature