Hi Mischa,

I think you are right with ligand structures and it would be very difficult if not impossible to distinguish between real measured data and faked data. You just need to run a docking program dock the ligand calculate new structure factors add some noise and combine that with your real data of the unliganded structure. I'm not an expert, but how would one be able to detect say a molecule which is in the order of 300-600 Da within an average protein of perhaps 40 kDa if it's true data or faked + noise ?

In Germany we have to keep data (data meaning everything, from clones, scans of gels, sizing profiles to xray diffraction images etc.) for 10 years. Not sure how this is in US.

Juergen

Mischa Machius wrote:

I agree. However, I am personally not so much worried about entire protein structures being wrong or fabricated. I am much more worried about co-crystal structures. Capturing a binding partner, a reaction intermediate or a substrate in an active site is often as spectacular an achievement as determining a novel membrane protein structure. The threshold for over-interpreting densities for ligands is rather low, and wishful thinking can turn into model bias much more easily than for a protein structure alone; not to mention making honest mistakes. Just for plain and basic scientific purposes, it would be helpful every now and then to have access to the orginal images. As to the matter of fabricating ligand densities, I surmise, that is much easier than fabricating entire protein structures. The potential rewards (in terms of high-profile publications and obtaining grants) are just as high. There is enough incentive to apply lax scientific standards. If a simple means exists, beyond what is available today, that can help tremendously in identifying honest mistakes, and perhaps a rare fabrication, I think it should seriously be considered. Best - MM



On Sat, 18 Aug 2007, George M. Sheldrick wrote:
There are good reasons for preserving frames, but most of all for the crystals that appeared to diffract but did not lead to a successful structure solution, publication, and PDB deposition. Maybe in the future there will be improved data processing software (for example to integrate non-merohedral twins) that will enable good structures to be obtained from such data. At the moment most such data is thrown away. However, forcing everyone to deposit their frames each time they deposit a structure with the PDB would be a thorough nuisance and major logistic hassle. It is also a complete illusion to believe that the reviewers for Nature etc. would process or even look at frames, even if they could download them with the manuscript. For small molecules, many journals require an 'ORTEP plot' to be submitted with the paper. As older readers who have experienced Dick Harlow's 'ORTEP of the year' competition at ACA Meetings will remember, even a viewer with little experience of small-molecule crystallography can see from the ORTEP plot within seconds if something is seriously wrong, and many non-crystallographic referees for e.g. the journal Inorganic Chemistry can even make a good guess as to what is wrong (e.g wrong element assigned to an atom). It would be nice if we could find something similar for macromolecules that the author would have to submit with the paper. One immediate bonus is that the authors would look at it carefully themselves before submitting, which could lead to an improvement of the quality of structures being submitted. My suggestion is that the wwPDB might provide say a one-page diagnostic summary when they allocate each PDB ID that could be used for this purpose. A good first pass at this would be the output that the MolProbity server http://molprobity.biochem.duke.edu/ sends when is given a PDB file. It starts with a few lines of summary in which bad things are marked red and the structure is assigned to a pecentile: a percentile of 6% means that 93% of the sturcture in the PDB with a similar resolution are 'better' and 5% are 'worse'. This summary can be understood with very little crystallographic background and a similar summary can of course be produced for NMR structures. The summary is followed by diagnostics for each residue, normally if the summary looks good it would not be necessary for the editor or referee to look at the rest. Although this server was intended to help us to improve our structures rather than detect manipulated or fabricated data, I asked it for a report on 2HR0 to see what it would do (probably many other people were trying to do exactly the same, the server was slower than usual). Although the structure got poor marks on most tests, MolProbity generously assigned it overall to the 6th pecentile, I suppose that this is about par for structures submitted to Nature (!). However there was one feature that was unlike anything I have ever seen before although I have fed the MolProbity server with some pretty ropey PDB files in the past: EVERY residue, including EVERY WATER molecule, made either at least one bad contact or was a Ramachandran outlier or was a rotamer outlier (or more than one of these). This surely would ring all the alarm bells! So I would suggest that the wwPDB could coordinate, with the help of the validation experts, software to produce a short summary report that would be automatically provided in the same email that allocates the PDB ID. This email could make the strong recommendation that the report file be submitted with the publication, and maybe in the fullness of time even the Editors of high profile journals would require this report for the referees (or even read it themselves!). To gain acceptance for such a procedure the report would have to be short and comprehensible to non-crystallographers; the MolProbity summary is an excellent first pass in this respect, but (partially with a view to detecting manipulation of the data) a couple of tests could be added based on the data statistics as reported in the PDB file or even better the reflection data if submitted). Most of the necessary software already exists, much of it produced by regular readers of this bb, it just needs to be adapted so that the results can be digested by referees and editors with little or no crystallographic experience. And most important, a PDB ID should always be released only in combination with such a summary. George Prof. George M. Sheldrick FRS Dept. Structural Chemistry, University of Goettingen, Tammannstr. 4, D37077 Goettingen, Germany Tel. +49-551-39-3021 or -3068 Fax. +49-551-39-2582


--------------------------------------------------------------------------------
Mischa Machius, PhD
Associate Professor
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.; ND10.214A
Dallas, TX 75390-8816; U.S.A.
Tel: +1 214 645 6381
Fax: +1 214 645 6353



--
Jürgen Bosch
University of Washington
Dept. of Biochemistry, K-426
1705 NE Pacific Street
Seattle, WA 98195
Box 357742
Phone:   +1-206-616-4510
FAX:     +1-206-685-7002

Reply via email to