Dear all,

With regards to the possible "fabrication" of the 2hr0 structure, why would the authors have deposited the structure factors if this is not required by the journal? Also, why would they have "fabricated" a structure with gaps along c if they could have done so without the gap?

I few years ago, I had to cope with two structures with gaps along c, pdb codes 1h6w and 1ocy those of you who are interested, structure factors are available from the pdb, unmerged intensities/raw images I will look for and provide if requested...

Without further evidence, I suspect their structure is real, perhaps not optimally refined and treated though, but then again, this seems commonplace in "Nature" structures, perhaps due to lack of time/ experience and, in some cases, putting too much pressure on the PhD students/postdocs involved instead of mentoring and checking them. I hope the authors provide the raw diffraction images to dispel any doubts and would be curious to learn about the other structures of the same group - anyone has a comprehensive, annotated list of them?

Greetings,

Mark J. van Raaij
Unidad de Bioquímica Estructural
Dpto de Bioquímica, Facultad de Farmacia
and
Unidad de Rayos X, Edificio CACTUS
Universidad de Santiago
15782 Santiago de Compostela
Spain
http://web.usc.es/~vanraaij/


On 16 Aug 2007, at 15:22, Randy J. Read wrote:

On Aug 16 2007, Eleanor Dodson wrote:

The weighting in REFMAC is a function of SigmA ( plotted in log file). For this example it will be nearly 1 for all resolutions ranges so the weights are pretty constant. There is also a contribution from the "experimental" sigma, which in this case seems to be proportional to |F|

Originally I expected that the publication of our Brief Communication in Nature would stimulate a lot of discussion on the bulletin board, but clearly it hasn't. One reason is probably that we couldn't be as forthright as we wished to be. For its own good reasons, Nature did not allow us to use the word "fabricated". Nor were we allowed to discuss other structures from the same group, if they weren't published in Nature.

Another reason is an understandable reluctance to make allegations in public, and the CCP4 bulletin board probably isn't the best place to do that.

But I think the case raises essential topics for the community to discuss, and this is a good forum for those discussions. We need to consider how to ensure the integrity of the structural databases and the associated publications.

So here are some questions to start a discussion, with some suggestions of partial answers.

1. How many structures in the PDB are fabricated?

I don't know, but I think (or at least hope) that the number is very small.

2. How easy is it to fabricate a structure?

It's very easy, if no-one will be examining it with a suspicious mind, but it's extremely difficult to do well. No matter how well a structure is fabricated, it will violate something that is known now or learned later about the properties of real macromolecules and their diffraction data. If you're clever enough to do this really well, then you should be clever enough to determine the real structure of an interesting protein.

3. How can we tell whether structures in the PDB are fabricated, or just poorly refined?

The current standard validation tools are aimed at detecting errors in structure determination or the effects of poor refinement practice. None of them are aimed at detecting specific signs of fabrication because we assume (almost always correctly) that others are acting in good faith.

The more information that is available, the easier it will be to detect fabrication (because it is harder to make up more information convincingly). For instance, if the diffraction data are deposited, we can check for consistency with the known properties of real macromolecular crystals, e.g. that they contain disordered solvent and not vacuum. As Tassos Perrakis has discovered, there are characteristic ways in which the standard deviations depend on the intensities and the resolution. If unmerged data are deposited, there will probably be evidence of radiation damage, weak effects from intrinsic anomalous scatterers, etc. Raw images are probably even harder to simulate convincingly.

If a structure is fabricated by making up a new crystal form, perhaps a complex of previously-known components, then the crystal packing interactions should look like the interactions seen in real crystals. If it's fabricated by homology modelling, then the internal packing is likely to be suboptimal. I'm told by David Baker (who knows a thing or two about this) that it is extremely difficult to make a homology model that both obeys what we know about torsion angle preferences and is packed as well as a real protein structure.

I'm very interested in hearing about new ideas along these lines. The wwPDB has agreed to sponsor a workshop next year where we will propose and test new validation criteria.

4. If new validation criteria are applied at the PDB, won't someone who wants to fabricate a structure just keep improving their fabricated model until it passes all the tests?

That's a possibility, but I think the deterrence effect of knowing that there are measures to detect fabrication will outweigh this. And it isn't enough for a fabricated structure to pass today's tests; it has to pass all the new tests devised for the rest of the person's life, or at least their career.

5. What should we do if tests suggest that a structure may be fabricated?

I think we need to be extremely careful. Conclusions should not be drawn on the basis of a few numbers. The tests can just point up which structures should be examined closely. Close examination would then involve less automated criteria, such as whether the structure agrees with all the biochemical data about the system. As in the process followed by Nature, you also have to start by giving the people who deposited the structure an opportunity to explain the anomalies.

Randy Read

Reply via email to