I think one aspect has not been fully appreciated, although it was already mentioned in this thread: a typical crystallographic model is based on both the diffraction data and information about what we expect a structure to look like (bond lengths, angles, etc.). If we would base our models exclusively on diffraction data, we wouldn't be able to come up with an accurate and precise model, unless we have sub-Å resolution. Instead, we incorporate prior knowledge at all stages of the process. I would guess that almost all the models in the PDB are based to a large part on information that does not come from the diffraction data. In fact, many atoms for which there is actually _good_ density could not be placed properly if we only relied on that density. Likewise, one can also rely to some extent on external information for some of the entities for which there is _poor_ density (e.g. disordered side chain atoms). Thus, as already mentioned, coming up with an a priori probability distribution of side chain conformations, evaluating it in the context of a given position in a structure, and including the results in the final model sounds like a perfectly reasonable idea.

The notion that everything about a model needs to be represented by the diffraction data, as has been demanded a lot during this thread, is a very poor one. I am not even sure if such a notion has ever been seriously considered at all, knowing all along that it would be impossible, except perhaps for the highest-resolution small molecule structures. We have never refrained from including certain aspects in our models for which we don't have direct experimental observations, but for which we can come up with perfectly acceptable and useful predictions. For disordered side chains, if we know where C-beta is, we can describe fairly accurately where C-gamma can be found, etc. Omitting disordered side chain atoms, IMO, is a bigger distortion of reality, than defining the space that is accessible to these atoms, although we don't exactly know where they are. For that, at present, the best (and only) way seems to be to use x,y,z, and B.

Indeed, there need to be better ways to include prior information in the model building/refinement process and to describe our models. Setting occupancies to zero, leaving out atoms, or letting the B factors reflect our intentions are all unsatisfactory solutions. The more is known about structures, the "worse" this problem will get, because more and more external information will get incorporated in models.

Best - MM

------------------------------------------------------------------------ --------
Mischa Machius, PhD
Associate Professor
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.; ND10.214A
Dallas, TX 75390-8816; U.S.A.
Tel: +1 214 645 6381
Fax: +1 214 645 6353


Reply via email to