I think one aspect has not been fully appreciated, although it was
already mentioned in this thread: a typical crystallographic model is
based on both the diffraction data and information about what we
expect a structure to look like (bond lengths, angles, etc.). If we
would base our models exclusively on diffraction data, we wouldn't be
able to come up with an accurate and precise model, unless we have
sub-Å resolution. Instead, we incorporate prior knowledge at all
stages of the process. I would guess that almost all the models in
the PDB are based to a large part on information that does not come
from the diffraction data. In fact, many atoms for which there is
actually _good_ density could not be placed properly if we only
relied on that density. Likewise, one can also rely to some extent on
external information for some of the entities for which there is
_poor_ density (e.g. disordered side chain atoms). Thus, as already
mentioned, coming up with an a priori probability distribution of
side chain conformations, evaluating it in the context of a given
position in a structure, and including the results in the final model
sounds like a perfectly reasonable idea.
The notion that everything about a model needs to be represented by
the diffraction data, as has been demanded a lot during this thread,
is a very poor one. I am not even sure if such a notion has ever been
seriously considered at all, knowing all along that it would be
impossible, except perhaps for the highest-resolution small molecule
structures. We have never refrained from including certain aspects in
our models for which we don't have direct experimental observations,
but for which we can come up with perfectly acceptable and useful
predictions. For disordered side chains, if we know where C-beta is,
we can describe fairly accurately where C-gamma can be found, etc.
Omitting disordered side chain atoms, IMO, is a bigger distortion of
reality, than defining the space that is accessible to these atoms,
although we don't exactly know where they are. For that, at present,
the best (and only) way seems to be to use x,y,z, and B.
Indeed, there need to be better ways to include prior information in
the model building/refinement process and to describe our models.
Setting occupancies to zero, leaving out atoms, or letting the B
factors reflect our intentions are all unsatisfactory solutions. The
more is known about structures, the "worse" this problem will get,
because more and more external information will get incorporated in
models.
Best - MM
------------------------------------------------------------------------
--------
Mischa Machius, PhD
Associate Professor
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.; ND10.214A
Dallas, TX 75390-8816; U.S.A.
Tel: +1 214 645 6381
Fax: +1 214 645 6353