Sounds like what you are trying to do is similar to an old jiffy script of mine:
http://bl831.als.lbl.gov/~jamesh/pickup/local_corr.com

My purpose was to compute the correlation coefficient (CC) for a bunch of different rotamers of a side chain, but I think the script will work for you "as is". What I learned is that you want to make a set of PDB files that contain only the "variable" atoms in the structure (in your case, just the ligands, no protein). Otherwise, the "signal" you are trying to measure is swamped by all the other atoms in the structure. Then you want to "select" map grid points that are near ANY of the atoms in you set of PDB files and score all your PDBs against that SAME map. If you don't do this, you will always find that bigger stuff matches better because bigger stuff simply intersects more density.

I also found it is much better to use the CC of the Laplacian of the electron density maps, rather than the CC of the raw electron density itself. By "better" I mean that a "wrong" rotamer that happens to stick itself into the middle of a nearby helix or heavy metal will "correlate" very well, and at poor resolution it will actually correlate better than the "right" rotamer. This is because the CC (and the R factor) essentially "score" the overall density overlap, whereas the Laplacian seems to "score" how connected the density is. The Laplacian filter does have the unfortunate effect of amplifying the noise of high-angle spots, so applying a B factor after the Laplacian can make things behave better. Exactly which smoothing B factor is optimal is something I have yet to figure out.

I think the reason comparing Laplacian-ized maps works better is because when we mortals look at maps, what we are looking at is an edge detection (we contour the map) next to another kind of edge detection (bonds between atoms). I'm told that comparing Laplacians instead of direct pixels this is a fairly standard methodology in machine vision, but I don't have a reference for that.

As for the real-space R factor, I have always found this to be highly sensitive to the scale and offset of the maps, whereas the correlation coefficient is completely insensitive to scale factors. Since I can't think of anything that the real-space R would tell me that the CC wouldn't, I have always used the latter.

Oh, and if you are getting zero or negative CC for perfectly good models, you might want to check and be sure that SFALL is doing the map calculation properly. A while ago I noticed that if I were missing the CRYST1 line in the PDB file, then SFALL would happily give me a random map, even if I gave it the cell and SG in the input cards! This was probably fixed in the latest release, but I have not checked...

-James Holton
MAD Scientist

On 10/4/2011 1:55 PM, Brigitte Ziervogel wrote:
Hi,

I am using the program Overlapmap to calculate real-space R-factors and 
correlation coefficients in order to find ligand conformations that fit best 
within the density.

I'm confused by the Overlapmap output, which includes "Fobs" and "Fcalc" values 
that are used to calculate the R-factors and corr coeff.  However, I'm not sure what these F values 
are as they should not be structure factors since the program seems to only deal with maps.  
Additionally, in many cases the Fobs and Fcalc values are either 0 or negative values, even for 
protein residues that are well-defined in the density.

Has anyone used this program before or have an idea of what could be going on 
here?

I have been supplying the program with a refmac mtz file with ligand unmodeled 
as map 1 and a pdb file with both protein and ligand coordinates to calculate 
the map 2.

Any suggestions or ideas of better ways to score ligand fits are appreciated, 
thanks.

Brigitte

Reply via email to