Re: [ccp4bb] mtz map data

James Holton Fri, 29 Aug 2014 13:25:21 -0700

MTZ files are not maps. They can usually be converted into maps, butthat depends on the type of information they contain. Coot lets you loadan mtz and immediately view a map, but only because it internally picksthe coefficients that you probably want to use and then internallyexecutes an inverse Fourier transform. To get a map you can accessvoxel by voxel you will need the CCP4 program "FFT":

http://www.ccp4.ac.uk/html/fft.html

Once you have a map file, you can convert it to text using the CCP4program MAPDUMP:

http://www.ccp4.ac.uk/html/mapdump.html
or perhaps MAP2NA4
http://www.ccp4.ac.uk/html/maptona4.html

However, both of these output only a few significant digits, whichinvariably introduces round-off error. Some may argue that the extradigits are not "significant" anyway, but round-off error is an insidiousbeast, and once introduced it never goes away. If you don't believe me,try rotating a PDB file of lysozyme through 100 full 360-degreerotations in 1 degree steps with PDBSET. You will find that you do notend up with the same structure you started with! It is RMS 0.1 Adifferent, with some atoms as much as 1 A out of place. This is not abug in PDBSET, it arises from the round-off error in the 3rd"significant digit" of the coordinates accumulating with each round-offevent.

But I digress. The point is that in situations where you are going togo back-and-forth between map an mtz more than once, or if you simplydon't know the magnitude of error that you are dealing with (a commonsituation), then I recommend keeping round-off error to a minimum.

Internally, CCP4 maps are stored as 4-byte floats, which have aboutseven significant base-10 digits. The full file format definition is here:

http://www.ccp4.ac.uk/html/maplib.html

Personally, I tend to read map values with the unix binary dump program"od", which can be made to read 4-byte floats. It is simply a matter ofskipping the header, which is easily done by using MAPDUMP to tell youhow many grid points are in the file and subtracting 4* that number ofbytes from the file size. I wrote a little jiffy script for convertinga CCP4 *.map file into a PDB file with the density printed into the Bfactor column here:

http://bl831.als.lbl.gov/~jamesh/map_noise/scripts/map2pdb.com

This script does introduce round-off error, but you can modify it as yousee fit. You will find, however, that the PDB file produced this way ispretty huge.

I should warn you that you will find that the "nearest" map grid pointis not a very good approximation of a given point of interest (such asthe center of an atom). What you might want to do is interpolatebetween nearby map grid points, possibly even doing a cubic splineinterpolation in three dimensions. This feature cannot be found in theCCP4 suite, but is available in the RAVE suite from Uppsala SoftwareFactory as the program MAPMAN:

http://xray.bmc.uu.se/usf/mapman_man.html
use the "peek" function in that program.

All that said, I caution you that integrating a "volume" from map valuesis in general not a very good idea. More properly, the integral ofelectron density over space is a "charge", and such integrals are verysensitive to offsets. Offsets are dangerous because most maps arecalculated to have the sum all grid point values equal to zero. This issimply because the F000 structure factor, which represents this sum, wasnot measured. So, if you add "1" to all your map voxels, your integralimmediately becomes "n" times higher, where "n" is the number of gridpoints. Yes, difference maps are supposed to be centered on zero, butto what precision? You need F000 to be sure. One way of estimatingF000 was published recently:

http://www.pnas.org/content/111/1/237.long

However, when integrating density you also have problems with how faraway from your point of interest you should integrate. Too far and youpick up a lot of noise, but too close and you get systematic bias.

Personally, I think the best way to integrate a map is by usingoccupancy refinement. Essentially, if you have a noisy curve and youwant to integrate it, fitting a smooth function to that curve and thenintegrating the curve itself is often a very powerful noise filter. Thecalculated form factor for an atom has a well-defined integral (theatomic number), it "automatically" defines the vacuum level of the mapby being zero far from the atomic center, it "automatically" subtractsthe density of neighboring atoms (if they are modeled in), and alsodown-weights more distant voxels by an appropriate amount. That is, theoptimum noise filter is generally the same shape as the signal ofinterest (sometimes called a Wiener filter, see the Numerical Recipesbook). So, if you fit a bunch of hydrogen atoms into your mysterydensity (perhaps keeping the rest of the model fixed if you can) thenthe sum of all those hydrogen occupancies is a very good estimate of thenumber of electrons in the feature. What is the error bar? Tryjiggling the rest of the molecule and re-refining the occupancies. Thatis at worst a lower bound for the total error. Do this 5-10 times withdifferent random number seeds and you will have a handle on the RMSvariation in your "measurement" of the charge. The END_RAPID.com scriptavailable here:

http://bl831.als.lbl.gov/END/RAPID/end.rapid/Documentation/documentation.htm

can be useful for doing several parallel occupancy refinements with thebenefit of also adding noise to the data to get a more realistic error bar.


HTH

-James Holton
MAD Scientist


On 8/28/2014 7:49 AM, Alejandro Virrueta wrote:

Hello,
I am new to crystallography refinement, and I have a question (I'msorry in advance if it is a stupid one):
Is it possible to extract all the density values (observed andmodel-computed) from an mtz file? I need this information in order toquantify the volumes of the difference regions (peaks and holes).
Thanks,
Alex

Re: [ccp4bb] mtz map data

Reply via email to