Re: [ccp4bb] lossy compression of diffraction images

James Holton Fri, 07 May 2010 09:19:30 -0700

So far I have gotten several "votes" based on the lossless compressionratio of the images, but, before I reveal the "answer" to the CCP4BB Iremind everyone that the LOSSY compression ratio of the compressedimages is 34-fold! So bzip2 and gzip are now incredibly inefficientmethods of storage for the "compressed data set".

I am mainly curious if anyone can find some significant change in thedata quality upon processing these images. At higher compression ratiosthan this, the visual appearance of the background does indeed becomequite "jpegy", but the cool thing about video compression is that it isvery good at preserving the "local average value" of a group of pixels,and thus the fit of the background around a spot to a plane that is doneduring data reduction still works, even at VERY high compression ratios(200 or more). But you do eventually end up sacrificing faint spots.This is the "judgment call" I'd like opinions on. Personally, I don'tthink the faint spots are all that important, but others might have somereligion about them...


Thanks for the input!

-James Holton
MAD Scientist


H. Raaijmakers wrote:

James,

caseB was lossy compressed.
It is 10% smaller when compressed (gzip, bzip2), so it contains
significantly less information.

cheers,

Hans

James Holton schreef:

Ian Tickle wrote:

I found an old e-mail from James Holton where he suggested lossy
compression for diffraction images (as long as it didn't change the
F's significantly!) - I'm not sure whether anything came of that!

Well, yes, something did come of this....  But I don't think Gerard
Bricogne is going to like it.

Details are here:
http://bl831.als.lbl.gov/~jamesh/lossy_compression/

Short version is that I found a way to compress a test lysozyme dataset
by a factor of ~33 with no apparent ill effects on the data.  In fact,
anomalous differences were completely unaffected, and Rfree dropped from
0.287 for the original data to 0.275 when refined against Fs from the
compressed images.  This is no doubt a fluke of the excess noise added
by compression, but I think it highlights how the errors in
crystallography are dominated by the inadequacies of the electron
density models we use, and not the quality of our data.

The page above lists two data sets: "A" and "B", and I am interested to
know if and how anyone can "tell" which one of these data sets was
compressed.  The first image of each data set can be found here:
http://bl831.als.lbl.gov/~jamesh/lossy_compression/firstimage.tar.bz2

-James Holton
MAD Scientist

Re: [ccp4bb] lossy compression of diffraction images

Reply via email to