Re: [ccp4bb] Processing compressed diffraction images?

Fischmann, Thierry Thu, 06 May 2010 06:05:59 -0700

The results from compressing a diffraction image must vary quite a bit on a 
case by case basis.


I looked into it a long time ago using images from a few datasets from 2 
different projects. Compress was quite faster than gzip or bzip2 in these 
tests. It also delivered the less compression. gzip and bzip2 were about the 
same speed (or lack thereof). But while the difference in speed was marginal 
bzip2 delivered a 20-30% size improvement over gzip.

The tests were with images of diffracting crystals, the diffraction extending 
to the edge of the detector.

Regards,

Thierry

-----Original Message-----
From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Ian Tickle
Sent: Thursday, May 06, 2010 08:28 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Processing compressed diffraction images?

Hi Tim thanks for that, sorry yes I missed that page.  But I'm still
not clear: is it uncompressing to disk or is it doing it in memory?  I
assume the latter: if the former then obviously nothing is gained.
You're right about the compression factor, it's more like a factor of
2 or 3, I should have looked at the image in question as the one I
picked had no spots!

Cheers

-- Iam

On Thu, May 6, 2010 at 12:54 PM, Tim Gruene <t...@shelx.uni-ac.gwdg.de> wrote:
> Entering "xds gzip" at www.ixquick.com came up with
> http://www.mpimf-heidelberg.mpg.de/~kabsch/xds/html_doc/xds_parameters.html:
>
> "To save space it is allowed to compress the images by using the UNIX 
> compress,
> gzip, or bzip2 routines. On data processing XDS will automatically recognize 
> and
> expand the compressed images files. The file name extensions (.Z, .z, .gz, 
> bz2)
> due to the compression routines should not be included in the generic file 
> name
> template. "
>
> I thought to remember that mosflm also supports gzipped images but didn't 
> find a
> reference within 2 minutes.
>
> I'm surprised to hear that you get such a high compression rate with mccd
> images.
>
> Cheers, Tim
>
>
> On Thu, May 06, 2010 at 12:24:47PM +0100, Ian Tickle wrote:
>> All -
>>
>> No doubt this topic has come up before on the BB: I'd like to ask
>> about the current capabilities of the various integration programs (in
>> practice we use only MOSFLM & XDS) for reading compressed diffraction
>> images from synchrotrons.  AFAICS XDS has limited support for reading
>> compressed images (TIFF format from the MARCCD detector and CCP4
>> compressed format from the Oxford Diffraction CCD); MOSFLM doesn't
>> seem to support reading compressed images at all (I'm sure Harry will
>> correct me if I'm wrong about this!).  I'm really thinking about
>> gzipped files here: bzip2 no doubt gives marginally smaller files but
>> is very slow.  Currently we bring back uncompressed images but it
>> seems to me that this is not the most efficient way of doing things -
>> or is it just that my expectation that it's more efficient to read
>> compressed images and uncompress in memory not realised in practice?
>> For example the AstexViewer molecular viewer software currently reads
>> gzipped CCP4 maps directly and gunzips them in memory; this improves
>> the response time by a modest factor of ~ 1.5, but this is because
>> electron density maps are 'dense' from a compression point of view;
>> X-ray diffraction images tend to have much more 'empty space' and the
>> compression factor is usually considerably higher (as much as
>> 10-fold).
>>
>> On a recent trip we collected more data than we anticipated & the
>> uncompressed data no longer fitted on our USB disk (the data is backed
>> up to the USB disk as it's collected), so we would have definitely
>> benefited from compression!  However file size is *not* the issue:
>> disk space is cheap after all.  My point is that compressed images
>> surely require much less disk I/O to read.  In this respect bringing
>> back compressed images and then uncompressing back to a local disk
>> completely defeats the object of compression - you actually more than
>> double the I/O instead of reducing it!  We see this when we try to
>> process the ~150 datasets that we bring back on our PC cluster and the
>> disk I/O completely cripples the disk server machine (and everyone
>> who's trying to use it at the same time!) unless we're careful to
>> limit the number of simultaneous jobs.  When we routinely start to use
>> the Pilatus detector on the beamlines this is going to be even more of
>> an issue.  Basically we have plenty of processing power from the
>> cluster: the disk I/O is the bottleneck.  Now you could argue that we
>> should spread the load over more disks or maybe spend more on faster
>> disk controllers, but the whole point about disks is they're cheap, we
>> don't need the extra I/O bandwidth for anything else, and you
>> shouldn't need to spend a fortune, particularly if there are ways of
>> making the software more efficient, which after all will benefit
>> everyone.
>>
>> Cheers
>>
>> -- Ian
>
> --
> --
> Tim Gruene
> Institut fuer anorganische Chemie
> Tammannstr. 4
> D-37077 Goettingen
>
> GPG Key ID = A46BEE1A
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iD8DBQFL4q3xUxlJ7aRr7hoRAibGAKDJvFsy+GUZQ3E/tqQMVovkJxPTRACgoSjb
> QaVZzpgtXv4IUTx5Kt8d5eM=
> =OvRA
> -----END PGP SIGNATURE-----
>
>
*********************************************************************
This message and any attachments are solely for the
intended recipient. If you are not the intended recipient,
disclosure, copying, use or distribution of the information 
included in this message is prohibited -- Please 
immediately and permanently delete.

Re: [ccp4bb] Processing compressed diffraction images?

Reply via email to