On Mon, Apr 20, 2009 at 6:39 PM, John Levon <john.levon at sun.com> wrote:
> On Mon, Apr 20, 2009 at 09:25:51AM -0700, Rich Burridge wrote:
>
>>       bzip2 is a free and open source block sorting lossless data
>>       compression algorithm with comparatively high compression efficiency.
>>
>>       pbzip2 is a parallel implementation of the bzip2 algorithm using
>>       pthreads written in C++ by Jeff Gilchrist that retains file
>>       compatibility with the common bzip2(1) application included in
>>       Solaris and many other operating systems
>
> Is there a reason we're not delivering this version as the real bzip2,
> then just providing a symlink? What is the advantage of the non-parallel
> implementation exactly to mean it needs a new name?

I use pbzip2 and bzip2 a lot.

They should be kept separate.

For one thing, I would expect to get regular bzip2 if I called it by that name,
likewise pbzip2.

There are key behavioural differences that I've seen:

 - pbzip2 by default will use all the available cpus. You really don't want to
make it that easy to saturate a machine - it can be very unpleasant on the
other users of the machine.

 - pbzip2 requires memory equal to the size of the file to decompress a
file compressed by bzip2, which may be extremely large and may not
work at all

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Reply via email to