Peter Tribble <peter.tribble at gmail.com> writes: > On Mon, Apr 20, 2009 at 6:39 PM, John Levon <john.levon at sun.com> wrote: >> On Mon, Apr 20, 2009 at 09:25:51AM -0700, Rich Burridge wrote: >> >>> bzip2 is a free and open source block sorting lossless data >>> compression algorithm with comparatively high compression efficiency. >>> >>> pbzip2 is a parallel implementation of the bzip2 algorithm using >>> pthreads written in C++ by Jeff Gilchrist that retains file >>> compatibility with the common bzip2(1) application included in >>> Solaris and many other operating systems >> >> Is there a reason we're not delivering this version as the real bzip2, >> then just providing a symlink? What is the advantage of the non-parallel >> implementation exactly to mean it needs a new name? > > I use pbzip2 and bzip2 a lot. > > They should be kept separate. > > For one thing, I would expect to get regular bzip2 if I called it by that > name, > likewise pbzip2. > > There are key behavioural differences that I've seen: > > - pbzip2 by default will use all the available cpus. You really don't want to > make it that easy to saturate a machine - it can be very unpleasant on the > other users of the machine. > > - pbzip2 requires memory equal to the size of the file to decompress a > file compressed by bzip2, which may be extremely large and may not > work at all
I believe it's also the case that older versions of bzip2 (< 1.0.2) are unable to decompress files compressed by pbzip2. -- Boyd