It depends :)

All the commits I did so far represent distinct pieces of
functionality that someone could choose to use by itself, so if
c-compress was released tomorrow they'd be usable. That being said,
I'm still polishing the last high-level class that provides the
consumer-friendlier stuff to create a zip or a jar file. Even when I
do that, some of the underlying stuff will be visible.

ScatterZipOutputStream is client extensible to provide different kinds
of backing store for the intermediate output. I might be able to hide
the StreamCompressor is part of the extensible type hierarchy for
ScatterZipOutputStream; I'll take a look.

As you may have seen, there is one subclass
org.apache.commons.compress.archivers.zip.ScatterZipOutputStream.FileScatterOutputStream
that is implemented right now, which backs it straight to file. It is
possible for clients to make their own backing implementations of the
ScatterZipOutputStream. In my git repo I have 2 other implementations,
one using the commons-io DeferredOutputStream and another one using a
custom "OffloadingDeferredOutputstream", that basically offloads to
disk after N bytes but retains the stuff already written to memory.
(Deferred flushes *everything* to disk once it switches to
disk-based). Adding these two last implementations to commons-compress
would imply adding at least an optional dependency to commons-io, and
I have been deferring that question :) I can think of at least a 4th
implementation, which would use off-heap memory for storing stuff.

Kristian


2014-12-23 12:02 GMT+01:00 Emmanuel Bourg <ebo...@apache.org>:
> Le 22/12/2014 16:24, krosenv...@apache.org a écrit :
>> Author: krosenvold
>> Date: Mon Dec 22 15:24:02 2014
>> New Revision: 1647329
>>
>> URL: http://svn.apache.org/r1647329
>> Log:
>> COMPRESS-296 Parallel compression. Added StreamCompressor and 
>> ScatterZipOutputStream.
>>
>> StreamCompressor is an extract of the deflation algorithm from 
>> ZipArchiveOutputStream, which unfortunately
>> was too conflated with writing a file in a particular structure. Using the 
>> actual zip file format as an
>> intermediate format for scatter-streams turned out to be fairly inefficient. 
>> ScatterZipOuputStream
>> is 2-3x faster than using a zip file as intermediate format.
>>
>> It would be possibly to refactor ZipArchiveOutputStream to use 
>> StreamCompressor, but there would
>> be a slight break in backward compatibility regarding the protected writeOut 
>> method, which
>> is moved to the streamCompressor class.
>
> Thank you Kristian. Is it possible to make the new classes package
> private or do they have to be part of the public API?
>
> Emmanuel Bourg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to