I will try to explain what goes on again.

OpenSSL uses ZLIB compression in the following manner:
        On each block of data transmitted, compress() is called.
        It's equivalent to deflateInit() + deflate() + deflateEnd().

On a reliable continuous stream of data you can use it in the following way:
        You call deflateInit() when the connection is established.
        You call deflate() for each bloc to transmit using Z_SYNC_FLUSH.
        When the connection closes, you call deflateEnd().

In the latter case, you do not initialize and destroy the dictionary for
each block you transmit.

Now there are three options to deflate, Z_NO_FLUSH, Z_SYNC_FLUSH and
Z_FULL_FLUSH.  For interactive applications, you need to flush, otherwise
your block of data may get stuck in the pipeline until more data pushes on
it.  Using Z_SYNC_FLUSH, you force the compressor to output the compressed
data immediately.  With Z_FULL_FLUSH, you additionally reset the
compressor's state.

I ran tests using these options, and on our typical datastream sample, it
meant for us a compression factor of 6:1 with Z_SYNC_FLUSH and 2:1 with
Z_FULL_FLUSH.  With Z_SYNC_FLUSH, the dictionary is not trashed.

The way OpenSSL uses ZLIB, resetting the compressor's state after each block
of data, you would achieve similar results as with Z_FULL_FLUSH.

I hope this clarifies things.

So I am still wondering if there is a reason why each block of data is
compressed independently from the previous one in the OpenSSL use of
compression.  Is it an architectural constraint?


Eric Le Saux
Electronic Arts

-----Original Message-----
From: Bear Giles [mailto:bgiles@;coyotesong.com] 
Sent: Monday, November 11, 2002 8:14 PM
To: [EMAIL PROTECTED]
Subject: Re: OpenSSL and compression using ZLIB

Le Saux, Eric wrote:
> 
> I am trying to understand why ZLIB is being used that way.  Here is what 
> gives better results on a continuous reliable stream of data:
>  
> 1)       You create a z_stream for sending, and another z_stream for 
> receiving.
> 
> 2)       You call deflateInit() and inflateInit() on them, respectively, 
> when the communication is established.
> 
> 3)       For each data block you send, you call deflate() on it.  For 
> each data block you receive, you call inflate() on it.

You then die from the latency in the inflation/deflation routines.  You 
have to flush the deflater for each block, and depending on how you do 
it your performance is the same as deflating each block separately.

> 4)       When the connection is terminated, you call deflateEnd() and 
> inflateEnd() respectively.

...

> But by far, the main advantage is that you can achieve good compression 
> even for very small blocks of data.  The "dictionary" window stays open 
> for the whole communication stream, making it possible to compress a 
> message by reference to a number of previously sent messages.

If you do a Z_SYNC_FLUSH (iirc), it blows the dictionary.  This is 
intentional, since you can restart the inflater at every SYNC mark.

I thought there was also a mode to flush the buffer (including any 
necessary padding for partial bytes) but not blowing the dictionary, but 
I'm not sure how portable it is.

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [EMAIL PROTECTED]
Automated List Manager                           [EMAIL PROTECTED]

Reply via email to