Re: [HACKERS] directory archive format for pg_dump

Heikki Linnakangas Wed, 01 Dec 2010 23:50:15 -0800

On 02.12.2010 04:35, Joachim Wieland wrote:

There is one thing however that I am not in favor of, which is the
removal of the "sizeHint" parameter for the read functions. The reason
for this parameter is not very clear now without LZF but I have tried
to put in a few comments to explain the situation (which you have
taken out as well :-) ).


The point is that zlib is a stream based compression algorithm, you
just stuff data in and from time to time you get data out and in the
end you explicitly flush the compressor. The read function can just
return as many bytes as it wants and we can just hand it all over to
zlib. Other compression algorithms however are block based and first
write a block header that contains the information on the next data
block, including uncompressed and compressed sizes. Now with the
sizeHint parameter I used, the compressor could tell the read function
that it just wants to read the fixed size header (6 bytes IIRC). In
the header it would look up the compressed size for the next block and
would then ask the read function to get exactly this amount of data,
decompress it and go on with the next block, and so forth...

Of course you can possibly do that memory management inside the
compressor with an extra buffer holding what you got in excess but
it's a pain. If you removed that part on purpose on the grounds that
there is no block based compression algorithm in core and probably
never will be, then that's okay :-)

Yeah, we're not going to have lzf built-in anytime soon. The externalcommand approach seems like the best way to support additionalcompression algorithms, and I don't think it could do anything withsizeHint. And the custom format didn't obey sizeHint anyway, because itreads one custom-format block at a time.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] directory archive format for pg_dump

Reply via email to