Tom Lane wrote:
Joachim Wieland <j...@mcknight.de> writes:
If we still cannot do this, then what I am asking is: What does the
project need to be able to at least link against such a compression
algorithm?

Well, what we *really* need is a convincing argument that it's worth
taking some risk for.  I find that not obvious.  You can pipe the output
of pg_dump into your-choice-of-compressor, for example, and that gets
you the ability to spread the work across multiple CPUs in addition to
eliminating legal risk to the PG project.  And in any case the general
impression seems to be that the main dump-speed bottleneck is on the
backend side not in pg_dump's compression.

legal risks aside (I'm not a lawyer so I cannot comment on that) the current situation imho is:

* for a plain pg_dump the backend is the bottleneck
* for a pg_dump -Fc with compression, compression is a huge bottleneck
* for pg_dump | gzip, it is usually compression (or bytea and some other datatypes in <9.0) * for a parallel dump you can either dump uncompressed and compress afterwards which increases diskspace requirements (and if you need parallel dump you usually have a large database) and complexity (because you would have to think about how to manually parallel the compression * for a parallel dump that compresses inline you are limited by the compression algorithm on a per core base and given that the current inline compression overhead is huge you loose a lot of the benefits of parallel dump


Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to