On 08.02.2021 22:23, Daniil Zakhlystov wrote:
Hi everyone,

I’ve been making some experiments with an on-the-fly compression switch lately 
and have some updates.

...
pg_restore of IMDB database test results

Chunked compression with only CopyData or DataRow compression (second approach):
time:
real    2m27.947s
user    0m45.453s
sys     0m3.113s
RX bytes diff, human:  1.8837M
TX bytes diff, human:  1.2810G

Permanent compression:
time:
real    2m15.810s
user    0m42.822s
sys     0m2.022s
RX bytes diff, human:  2.3274M
TX bytes diff, human:  1.2761G

Without compression:
time:
real    2m38.245s
user    0m18.946s
sys     0m2.443s
RX bytes diff, human:  5.6117M
TX bytes diff, human:  3.8227G


Also, I’ve run pgbench tests and measured the CPU load. Since chunked 
compression did not compress any messages
except for CopyData or DataRow, it demonstrated lower CPU usage compared to the 
permanent compression, full report with graphs
is available in the Google doc above.

Pull request with the second approach implemented:
https://github.com/postgrespro/libpq_compression/pull/7

Also, in this pull request, I’ve made the following changes:
- extracted the general-purpose streaming compression API into the separate 
structure (ZStream) so it can be used in other places without tx_func and 
rx_func,
maybe the other compression patches can utilize it?
- made some refactoring of ZpqStream
- moved the SSL and ZPQ buffered read data checks into separate function 
pqReadPending

What do you think of the results above? I think that the implemented approach 
is viable, but maybe I missed something in my tests.

Sorry, but my interpretation of your results is completely different:
permanent compression is faster than chunked compression (2m15 vs. 2m27) and consumes less CPU (44 vs 48 sec).
Size of RX data is slightly larger - 0.5Mb but TX size is smaller - 5Mb.
So permanent compression is better from all points of view: it is faster, consumes less CPU and reduces network traffic!

From my point of view your results just prove my original opinion that possibility to control compression on the fly and use different compression algorithm for TX/RX data
just complicates implementation and given no significant advantages.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Reply via email to