(2013/10/19 14:58), Amit Kapila wrote:
> On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa
> <kondo.mitsum...@lab.ntt.co.jp> wrote:
> I think in general also snappy is mostly preferred for it's low CPU
> usage not for compression, but overall my vote is also for snappy.
I think low CPU usage is the best important factor in WAL compression.
It is because WAL write is sequencial write, so few compression ratio improvement cannot change PostgreSQL's performance, and furthermore raid card with writeback feature. Furthermore PG executes programs by single proccess, high CPU usage compression algorithm will cause lessor performance.

>> I found compression algorithm test in HBase. I don't read detail, but it
>> indicates snnapy algorithm gets best performance.
>> http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of
>
> The dataset used for performance is quite different from the data
> which we are talking about here (WAL).
> "These are the scores for a data which consist of 700kB rows, each
> containing a binary image data. They probably won’t apply to things
> like numeric or text data."
Yes, you are right. We need testing about compression algorithm in WAL write.

>> I think it is necessary to make best efforts in community than I do the best
>> choice with strict test.
>
> Sure, it is good to make effort to select the best algorithm, but if
> you are combining this patch with inclusion of new compression
> algorithm in PG, it can only make the patch to take much longer time.
I think if our direction is specifically decided, it is easy to make the patch.
Complession patch's direction isn't still become clear, it will be a troublesome patch which is like sync-rep patch.

> In general, my thinking is that we should prefer compression to reduce
> IO (WAL volume), because reducing WAL volume has other benefits as
> well like sending it to subscriber nodes. I think it will help cases
> where due to less n/w bandwidth, the disk allocated for WAL becomes
> full due to high traffic on master and then users need some
> alternative methods to handle such situations.
Do you talk about archiving WAL file? It can easy to reduce volume that we set and add compression command with copy command at archive_command.

> I think many users would like to use a method which can reduce WAL
> volume and the users which don't find it enough useful in their
> environments due to decrease in TPS or not significant reduction in
> WAL have the option to disable it.
I favor to select compression algorithm for higher performance. If we need to compress WAL file more, in spite of lessor performance, we can change archive copy command with high compression algorithm and add documents that how to compress archive WAL files at archive_command. Does it wrong? In actual, many of NoSQLs use snappy for purpose of higher performance.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to