Re: [HACKERS] Compression of full-page-writes
Marking this patch as returned with feedback for this CF, moving it to the next one. I doubt that there will be much progress here for the next couple of days, so let's try at least to get something for this release cycle. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Jan 2, 2015 at 11:52 AM, Bruce Momjian br...@momjian.us wrote: OK, so given your stats, the feature give a 12.5% reduction in I/O. If that is significant, shouldn't we see a performance improvement? If we don't see a performance improvement, is I/O reduction worthwhile? Is it valuable in that it gives non-database applications more I/O to use? Is that all? I suggest we at least document that this feature as mostly useful for I/O reduction, and maybe say CPU usage and performance might be negatively impacted. OK, here is the email I remember from Fujii Masao this same thread that showed a performance improvement for WAL compression: http://www.postgresql.org/message-id/CAHGQGwGqG8e9YN0fNCUZqTTT=hnr7ly516kft5ffqf4pp1q...@mail.gmail.com Why are we not seeing the 33% compression and 15% performance improvement he saw? What am I missing here? Bruce, some database workloads are I/O bound and others are CPU bound. Any patch that reduces I/O by using CPU is going to be a win when the system is I/O bound and a loss when it is CPU bound. I'm not really sure what else to say about that; it seems pretty obvious. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Jan 9, 2015 at 9:49 PM, Rahila Syed rahilasyed...@gmail.com wrote: So this test can be used to evaluate how shorter records influence performance since the master waits for flush confirmation from the standby, right? Yes. This test can help measure performance improvement due to reduced I/O on standby as master waits for WAL records flush on standby. It may be interesting to run such tests with more concurrent connections at the same time, like 32 or 64. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
So this test can be used to evaluate how shorter records influence performance since the master waits for flush confirmation from the standby, right? Yes. This test can help measure performance improvement due to reduced I/O on standby as master waits for WAL records flush on standby. Isn't that GB and not MB? Yes. That is a typo. It should be GB. How many FPWs have been generated and how many dirty buffers have been flushed for the 3 checkpoints of each test? Any data about the CPU activity? Above data is not available for this run . I will rerun the tests to gather above data. Thank you, Rahila Syed -- View this message in context: http://postgresql.nabble.com/Compression-of-full-page-writes-tp5769039p5833389.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, Jan 8, 2015 at 11:59 PM, Rahila Syed rahilasyed...@gmail.com wrote: Below are performance numbers in case of synchronous replication with and without fpw compression using latest version of patch(version 14). The patch helps improve performance considerably. Both master and standby are on the same machine in order to get numbers independent of network overhead. So this test can be used to evaluate how shorter records influence performance since the master waits for flush confirmation from the standby, right? The compression patch helps to increase tps by 10% . It also helps reduce I/O to disk , latency and total runtime for a fixed number of transactions as shown below. The compression of WAL is quite high around 40%. Compressionon off WAL generated 23037180520(~23.04MB) 38196743704(~38.20MB) Isn't that GB and not MB? TPS 264.18239.34 Latency average60.541 ms 66.822 ms Latency stddev 126.567 ms 130.434 ms Total writes to disk 145045.310 MB 192357.250MB Runtime 15141.0 s 16712.0 s How many FPWs have been generated and how many dirty buffers have been flushed for the 3 checkpoints of each test? Any data about the CPU activity? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
Hello, Below are performance numbers in case of synchronous replication with and without fpw compression using latest version of patch(version 14). The patch helps improve performance considerably. Both master and standby are on the same machine in order to get numbers independent of network overhead. The compression patch helps to increase tps by 10% . It also helps reduce I/O to disk , latency and total runtime for a fixed number of transactions as shown below. The compression of WAL is quite high around 40%. pgbench scale :1000 pgbench command : pgbench -c 16 -j 16 -r -t 25 -M prepared To ensure that data is not highly compressible, empty filler columns were altered using alter table pgbench_accounts alter column filler type text using gen_random_uuid()::text checkpoint_segments = 1024 checkpoint_timeout = 5min fsync = on Compressionon off WAL generated 23037180520(~23.04MB) 38196743704(~38.20MB) TPS 264.18 239.34 Latency average60.541 ms 66.822 ms Latency stddev 126.567 ms 130.434 ms Total writes to disk 145045.310 MB192357.250 MB Runtime 15141.0 s 16712.0 s Server specifications: Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos RAM: 32GB Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm Thank you, Rahila Syed -- View this message in context: http://postgresql.nabble.com/Compression-of-full-page-writes-tp5769039p5833315.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Sat, Jan 3, 2015 at 1:52 AM, Bruce Momjian br...@momjian.us wrote: On Fri, Jan 2, 2015 at 10:15:57AM -0600, k...@rice.edu wrote: On Fri, Jan 02, 2015 at 01:01:06PM +0100, Andres Freund wrote: On 2014-12-31 16:09:31 -0500, Bruce Momjian wrote: I still don't understand the value of adding WAL compression, given the high CPU usage and minimal performance improvement. The only big advantage is WAL storage, but again, why not just compress the WAL file when archiving. before: pg_xlog is 800GB after: pg_xlog is 600GB. I'm damned sure that many people would be happy with that, even if the *per backend* overhead is a bit higher. And no, compression of archives when archiving helps *zap* with that (streaming, wal_keep_segments, checkpoint_timeout). As discussed before. Greetings, Andres Freund +1 On an I/O constrained system assuming 50:50 table:WAL I/O, in the case above you can process 100GB of transaction data at the cost of a bit more CPU. OK, so given your stats, the feature give a 12.5% reduction in I/O. If that is significant, shouldn't we see a performance improvement? If we don't see a performance improvement, is I/O reduction worthwhile? Is it valuable in that it gives non-database applications more I/O to use? Is that all? I suggest we at least document that this feature as mostly useful for I/O reduction, and maybe say CPU usage and performance might be negatively impacted. OK, here is the email I remember from Fujii Masao this same thread that showed a performance improvement for WAL compression: http://www.postgresql.org/message-id/CAHGQGwGqG8e9YN0fNCUZqTTT=hnr7ly516kft5ffqf4pp1q...@mail.gmail.com Why are we not seeing the 33% compression and 15% performance improvement he saw? Because the benchmarks I and Michael used are very difffernet. I just used pgbench, but he used his simple test SQLs (see http://www.postgresql.org/message-id/cab7npqsc97o-ue5paxfmukwcxe_jioyxo1m4a0pmnmyqane...@mail.gmail.com). Furthermore, the data type of pgbench_accounts.filler column is character(84) and its content is empty, so pgbench_accounts is very compressible. This is one of the reasons I could see good performance improvement and high compression ratio. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Sat, Jan 3, 2015 at 2:24 AM, Bruce Momjian br...@momjian.us wrote: On Fri, Jan 2, 2015 at 02:18:12PM -0300, Claudio Freire wrote: On Fri, Jan 2, 2015 at 2:11 PM, Andres Freund and...@2ndquadrant.com wrote: , I now see the compression patch as something that has negatives, so has to be set by the user, and only wins in certain cases. I am disappointed, and am trying to figure out how this became such a marginal win for 9.5. :-( I find the notion that a multi digit space reduction is a marginal win pretty ridiculous and way too narrow focused. Our WAL volume is a *significant* problem in the field. And it mostly consists out of FPWs spacewise. One thing I'd like to point out, is that in cases where WAL I/O is an issue (ie: WAL archiving), usually people already compress the segments during archiving. I know I do, and I know it's recommended on the web, and by some consultants. So, I wouldn't want this FPW compression, which is desirable in replication scenarios if you can spare the CPU cycles (because of streaming), adversely affecting WAL compression during archiving. To be specific, desirable in streaming replication scenarios that don't use SSL compression. (What percentage is that?) It is something we should mention in the docs for this feature? Even if SSL is used in replication, FPW compression is useful. It can reduce the amount of I/O in the standby side. Sometimes I've seen walreceiver's I/O had become a performance bottleneck especially in synchronous replication cases. FPW compression can be useful for those cases, for example. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Sat, Jan 3, 2015 at 1:52 AM, Bruce Momjian br...@momjian.us wrote: I suggest we at least document that this feature as mostly useful for I/O reduction, and maybe say CPU usage and performance might be negatively impacted. FWIW, that's mentioned in the documentation included in the patch.. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2014-12-31 16:09:31 -0500, Bruce Momjian wrote: I still don't understand the value of adding WAL compression, given the high CPU usage and minimal performance improvement. The only big advantage is WAL storage, but again, why not just compress the WAL file when archiving. before: pg_xlog is 800GB after: pg_xlog is 600GB. I'm damned sure that many people would be happy with that, even if the *per backend* overhead is a bit higher. And no, compression of archives when archiving helps *zap* with that (streaming, wal_keep_segments, checkpoint_timeout). As discussed before. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Jan 2, 2015 at 10:15:57AM -0600, k...@rice.edu wrote: On Fri, Jan 02, 2015 at 01:01:06PM +0100, Andres Freund wrote: On 2014-12-31 16:09:31 -0500, Bruce Momjian wrote: I still don't understand the value of adding WAL compression, given the high CPU usage and minimal performance improvement. The only big advantage is WAL storage, but again, why not just compress the WAL file when archiving. before: pg_xlog is 800GB after: pg_xlog is 600GB. I'm damned sure that many people would be happy with that, even if the *per backend* overhead is a bit higher. And no, compression of archives when archiving helps *zap* with that (streaming, wal_keep_segments, checkpoint_timeout). As discussed before. Greetings, Andres Freund +1 On an I/O constrained system assuming 50:50 table:WAL I/O, in the case above you can process 100GB of transaction data at the cost of a bit more CPU. OK, so given your stats, the feature give a 12.5% reduction in I/O. If that is significant, shouldn't we see a performance improvement? If we don't see a performance improvement, is I/O reduction worthwhile? Is it valuable in that it gives non-database applications more I/O to use? Is that all? I suggest we at least document that this feature as mostly useful for I/O reduction, and maybe say CPU usage and performance might be negatively impacted. OK, here is the email I remember from Fujii Masao this same thread that showed a performance improvement for WAL compression: http://www.postgresql.org/message-id/CAHGQGwGqG8e9YN0fNCUZqTTT=hnr7ly516kft5ffqf4pp1q...@mail.gmail.com Why are we not seeing the 33% compression and 15% performance improvement he saw? What am I missing here? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2015-01-02 12:06:33 -0500, Bruce Momjian wrote: On Fri, Jan 2, 2015 at 05:55:52PM +0100, Andres Freund wrote: On 2015-01-02 11:52:42 -0500, Bruce Momjian wrote: Why are we not seeing the 33% compression and 15% performance improvement he saw? What am I missing here? To see performance improvements something needs to be the bottleneck. If WAL writes/flushes aren't that in the tested scenario, you won't see a performance benefit. Amdahl's law and all that. I don't understand your negativity about the topic. I remember the initial post from Masao in August 2013 showing a performance boost, so I assumed, while we had the concurrent WAL insert performance improvement in 9.4, this was going to be our 9.5 WAL improvement. I don't think it makes sense to compare features/improvements that way. While the WAL insert performance improvement required no tuning and was never a negative It's actually a negative in some cases. , I now see the compression patch as something that has negatives, so has to be set by the user, and only wins in certain cases. I am disappointed, and am trying to figure out how this became such a marginal win for 9.5. :-( I find the notion that a multi digit space reduction is a marginal win pretty ridiculous and way too narrow focused. Our WAL volume is a *significant* problem in the field. And it mostly consists out of FPWs spacewise. My negativity is not that I don't want it, but I want to understand why it isn't better than I remembered. You are basically telling me it was always a marginal win. :-( Boohoo! No, I didn't. I told you that *IN ONE BENCHMARK* wal writes apparently are not the bottleneck. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Jan 2, 2015 at 2:11 PM, Andres Freund and...@2ndquadrant.com wrote: , I now see the compression patch as something that has negatives, so has to be set by the user, and only wins in certain cases. I am disappointed, and am trying to figure out how this became such a marginal win for 9.5. :-( I find the notion that a multi digit space reduction is a marginal win pretty ridiculous and way too narrow focused. Our WAL volume is a *significant* problem in the field. And it mostly consists out of FPWs spacewise. One thing I'd like to point out, is that in cases where WAL I/O is an issue (ie: WAL archiving), usually people already compress the segments during archiving. I know I do, and I know it's recommended on the web, and by some consultants. So, I wouldn't want this FPW compression, which is desirable in replication scenarios if you can spare the CPU cycles (because of streaming), adversely affecting WAL compression during archiving. Has anyone tested the compressability of WAL segments with FPW compression on? AFAIK, both pglz and lz4 output should still be compressible with deflate, but I've never tried. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Jan 2, 2015 at 05:55:52PM +0100, Andres Freund wrote: On 2015-01-02 11:52:42 -0500, Bruce Momjian wrote: Why are we not seeing the 33% compression and 15% performance improvement he saw? What am I missing here? To see performance improvements something needs to be the bottleneck. If WAL writes/flushes aren't that in the tested scenario, you won't see a performance benefit. Amdahl's law and all that. I don't understand your negativity about the topic. I remember the initial post from Masao in August 2013 showing a performance boost, so I assumed, while we had the concurrent WAL insert performance improvement in 9.4, this was going to be our 9.5 WAL improvement. While the WAL insert performance improvement required no tuning and was never a negative, I now see the compression patch as something that has negatives, so has to be set by the user, and only wins in certain cases. I am disappointed, and am trying to figure out how this became such a marginal win for 9.5. :-( My negativity is not that I don't want it, but I want to understand why it isn't better than I remembered. You are basically telling me it was always a marginal win. :-( Boohoo! -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Jan 02, 2015 at 01:01:06PM +0100, Andres Freund wrote: On 2014-12-31 16:09:31 -0500, Bruce Momjian wrote: I still don't understand the value of adding WAL compression, given the high CPU usage and minimal performance improvement. The only big advantage is WAL storage, but again, why not just compress the WAL file when archiving. before: pg_xlog is 800GB after: pg_xlog is 600GB. I'm damned sure that many people would be happy with that, even if the *per backend* overhead is a bit higher. And no, compression of archives when archiving helps *zap* with that (streaming, wal_keep_segments, checkpoint_timeout). As discussed before. Greetings, Andres Freund +1 On an I/O constrained system assuming 50:50 table:WAL I/O, in the case above you can process 100GB of transaction data at the cost of a bit more CPU. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Jan 2, 2015 at 06:11:29PM +0100, Andres Freund wrote: My negativity is not that I don't want it, but I want to understand why it isn't better than I remembered. You are basically telling me it was always a marginal win. :-( Boohoo! No, I didn't. I told you that *IN ONE BENCHMARK* wal writes apparently are not the bottleneck. What I have not seen is any recent benchmarks that show it as a win, while the original email did, so I was confused. I tried to explain exactly how I viewed things --- you can not like it, but that is how I look for upcoming features, and where we should focus our time. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2015-01-02 11:52:42 -0500, Bruce Momjian wrote: Why are we not seeing the 33% compression and 15% performance improvement he saw? What am I missing here? To see performance improvements something needs to be the bottleneck. If WAL writes/flushes aren't that in the tested scenario, you won't see a performance benefit. Amdahl's law and all that. I don't understand your negativity about the topic. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Jan 2, 2015 at 02:18:12PM -0300, Claudio Freire wrote: On Fri, Jan 2, 2015 at 2:11 PM, Andres Freund and...@2ndquadrant.com wrote: , I now see the compression patch as something that has negatives, so has to be set by the user, and only wins in certain cases. I am disappointed, and am trying to figure out how this became such a marginal win for 9.5. :-( I find the notion that a multi digit space reduction is a marginal win pretty ridiculous and way too narrow focused. Our WAL volume is a *significant* problem in the field. And it mostly consists out of FPWs spacewise. One thing I'd like to point out, is that in cases where WAL I/O is an issue (ie: WAL archiving), usually people already compress the segments during archiving. I know I do, and I know it's recommended on the web, and by some consultants. So, I wouldn't want this FPW compression, which is desirable in replication scenarios if you can spare the CPU cycles (because of streaming), adversely affecting WAL compression during archiving. To be specific, desirable in streaming replication scenarios that don't use SSL compression. (What percentage is that?) It is something we should mention in the docs for this feature? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
* Bruce Momjian (br...@momjian.us) wrote: To be specific, desirable in streaming replication scenarios that don't use SSL compression. (What percentage is that?) It is something we should mention in the docs for this feature? Considering how painful the SSL rengeotiation problems were and the CPU overhead, I'd be surprised if many high-write-volume replication environments use SSL at all. There's a lot of win to be had from compression of FPWs, but it's like most compression in that there are trade-offs to be had and environments where it won't be a win, but I believe those cases to be the minority. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Compression of full-page-writes
On Tue, Dec 30, 2014 at 01:27:44PM +0100, Andres Freund wrote: On 2014-12-30 21:23:38 +0900, Michael Paquier wrote: On Tue, Dec 30, 2014 at 6:21 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2013-08-30 at 09:57 +0300, Heikki Linnakangas wrote: Speeding up the CRC calculation obviously won't help with the WAL volume per se, ie. you still generate the same amount of WAL that needs to be shipped in replication. But then again, if all you want to do is to reduce the volume, you could just compress the whole WAL stream. Was this point addressed? Compressing the whole record is interesting for multi-insert records, but as we need to keep the compressed data in a pre-allocated buffer until WAL is written, we can only compress things within a given size range. The point is, even if we define a lower bound, compression is going to perform badly with an application that generates for example many small records that are just higher than the lower bound... Unsurprisingly for small records this was bad: So why are you bringing it up? That's not an argument for anything, except not doing it in such a simplistic way. I still don't understand the value of adding WAL compression, given the high CPU usage and minimal performance improvement. The only big advantage is WAL storage, but again, why not just compress the WAL file when archiving. I thought we used to see huge performance benefits from WAL compression, but not any more? Has the UPDATE WAL compression removed that benefit? Am I missing something? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, Jan 1, 2015 at 2:39 AM, Bruce Momjian br...@momjian.us wrote: On Tue, Dec 30, 2014 at 01:27:44PM +0100, Andres Freund wrote: On 2014-12-30 21:23:38 +0900, Michael Paquier wrote: On Tue, Dec 30, 2014 at 6:21 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2013-08-30 at 09:57 +0300, Heikki Linnakangas wrote: Speeding up the CRC calculation obviously won't help with the WAL volume per se, ie. you still generate the same amount of WAL that needs to be shipped in replication. But then again, if all you want to do is to reduce the volume, you could just compress the whole WAL stream. Was this point addressed? Compressing the whole record is interesting for multi-insert records, but as we need to keep the compressed data in a pre-allocated buffer until WAL is written, we can only compress things within a given size range. The point is, even if we define a lower bound, compression is going to perform badly with an application that generates for example many small records that are just higher than the lower bound... Unsurprisingly for small records this was bad: So why are you bringing it up? That's not an argument for anything, except not doing it in such a simplistic way. I still don't understand the value of adding WAL compression, given the high CPU usage and minimal performance improvement. The only big advantage is WAL storage, but again, why not just compress the WAL file when archiving. I thought we used to see huge performance benefits from WAL compression, but not any more? I think there can be performance benefit for the cases when the data is compressible, but it would be loss otherwise. The main thing is that the current compression algorithm (pg_lz) used is not so favorable for non-compresible data. Has the UPDATE WAL compression removed that benefit? Good question, I think there might be some impact due to that, but in general for page level compression still there will be much more to compress. In general, I think this idea has merit with respect to compressible data, and to save for the cases where it will not perform well, there is a on/off switch for this feature and in future if PostgreSQL has some better compression method, we can consider the same as well. One thing that we need to think is whether user's can decide with ease when to enable this global switch. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Compression of full-page-writes
On Thu, Jan 1, 2015 at 2:10 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Thu, Jan 1, 2015 at 2:39 AM, Bruce Momjian br...@momjian.us wrote: So why are you bringing it up? That's not an argument for anything, except not doing it in such a simplistic way. I still don't understand the value of adding WAL compression, given the high CPU usage and minimal performance improvement. The only big advantage is WAL storage, but again, why not just compress the WAL file when archiving. When doing some tests with pgbench for a fixed number of transactions, I also noticed a reduction in replay time as well, see here for example some results here: http://www.postgresql.org/message-id/CAB7nPqRv6RaSx7hTnp=g3dyqou++fel0uioyqpllbdbhayb...@mail.gmail.com I thought we used to see huge performance benefits from WAL compression, but not any more? I think there can be performance benefit for the cases when the data is compressible, but it would be loss otherwise. The main thing is that the current compression algorithm (pg_lz) used is not so favorable for non-compresible data. Yes definitely. Switching to a different algorithm would be the next step forward. We have been discussing mainly about lz4 that has a friendly license, I think that it would be worth studying other things as well once we have all the infrastructure in place. Has the UPDATE WAL compression removed that benefit? Good question, I think there might be some impact due to that, but in general for page level compression still there will be much more to compress. That may be a good thing to put a number on. We could try to patch a build with a revert of a3115f0d and measure a bit that the difference in WAL size that it creates. Thoughts? In general, I think this idea has merit with respect to compressible data, and to save for the cases where it will not perform well, there is a on/off switch for this feature and in future if PostgreSQL has some better compression method, we can consider the same as well. One thing that we need to think is whether user's can decide with ease when to enable this global switch. The opposite is true as well, we shouldn't force the user to have data compressed even if the switch is disabled. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, 2013-08-30 at 09:57 +0300, Heikki Linnakangas wrote: Speeding up the CRC calculation obviously won't help with the WAL volume per se, ie. you still generate the same amount of WAL that needs to be shipped in replication. But then again, if all you want to do is to reduce the volume, you could just compress the whole WAL stream. Was this point addressed? How much benefit is there to compressing the data before it goes into the WAL stream versus after? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Dec 30, 2014 at 6:21 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2013-08-30 at 09:57 +0300, Heikki Linnakangas wrote: Speeding up the CRC calculation obviously won't help with the WAL volume per se, ie. you still generate the same amount of WAL that needs to be shipped in replication. But then again, if all you want to do is to reduce the volume, you could just compress the whole WAL stream. Was this point addressed? Compressing the whole record is interesting for multi-insert records, but as we need to keep the compressed data in a pre-allocated buffer until WAL is written, we can only compress things within a given size range. The point is, even if we define a lower bound, compression is going to perform badly with an application that generates for example many small records that are just higher than the lower bound... Unsurprisingly for small records this was bad: http://www.postgresql.org/message-id/cab7npqsc97o-ue5paxfmukwcxe_jioyxo1m4a0pmnmyqane...@mail.gmail.com Now are there still people interested in seeing the amount of time spent in the CRC calculation depending on the record length? Isn't that worth speaking on the CRC thread btw? I'd imagine that it would be simple to evaluate the effect of the CRC calculation within a single process using a bit getrusage. How much benefit is there to compressing the data before it goes into the WAL stream versus after? Here is a good list: http://www.postgresql.org/message-id/20141212145330.gk31...@awork2.anarazel.de Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2014-12-30 21:23:38 +0900, Michael Paquier wrote: On Tue, Dec 30, 2014 at 6:21 PM, Jeff Davis pg...@j-davis.com wrote: On Fri, 2013-08-30 at 09:57 +0300, Heikki Linnakangas wrote: Speeding up the CRC calculation obviously won't help with the WAL volume per se, ie. you still generate the same amount of WAL that needs to be shipped in replication. But then again, if all you want to do is to reduce the volume, you could just compress the whole WAL stream. Was this point addressed? Compressing the whole record is interesting for multi-insert records, but as we need to keep the compressed data in a pre-allocated buffer until WAL is written, we can only compress things within a given size range. The point is, even if we define a lower bound, compression is going to perform badly with an application that generates for example many small records that are just higher than the lower bound... Unsurprisingly for small records this was bad: So why are you bringing it up? That's not an argument for anything, except not doing it in such a simplistic way. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Dec 9, 2014 at 10:45 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Dec 8, 2014 at 3:17 PM, Simon Riggs si...@2ndquadrant.com wrote: On 8 December 2014 at 11:46, Michael Paquier michael.paqu...@gmail.com wrote: I don't really like those new names, but I'd prefer wal_compression_level if we go down that road with 'none' as default value. We may still decide in the future to support compression at the record level instead of context level, particularly if we have an API able to do palloc_return_null_at_oom, so the idea of WAL compression is not related only to FPIs IMHO. We may yet decide, but the pglz implementation is not effective on smaller record lengths. Nor has any testing been done to show that is even desirable. It's even much worse for non-compressible (or less-compressible) WAL data. To check the actual effect, I have ran few tests with the patch (0001-Move-pg_lzcompress.c-to-src-common, 0002-Support-compression-for-full-page-writes-in-WAL) and the data shows that for worst case (9 short and 1 long, short changed) there is dip of ~56% in runtime where the compression is less (~20%) and a ~35% of dip in runtime for the small record size (two short fields, no change) where compression is ~28%. For best case (one short and one long field, no change), the compression is more than 2 times and there is an improvement in runtime of ~4%. Note that in worst case, I am using random string due to which the compression is less and it seems to me that worst is not by far the worst because we see some compression in that case as well. I think this might not be the best test to measure the effect of this patch, but still it has data for various compression ratio's which could indicate the value of this patch. Test case used to take below data is attached with this mail. Seeing this data, one way to mitigate the cases where it can cause performance impact is to have a table level compression flag which we have discussed last year during development of WAL compression for Update operation as well. Performance Data - m/c configuration - IBM POWER-8 24 cores, 192 hardware threads RAM = 492GB Non-default parameters - checkpoint_segments - 256 checkpoint_timeout - 15 min wal_compression=off testname | wal_generated | duration -+---+-- two short fields, no change | 540055720 | 12.1288201808929 two short fields, no change | 542911816 | 11.8804960250854 two short fields, no change | 540063400 | 11.7856659889221 two short fields, one changed | 540055792 | 11.9835240840912 two short fields, one changed | 540056624 | 11.9008920192719 two short fields, one changed | 540059560 | 12.064150094986 two short fields, both changed | 581813832 | 10.290940847 two short fields, both changed | 579823384 | 12.4431331157684 two short fields, both changed | 579896448 | 12.5214929580688 one short and one long field, no change | 320058048 | 5.04950094223022 one short and one long field, no change | 321150040 | 5.24907302856445 one short and one long field, no change | 320055072 | 5.07368278503418 ten tiny fields, all changed| 620765680 | 14.2868521213531 ten tiny fields, all changed| 620681176 | 14.2786719799042 ten tiny fields, all changed| 620684600 | 14.21634316 hundred tiny fields, all changed| 306317512 | 6.98173499107361 hundred tiny fields, all changed| 308039000 | 7.03955984115601 hundred tiny fields, all changed| 307117016 | 7.11708188056946 hundred tiny fields, half changed | 306483392 | 7.06978106498718 hundred tiny fields, half changed | 309336056 | 7.07678198814392 hundred tiny fields, half changed | 306317432 | 7.02817606925964 hundred tiny fields, half nulled| 219931376 | 6.29952597618103 hundred tiny fields, half nulled| 221001240 | 6.34559392929077 hundred tiny fields, half nulled| 219933072 | 6.36759996414185 9 short and 1 long, short changed | 253761248 | 4.37235498428345 9 short and 1 long, short changed | 253763040 | 4.34973502159119 9 short and 1 long, short changed | 253760280 | 4.34902000427246 (27 rows) wal_compression = on testname | wal_generated | duration -+---+-- two short fields, no change | 420569264 | 18.1419389247894 two short fields, no change | 423401960 | 16.0569458007812 two short fields, no change | 420568240 | 15.9060699939728 two short fields, one changed | 420769880 | 15.4179458618164
Re: [HACKERS] Compression of full-page-writes
On Thu, Dec 11, 2014 at 10:33 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Dec 9, 2014 at 4:09 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Dec 7, 2014 at 9:30 PM, Simon Riggs si...@2ndquadrant.com wrote: * parameter should be SUSET - it doesn't *need* to be set only at server start since all records are independent of each other Why not USERSET? There's no point in trying to prohibit users from doing things that will cause bad performance because they can do that anyway. Using SUSET or USERSET has a small memory cost: we should unconditionally palloc the buffers containing the compressed data until WAL is written out. We could always call an equivalent of InitXLogInsert when this parameter is updated but that would be bug-prone IMO and it does not plead in favor of code simplicity. I don't understand what you're saying here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Dec 12, 2014 at 10:23 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Dec 11, 2014 at 10:33 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Dec 9, 2014 at 4:09 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Dec 7, 2014 at 9:30 PM, Simon Riggs si...@2ndquadrant.com wrote: * parameter should be SUSET - it doesn't *need* to be set only at server start since all records are independent of each other Why not USERSET? There's no point in trying to prohibit users from doing things that will cause bad performance because they can do that anyway. Using SUSET or USERSET has a small memory cost: we should unconditionally palloc the buffers containing the compressed data until WAL is written out. We could always call an equivalent of InitXLogInsert when this parameter is updated but that would be bug-prone IMO and it does not plead in favor of code simplicity. I don't understand what you're saying here. I just meant that the scratch buffers used to store temporarily the compressed and uncompressed data should be palloc'd all the time, even if the switch is off. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Dec 12, 2014 at 11:32 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Dec 12, 2014 at 9:15 AM, Michael Paquier michael.paqu...@gmail.com wrote: I just meant that the scratch buffers used to store temporarily the compressed and uncompressed data should be palloc'd all the time, even if the switch is off. If they're fixed size, you can just put them on the heap as static globals. static char space_for_stuff[65536]; Well sure :) Or whatever you need. I don't think that's a cost worth caring about. OK, I thought it was. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Dec 12, 2014 at 9:15 AM, Michael Paquier michael.paqu...@gmail.com wrote: I just meant that the scratch buffers used to store temporarily the compressed and uncompressed data should be palloc'd all the time, even if the switch is off. If they're fixed size, you can just put them on the heap as static globals. static char space_for_stuff[65536]; Or whatever you need. I don't think that's a cost worth caring about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Dec 12, 2014 at 9:34 AM, Michael Paquier michael.paqu...@gmail.com wrote: I don't think that's a cost worth caring about. OK, I thought it was. Space on the heap that never gets used is basically free. The OS won't actually allocate physical memory unless the pages are actually accessed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Dec 9, 2014 at 4:09 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Dec 7, 2014 at 9:30 PM, Simon Riggs si...@2ndquadrant.com wrote: * parameter should be SUSET - it doesn't *need* to be set only at server start since all records are independent of each other Why not USERSET? There's no point in trying to prohibit users from doing things that will cause bad performance because they can do that anyway. Using SUSET or USERSET has a small memory cost: we should unconditionally palloc the buffers containing the compressed data until WAL is written out. We could always call an equivalent of InitXLogInsert when this parameter is updated but that would be bug-prone IMO and it does not plead in favor of code simplicity. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Dec 9, 2014 at 2:15 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Dec 8, 2014 at 3:17 PM, Simon Riggs si...@2ndquadrant.com wrote: On 8 December 2014 at 11:46, Michael Paquier michael.paqu...@gmail.com wrote: I don't really like those new names, but I'd prefer wal_compression_level if we go down that road with 'none' as default value. We may still decide in the future to support compression at the record level instead of context level, particularly if we have an API able to do palloc_return_null_at_oom, so the idea of WAL compression is not related only to FPIs IMHO. We may yet decide, but the pglz implementation is not effective on smaller record lengths. Nor has any testing been done to show that is even desirable. It's even much worse for non-compressible (or less-compressible) WAL data. I am not clear here that how a simple on/off switch could address such cases because the data could be sometimes dependent on which table user is doing operations (means schema or data in some tables are more prone for compression in which case it can give us benefits). I think may be we should think something on lines what Robert has touched in one of his e-mails (context-aware compression strategy). So, I have been doing some measurements using the patch compressing FPWs and had a look at the transaction latency using pgbench -P 1 with those parameters on my laptop: shared_buffers=512MB checkpoint_segments=1024 checkpoint_timeout = 5min fsync=off A checkpoint was executed just before a 20-min run, so 3 checkpoints at least kicked in during each measurement, roughly that: pgbench -i -s 100 psql -c 'checkpoint;' date ~/report.txt pgbench -P 1 -c 16 -j 16 -T 1200 2 ~/report.txt 1) Compression of FPW: latency average: 9.007 ms latency stddev: 25.527 ms tps = 1775.614812 (including connections establishing) Here is the latency when a checkpoint that wrote 28% of the buffers begun (570s): progress: 568.0 s, 2000.9 tps, lat 8.098 ms stddev 23.799 progress: 569.0 s, 1873.9 tps, lat 8.442 ms stddev 22.837 progress: 570.2 s, 1622.4 tps, lat 9.533 ms stddev 24.027 progress: 571.0 s, 1633.4 tps, lat 10.302 ms stddev 27.331 progress: 572.1 s, 1588.4 tps, lat 9.908 ms stddev 25.728 progress: 573.1 s, 1579.3 tps, lat 10.186 ms stddev 25.782 All the other checkpoints have the same profile, giving that the transaction latency increases by roughly 1.5~2ms to 10.5~11ms. 2) No compression of FPW: latency average: 8.507 ms latency stddev: 25.052 ms tps = 1870.368880 (including connections establishing) Here is the latency for a checkpoint that wrote 28% of buffers: progress: 297.1 s, 1997.9 tps, lat 8.112 ms stddev 24.288 progress: 298.1 s, 1990.4 tps, lat 7.806 ms stddev 21.849 progress: 299.0 s, 1986.9 tps, lat 8.366 ms stddev 22.896 progress: 300.0 s, 1648.1 tps, lat 9.728 ms stddev 25.811 progress: 301.0 s, 1806.5 tps, lat 8.646 ms stddev 24.187 progress: 302.1 s, 1810.9 tps, lat 8.960 ms stddev 24.201 progress: 303.0 s, 1831.9 tps, lat 8.623 ms stddev 23.199 progress: 304.0 s, 1951.2 tps, lat 8.149 ms stddev 22.871 Here is another one that began around 600s (20% of buffers): progress: 594.0 s, 1738.8 tps, lat 9.135 ms stddev 25.140 progress: 595.0 s, 893.2 tps, lat 18.153 ms stddev 67.186 progress: 596.1 s, 1671.0 tps, lat 9.470 ms stddev 25.691 progress: 597.1 s, 1580.3 tps, lat 10.189 ms stddev 26.430 progress: 598.0 s, 1570.9 tps, lat 10.089 ms stddev 23.684 progress: 599.2 s, 1657.0 tps, lat 9.385 ms stddev 23.794 progress: 600.0 s, 1665.5 tps, lat 10.280 ms stddev 25.857 progress: 601.1 s, 1571.7 tps, lat 9.851 ms stddev 25.341 progress: 602.1 s, 1577.7 tps, lat 10.056 ms stddev 25.331 progress: 603.0 s, 1600.1 tps, lat 10.329 ms stddev 25.429 progress: 604.0 s, 1593.8 tps, lat 10.004 ms stddev 26.816 Not sure what happened here, the burst has been a bit higher. However roughly the latency was never higher than 10.5ms for the non-compression case. With those measurements I am getting more or less 1ms of latency difference between the compression and non-compression cases when checkpoint show up. Note that fsync is disabled. Also, I am still planning to hack a patch able to compress directly records with a scratch buffer up 32k and see the difference with what I got here. For now, the results are attached. Comments welcome. -- Michael fpw_results.tar.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 8 December 2014 at 11:46, Michael Paquier michael.paqu...@gmail.com wrote: * ideally we'd like to be able to differentiate the types of usage. which then allows the user to control the level of compression depending upon the type of action. My first cut at what those settings should be are ALL LOGICAL PHYSICAL VACUUM. VACUUM - only compress while running vacuum commands PHYSICAL - only compress while running physical DDL commands (ALTER TABLE set tablespace, CREATE INDEX), i.e. those that wouldn't typically be used for logical decoding LOGICAL - compress FPIs for record types that change tables ALL - all user commands (each level includes all prior levels) Well, that's clearly an optimization so I don't think this should be done for a first shot but those are interesting fresh ideas. It is important that we offer an option that retains user performance. I don't see that as an optimisation, but as an essential item. The current feature will reduce WAL volume, at the expense of foreground user performance. Worse, that will all happen around time of new checkpoint, so I expect this will have a large impact. Presumably testing has been done to show the impact on user response times? If not, we need that. The most important distinction is between foreground and background tasks. If you think the above is too complex, then we should make the parameter into a USET, but set it to on in VACUUM, CLUSTER and autovacuum. Technically speaking, note that we would need to support such things with a new API to switch a new context flag in registered_buffers of xloginsert.c for each block, and decide if the block is compressed based on this context flag, and the compression level wanted. * name should not be wal_compression - we're not compressing all wal records, just fpis. There is no evidence that we even want to compress other record types, nor that our compression mechanism is effective at doing so. Simple = keep name as compress_full_page_writes Though perhaps we should have it called wal_compression_level I don't really like those new names, but I'd prefer wal_compression_level if we go down that road with 'none' as default value. We may still decide in the future to support compression at the record level instead of context level, particularly if we have an API able to do palloc_return_null_at_oom, so the idea of WAL compression is not related only to FPIs IMHO. We may yet decide, but the pglz implementation is not effective on smaller record lengths. Nor has any testing been done to show that is even desirable. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Sun, Dec 7, 2014 at 9:30 PM, Simon Riggs si...@2ndquadrant.com wrote: * parameter should be SUSET - it doesn't *need* to be set only at server start since all records are independent of each other Why not USERSET? There's no point in trying to prohibit users from doing things that will cause bad performance because they can do that anyway. * ideally we'd like to be able to differentiate the types of usage. which then allows the user to control the level of compression depending upon the type of action. My first cut at what those settings should be are ALL LOGICAL PHYSICAL VACUUM. VACUUM - only compress while running vacuum commands PHYSICAL - only compress while running physical DDL commands (ALTER TABLE set tablespace, CREATE INDEX), i.e. those that wouldn't typically be used for logical decoding LOGICAL - compress FPIs for record types that change tables ALL - all user commands (each level includes all prior levels) Interesting idea, but what evidence do we have that a simple on/off switch isn't good enough? * name should not be wal_compression - we're not compressing all wal records, just fpis. There is no evidence that we even want to compress other record types, nor that our compression mechanism is effective at doing so. Simple = keep name as compress_full_page_writes Quite right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2014-12-08 14:09:19 -0500, Robert Haas wrote: records, just fpis. There is no evidence that we even want to compress other record types, nor that our compression mechanism is effective at doing so. Simple = keep name as compress_full_page_writes Quite right. I don't really agree with this. There's lots of records which can be quite big where compression could help a fair bit. Most prominently HEAP2_MULTI_INSERT + INIT_PAGE. During initial COPY that's the biggest chunk of WAL. And these are big and repetitive enough that compression is very likely to be beneficial. I still think that just compressing the whole record if it's above a certain size is going to be better than compressing individual parts. Michael argued thta that'd be complicated because of the varying size of the required 'scratch space'. I don't buy that argument though. It's easy enough to simply compress all the data in some fixed chunk size. I.e. always compress 64kb in one go. If there's more compress that independently. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Mon, Dec 8, 2014 at 2:21 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-12-08 14:09:19 -0500, Robert Haas wrote: records, just fpis. There is no evidence that we even want to compress other record types, nor that our compression mechanism is effective at doing so. Simple = keep name as compress_full_page_writes Quite right. I don't really agree with this. There's lots of records which can be quite big where compression could help a fair bit. Most prominently HEAP2_MULTI_INSERT + INIT_PAGE. During initial COPY that's the biggest chunk of WAL. And these are big and repetitive enough that compression is very likely to be beneficial. I still think that just compressing the whole record if it's above a certain size is going to be better than compressing individual parts. Michael argued thta that'd be complicated because of the varying size of the required 'scratch space'. I don't buy that argument though. It's easy enough to simply compress all the data in some fixed chunk size. I.e. always compress 64kb in one go. If there's more compress that independently. I agree that idea is worth considering. But I think we should decide which way is better and then do just one or the other. I can't see the point in adding wal_compress=full_pages now and then offering an alternative wal_compress=big_records in 9.5. I think it's also quite likely that there may be cases where context-aware compression strategies can be employed. For example, the prefix/suffix compression of updates that Amit did last cycle exploit the likely commonality between the old and new tuple. We might have cases like that where there are meaningful trade-offs to be made between CPU and I/O, or other reasons to have user-exposed knobs. I think we'll be much happier if those are completely separate GUCs, so we can say things like compress_gin_wal=true and compress_brin_effort=3.14 rather than trying to have a single wal_compress GUC and assuming that we can shoehorn all future needs into it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 12/08/2014 09:21 PM, Andres Freund wrote: I still think that just compressing the whole record if it's above a certain size is going to be better than compressing individual parts. Michael argued thta that'd be complicated because of the varying size of the required 'scratch space'. I don't buy that argument though. It's easy enough to simply compress all the data in some fixed chunk size. I.e. always compress 64kb in one go. If there's more compress that independently. Doing it in fixed-size chunks doesn't help - you have to hold onto the compressed data until it's written to the WAL buffers. But you could just allocate a large enough scratch buffer, and give up if it doesn't fit. If the compressed data doesn't fit in e.g. 3 * 8kb, it didn't compress very well, so there's probably no point in compressing it anyway. Now, an exception to that might be a record that contains something else than page data, like a commit record with millions of subxids, but I think we could live with not compressing those, even though it would be beneficial to do so. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Dec 9, 2014 at 5:33 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 12/08/2014 09:21 PM, Andres Freund wrote: I still think that just compressing the whole record if it's above a certain size is going to be better than compressing individual parts. Michael argued thta that'd be complicated because of the varying size of the required 'scratch space'. I don't buy that argument though. It's easy enough to simply compress all the data in some fixed chunk size. I.e. always compress 64kb in one go. If there's more compress that independently. Doing it in fixed-size chunks doesn't help - you have to hold onto the compressed data until it's written to the WAL buffers. But you could just allocate a large enough scratch buffer, and give up if it doesn't fit. If the compressed data doesn't fit in e.g. 3 * 8kb, it didn't compress very well, so there's probably no point in compressing it anyway. Now, an exception to that might be a record that contains something else than page data, like a commit record with millions of subxids, but I think we could live with not compressing those, even though it would be beneficial to do so. Another thing to consider is the possibility to control at GUC level what is the maximum size of a record we allow to compress. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 9 December 2014 at 04:09, Robert Haas robertmh...@gmail.com wrote: On Sun, Dec 7, 2014 at 9:30 PM, Simon Riggs si...@2ndquadrant.com wrote: * parameter should be SUSET - it doesn't *need* to be set only at server start since all records are independent of each other Why not USERSET? There's no point in trying to prohibit users from doing things that will cause bad performance because they can do that anyway. Yes, I think USERSET would work fine for this. * ideally we'd like to be able to differentiate the types of usage. which then allows the user to control the level of compression depending upon the type of action. My first cut at what those settings should be are ALL LOGICAL PHYSICAL VACUUM. VACUUM - only compress while running vacuum commands PHYSICAL - only compress while running physical DDL commands (ALTER TABLE set tablespace, CREATE INDEX), i.e. those that wouldn't typically be used for logical decoding LOGICAL - compress FPIs for record types that change tables ALL - all user commands (each level includes all prior levels) Interesting idea, but what evidence do we have that a simple on/off switch isn't good enough? Yes, I think that was overcooked. What I'm thinking is that in the long run we might have groups of parameters attached to different types of action, so we wouldn't need, for example, two parameters for work_mem and maintenance_work_mem. We'd just have work_mem and then a scheme that has different values of work_mem for different action types. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 9 December 2014 at 04:21, Andres Freund and...@2ndquadrant.com wrote: On 2014-12-08 14:09:19 -0500, Robert Haas wrote: records, just fpis. There is no evidence that we even want to compress other record types, nor that our compression mechanism is effective at doing so. Simple = keep name as compress_full_page_writes Quite right. I don't really agree with this. There's lots of records which can be quite big where compression could help a fair bit. Most prominently HEAP2_MULTI_INSERT + INIT_PAGE. During initial COPY that's the biggest chunk of WAL. And these are big and repetitive enough that compression is very likely to be beneficial. Yes, you're right there. I was forgetting those aren't FPIs. However they are close enough that it wouldn't necessarily effect the naming of a parameter that controls such compression. I still think that just compressing the whole record if it's above a certain size is going to be better than compressing individual parts. I think its OK to think it, but we should measure it. For now then, I remove my objection to a commit of this patch based upon parameter naming/rethinking. We have a fine tradition of changing the names after the release is mostly wrapped, so lets pick a name in a few months time when the dust has settled on what's in. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Mon, Dec 8, 2014 at 3:17 PM, Simon Riggs si...@2ndquadrant.com wrote: On 8 December 2014 at 11:46, Michael Paquier michael.paqu...@gmail.com wrote: I don't really like those new names, but I'd prefer wal_compression_level if we go down that road with 'none' as default value. We may still decide in the future to support compression at the record level instead of context level, particularly if we have an API able to do palloc_return_null_at_oom, so the idea of WAL compression is not related only to FPIs IMHO. We may yet decide, but the pglz implementation is not effective on smaller record lengths. Nor has any testing been done to show that is even desirable. It's even much worse for non-compressible (or less-compressible) WAL data. I am not clear here that how a simple on/off switch could address such cases because the data could be sometimes dependent on which table user is doing operations (means schema or data in some tables are more prone for compression in which case it can give us benefits). I think may be we should think something on lines what Robert has touched in one of his e-mails (context-aware compression strategy). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Compression of full-page-writes
On Thu, Dec 4, 2014 at 8:37 PM, Michael Paquier wrote I pondered something that Andres mentioned upthread: we may not do the compression in WAL record only for blocks, but also at record level. Hence joining the two ideas together I think that we should definitely have a different GUC to control the feature, consistently for all the images. Let's call it wal_compression, with the following possible values: - on, meaning that a maximum of compression is done, for this feature basically full_page_writes = on. - full_page_writes, meaning that full page writes are compressed - off, default value, to disable completely the feature. This would let room for another mode: 'record', to completely compress a record. For now though, I think that a simple on/off switch would be fine for this patch. Let's keep things simple. +1 for a separate parameter for compression Some changed thoughts to the above * parameter should be SUSET - it doesn't *need* to be set only at server start since all records are independent of each other * ideally we'd like to be able to differentiate the types of usage. which then allows the user to control the level of compression depending upon the type of action. My first cut at what those settings should be are ALL LOGICAL PHYSICAL VACUUM. VACUUM - only compress while running vacuum commands PHYSICAL - only compress while running physical DDL commands (ALTER TABLE set tablespace, CREATE INDEX), i.e. those that wouldn't typically be used for logical decoding LOGICAL - compress FPIs for record types that change tables ALL - all user commands (each level includes all prior levels) * name should not be wal_compression - we're not compressing all wal records, just fpis. There is no evidence that we even want to compress other record types, nor that our compression mechanism is effective at doing so. Simple = keep name as compress_full_page_writes Though perhaps we should have it called wal_compression_level -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Mon, Dec 8, 2014 at 11:30 AM, Simon Riggs si...@2ndquadrant.com wrote: * parameter should be SUSET - it doesn't *need* to be set only at server start since all records are independent of each other Check. * ideally we'd like to be able to differentiate the types of usage. which then allows the user to control the level of compression depending upon the type of action. My first cut at what those settings should be are ALL LOGICAL PHYSICAL VACUUM. VACUUM - only compress while running vacuum commands PHYSICAL - only compress while running physical DDL commands (ALTER TABLE set tablespace, CREATE INDEX), i.e. those that wouldn't typically be used for logical decoding LOGICAL - compress FPIs for record types that change tables ALL - all user commands (each level includes all prior levels) Well, that's clearly an optimization so I don't think this should be done for a first shot but those are interesting fresh ideas. Technically speaking, note that we would need to support such things with a new API to switch a new context flag in registered_buffers of xloginsert.c for each block, and decide if the block is compressed based on this context flag, and the compression level wanted. * name should not be wal_compression - we're not compressing all wal records, just fpis. There is no evidence that we even want to compress other record types, nor that our compression mechanism is effective at doing so. Simple = keep name as compress_full_page_writes Though perhaps we should have it called wal_compression_level I don't really like those new names, but I'd prefer wal_compression_level if we go down that road with 'none' as default value. We may still decide in the future to support compression at the record level instead of context level, particularly if we have an API able to do palloc_return_null_at_oom, so the idea of WAL compression is not related only to FPIs IMHO. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Wed, Jun 11, 2014 at 10:05 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Tue, Jun 10, 2014 at 11:49 PM, Rahila Syed rahilasye...@gmail.com wrote: Hello , In order to facilitate changing of compression algorithms and to be able to recover using WAL records compressed with different compression algorithms, information about compression algorithm can be stored in WAL record. XLOG record header has 2 to 4 padding bytes in order to align the WAL record. This space can be used for a new flag in order to store information about the compression algorithm used. Like the xl_info field of XlogRecord struct, 8 bits flag can be constructed with the lower 4 bits of the flag used to indicate which backup block is compressed out of 0,1,2,3. Higher four bits can be used to indicate state of compression i.e off,lz4,snappy,pglz. The flag can be extended to incorporate more compression algorithms added in future if any. What is your opinion on this? -1 for any additional bytes in WAL record to control such things, having one single compression that we know performs well and relying on it makes the life of user and developer easier. IIUC even when we adopt only one algorithm, additional at least one bit is necessary to see whether this backup block is compressed or not. This flag is necessary only for backup block, so there is no need to use the header of each WAL record. What about just using the backup block header? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Wed, Jun 11, 2014 at 4:19 PM, Fujii Masao masao.fu...@gmail.com wrote: IIUC even when we adopt only one algorithm, additional at least one bit is necessary to see whether this backup block is compressed or not. This flag is necessary only for backup block, so there is no need to use the header of each WAL record. What about just using the backup block header? +1. We can also steal a few bits from ForkNumber field in the backup block header if required. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee
Re: [HACKERS] Compression of full-page-writes
Hello , In order to facilitate changing of compression algorithms and to be able to recover using WAL records compressed with different compression algorithms, information about compression algorithm can be stored in WAL record. XLOG record header has 2 to 4 padding bytes in order to align the WAL record. This space can be used for a new flag in order to store information about the compression algorithm used. Like the xl_info field of XlogRecord struct, 8 bits flag can be constructed with the lower 4 bits of the flag used to indicate which backup block is compressed out of 0,1,2,3. Higher four bits can be used to indicate state of compression i.e off,lz4,snappy,pglz. The flag can be extended to incorporate more compression algorithms added in future if any. What is your opinion on this? Thank you, Rahila Syed On Tue, May 27, 2014 at 9:27 AM, Rahila Syed rahilasyed...@gmail.com wrote: Hello All, 0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of full page writes to include LZ4 and Snappy . Changes include making compress_backup_block GUC from boolean to enum. Value of the GUC can be OFF, pglz, snappy or lz4 which can be used to turn off compression or set the desired compression algorithm. 0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It uses Andres’s patch for getting Makefiles working and has a few wrappers to make the function calls to LZ4 and Snappy compression functions and handle varlena datatypes. Patch Courtesy: Pavan Deolasee These patches serve as a way to test various compression algorithms. These are WIP yet. They don’t support changing compression algorithms on standby . Also, compress_backup_block GUC needs to be merged with full_page_writes. The patch uses LZ4 high compression(HC) variant. I have conducted initial tests which I would like to share and solicit feedback Tests use JDBC runner TPC-C benchmark to measure the amount of WAL compression ,tps and response time in each of the scenarios viz . Compression = OFF , pglz, LZ4 , snappy ,FPW=off Server specifications: Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos RAM: 32GB Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm Benchmark: Scale : 100 Command :java JR /home/postgres/jdbcrunner-1.2/scripts/tpcc.js -sleepTime 600,350,300,250,250 Warmup time : 1 sec Measurement time : 900 sec Number of tx types : 5 Number of agents : 16 Connection pool size : 16 Statement cache size : 40 Auto commit : false Sleep time : 600,350,300,250,250 msec Checkpoint segments:1024 Checkpoint timeout:5 mins Scenario WAL generated(bytes) Compression (bytes) TPS (tx1,tx2,tx3,tx4,tx5) No_compress 2220787088 (~2221MB) NULL 13.3,13.3,1.3,1.3,1.3 tps Pglz 1796213760 (~1796MB) 424573328 (19.11%) 13.1,13.1,1.3,1.3,1.3 tps Snappy 1724171112 (~1724MB) 496615976( 22.36%) 13.2,13.2,1.3,1.3,1.3 tps LZ4(HC)1658941328 (~1659MB) 561845760(25.29%) 13.2,13.2,1.3,1.3,1.3 tps FPW(off) 139384320(~139 MB)NULL 13.3,13.3,1.3,1.3,1.3 tps As per measurement results, WAL reduction using LZ4 is close to 25% which shows 6 percent increase in WAL reduction when compared to pglz . WAL reduction in snappy is close to 22 % . The numbers for compression using LZ4 and Snappy doesn’t seem to be very high as compared to pglz for given workload. This can be due to in-compressible nature of the TPC-C data which contains random strings Compression does not have bad impact on the response time. In fact, response times for Snappy, LZ4 are much better than no compression with almost ½ to 1/3 of the response times of no-compression(FPW=on) and FPW = off. The response time order for each type of compression is PglzSnappyLZ4 Scenario Response time (tx1,tx2,tx3,tx4,tx5) no_compress,1848,4221,6791,5747 msec pglz4275,2659,1828,4025,3326 msec Snappy 3790,2828,2186,1284,1120 msec LZ4(hC) 2519,2449,1158,2066,2065 msec FPW(off) 6234,2430,3017,5417,5885 msec LZ4 and Snappy are almost at par with each other in terms of response time as average response times of five types of transactions remains almost same for both. 0001-CompressBackupBlock_snappy_lz4_pglz.patch http://postgresql.1045698.n5.nabble.com/file/n5805044/0001-CompressBackupBlock_snappy_lz4_pglz.patch 0002-Support_snappy_lz4.patch http://postgresql.1045698.n5.nabble.com/file/n5805044/0002-Support_snappy_lz4.patch -- View this message in context: http://postgresql.1045698.n5.nabble.com/Compression-of-full-page-writes-tp5769039p5805044.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
Re: [HACKERS] Compression of full-page-writes
On Tue, Jun 10, 2014 at 11:49 PM, Rahila Syed rahilasye...@gmail.com wrote: Hello , In order to facilitate changing of compression algorithms and to be able to recover using WAL records compressed with different compression algorithms, information about compression algorithm can be stored in WAL record. XLOG record header has 2 to 4 padding bytes in order to align the WAL record. This space can be used for a new flag in order to store information about the compression algorithm used. Like the xl_info field of XlogRecord struct, 8 bits flag can be constructed with the lower 4 bits of the flag used to indicate which backup block is compressed out of 0,1,2,3. Higher four bits can be used to indicate state of compression i.e off,lz4,snappy,pglz. The flag can be extended to incorporate more compression algorithms added in future if any. What is your opinion on this? -1 for any additional bytes in WAL record to control such things, having one single compression that we know performs well and relying on it makes the life of user and developer easier. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, May 29, 2014 at 7:21 PM, Simon Riggs si...@2ndquadrant.com wrote: On 29 May 2014 01:07, Bruce Momjian br...@momjian.us wrote: On Wed, May 28, 2014 at 04:04:13PM +0100, Simon Riggs wrote: On 28 May 2014 15:34, Fujii Masao masao.fu...@gmail.com wrote: Also, compress_backup_block GUC needs to be merged with full_page_writes. Basically I agree with you because I don't want to add new GUC very similar to the existing one. But could you imagine the case where full_page_writes = off. Even in this case, FPW is forcibly written only during base backup. Such FPW also should be compressed? Which compression algorithm should be used? If we want to choose the algorithm for such FPW, we would not be able to merge those two GUCs. IMO it's OK to always use the best compression algorithm for such FPW and merge them, though. I'd prefer a new name altogether torn_page_protection = 'full_page_writes' torn_page_protection = 'compressed_full_page_writes' torn_page_protection = 'none' this allows us to add new techniques later like torn_page_protection = 'background_FPWs' or torn_page_protection = 'double_buffering' when/if we add those new techniques Uh, how would that work if you want to compress the background_FPWs? Use compressed_background_FPWs? We've currently got 1 technique for torn page protection, soon to have 2 and with a 3rd on the horizon and likely to receive effort in next release. It seems sensible to have just one parameter to describe the various techniques, as suggested. I'm suggesting that we plan for how things will look when we have the 3rd one as well. Alternate suggestions welcome. Is even compression of double buffer worthwhile? If yes, what about separating the GUC parameter into torn_page_protection and something like full_page_compression? ISTM that any combination of settings of those parameters can work. torn_page_protection = 'FPW', 'background FPW', 'none', 'double buffer' full_page_compression = 'no', 'pglz', 'lz4', 'snappy' Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 29 May 2014 01:07, Bruce Momjian br...@momjian.us wrote: On Wed, May 28, 2014 at 04:04:13PM +0100, Simon Riggs wrote: On 28 May 2014 15:34, Fujii Masao masao.fu...@gmail.com wrote: Also, compress_backup_block GUC needs to be merged with full_page_writes. Basically I agree with you because I don't want to add new GUC very similar to the existing one. But could you imagine the case where full_page_writes = off. Even in this case, FPW is forcibly written only during base backup. Such FPW also should be compressed? Which compression algorithm should be used? If we want to choose the algorithm for such FPW, we would not be able to merge those two GUCs. IMO it's OK to always use the best compression algorithm for such FPW and merge them, though. I'd prefer a new name altogether torn_page_protection = 'full_page_writes' torn_page_protection = 'compressed_full_page_writes' torn_page_protection = 'none' this allows us to add new techniques later like torn_page_protection = 'background_FPWs' or torn_page_protection = 'double_buffering' when/if we add those new techniques Uh, how would that work if you want to compress the background_FPWs? Use compressed_background_FPWs? We've currently got 1 technique for torn page protection, soon to have 2 and with a 3rd on the horizon and likely to receive effort in next release. It seems sensible to have just one parameter to describe the various techniques, as suggested. I'm suggesting that we plan for how things will look when we have the 3rd one as well. Alternate suggestions welcome. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
Thanks for extending and revising the FPW-compress patch! Could you add your patch into next CF? Sure. I will make improvements and add it to next CF. Isn't it worth measuring the recovery performance for each compression algorithm? Yes I will post this soon. On Wed, May 28, 2014 at 8:04 PM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, May 27, 2014 at 12:57 PM, Rahila Syed rahilasyed...@gmail.com wrote: Hello All, 0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of full page writes to include LZ4 and Snappy . Changes include making compress_backup_block GUC from boolean to enum. Value of the GUC can be OFF, pglz, snappy or lz4 which can be used to turn off compression or set the desired compression algorithm. 0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It uses Andres’s patch for getting Makefiles working and has a few wrappers to make the function calls to LZ4 and Snappy compression functions and handle varlena datatypes. Patch Courtesy: Pavan Deolasee Thanks for extending and revising the FPW-compress patch! Could you add your patch into next CF? Also, compress_backup_block GUC needs to be merged with full_page_writes. Basically I agree with you because I don't want to add new GUC very similar to the existing one. But could you imagine the case where full_page_writes = off. Even in this case, FPW is forcibly written only during base backup. Such FPW also should be compressed? Which compression algorithm should be used? If we want to choose the algorithm for such FPW, we would not be able to merge those two GUCs. IMO it's OK to always use the best compression algorithm for such FPW and merge them, though. Tests use JDBC runner TPC-C benchmark to measure the amount of WAL compression ,tps and response time in each of the scenarios viz . Compression = OFF , pglz, LZ4 , snappy ,FPW=off Isn't it worth measuring the recovery performance for each compression algorithm? Regards, -- Fujii Masao
Re: [HACKERS] Compression of full-page-writes
On Thu, May 29, 2014 at 11:21:44AM +0100, Simon Riggs wrote: Uh, how would that work if you want to compress the background_FPWs? Use compressed_background_FPWs? We've currently got 1 technique for torn page protection, soon to have 2 and with a 3rd on the horizon and likely to receive effort in next release. It seems sensible to have just one parameter to describe the various techniques, as suggested. I'm suggesting that we plan for how things will look when we have the 3rd one as well. Alternate suggestions welcome. I was just pointing out that we might need compression to be a separate boolean variable from the type of page tear protection. I know I am usually anti-adding-variables, but in this case it seems trying to have one variable control several things will lead to confusion. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, May 27, 2014 at 12:57 PM, Rahila Syed rahilasyed...@gmail.com wrote: Hello All, 0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of full page writes to include LZ4 and Snappy . Changes include making compress_backup_block GUC from boolean to enum. Value of the GUC can be OFF, pglz, snappy or lz4 which can be used to turn off compression or set the desired compression algorithm. 0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It uses Andres’s patch for getting Makefiles working and has a few wrappers to make the function calls to LZ4 and Snappy compression functions and handle varlena datatypes. Patch Courtesy: Pavan Deolasee Thanks for extending and revising the FPW-compress patch! Could you add your patch into next CF? Also, compress_backup_block GUC needs to be merged with full_page_writes. Basically I agree with you because I don't want to add new GUC very similar to the existing one. But could you imagine the case where full_page_writes = off. Even in this case, FPW is forcibly written only during base backup. Such FPW also should be compressed? Which compression algorithm should be used? If we want to choose the algorithm for such FPW, we would not be able to merge those two GUCs. IMO it's OK to always use the best compression algorithm for such FPW and merge them, though. Tests use JDBC runner TPC-C benchmark to measure the amount of WAL compression ,tps and response time in each of the scenarios viz . Compression = OFF , pglz, LZ4 , snappy ,FPW=off Isn't it worth measuring the recovery performance for each compression algorithm? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 28 May 2014 15:34, Fujii Masao masao.fu...@gmail.com wrote: Also, compress_backup_block GUC needs to be merged with full_page_writes. Basically I agree with you because I don't want to add new GUC very similar to the existing one. But could you imagine the case where full_page_writes = off. Even in this case, FPW is forcibly written only during base backup. Such FPW also should be compressed? Which compression algorithm should be used? If we want to choose the algorithm for such FPW, we would not be able to merge those two GUCs. IMO it's OK to always use the best compression algorithm for such FPW and merge them, though. I'd prefer a new name altogether torn_page_protection = 'full_page_writes' torn_page_protection = 'compressed_full_page_writes' torn_page_protection = 'none' this allows us to add new techniques later like torn_page_protection = 'background_FPWs' or torn_page_protection = 'double_buffering' when/if we add those new techniques -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Wed, May 28, 2014 at 04:04:13PM +0100, Simon Riggs wrote: On 28 May 2014 15:34, Fujii Masao masao.fu...@gmail.com wrote: Also, compress_backup_block GUC needs to be merged with full_page_writes. Basically I agree with you because I don't want to add new GUC very similar to the existing one. But could you imagine the case where full_page_writes = off. Even in this case, FPW is forcibly written only during base backup. Such FPW also should be compressed? Which compression algorithm should be used? If we want to choose the algorithm for such FPW, we would not be able to merge those two GUCs. IMO it's OK to always use the best compression algorithm for such FPW and merge them, though. I'd prefer a new name altogether torn_page_protection = 'full_page_writes' torn_page_protection = 'compressed_full_page_writes' torn_page_protection = 'none' this allows us to add new techniques later like torn_page_protection = 'background_FPWs' or torn_page_protection = 'double_buffering' when/if we add those new techniques Uh, how would that work if you want to compress the background_FPWs? Use compressed_background_FPWs? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
Hello All, 0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of full page writes to include LZ4 and Snappy . Changes include making compress_backup_block GUC from boolean to enum. Value of the GUC can be OFF, pglz, snappy or lz4 which can be used to turn off compression or set the desired compression algorithm. 0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It uses Andres’s patch for getting Makefiles working and has a few wrappers to make the function calls to LZ4 and Snappy compression functions and handle varlena datatypes. Patch Courtesy: Pavan Deolasee These patches serve as a way to test various compression algorithms. These are WIP yet. They don’t support changing compression algorithms on standby . Also, compress_backup_block GUC needs to be merged with full_page_writes. The patch uses LZ4 high compression(HC) variant. I have conducted initial tests which I would like to share and solicit feedback Tests use JDBC runner TPC-C benchmark to measure the amount of WAL compression ,tps and response time in each of the scenarios viz . Compression = OFF , pglz, LZ4 , snappy ,FPW=off Server specifications: Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos RAM: 32GB Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm Benchmark: Scale : 100 Command :java JR /home/postgres/jdbcrunner-1.2/scripts/tpcc.js -sleepTime 600,350,300,250,250 Warmup time : 1 sec Measurement time : 900 sec Number of tx types : 5 Number of agents : 16 Connection pool size : 16 Statement cache size : 40 Auto commit : false Sleep time : 600,350,300,250,250 msec Checkpoint segments:1024 Checkpoint timeout:5 mins Scenario WAL generated(bytes) Compression (bytes) TPS (tx1,tx2,tx3,tx4,tx5) No_compress 2220787088 (~2221MB) NULL 13.3,13.3,1.3,1.3,1.3 tps Pglz 1796213760 (~1796MB) 424573328 (19.11%) 13.1,13.1,1.3,1.3,1.3 tps Snappy 1724171112 (~1724MB) 496615976( 22.36%) 13.2,13.2,1.3,1.3,1.3 tps LZ4(HC)1658941328 (~1659MB) 561845760(25.29%) 13.2,13.2,1.3,1.3,1.3 tps FPW(off) 139384320(~139 MB)NULL 13.3,13.3,1.3,1.3,1.3 tps As per measurement results, WAL reduction using LZ4 is close to 25% which shows 6 percent increase in WAL reduction when compared to pglz . WAL reduction in snappy is close to 22 % . The numbers for compression using LZ4 and Snappy doesn’t seem to be very high as compared to pglz for given workload. This can be due to in-compressible nature of the TPC-C data which contains random strings Compression does not have bad impact on the response time. In fact, response times for Snappy, LZ4 are much better than no compression with almost ½ to 1/3 of the response times of no-compression(FPW=on) and FPW = off. The response time order for each type of compression is PglzSnappyLZ4 Scenario Response time (tx1,tx2,tx3,tx4,tx5) no_compress,1848,4221,6791,5747 msec pglz4275,2659,1828,4025,3326 msec Snappy 3790,2828,2186,1284,1120 msec LZ4(hC) 2519,2449,1158,2066,2065 msec FPW(off) 6234,2430,3017,5417,5885 msec LZ4 and Snappy are almost at par with each other in terms of response time as average response times of five types of transactions remains almost same for both. 0001-CompressBackupBlock_snappy_lz4_pglz.patch http://postgresql.1045698.n5.nabble.com/file/n5805044/0001-CompressBackupBlock_snappy_lz4_pglz.patch 0002-Support_snappy_lz4.patch http://postgresql.1045698.n5.nabble.com/file/n5805044/0002-Support_snappy_lz4.patch -- View this message in context: http://postgresql.1045698.n5.nabble.com/Compression-of-full-page-writes-tp5769039p5805044.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Sun, May 11, 2014 at 7:30 PM, Simon Riggs si...@2ndquadrant.com wrote: On 30 August 2013 04:55, Fujii Masao masao.fu...@gmail.com wrote: My idea is very simple, just compress FPW because FPW is a big part of WAL. I used pglz_compress() as a compression method, but you might think that other method is better. We can add something like FPW-compression-hook for that later. The patch adds new GUC parameter, but I'm thinking to merge it to full_page_writes parameter to avoid increasing the number of GUC. That is, I'm thinking to change full_page_writes so that it can accept new value 'compress'. * Result [tps] 1386.8 (compress_backup_block = off) 1627.7 (compress_backup_block = on) [the amount of WAL generated during running pgbench] 4302 MB (compress_backup_block = off) 1521 MB (compress_backup_block = on) Compressing FPWs definitely makes sense for bulk actions. I'm worried that the loss of performance occurs by greatly elongating transaction response times immediately after a checkpoint, which were already a problem. I'd be interested to look at the response time curves there. Yep, I agree that we should check how the compression of FPW affects the response time, especially just after checkpoint starts. I was thinking about this and about our previous thoughts about double buffering. FPWs are made in foreground, so will always slow down transaction rates. If we could move to double buffering we could avoid FPWs altogether. Thoughts? If I understand the double buffering correctly, it would eliminate the need for FPW. But I'm not sure how easy we can implement the double buffering. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, May 13, 2014 at 3:33 AM, Fujii Masao masao.fu...@gmail.com wrote: On Sun, May 11, 2014 at 7:30 PM, Simon Riggs si...@2ndquadrant.com wrote: On 30 August 2013 04:55, Fujii Masao masao.fu...@gmail.com wrote: My idea is very simple, just compress FPW because FPW is a big part of WAL. I used pglz_compress() as a compression method, but you might think that other method is better. We can add something like FPW-compression-hook for that later. The patch adds new GUC parameter, but I'm thinking to merge it to full_page_writes parameter to avoid increasing the number of GUC. That is, I'm thinking to change full_page_writes so that it can accept new value 'compress'. * Result [tps] 1386.8 (compress_backup_block = off) 1627.7 (compress_backup_block = on) [the amount of WAL generated during running pgbench] 4302 MB (compress_backup_block = off) 1521 MB (compress_backup_block = on) Compressing FPWs definitely makes sense for bulk actions. I'm worried that the loss of performance occurs by greatly elongating transaction response times immediately after a checkpoint, which were already a problem. I'd be interested to look at the response time curves there. Yep, I agree that we should check how the compression of FPW affects the response time, especially just after checkpoint starts. I was thinking about this and about our previous thoughts about double buffering. FPWs are made in foreground, so will always slow down transaction rates. If we could move to double buffering we could avoid FPWs altogether. Thoughts? If I understand the double buffering correctly, it would eliminate the need for FPW. But I'm not sure how easy we can implement the double buffering. There is already a patch on the double buffer write to eliminate the FPW. But It has some performance problem because of CRC calculation for the entire page. http://www.postgresql.org/message-id/1962493974.656458.1327703514780.javamail.r...@zimbra-prod-mbox-4.vmware.com I think this patch can be further modified with a latest multi core CRC calculation and can be used for testing. Regards, Hari Babu Fujitsu Australia -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
Hello, What kind of error did you get at the server crash? Assertion error? If yes, it might be because of the conflict with 4a170ee9e0ebd7021cb1190fabd5b0cbe2effb8e. This commit forbids palloc from being called within a critical section, but the patch does that and then the assertion error happens. That's a bug of the patch. seems to be that STATEMENT: create table test (id integer); TRAP: FailedAssertion(!(CritSectionCount == 0 || (CurrentMemoryContext) == ErrorContext || (MyAuxProcType == CheckpointerProcess)), File: mcxt.c, Line: 670) LOG: server process (PID 29721) was terminated by signal 6: Aborted DETAIL: Failed process was running: drop table test; LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. How do i resolve this? Thank you, Sameer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 30 August 2013 04:55, Fujii Masao masao.fu...@gmail.com wrote: My idea is very simple, just compress FPW because FPW is a big part of WAL. I used pglz_compress() as a compression method, but you might think that other method is better. We can add something like FPW-compression-hook for that later. The patch adds new GUC parameter, but I'm thinking to merge it to full_page_writes parameter to avoid increasing the number of GUC. That is, I'm thinking to change full_page_writes so that it can accept new value 'compress'. * Result [tps] 1386.8 (compress_backup_block = off) 1627.7 (compress_backup_block = on) [the amount of WAL generated during running pgbench] 4302 MB (compress_backup_block = off) 1521 MB (compress_backup_block = on) Compressing FPWs definitely makes sense for bulk actions. I'm worried that the loss of performance occurs by greatly elongating transaction response times immediately after a checkpoint, which were already a problem. I'd be interested to look at the response time curves there. Maybe it makes sense to compress FPWs if we do, say, N FPW writes in a transaction. Just ideas. I was thinking about this and about our previous thoughts about double buffering. FPWs are made in foreground, so will always slow down transaction rates. If we could move to double buffering we could avoid FPWs altogether. Thoughts? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
Hello, Done. Attached is the updated version of the patch. I was trying to check WAL reduction using this patch on latest available git version of Postgres using JDBC runner with tpcc benchmark. patching_problems.txt http://postgresql.1045698.n5.nabble.com/file/n5803482/patching_problems.txt I did resolve the patching conflicts and then compiled the source, removing couple of compiler errors in process. But the server crashes in the compress mode i.e. the moment any WAL is generated. Works fine in 'on' and 'off' mode. Clearly i must be resolving patch conflicts incorrectly as this patch applied cleanly earlier. Is there a version of the source where i could apply it the patch cleanly? Thank you, Sameer -- View this message in context: http://postgresql.1045698.n5.nabble.com/Compression-of-full-page-writes-tp5769039p5803482.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
Sameer Thakur samthaku...@gmail.com writes: I was trying to check WAL reduction using this patch on latest available git version of Postgres using JDBC runner with tpcc benchmark. patching_problems.txt http://postgresql.1045698.n5.nabble.com/file/n5803482/patching_problems.txt I did resolve the patching conflicts and then compiled the source, removing couple of compiler errors in process. But the server crashes in the compress mode i.e. the moment any WAL is generated. Works fine in 'on' and 'off' mode. Clearly i must be resolving patch conflicts incorrectly as this patch applied cleanly earlier. Is there a version of the source where i could apply it the patch cleanly? If the patch used to work, it's a good bet that what broke it is the recent pgindent run: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0a7832005792fa6dad171f9cadb8d587fe0dd800 It's going to need to be rebased past that, but doing so by hand would be tedious, and evidently was error-prone too. If you've got pgindent installed, you could consider applying the patch to the parent of that commit, pgindent'ing the whole tree, and then diffing against that commit to generate an updated patch. See src/tools/pgindent/README for some build/usage notes about pgindent. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Sat, May 10, 2014 at 8:33 PM, Sameer Thakur samthaku...@gmail.com wrote: Hello, Done. Attached is the updated version of the patch. I was trying to check WAL reduction using this patch on latest available git version of Postgres using JDBC runner with tpcc benchmark. patching_problems.txt http://postgresql.1045698.n5.nabble.com/file/n5803482/patching_problems.txt I did resolve the patching conflicts and then compiled the source, removing couple of compiler errors in process. But the server crashes in the compress mode i.e. the moment any WAL is generated. Works fine in 'on' and 'off' mode. What kind of error did you get at the server crash? Assertion error? If yes, it might be because of the conflict with 4a170ee9e0ebd7021cb1190fabd5b0cbe2effb8e. This commit forbids palloc from being called within a critical section, but the patch does that and then the assertion error happens. That's a bug of the patch. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Oct 11, 2013 at 12:30:41PM +0900, Fujii Masao wrote: Sure. To be honest, when I received the same request from Andres, I did that benchmark. But unfortunately because of machine trouble, I could not report it, yet. Will do that again. Here is the benchmark result: * Result [tps] 1317.306391 (full_page_writes = on) 1628.407752 (compress) [the amount of WAL generated during running pgbench] 1319 MB (on) 326 MB (compress) [time required to replay WAL generated during running pgbench] 19s (on) 2013-10-11 12:05:09 JST LOG: redo starts at F/F128 2013-10-11 12:05:28 JST LOG: redo done at 10/446B7BF0 12s (on) 2013-10-11 12:06:22 JST LOG: redo starts at F/F128 2013-10-11 12:06:34 JST LOG: redo done at 10/446B7BF0 12s (on) 2013-10-11 12:07:19 JST LOG: redo starts at F/F128 2013-10-11 12:07:31 JST LOG: redo done at 10/446B7BF0 8s (compress) 2013-10-11 12:17:36 JST LOG: redo starts at 10/5028 2013-10-11 12:17:44 JST LOG: redo done at 10/655AE478 8s (compress) 2013-10-11 12:18:26 JST LOG: redo starts at 10/5028 2013-10-11 12:18:34 JST LOG: redo done at 10/655AE478 8s (compress) 2013-10-11 12:19:07 JST LOG: redo starts at 10/5028 2013-10-11 12:19:15 JST LOG: redo done at 10/655AE478 Fujii, are you still working on this? I sure hope so. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Sat, Feb 1, 2014 at 10:22 AM, Bruce Momjian br...@momjian.us wrote: On Fri, Oct 11, 2013 at 12:30:41PM +0900, Fujii Masao wrote: Sure. To be honest, when I received the same request from Andres, I did that benchmark. But unfortunately because of machine trouble, I could not report it, yet. Will do that again. Here is the benchmark result: * Result [tps] 1317.306391 (full_page_writes = on) 1628.407752 (compress) [the amount of WAL generated during running pgbench] 1319 MB (on) 326 MB (compress) [time required to replay WAL generated during running pgbench] 19s (on) 2013-10-11 12:05:09 JST LOG: redo starts at F/F128 2013-10-11 12:05:28 JST LOG: redo done at 10/446B7BF0 12s (on) 2013-10-11 12:06:22 JST LOG: redo starts at F/F128 2013-10-11 12:06:34 JST LOG: redo done at 10/446B7BF0 12s (on) 2013-10-11 12:07:19 JST LOG: redo starts at F/F128 2013-10-11 12:07:31 JST LOG: redo done at 10/446B7BF0 8s (compress) 2013-10-11 12:17:36 JST LOG: redo starts at 10/5028 2013-10-11 12:17:44 JST LOG: redo done at 10/655AE478 8s (compress) 2013-10-11 12:18:26 JST LOG: redo starts at 10/5028 2013-10-11 12:18:34 JST LOG: redo done at 10/655AE478 8s (compress) 2013-10-11 12:19:07 JST LOG: redo starts at 10/5028 2013-10-11 12:19:15 JST LOG: redo done at 10/655AE478 Fujii, are you still working on this? I sure hope so. Yes, but it's too late to implement and post new patch in this development cycle of 9.4dev. I will propose that in next CF. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao masao.fu...@gmail.com wrote: So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. I don't favor making this pluggable. I think we should pick snappy or lz4 (or something else), put it in the tree, and use it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
Robert Haas robertmh...@gmail.com writes: On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao masao.fu...@gmail.com wrote: So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. I don't favor making this pluggable. I think we should pick snappy or lz4 (or something else), put it in the tree, and use it. I agree. Hooks in this area are going to be a constant source of headaches, vastly outweighing any possible benefit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, Oct 24, 2013 at 11:07:38AM -0400, Robert Haas wrote: On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao masao.fu...@gmail.com wrote: So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. I don't favor making this pluggable. I think we should pick snappy or lz4 (or something else), put it in the tree, and use it. Hi, My vote would be for lz4 since it has faster single thread compression and decompression speeds with the decompression speed being almost 2X snappy's decompression speed. The both are BSD licensed so that is not an issue. The base code for lz4 is c and it is c++ for snappy. There is also a HC (high-compression) varient for lz4 that pushes its compression rate to about the same as zlib (-1) which uses the same decompressor which can provide data even faster due to better compression. Some more real world tests would be useful, which is really where being pluggable would help. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, Oct 24, 2013 at 11:40 AM, k...@rice.edu k...@rice.edu wrote: On Thu, Oct 24, 2013 at 11:07:38AM -0400, Robert Haas wrote: On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao masao.fu...@gmail.com wrote: So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. I don't favor making this pluggable. I think we should pick snappy or lz4 (or something else), put it in the tree, and use it. Hi, My vote would be for lz4 since it has faster single thread compression and decompression speeds with the decompression speed being almost 2X snappy's decompression speed. The both are BSD licensed so that is not an issue. The base code for lz4 is c and it is c++ for snappy. There is also a HC (high-compression) varient for lz4 that pushes its compression rate to about the same as zlib (-1) which uses the same decompressor which can provide data even faster due to better compression. Some more real world tests would be useful, which is really where being pluggable would help. Well, it's probably a good idea for us to test, during the development cycle, which algorithm works better for WAL compression, and then use that one. Once we make that decision, I don't see that there are many circumstances in which a user would care to override it. Now if we find that there ARE reasons for users to prefer different algorithms in different situations, that would be a good reason to make it configurable (or even pluggable). But if we find that no such reasons exist, then we're better off avoiding burdening users with the need to configure a setting that has only one sensible value. It seems fairly clear from previous discussions on this mailing list that snappy and lz4 are the top contenders for the position of compression algorithm favored by PostgreSQL. I am wondering, though, whether it wouldn't be better to add support for both - say we added both to libpgcommon, and perhaps we could consider moving pglz there as well. That would allow easy access to all of those algorithms from both front-end and backend-code. If we can make the APIs parallel, it should very simple to modify any code we add now to use a different algorithm than the one initially chosen if in the future we add algorithms to or remove algorithms from the list, or if one algorithm is shown to outperform another in some particular context. I think we'll do well to isolate the question of adding support for these algorithms form the current patch or any other particular patch that may be on the table, and FWIW, I think having two leading contenders and adding support for both may have a variety of advantages over crowning a single victor. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, Oct 24, 2013 at 12:22:59PM -0400, Robert Haas wrote: On Thu, Oct 24, 2013 at 11:40 AM, k...@rice.edu k...@rice.edu wrote: On Thu, Oct 24, 2013 at 11:07:38AM -0400, Robert Haas wrote: On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao masao.fu...@gmail.com wrote: So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. I don't favor making this pluggable. I think we should pick snappy or lz4 (or something else), put it in the tree, and use it. Hi, My vote would be for lz4 since it has faster single thread compression and decompression speeds with the decompression speed being almost 2X snappy's decompression speed. The both are BSD licensed so that is not an issue. The base code for lz4 is c and it is c++ for snappy. There is also a HC (high-compression) varient for lz4 that pushes its compression rate to about the same as zlib (-1) which uses the same decompressor which can provide data even faster due to better compression. Some more real world tests would be useful, which is really where being pluggable would help. Well, it's probably a good idea for us to test, during the development cycle, which algorithm works better for WAL compression, and then use that one. Once we make that decision, I don't see that there are many circumstances in which a user would care to override it. Now if we find that there ARE reasons for users to prefer different algorithms in different situations, that would be a good reason to make it configurable (or even pluggable). But if we find that no such reasons exist, then we're better off avoiding burdening users with the need to configure a setting that has only one sensible value. It seems fairly clear from previous discussions on this mailing list that snappy and lz4 are the top contenders for the position of compression algorithm favored by PostgreSQL. I am wondering, though, whether it wouldn't be better to add support for both - say we added both to libpgcommon, and perhaps we could consider moving pglz there as well. That would allow easy access to all of those algorithms from both front-end and backend-code. If we can make the APIs parallel, it should very simple to modify any code we add now to use a different algorithm than the one initially chosen if in the future we add algorithms to or remove algorithms from the list, or if one algorithm is shown to outperform another in some particular context. I think we'll do well to isolate the question of adding support for these algorithms form the current patch or any other particular patch that may be on the table, and FWIW, I think having two leading contenders and adding support for both may have a variety of advantages over crowning a single victor. +++1 Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Thu, Oct 24, 2013 at 8:37 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Oct 21, 2013 at 11:52 PM, Fujii Masao masao.fu...@gmail.com wrote: So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. I don't favor making this pluggable. I think we should pick snappy or lz4 (or something else), put it in the tree, and use it. The reason why the discussion went towards making it pluggable (or at least what made me to think like that) was because of below reasons: a. what somebody needs to do to make snappy or lz4 in the tree, is it only performance/compression data for some of the scenario's or some other legal stuff as well, if it is only performance/compression then what will be the scenario's (is pgbench sufficient?). b. there can be cases where one or the other algorithm can be better or not doing compression is better. For example in one of the other patches where we were trying to achieve WAL reduction in Update operation (http://www.postgresql.org/message-id/8977cb36860c5843884e0a18d8747b036b9a4...@szxeml558-mbs.china.huawei.com), Heikki has came up with a test (where data is not much compressible), in such a case, the observation was that LZ was better than native compression method used in that patch and Snappy was better than LZ and not doing compression could be considered preferable in such a scenario because all the algorithm's were reducing TPS for that case. Now I think it is certainly better if we could choose one of the algorithms (snappy or lz4) and test them for most used scenario's for compression and performance and call it done, but I think giving at least an option to user to make compression altogether off should be still considered. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2013-10-22 12:52:09 +0900, Fujii Masao wrote: So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? No, I don't think that's concensus yet. If you want to make it configurable on that level you need to have: 1) compression format signature on fpws 2) mapping between identifiers for compression formats and the libraries implementing them. Otherwise you can only change the configuration at initdb time... Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. From my tests for toast compression I'd suggest starting with lz4. I'd suggest starting by publishing test results with a more modern compression formats, but without hacks like increasing padding. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
(2013/10/22 12:52), Fujii Masao wrote: On Tue, Oct 22, 2013 at 12:47 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Oct 21, 2013 at 4:40 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/19 14:58), Amit Kapila wrote: On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: In general, my thinking is that we should prefer compression to reduce IO (WAL volume), because reducing WAL volume has other benefits as well like sending it to subscriber nodes. I think it will help cases where due to less n/w bandwidth, the disk allocated for WAL becomes full due to high traffic on master and then users need some alternative methods to handle such situations. Do you talk about archiving WAL file? One of the points what I am talking about is sending data over network to subscriber nodes for streaming replication and another is WAL in pg_xlog. Both scenario's get benefited if there is is WAL volume. It can easy to reduce volume that we set and add compression command with copy command at archive_command. Okay. I think many users would like to use a method which can reduce WAL volume and the users which don't find it enough useful in their environments due to decrease in TPS or not significant reduction in WAL have the option to disable it. I favor to select compression algorithm for higher performance. If we need to compress WAL file more, in spite of lessor performance, we can change archive copy command with high compression algorithm and add documents that how to compress archive WAL files at archive_command. Does it wrong? No, it is not wrong, but there are scenario's as mentioned above where less WAL volume can be beneficial. In actual, many of NoSQLs use snappy for purpose of higher performance. Okay, you can also check the results with snappy algorithm, but don't just rely completely on snappy for this patch, you might want to think of another alternative for this patch. So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Yes, it will be also good for future improvement. But I think WAL compression for disaster recovery system should be need in walsender and walreceiver proccess, and it is propety architecture for DR system. Higher compression ratio with high CPU usage algorithm in FPW might affect bad for perfomance in master server. If we can set compression algorithm in walsender and walreciever, performance is same as before or better, and WAL send performance will be better. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Wed, Oct 23, 2013 at 7:05 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/22 12:52), Fujii Masao wrote: On Tue, Oct 22, 2013 at 12:47 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Oct 21, 2013 at 4:40 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/19 14:58), Amit Kapila wrote: On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: In actual, many of NoSQLs use snappy for purpose of higher performance. Okay, you can also check the results with snappy algorithm, but don't just rely completely on snappy for this patch, you might want to think of another alternative for this patch. So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Yes, it will be also good for future improvement. But I think WAL compression for disaster recovery system should be need in walsender and walreceiver proccess, and it is propety architecture for DR system. Higher compression ratio with high CPU usage algorithm in FPW might affect bad for perfomance in master server. This is true, thats why there is a discussion for pluggable API for compression of WAL, we should try to choose best algorithm from the available choices. Even after that I am not sure it works same for all kind of loads, so user will have option to completely disable it as well. If we can set compression algorithm in walsender and walreciever, performance is same as before or better, and WAL send performance will be better. Do you mean to say that walsender should compress the data before sending and then walreceiver will decompress it, if yes then won't it add extra overhead on standby, or do you think as walreceiver has to read less data from socket, so it will compensate for it. I think may be we should consider this if the test results are good, but lets not try to do this until the current patch proves that such mechanism is good for WAL compression. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
(2013/10/19 14:58), Amit Kapila wrote: On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I think in general also snappy is mostly preferred for it's low CPU usage not for compression, but overall my vote is also for snappy. I think low CPU usage is the best important factor in WAL compression. It is because WAL write is sequencial write, so few compression ratio improvement cannot change PostgreSQL's performance, and furthermore raid card with writeback feature. Furthermore PG executes programs by single proccess, high CPU usage compression algorithm will cause lessor performance. I found compression algorithm test in HBase. I don't read detail, but it indicates snnapy algorithm gets best performance. http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of The dataset used for performance is quite different from the data which we are talking about here (WAL). These are the scores for a data which consist of 700kB rows, each containing a binary image data. They probably won’t apply to things like numeric or text data. Yes, you are right. We need testing about compression algorithm in WAL write. I think it is necessary to make best efforts in community than I do the best choice with strict test. Sure, it is good to make effort to select the best algorithm, but if you are combining this patch with inclusion of new compression algorithm in PG, it can only make the patch to take much longer time. I think if our direction is specifically decided, it is easy to make the patch. Complession patch's direction isn't still become clear, it will be a troublesome patch which is like sync-rep patch. In general, my thinking is that we should prefer compression to reduce IO (WAL volume), because reducing WAL volume has other benefits as well like sending it to subscriber nodes. I think it will help cases where due to less n/w bandwidth, the disk allocated for WAL becomes full due to high traffic on master and then users need some alternative methods to handle such situations. Do you talk about archiving WAL file? It can easy to reduce volume that we set and add compression command with copy command at archive_command. I think many users would like to use a method which can reduce WAL volume and the users which don't find it enough useful in their environments due to decrease in TPS or not significant reduction in WAL have the option to disable it. I favor to select compression algorithm for higher performance. If we need to compress WAL file more, in spite of lessor performance, we can change archive copy command with high compression algorithm and add documents that how to compress archive WAL files at archive_command. Does it wrong? In actual, many of NoSQLs use snappy for purpose of higher performance. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Mon, Oct 21, 2013 at 4:40 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/19 14:58), Amit Kapila wrote: On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: In general, my thinking is that we should prefer compression to reduce IO (WAL volume), because reducing WAL volume has other benefits as well like sending it to subscriber nodes. I think it will help cases where due to less n/w bandwidth, the disk allocated for WAL becomes full due to high traffic on master and then users need some alternative methods to handle such situations. Do you talk about archiving WAL file? One of the points what I am talking about is sending data over network to subscriber nodes for streaming replication and another is WAL in pg_xlog. Both scenario's get benefited if there is is WAL volume. It can easy to reduce volume that we set and add compression command with copy command at archive_command. Okay. I think many users would like to use a method which can reduce WAL volume and the users which don't find it enough useful in their environments due to decrease in TPS or not significant reduction in WAL have the option to disable it. I favor to select compression algorithm for higher performance. If we need to compress WAL file more, in spite of lessor performance, we can change archive copy command with high compression algorithm and add documents that how to compress archive WAL files at archive_command. Does it wrong? No, it is not wrong, but there are scenario's as mentioned above where less WAL volume can be beneficial. In actual, many of NoSQLs use snappy for purpose of higher performance. Okay, you can also check the results with snappy algorithm, but don't just rely completely on snappy for this patch, you might want to think of another alternative for this patch. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Oct 22, 2013 at 12:47 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Oct 21, 2013 at 4:40 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/19 14:58), Amit Kapila wrote: On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: In general, my thinking is that we should prefer compression to reduce IO (WAL volume), because reducing WAL volume has other benefits as well like sending it to subscriber nodes. I think it will help cases where due to less n/w bandwidth, the disk allocated for WAL becomes full due to high traffic on master and then users need some alternative methods to handle such situations. Do you talk about archiving WAL file? One of the points what I am talking about is sending data over network to subscriber nodes for streaming replication and another is WAL in pg_xlog. Both scenario's get benefited if there is is WAL volume. It can easy to reduce volume that we set and add compression command with copy command at archive_command. Okay. I think many users would like to use a method which can reduce WAL volume and the users which don't find it enough useful in their environments due to decrease in TPS or not significant reduction in WAL have the option to disable it. I favor to select compression algorithm for higher performance. If we need to compress WAL file more, in spite of lessor performance, we can change archive copy command with high compression algorithm and add documents that how to compress archive WAL files at archive_command. Does it wrong? No, it is not wrong, but there are scenario's as mentioned above where less WAL volume can be beneficial. In actual, many of NoSQLs use snappy for purpose of higher performance. Okay, you can also check the results with snappy algorithm, but don't just rely completely on snappy for this patch, you might want to think of another alternative for this patch. So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Oct 22, 2013 at 9:22 AM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Oct 22, 2013 at 12:47 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Oct 21, 2013 at 4:40 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/19 14:58), Amit Kapila wrote: On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: In actual, many of NoSQLs use snappy for purpose of higher performance. Okay, you can also check the results with snappy algorithm, but don't just rely completely on snappy for this patch, you might want to think of another alternative for this patch. So, our consensus is to introduce the hooks for FPW compression so that users can freely select their own best compression algorithm? We can also provide GUC for whether to enable WAL compression, which I think you are also planing to include based on some previous e-mails in this thread. You can consider my vote for this idea. However I think we should wait to see if anyone else have objection to this idea. Also, probably we need to implement at least one compression contrib module using that hook, maybe it's based on pglz or snappy. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/15 13:33), Amit Kapila wrote: Snappy is good mainly for un-compressible data, see the link below: http://www.postgresql.org/message-id/CAAZKuFZCOCHsswQM60ioDO_hk12tA7OG3YcJA8v=4yebmoa...@mail.gmail.com This result was gotten in ARM architecture, it is not general CPU. Please see detail document. http://www.reddit.com/r/programming/comments/1aim6s/lz4_extremely_fast_compression_algorithm/c8y0ew9 I think in general also snappy is mostly preferred for it's low CPU usage not for compression, but overall my vote is also for snappy. I found compression algorithm test in HBase. I don't read detail, but it indicates snnapy algorithm gets best performance. http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of The dataset used for performance is quite different from the data which we are talking about here (WAL). These are the scores for a data which consist of 700kB rows, each containing a binary image data. They probably won’t apply to things like numeric or text data. In fact, most of modern NoSQL storages use snappy. Because it has good performance and good licence(BSD license). I think it is bit difficult to prove that any one algorithm is best for all kind of loads. I think it is necessary to make best efforts in community than I do the best choice with strict test. Sure, it is good to make effort to select the best algorithm, but if you are combining this patch with inclusion of new compression algorithm in PG, it can only make the patch to take much longer time. In general, my thinking is that we should prefer compression to reduce IO (WAL volume), because reducing WAL volume has other benefits as well like sending it to subscriber nodes. I think it will help cases where due to less n/w bandwidth, the disk allocated for WAL becomes full due to high traffic on master and then users need some alternative methods to handle such situations. I think many users would like to use a method which can reduce WAL volume and the users which don't find it enough useful in their environments due to decrease in TPS or not significant reduction in WAL have the option to disable it. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Wed, Oct 16, 2013 at 01:42:34PM +0900, KONDO Mitsumasa wrote: (2013/10/15 22:01), k...@rice.edu wrote: Google's lz4 is also a very nice algorithm with 33% better compression performance than snappy and 2X the decompression performance in some benchmarks also with a bsd license: https://code.google.com/p/lz4/ If we judge only performance, we will select lz4. However, we should think another important factor which is software robustness, achievement, bug fix history, and etc... If we see unknown bugs, can we fix it or improve algorithm? It seems very difficult, because we only use it and don't understand algorihtms. Therefore, I think that we had better to select robust and having more user software. Regards, -- Mitsumasa KONDO NTT Open Source Software Hi, Those are all very good points. lz4 however is being used by Hadoop. It is implemented natively in the Linux 3.11 kernel and the BSD version of the ZFS filesystem supports the lz4 algorithm for on-the-fly compression. With more and more CPU cores available in modern system, using an algorithm with very fast decompression speeds can make storing data, even in memory, in a compressed form can reduce space requirements in exchange for a higher CPU cycle cost. The ability to make those sorts of trade-offs can really benefit from a plug-able compression algorithm interface. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
(2013/10/15 13:33), Amit Kapila wrote: Snappy is good mainly for un-compressible data, see the link below: http://www.postgresql.org/message-id/CAAZKuFZCOCHsswQM60ioDO_hk12tA7OG3YcJA8v=4yebmoa...@mail.gmail.com This result was gotten in ARM architecture, it is not general CPU. Please see detail document. http://www.reddit.com/r/programming/comments/1aim6s/lz4_extremely_fast_compression_algorithm/c8y0ew9 I found compression algorithm test in HBase. I don't read detail, but it indicates snnapy algorithm gets best performance. http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of In fact, most of modern NoSQL storages use snappy. Because it has good performance and good licence(BSD license). I think it is bit difficult to prove that any one algorithm is best for all kind of loads. I think it is necessary to make best efforts in community than I do the best choice with strict test. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Oct 15, 2013 at 03:11:22PM +0900, KONDO Mitsumasa wrote: (2013/10/15 13:33), Amit Kapila wrote: Snappy is good mainly for un-compressible data, see the link below: http://www.postgresql.org/message-id/CAAZKuFZCOCHsswQM60ioDO_hk12tA7OG3YcJA8v=4yebmoa...@mail.gmail.com This result was gotten in ARM architecture, it is not general CPU. Please see detail document. http://www.reddit.com/r/programming/comments/1aim6s/lz4_extremely_fast_compression_algorithm/c8y0ew9 I found compression algorithm test in HBase. I don't read detail, but it indicates snnapy algorithm gets best performance. http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of In fact, most of modern NoSQL storages use snappy. Because it has good performance and good licence(BSD license). I think it is bit difficult to prove that any one algorithm is best for all kind of loads. I think it is necessary to make best efforts in community than I do the best choice with strict test. Regards, -- Mitsumasa KONDO NTT Open Source Software Center Google's lz4 is also a very nice algorithm with 33% better compression performance than snappy and 2X the decompression performance in some benchmarks also with a bsd license: https://code.google.com/p/lz4/ Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
(2013/10/15 22:01), k...@rice.edu wrote: Google's lz4 is also a very nice algorithm with 33% better compression performance than snappy and 2X the decompression performance in some benchmarks also with a bsd license: https://code.google.com/p/lz4/ If we judge only performance, we will select lz4. However, we should think another important factor which is software robustness, achievement, bug fix history, and etc... If we see unknown bugs, can we fix it or improve algorithm? It seems very difficult, because we only use it and don't understand algorihtms. Therefore, I think that we had better to select robust and having more user software. Regards, -- Mitsumasa KONDO NTT Open Source Software -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
(2013/10/13 0:14), Amit Kapila wrote: On Fri, Oct 11, 2013 at 10:36 PM, Andres Freund and...@2ndquadrant.com wrote: But maybe pglz is just not a good fit for this, it really isn't a very good algorithm in this day and aage. +1. This compression algorithm is needed more faster than pglz which is like general compression algorithm, to avoid the CPU bottle-neck. I think pglz doesn't have good performance, and it is like fossil compression algorithm. So we need to change latest compression algorithm for more better future. Do you think that if WAL reduction or performance with other compression algorithm (for ex. snappy) is better, then chances of getting the new compression algorithm in postresql will be more? Latest compression algorithms papers(also snappy) have indecated. I think it is enough to select algorithm. It may be also good work in postgres. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Oct 15, 2013 at 6:30 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/13 0:14), Amit Kapila wrote: On Fri, Oct 11, 2013 at 10:36 PM, Andres Freund and...@2ndquadrant.com wrote: But maybe pglz is just not a good fit for this, it really isn't a very good algorithm in this day and aage. +1. This compression algorithm is needed more faster than pglz which is like general compression algorithm, to avoid the CPU bottle-neck. I think pglz doesn't have good performance, and it is like fossil compression algorithm. So we need to change latest compression algorithm for more better future. Do you think that if WAL reduction or performance with other compression algorithm (for ex. snappy) is better, then chances of getting the new compression algorithm in postresql will be more? Latest compression algorithms papers(also snappy) have indecated. I think it is enough to select algorithm. It may be also good work in postgres. Snappy is good mainly for un-compressible data, see the link below: http://www.postgresql.org/message-id/CAAZKuFZCOCHsswQM60ioDO_hk12tA7OG3YcJA8v=4yebmoa...@mail.gmail.com I think it is bit difficult to prove that any one algorithm is best for all kind of loads. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 11/10/13 19:06, Andres Freund wrote: On 2013-10-11 09:22:50 +0530, Amit Kapila wrote: I think it will be difficult to prove by using any compression algorithm, that it compresses in most of the scenario's. In many cases it can so happen that the WAL will also not be reduced and tps can also come down if the data is non-compressible, because any compression algorithm will have to try to compress the data and it will burn some cpu for that, which inturn will reduce tps. Then those concepts maybe aren't such a good idea after all. Storing lots of compressible data in an uncompressed fashion isn't an all that common usecase. I most certainly don't want postgres to optimize for blank padded data, especially if it can hurt other scenarios. Just not enough benefit. That said, I actually have relatively high hopes for compressing full page writes. There often enough is lot of repetitiveness between rows on the same page that it should be useful outside of such strange scenarios. But maybe pglz is just not a good fit for this, it really isn't a very good algorithm in this day and aage. Hm,. There is a clear benefit for compressible data and clearly no benefit from incompressible data.. how about letting autovacuum taste the compressibillity of pages on per relation/index basis and set a flag that triggers this functionality where it provides a benefit? not hugely more magical than figuring out wether the data ends up in the heap or in a toast table as it is now. -- Jesper -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Oct 11, 2013 at 10:36 PM, Andres Freund and...@2ndquadrant.com wrote: On 2013-10-11 09:22:50 +0530, Amit Kapila wrote: I think it will be difficult to prove by using any compression algorithm, that it compresses in most of the scenario's. In many cases it can so happen that the WAL will also not be reduced and tps can also come down if the data is non-compressible, because any compression algorithm will have to try to compress the data and it will burn some cpu for that, which inturn will reduce tps. Then those concepts maybe aren't such a good idea after all. Storing lots of compressible data in an uncompressed fashion isn't an all that common usecase. I most certainly don't want postgres to optimize for blank padded data, especially if it can hurt other scenarios. Just not enough benefit. That said, I actually have relatively high hopes for compressing full page writes. There often enough is lot of repetitiveness between rows on the same page that it should be useful outside of such strange scenarios. But maybe pglz is just not a good fit for this, it really isn't a very good algorithm in this day and aage. Do you think that if WAL reduction or performance with other compression algorithm (for ex. snappy) is better, then chances of getting the new compression algorithm in postresql will be more? Wouldn't it be okay, if we have GUC to enable it and have pluggable api for calling compression method, with this we can even include other compression algorithm's if they proved to be good and reduce the dependency of this patch on inclusion of new compression methods in postgresql? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 10 October 2013 23:06 Fujii Masao wrote: On Wed, Oct 9, 2013 at 1:35 PM, Haribabu kommi haribabu.ko...@huawei.com wrote: Thread-1 Threads-2 Head code FPW compressHead code FPW compress Pgbench-org 5min138(0.24GB) 131(0.04GB) 160(0.28GB) 163(0.05GB) Pgbench-1000 5min 140(0.29GB) 128(0.03GB) 160(0.33GB) 162(0.02GB) Pgbench-org 15min 141(0.59GB) 136(0.12GB) 160(0.65GB) 162(0.14GB) Pgbench-1000 15min 138(0.81GB) 134(0.11GB) 159(0.92GB) 162(0.18GB) Pgbench-org - original pgbench Pgbench-1000 - changed pgbench with a record size of 1000. This means that you changed the data type of pgbench_accounts.filler to char(1000)? Yes, I changed the filler column as char(1000). Regards, Hari babu. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2013-10-11 09:22:50 +0530, Amit Kapila wrote: I think it will be difficult to prove by using any compression algorithm, that it compresses in most of the scenario's. In many cases it can so happen that the WAL will also not be reduced and tps can also come down if the data is non-compressible, because any compression algorithm will have to try to compress the data and it will burn some cpu for that, which inturn will reduce tps. Then those concepts maybe aren't such a good idea after all. Storing lots of compressible data in an uncompressed fashion isn't an all that common usecase. I most certainly don't want postgres to optimize for blank padded data, especially if it can hurt other scenarios. Just not enough benefit. That said, I actually have relatively high hopes for compressing full page writes. There often enough is lot of repetitiveness between rows on the same page that it should be useful outside of such strange scenarios. But maybe pglz is just not a good fit for this, it really isn't a very good algorithm in this day and aage. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
Hi, I did a partial review of this patch, wherein I focused on the patch and the code itself, as I saw other contributors already did some testing on it, so that we know it applies cleanly and work to some good extend. Fujii Masao masao.fu...@gmail.com writes: In this patch, full_page_writes accepts three values: on, compress, and off. When it's set to compress, the full page image is compressed before it's inserted into the WAL buffers. Code review : In full_page_writes_str() why are you returning unrecognized rather than doing an ELOG(ERROR, …) for this unexpected situation? The code switches to compression (or trying to) when the following condition is met: + if (fpw = FULL_PAGE_WRITES_COMPRESS) + { + rdt-data = CompressBackupBlock(page, BLCKSZ - bkpb-hole_length, (rdt-len)); We have + typedef enum FullPageWritesLevel + { + FULL_PAGE_WRITES_OFF = 0, + FULL_PAGE_WRITES_COMPRESS, + FULL_PAGE_WRITES_ON + } FullPageWritesLevel; + #define FullPageWritesIsNeeded(fpw) (fpw = FULL_PAGE_WRITES_COMPRESS) I don't much like using the = test against and ENUM and I'm not sure I understand the intention you have here. It somehow looks like a typo and disagrees with the macro. What about using the FullPageWritesIsNeeded macro, and maybe rewriting the macro as #define FullPageWritesIsNeeded(fpw) \ (fpw == FULL_PAGE_WRITES_COMPRESS || fpw == FULL_PAGE_WRITES_ON) Also, having on imply compress is a little funny to me. Maybe we should just finish our testing and be happy to always compress the full page writes. What would the downside be exactly (on buzy IO system writing less data even if needing more CPU will be the right trade-off). I like that you're checking the savings of the compressed data with respect to the uncompressed data and cancel the compression if there's no gain. I wonder if your test accounts for enough padding and headers though given the results we saw in other tests made in this thread. Why do we have both the static function full_page_writes_str() and the macro FullPageWritesStr, with two different implementations issuing either true and false or on and off? ! unsignedhole_offset:15, /* number of bytes before hole */ ! flags:2,/* state of a backup block, see below */ ! hole_length:15; /* number of bytes in hole */ I don't understand that. I wanted to use that patch as a leverage to smoothly discover the internals of our WAL system but won't have the time to do that here. That said, I don't even know that C syntax. + #define BKPBLOCK_UNCOMPRESSED 0 /* uncompressed */ + #define BKPBLOCK_COMPRESSED 1 /* comperssed */ There's a typo in the comment above. [time required to replay WAL generated during running pgbench] 61s (on) 1209911 transactions were replayed, recovery speed: 19834.6 transactions/sec 39s (compress) 1445446 transactions were replayed, recovery speed: 37062.7 transactions/sec 37s (off) 1629235 transactions were replayed, recovery speed: 44033.3 transactions/sec How did you get those numbers ? pg_basebackup before the test and archiving, then a PITR maybe? Is it possible to do the same test with the same number of transactions to replay, I guess using the -t parameter rather than the -T one for this testing. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Tue, Oct 8, 2013 at 10:07 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: Hi, I tested dbt-2 benchmark in single instance and synchronous replication. Thanks! Unfortunately, my benchmark results were not seen many differences... * Test server Server: HP Proliant DL360 G7 CPU:Xeon E5640 2.66GHz (1P/4C) Memory: 18GB(PC3-10600R-9) Disk: 146GB(15k)*4 RAID1+0 RAID controller: P410i/256MB * Result ** Single instance** | NOTPM | 90%tile | Average | S.Deviation +---+-+-+- no-patched | 3322.93 | 20.469071 | 5.882 | 10.478 patched | 3315.42 | 19.086105 | 5.669 | 9.108 ** Synchronous Replication ** | NOTPM | 90%tile | Average | S.Deviation +---+-+-+- no-patched | 3275.55 | 21.332866 | 6.072 | 9.882 patched | 3318.82 | 18.141807 | 5.757 | 9.829 ** Detail of result http://pgstatsinfo.projects.pgfoundry.org/DBT-2_Fujii_patch/ I set full_page_write = compress with Fujii's patch in DBT-2. But it does not seems to effect for eleminating WAL files. Could you let me know how much WAL records were generated during each benchmark? I think that this benchmark result clearly means that the patch has only limited effects in the reduction of WAL volume and the performance improvement unless the database contains highly-compressible data like pgbench_accounts.filler. But if we can use other compression algorithm, maybe we can reduce WAL volume very much. I'm not sure what algorithm is good for WAL compression, though. It might be better to introduce the hook for compression of FPW so that users can freely use their compression module, rather than just using pglz_compress(). Thought? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Wed, Oct 9, 2013 at 1:35 PM, Haribabu kommi haribabu.ko...@huawei.com wrote: On 08 October 2013 18:42 KONDO Mitsumasa wrote: (2013/10/08 20:13), Haribabu kommi wrote: I will test with sync_commit=on mode and provide the test results. OK. Thanks! Pgbench test results with synchronous_commit mode as on. Thanks! Thread-1 Threads-2 Head code FPW compressHead code FPW compress Pgbench-org 5min138(0.24GB) 131(0.04GB) 160(0.28GB) 163(0.05GB) Pgbench-1000 5min 140(0.29GB) 128(0.03GB) 160(0.33GB) 162(0.02GB) Pgbench-org 15min 141(0.59GB) 136(0.12GB) 160(0.65GB) 162(0.14GB) Pgbench-1000 15min 138(0.81GB) 134(0.11GB) 159(0.92GB) 162(0.18GB) Pgbench-org - original pgbench Pgbench-1000 - changed pgbench with a record size of 1000. This means that you changed the data type of pgbench_accounts.filler to char(1000)? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Oct 11, 2013 at 1:20 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hi, I did a partial review of this patch, wherein I focused on the patch and the code itself, as I saw other contributors already did some testing on it, so that we know it applies cleanly and work to some good extend. Thanks a lot! In full_page_writes_str() why are you returning unrecognized rather than doing an ELOG(ERROR, …) for this unexpected situation? It's because the similar functions 'wal_level_str' and 'dbState' also return 'unrecognized' in the unexpected situation. I just implemented full_page_writes_str() in the same manner. If we do an elog(ERROR) in that case, pg_xlogdump would fail to dump the 'broken' (i.e., unrecognized fpw is set) WAL file. I think that some users want to use pg_xlogdump to investigate the broken WAL file, so doing an elog(ERROR) seems not good to me. The code switches to compression (or trying to) when the following condition is met: + if (fpw = FULL_PAGE_WRITES_COMPRESS) + { + rdt-data = CompressBackupBlock(page, BLCKSZ - bkpb-hole_length, (rdt-len)); We have + typedef enum FullPageWritesLevel + { + FULL_PAGE_WRITES_OFF = 0, + FULL_PAGE_WRITES_COMPRESS, + FULL_PAGE_WRITES_ON + } FullPageWritesLevel; + #define FullPageWritesIsNeeded(fpw) (fpw = FULL_PAGE_WRITES_COMPRESS) I don't much like using the = test against and ENUM and I'm not sure I understand the intention you have here. It somehow looks like a typo and disagrees with the macro. I thought that FPW should be compressed only when full_page_writes is set to 'compress' or 'off'. That is, 'off' implies a compression. When it's set to 'off', FPW is basically not generated, so there is no need to call CompressBackupBlock() in that case. But only during online base backup, FPW is forcibly generated even when it's set to 'off'. So I used the check fpw = FULL_PAGE_WRITES_COMPRESS there. What about using the FullPageWritesIsNeeded macro, and maybe rewriting the macro as #define FullPageWritesIsNeeded(fpw) \ (fpw == FULL_PAGE_WRITES_COMPRESS || fpw == FULL_PAGE_WRITES_ON) I'm OK to change the macro so that the = test is not used. Also, having on imply compress is a little funny to me. Maybe we should just finish our testing and be happy to always compress the full page writes. What would the downside be exactly (on buzy IO system writing less data even if needing more CPU will be the right trade-off). on doesn't imply compress. When full_page_writes is set to on, FPW is not compressed at all. I like that you're checking the savings of the compressed data with respect to the uncompressed data and cancel the compression if there's no gain. I wonder if your test accounts for enough padding and headers though given the results we saw in other tests made in this thread. I'm afraid that the patch has only limited effects in WAL reduction and performance improvement unless the database contains highly-compressible data like large blank characters column. It really depends on the contents of the database. So, obviously FPW compression should not be the default. Maybe we can treat it as just tuning knob. Why do we have both the static function full_page_writes_str() and the macro FullPageWritesStr, with two different implementations issuing either true and false or on and off? First I was thinking to use on and off because they are often used as the setting value of boolean GUC. But unfortunately the existing pg_xlogdump uses true and false to show the value of full_page_writes in WAL. To avoid breaking the backward compatibility, I implmented the true/false version of function. I'm really not sure how many people want such a compatibility of pg_xlogdump, though. ! unsignedhole_offset:15, /* number of bytes before hole */ ! flags:2,/* state of a backup block, see below */ ! hole_length:15; /* number of bytes in hole */ I don't understand that. I wanted to use that patch as a leverage to smoothly discover the internals of our WAL system but won't have the time to do that here. We need the flag indicating whether each FPW is compressed or not. If no such a flag exists in WAL, the standby cannot determine whether it should decompress each FPW or not, and then cannot replay the WAL containing FPW properly. That is, I just used a 'space' in the header of FPW to have such a flag. That said, I don't even know that C syntax. The struct 'ItemIdData' uses the same C syntax. + #define BKPBLOCK_UNCOMPRESSED 0 /* uncompressed */ + #define BKPBLOCK_COMPRESSED 1 /* comperssed */ There's a typo in the comment above. Yep. [time required to replay WAL generated during running pgbench] 61s (on) 1209911 transactions were replayed, recovery speed: 19834.6 transactions/sec 39s
Re: [HACKERS] Compression of full-page-writes
Hi, On 2013-10-11 03:44:01 +0900, Fujii Masao wrote: I'm afraid that the patch has only limited effects in WAL reduction and performance improvement unless the database contains highly-compressible data like large blank characters column. It really depends on the contents of the database. So, obviously FPW compression should not be the default. Maybe we can treat it as just tuning knob. Have you tried using lz4 (or snappy) instead of pglz? There's a patch adding it to pg in http://archives.postgresql.org/message-id/20130621000900.GA12425%40alap2.anarazel.de If this really is only a benefit in scenarios with lots of such data, I have to say I have my doubts about the benefits of the patch. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Oct 11, 2013 at 3:44 AM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Oct 11, 2013 at 1:20 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hi, I did a partial review of this patch, wherein I focused on the patch and the code itself, as I saw other contributors already did some testing on it, so that we know it applies cleanly and work to some good extend. Thanks a lot! In full_page_writes_str() why are you returning unrecognized rather than doing an ELOG(ERROR, …) for this unexpected situation? It's because the similar functions 'wal_level_str' and 'dbState' also return 'unrecognized' in the unexpected situation. I just implemented full_page_writes_str() in the same manner. If we do an elog(ERROR) in that case, pg_xlogdump would fail to dump the 'broken' (i.e., unrecognized fpw is set) WAL file. I think that some users want to use pg_xlogdump to investigate the broken WAL file, so doing an elog(ERROR) seems not good to me. The code switches to compression (or trying to) when the following condition is met: + if (fpw = FULL_PAGE_WRITES_COMPRESS) + { + rdt-data = CompressBackupBlock(page, BLCKSZ - bkpb-hole_length, (rdt-len)); We have + typedef enum FullPageWritesLevel + { + FULL_PAGE_WRITES_OFF = 0, + FULL_PAGE_WRITES_COMPRESS, + FULL_PAGE_WRITES_ON + } FullPageWritesLevel; + #define FullPageWritesIsNeeded(fpw) (fpw = FULL_PAGE_WRITES_COMPRESS) I don't much like using the = test against and ENUM and I'm not sure I understand the intention you have here. It somehow looks like a typo and disagrees with the macro. I thought that FPW should be compressed only when full_page_writes is set to 'compress' or 'off'. That is, 'off' implies a compression. When it's set to 'off', FPW is basically not generated, so there is no need to call CompressBackupBlock() in that case. But only during online base backup, FPW is forcibly generated even when it's set to 'off'. So I used the check fpw = FULL_PAGE_WRITES_COMPRESS there. What about using the FullPageWritesIsNeeded macro, and maybe rewriting the macro as #define FullPageWritesIsNeeded(fpw) \ (fpw == FULL_PAGE_WRITES_COMPRESS || fpw == FULL_PAGE_WRITES_ON) I'm OK to change the macro so that the = test is not used. Also, having on imply compress is a little funny to me. Maybe we should just finish our testing and be happy to always compress the full page writes. What would the downside be exactly (on buzy IO system writing less data even if needing more CPU will be the right trade-off). on doesn't imply compress. When full_page_writes is set to on, FPW is not compressed at all. I like that you're checking the savings of the compressed data with respect to the uncompressed data and cancel the compression if there's no gain. I wonder if your test accounts for enough padding and headers though given the results we saw in other tests made in this thread. I'm afraid that the patch has only limited effects in WAL reduction and performance improvement unless the database contains highly-compressible data like large blank characters column. It really depends on the contents of the database. So, obviously FPW compression should not be the default. Maybe we can treat it as just tuning knob. Why do we have both the static function full_page_writes_str() and the macro FullPageWritesStr, with two different implementations issuing either true and false or on and off? First I was thinking to use on and off because they are often used as the setting value of boolean GUC. But unfortunately the existing pg_xlogdump uses true and false to show the value of full_page_writes in WAL. To avoid breaking the backward compatibility, I implmented the true/false version of function. I'm really not sure how many people want such a compatibility of pg_xlogdump, though. ! unsignedhole_offset:15, /* number of bytes before hole */ ! flags:2,/* state of a backup block, see below */ ! hole_length:15; /* number of bytes in hole */ I don't understand that. I wanted to use that patch as a leverage to smoothly discover the internals of our WAL system but won't have the time to do that here. We need the flag indicating whether each FPW is compressed or not. If no such a flag exists in WAL, the standby cannot determine whether it should decompress each FPW or not, and then cannot replay the WAL containing FPW properly. That is, I just used a 'space' in the header of FPW to have such a flag. That said, I don't even know that C syntax. The struct 'ItemIdData' uses the same C syntax. + #define BKPBLOCK_UNCOMPRESSED 0 /* uncompressed */ + #define BKPBLOCK_COMPRESSED 1 /* comperssed */ There's a typo in the comment above. Yep. [time required to replay WAL generated during running pgbench] 61s
Re: [HACKERS] Compression of full-page-writes
On Fri, Oct 11, 2013 at 8:35 AM, Andres Freund and...@2ndquadrant.com wrote: Hi, On 2013-10-11 03:44:01 +0900, Fujii Masao wrote: I'm afraid that the patch has only limited effects in WAL reduction and performance improvement unless the database contains highly-compressible data like large blank characters column. It really depends on the contents of the database. So, obviously FPW compression should not be the default. Maybe we can treat it as just tuning knob. Have you tried using lz4 (or snappy) instead of pglz? There's a patch adding it to pg in http://archives.postgresql.org/message-id/20130621000900.GA12425%40alap2.anarazel.de Yeah, it's worth checking them! Will do that. If this really is only a benefit in scenarios with lots of such data, I have to say I have my doubts about the benefits of the patch. Yep, maybe the patch needs to be redesigned. Currently in the patch compression is performed per FPW, i.e., the size of data to compress is just 8KB. If we can increase the size of data to compress, we might be able to improve the compression ratio. For example, by storing all outstanding WAL data temporarily in local buffer, compressing them, and then storing the compressed WAL data to WAL buffers. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On Fri, Oct 11, 2013 at 5:05 AM, Andres Freund and...@2ndquadrant.com wrote: Hi, On 2013-10-11 03:44:01 +0900, Fujii Masao wrote: I'm afraid that the patch has only limited effects in WAL reduction and performance improvement unless the database contains highly-compressible data like large blank characters column. It really depends on the contents of the database. So, obviously FPW compression should not be the default. Maybe we can treat it as just tuning knob. Have you tried using lz4 (or snappy) instead of pglz? There's a patch adding it to pg in http://archives.postgresql.org/message-id/20130621000900.GA12425%40alap2.anarazel.de If this really is only a benefit in scenarios with lots of such data, I have to say I have my doubts about the benefits of the patch. I think it will be difficult to prove by using any compression algorithm, that it compresses in most of the scenario's. In many cases it can so happen that the WAL will also not be reduced and tps can also come down if the data is non-compressible, because any compression algorithm will have to try to compress the data and it will burn some cpu for that, which inturn will reduce tps. As this patch is giving a knob to user to turn compression on/off, so users can decide if they want such benefit. Now some users can say that they have no idea, how or what kind of data will be there in their databases, so such kind of users should not use this option, but on the other side some users know that they have similar pattern of data, so they can get benefit out of such optimisations. For example in telecom industry, i have seen that they have lot of data as CDR's (call data records) in their HLR databases for which the data records will be different but of same pattern. Being said above, I think both this patch and my patch WAL reduction for Update (https://commitfest.postgresql.org/action/patch_view?id=1209) are using same technique for WAL compression and can lead to similar consequences in different ways. So I suggest to have unified method to enable WAL Compression for both the patches. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
(2013/10/08 17:33), Haribabu kommi wrote: The checkpoint_timeout and checkpoint_segments are increased to make sure no checkpoint happens during the test run. Your setting is easy occurred checkpoint in checkpoint_segments = 256. I don't know number of disks in your test server, in my test server which has 4 magnetic disk(1.5k rpm), postgres generates 50 - 100 WALs per minutes. And I cannot understand your setting which is sync_commit = off. This setting tend to cause cpu bottle-neck and data-loss. It is not general in database usage. Therefore, your test is not fair comparison for Fujii's patch. Going back to my DBT-2 benchmark, I have not got good performance (almost same performance). So I am checking hunk, my setting, or something wrong in Fujii's patch now. I am going to try to send test result tonight. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 2013-09-11 12:43:21 +0200, Andres Freund wrote: On 2013-09-11 19:39:14 +0900, Fujii Masao wrote: * Benchmark pgbench -c 32 -j 4 -T 900 -M prepared scaling factor: 100 checkpoint_segments = 1024 checkpoint_timeout = 5min (every checkpoint during benchmark were triggered by checkpoint_timeout) * Result [tps] 1344.2 (full_page_writes = on) 1605.9 (compress) 1810.1 (off) [the amount of WAL generated during running pgbench] 4422 MB (on) 1517 MB (compress) 885 MB (off) [time required to replay WAL generated during running pgbench] 61s (on) 1209911 transactions were replayed, recovery speed: 19834.6 transactions/sec 39s (compress) 1445446 transactions were replayed, recovery speed: 37062.7 transactions/sec 37s (off) 1629235 transactions were replayed, recovery speed: 44033.3 transactions/sec ISTM for those benchmarks you should use an absolute number of transactions, not one based on elapsed time. Otherwise the comparison isn't really meaningful. I really think we need to see recovery time benchmarks with a constant amount of transactions to judge this properly. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compression of full-page-writes
On 08 October 2013 15:22 KONDO Mitsumasa wrote: (2013/10/08 17:33), Haribabu kommi wrote: The checkpoint_timeout and checkpoint_segments are increased to make sure no checkpoint happens during the test run. Your setting is easy occurred checkpoint in checkpoint_segments = 256. I don't know number of disks in your test server, in my test server which has 4 magnetic disk(1.5k rpm), postgres generates 50 - 100 WALs per minutes. A manual checkpoint is executed before starting of the test and verified as no checkpoint happened during the run by increasing the checkpoint_warning. And I cannot understand your setting which is sync_commit = off. This setting tend to cause cpu bottle-neck and data-loss. It is not general in database usage. Therefore, your test is not fair comparison for Fujii's patch. I chosen the sync_commit=off mode because it generates more tps, thus it increases the volume of WAL. I will test with sync_commit=on mode and provide the test results. Regards, Hari babu. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers