On Mon, Jan 20, 2014 at 9:49 PM, Robert Haas <robertmh...@gmail.com> wrote: > > I ran Heikki's test suit on latest master and latest master plus > pgrb_delta_encoding_v4.patch on a PPC64 machine, but the results > didn't look too good. The only tests where the WAL volume changed by > more than half a percent were the "one short and one long field, no > change" test, where it dropped by 17%, but at the expense of an > increase in duration of 38%; and the "hundred tiny fields, half > nulled" test, where it dropped by 2% without a change in runtime.
> Unfortunately, some of the tests where WAL didn't change significantly > took a runtime hit - in particular, "hundred tiny fields, half > changed" slowed down by 10% and "hundred tiny fields, all changed" by > 8%. I think this part of result is positive, as with earlier approaches here the dip was > 20%. Refer the result posted at link: http://www.postgresql.org/message-id/51366323.8070...@vmware.com > I've attached the full results in OpenOffice format. > Profiling the "one short and one long field, no change" test turns up > the following: > > 51.38% postgres pgrb_delta_encode > 23.58% postgres XLogInsert > 2.54% postgres heap_update > 1.09% postgres LWLockRelease > 0.90% postgres LWLockAcquire > 0.89% postgres palloc0 > 0.88% postgres log_heap_update > 0.84% postgres HeapTupleSatisfiesMVCC > 0.75% postgres ExecModifyTable > 0.73% postgres hash_search_with_hash_value > > Yipes. That's a lot more than I remember this costing before. And I > don't understand why I'm seeing such a large time hit on this test > where you actually saw a significant time *reduction*. One > possibility is that you may have been running with a default > checkpoint_segments value or one that's low enough to force > checkpointing activity during the test. I ran with > checkpoint_segments=300. I ran with checkpoint_segments = 128 and when I ran with v4, I also see similar WAL reduction as you are seeing, except that in my case runtime for both are almost similar (i think in your case disk writes are fast, so CPU overhead is more visible). I think the major difference in above test is due to below part of code: pgrb_find_match() { .. + /* if (match_chunk) + { + while (*ip == *hp) + { + matchlen++; + ip++; + hp++; + } + } */ } Basically if we don't go for longer match, then for test where most data ("one short and one long field, no change") is similar, it has to do below extra steps with no advantage: a. copy extra tags b. calculation for rolling hash c. finding the match I think here major cost is due to 'a', but others might also not be free. To confirm the theory, if we run the test by just un-commenting above code, there can be significant change in both WAL reduction and runtime for this test. I have one idea to avoid the overhead of step a) which is to combine the tags, means don't write the tag until it founds any un-matching data. When any un-matched data is found, then combine all the previously matched data and write it as one tag. This should eliminate the overhead due to step a. Can we think of anyway in which inspite of doing longer matches, we can retain the sanctity of this approach? One way could be to check if the match after chunk is long enough that it matches rest of the string, but I think it can create problems in some other cases. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers