Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

Amit Kapila Fri, 31 Jan 2014 00:07:34 -0800

On Fri, Jan 31, 2014 at 12:33 AM, Amit Kapila <[email protected]> wrote:
> On Thu, Jan 30, 2014 at 12:23 PM, Amit Kapila <[email protected]> wrote:
>> On Wed, Jan 29, 2014 at 8:13 PM, Heikki Linnakangas
>> <[email protected]> wrote:
>>
>> After basic verification of  back-to-pglz-like-delta-encoding-1, I will
>> take the data with both the patches and report the same.
>
> I have corrected the problems reported in back-to-pglz-like-delta-encoding-1
> and removed hindex from pgrb_delta_encoding_v6 and attached are
> new versions of both patches.
>
> I/O Reduction Data
> -----------------------------
> Non-Default settings
> autovacuum = off
> checkpoitnt_segments = 256
> checkpoint_timeout =15min
>
> Observations
> --------------------
> 1. With both the patches WAL reduction is similar i.e ~37% for
>     "one short and one long field, no change" and 12% for
>     "hundred tiny fields, half nulled"
> 2. With pgrb_delta_encoding_v7, there is ~19% CPU reduction for best
>     case "one short and one long field, no change".
> 3. With pgrb_delta_encoding_v7, there is approximately 8~9% overhead
>     for cases where there is no match
> 4. With pgrb_delta_encoding_v7, there is approximately 15~18% overhead
>     for "hundred tiny fields, half nulled" case
> 5. With back-to-pglz-like-delta-encoding-2, the data is mostly similar except
>     for "hundred tiny fields, half nulled" where CPU overhead is much more.
>
> The case ("hundred tiny fields, half nulled") where CPU overhead is visible
> is due to repetitive data and if take some random or different data, it will 
> not
> be there.


To verify this theory, I have added one new test which is almost similar to
"hundred tiny fields, half nulled", the difference is that it has
non-repetive string
 and the results are as below:

Unpatch
--------------
                    testname                       | wal_generated |
  duration
------------------------------------------------------+---------------+------------------
 nine short and one long field, thirty percent change |     698912496
| 12.1819660663605
 nine short and one long field, thirty percent change |     698906048
| 11.9409539699554
 nine short and one long field, thirty percent change |     698910904
| 11.9367880821228

Patch pgrb_delta_encoding_v7
------------------------------------------------

                       testname                       | wal_generated
|     duration
------------------------------------------------------+---------------+------------------
 nine short and one long field, thirty percent change |     559840840
| 11.6027710437775
 nine short and one long field, thirty percent change |     559829440
| 11.8239741325378
 nine short and one long field, thirty percent change |     560141352
| 11.6789472103119

Patch back-to-pglz-like-delta-encoding-2
----------------------------------------------------------

                      testname                       | wal_generated |
    duration
------------------------------------------------------+---------------+------------------
 nine short and one long field, thirty percent change |     544391432
| 12.3666560649872
 nine short and one long field, thirty percent change |     544378616
| 11.8833730220795
 nine short and one long field, thirty percent change |     544376888
| 11.9487581253052
(3 rows)


Basic idea of new test is that some part of tuple is unchanged and
other part is changed, here the unchanged part contains random string
rather than repetitive set of chars.
The new test is added with other tests in attached file.

Observation
-------------------
LZ like delta encoding has more WAL reduction and chunk wise encoding
has bit better CPU usage, but overall both are almost similar.

> I think the main reason for overhead is that we store last offset
> of matching data in history at front, so during match, it has to traverse back
> many times to find longest possible match and in real world it won't be the
> case that most of history entries contain same hash index, so it should not
> effect.

If we want to improve CPU usage for cases like "hundred tiny fields,
half nulled"
(which I think is not important), forming history table by traversing from end
rather than beginning, can serve the purpose, I have not tried it but I think
it can certainly help.

Do you think overall data is acceptable?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

wal-update-testsuite.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

Reply via email to