Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

Tomas Vondra Wed, 02 Nov 2016 10:19:22 -0700

On 11/02/2016 05:52 PM, Amit Kapila wrote:

On Wed, Nov 2, 2016 at 9:01 AM, Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:

On 11/01/2016 08:13 PM, Robert Haas wrote:


On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:


The one remaining thing is the strange zig-zag behavior, but that might
easily be a due to scheduling in kernel, or something else. I don't consider
it a blocker for any of the patches, though.


The only reason I could think of for that zig-zag behaviour is
frequent multiple clog page accesses and it could be due to below
reasons:

a. transaction and its subtransactions (IIRC, Dilip's case has one
main transaction and two subtransactions) can't fit into same page, in
which case the group_update optimization won't apply and I don't think
we can do anything for it.
b. In the same group, multiple clog pages are being accessed.  It is
not a likely scenario, but it can happen and we might be able to
improve a bit if that is happening.
c. The transactions at same time tries to update different clog page.
I think as mentioned upthread we can handle it by using slots an
allowing multiple groups to work together instead of a single group.

To check if there is any impact due to (a) or (b), I have added few
logs in code (patch - group_update_clog_v9_log). The log message
could be "all xacts are not on same page" or "Group contains
different pages".

Patch group_update_clog_v9_slots tries to address (c). So if there
is any problem due to (c), this patch should improve the situation.

Can you please try to run the test where you saw zig-zag behaviour
with both the patches separately? I think if there is anything due
to postgres, then you can see either one of the new log message or
performance will be improved, OTOH if we see same behaviour, then I
think we can probably assume it due to scheduler activity and move
on. Also one point to note here is that even when the performance is
down in that curve, it is equal to or better than HEAD.


Will do.

Based on the results with more client counts (increment by 6 clientsinstead of 36), I think this really looks like something unrelated toany of the patches - kernel, CPU, or something already present incurrent master.


The attached results show that:

(a) master shows the same zig-zag behavior - No idea why this wasn'tobserved on the previous runs.

(b) group_update actually seems to improve the situation, because theperformance keeps stable up to 72 clients, while on master thefluctuation starts way earlier.

I'll redo the tests with a newer kernel - this was on 3.10.x which iswhat Red Hat 7.2 uses, I'll try on 4.8.6. Then I'll try with the patchesyou submitted, if the 4.8.6 kernel does not help.


Overall, I'm convinced this issue is unrelated to the patches.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

Reply via email to