On Mon, Oct 5, 2015 at 6:34 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Fri, Sep 11, 2015 at 8:01 PM, Amit Kapila <amit.kapil...@gmail.com> > wrote: >> >> >> If I am not wrong we need 1048576 number of transactions difference >> for each record to make each CLOG access a disk access, so if we >> increment XID counter by 100, then probably every 10000th (or multiplier >> of 10000) transaction would go for disk access. >> >> The number 1048576 is derived by below calc: >> #define CLOG_XACTS_PER_BYTE 4 >> #define CLOG_XACTS_PER_PAGE (BLCKSZ * CLOG_XACTS_PER_BYTE) >> > >> Transaction difference required for each transaction to go for disk >> access: >> CLOG_XACTS_PER_PAGE * num_clog_buffers. >> > > > That guarantees that every xid occupies its own 32-contiguous-pages chunk > of clog. > > But clog pages are not pulled in and out in 32-page chunks, but one page > chunks. So you would only need 32,768 differences to get every real > transaction to live on its own clog page, which means every look up of a > different real transaction would have to do a page replacement. > Agreed, but that doesn't effect the test result with the test done above. > (I think your references to disk access here are misleading. Isn't the > issue here the contention on the lock that controls the page replacement, > not the actual IO?) > > The point is that if there is no I/O needed, then all the read-access for transaction status will just use Shared locks, however if there is an I/O, then it would need an Exclusive lock. > I've attached a patch that allows you set the guc "JJ_xid",which makes it > burn the given number of xids every time one new one is asked for. (The > patch introduces lots of other stuff as well, but I didn't feel like > ripping the irrelevant parts out--if you don't set any of the other gucs it > introduces from their defaults, they shouldn't cause you trouble.) I think > there are other tools around that do the same thing, but this is the one I > know about. It is easy to drive the system into wrap-around shutdown with > this, so lowering autovacuum_vacuum_cost_delay is a good idea. > > Actually I haven't attached it, because then the commitfest app will list > it as the patch needing review, instead I've put it here > https://drive.google.com/file/d/0Bzqrh1SO9FcERV9EUThtT3pacmM/view?usp=sharing > > Thanks, I think probably this could also be used for testing. > I think reducing to every 100th access for transaction status as disk >> access >> is sufficient to prove that there is no regression with the patch for the >> screnario >> asked by Andres or do you think it is not? >> >> Now another possibility here could be that we try by commenting out fsync >> in CLOG path to see how much it impact the performance of this test and >> then for pgbench test. I am not sure there will be any impact because >> even >> every 100th transaction goes to disk access that is still less as compare >> WAL fsync which we have to perform for each transaction. >> > > You mentioned that your clog is not on ssd, but surely at this scale of > hardware, the hdd the clog is on has a bbu in front of it, no? > > Yes. > But I thought Andres' concern was not about fsync, but about the fact that > the SLRU does linear scans (repeatedly) of the buffers while holding the > control lock? At some point, scanning more and more buffers under the lock > is going to cause more contention than scanning fewer buffers and just > evicting a page will. > > Yes, at some point, that could matter, but I could not see the impact at 64 or 128 number of Clog buffers. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com