On Sat, Jul 25, 2015 at 10:12 AM, Amit Kapila <amit.kapil...@gmail.com>
wrote:

>
> >>
> >
> > Numbers look impressive and definitely shows that the idea is worth
> pursuing. I tried patch on my laptop. Unfortunately, at least for 4 and 8
> clients, I did not see any improvement.
> >
>
> I can't help in this because I think we need somewhat
> bigger m/c to test the impact of patch.
>
>
I understand. IMHO it will be a good idea though to ensure that the patch
does not cause regression for other setups such as a less powerful machine
or while running with lower number of clients.


>
> I was telling that fact even without my patch. Basically I have
> tried by commenting ProcArrayLock in ProcArrayEndTransaction.
>
>
I did not get that. You mean the TPS is same if you run with commenting out
ProcArrayLock in ProcArrayEndTransaction? Is that safe to do?


> > But those who don't get the lock will sleep and hence the contention is
> moved somewhere else, at least partially.
> >
>
> Sure, if contention is reduced at one place it will move
> to next lock.
>

What I meant was that the lock may not show up in the contention because
all but one processes now sleep for their work to do be done by the group
leader. So the total time spent in the critical section remains the same,
but not shown in the profile. Sure, your benchmark numbers show this is
still better than all processes contending for the lock.


>
>
> No, autovacuum generates I/O due to which sometimes there
> is more variation in Write tests.
>

Sure, but on an average does it still show similar improvements? Or does
the test becomes IO bound and hence the bottleneck shifts somewhere else?
Can you please post those numbers as well when you get chance?


>
> I can do something like that if others also agree with this new
> API in LWLock series, but personally I don't think LWLock.c is
> the right place to expose API for this work.  Broadly the work
> we are doing can be thought of below sub-tasks.
>
> 1. Advertise each backend's xid.
> 2. Push all backend's except one on global list.
> 3. wait till some-one wakes and check if the xid is cleared,
>    repeat untll the xid is clear
> 4. Acquire the lock
> 5. Pop all the backend's and clear each one's xid and used
>    their published xid to advance global latestCompleteXid.
> 6. Release Lock
> 7. Wake all the processes waiting for their xid to be cleared
>    and before waking mark that Xid of the backend is clear.
>
> So among these only step 2 can be common among different
> algorithms, other's need some work specific to each optimization.
>
>
Right, but if we could encapsulate that work in a function that just needs
to work on some shared memory, I think we can build such infrastructure.
Its possible though a more elaborate infrastructure is needed that just one
function pointer. For example, in this case, we also want to set the
latestCompletedXid after clearing xids for all pending processes.


>
> >
> > Regarding the patch, the compare-and-exchange function calls that you've
> used would work only for 64-bit machines, right? You would need to use
> equivalent 32-bit calls on a 32-bit machine.
> >
>
> I thought that internal API will automatically take care of it,
> example for msvc it uses _InterlockedCompareExchange64
> which if doesn't work on 32-bit systems or is not defined, then
> we have to use 32-bit version, but I am not certain about
> that fact.
>
>
Hmm. The pointer will be a 32-bit field on a 32-bit machine. I don't know
how exchanging that with 64-bit integer be safe.

Thanks,
Pavan


-- 
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to