2009/9/14 Pierre Frédéric Caillaud <li...@peufeu.com>

>
> A little bit of a reply to Jeff's email about WALInsertLock.
>
> This patch instruments LWLocks, it is controlled with the following
> #define's in lwlock.c :
>
> LWLOCK_STATS
> LWLOCK_TIMING_STATS
>
> It is an upgrade of current lwlocks stats.
>

Hi Pierre,

Have you looked at the total execution time with and without the
LWLOCK_TIMING_STATS?

I've implemented something similar to this myself (only without attempting
to make it portable and otherwise worthy of submitting as a general-interest
patch), what I found is that attempting to time every "hold" time
substantially increased the overall run time (which I would worry distorts
the reported times, queue bad Heisenberg analogies).  The problem is that
gettimeofday is slow, and on some multi-processor systems it is a global
point of serialization, making it even slower.  I decided to time only the
time spent waiting on a block, and not the time spent holding the lock.
This way you only call gettimeofday twice if you actually need to block, and
not at all if you immediately get the lock.  This had a much smaller effect
on runtime, and the info produced was sufficient for my purposes.

Not that this changes your conclusion.  With or without that distortion I
completely believe that WALInsertLock is the bottleneck of parallel bulk
copy into unindexed tables.  I just can't find anything else it is a primary
bottleneck on.  I think the only real solution for bulk copy is to call
XLogInsert less often.  For example, it could build blocks in local memory,
then when done copy it into the shared buffers and then toss the entire
block into WAL in one call.  Easier said than implemented, of course.

Cheers,

Jeff

Reply via email to