Re: [HACKERS] PATCH: pgbench - merging transaction logs

Fabien COELHO Sat, 02 May 2015 06:32:05 -0700


Hello,

The counters are updated when the transaction is finished anyway?


Yes, but the thread does not know it's time to write the results until
it completes the first transaction after the interval ends ...

Let's say the very first query in thread #1 takes a minute for some
reason, while the other threads process 100 transactions per second. So
before the thread #1 can report 0 transactions for the first second, the
other threads already have results for the 60 intervals.

I think there's no way to make this work except for somehow tracking
timestamp of the last submitted results for each thread, and only
flushing results older than the minimum of the timestamps. But that's
not trivial - it certainly is more complicated than just writing into a
shared file descriptor.

I agree that such an approach this would be horrible for a very limitedvalue. However I was suggesting that a transaction is counted only when itis finished, so the aggregated data is to be understood as refering to"finished transactions in the interval", and what is in progress would becounted in the next interval anyway.

Merging results for each transaction would not have this issue, but it
would also use the lock much more frequently, and that seems like a
pretty bad idea (especially for the workloads with short transactions
that you suggested are bad match for detailed log - this would make the
aggregated log bad too).

Also notice that with all the threads will try to merge the data (and
thus acquire the lock) at almost the same time - this is especially true
for very short transactions. I would be surprised if this did not cause
issues on read-only workloads with large numbers of threads.


ISTM that the aggregated version should fare better than the detailed log,

whatever is done: the key performance issue is because fprintf is slow,with aggregated log these are infrequent, and only arithmetic remains in acritical section.

(2) The feature would not be available for the thread-emulation with
this approach, but I do not see this as a particular issue as I
think that it is pretty much only dead code and a maintenance burden.


I'm not willing to investigate that, nor am I willing to implement
another feature that works only sometimes (I've done that in the past,
and I find it a bad practice).


[...]

After the small discussion I triggered, I've submitted a patch to dropthread fork-emulation from pgbench.

[...]
Also, if the lock for the shared buffer is cheaper than the lock
required for fprintf, it may still be an improvement.


Yep. "fprintf" does a lot of processing, so it is the main issue.

--
Fabien.


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: pgbench - merging transaction logs

Reply via email to