pgbench changes, when adding the throttling stuff. Having the start time
taken when the thread really starts is just sanity, and I needed that
just to rule out that it was the source of the "strange" measures.

I don't get it; why is taking the time just after pthread_create() more sane
than taking it just before pthread_create()?

Thread create time seems to be expensive as well, maybe up 0.1 seconds under some conditions (?). Under --rate, this create delay means that throttling is laging behind schedule by about that time, so all the first transactions are trying to catch up.

typically far more expensive than pthread_create().  The patch for threaded
pgbench made the decision to account for pthread_create() as though it were
part of establishing the connection.  You're proposing to not account for it
all.  Both of those designs are reasonable to me, but I do not comprehend the
benefit you anticipate from switching from one to the other.

-j 800 vs -j 100 : ITM that if I you create more threads, the time delay
incurred is cumulative, so the strangeness of the result should worsen.

Not in general; we do one INSTR_TIME_SET_CURRENT() per thread, just before
calling pthread_create().  However, thread 0 is a special case; we set its
start time first and actually start it last.  Your observation of cumulative
delay fits those facts.

Yep, that must be thread 0 which has a very large delay. I think it is simpler that each threads record its start time when it has started, without exception.

Initializing the thread-0 start time later, just before calling its threadRun(), should clear this anomaly without changing other aspects of the measurement.

Always taking the thread start time when the thread is started does solve the issue as well, and it is homogeneous for all cases, so the solution I suggest seems reasonable and simple.

While pondering this area of the code, it occurs to me -- shouldn't we initialize the throttle rate trigger later, after establishing connections and sending startup queries? As it stands, we build up a schedule deficit during those tasks. Was that deliberate?

On the principle, I agree with you.

The connection creation time is another thing, but it depends on the options set. Under some options the connection is open and closed for every transaction, so there is no point in avoiding it in the measure or in the scheduling, and I want to avoid having to distinguish those cases. Morover, ISTM that one of the thread reuse the existing connection while other recreate is. So I left it "as is".

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to