Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing
On Sun, Oct 06, 2013 at 09:48:04AM +0200, Fabien COELHO wrote: > Also, I'm not sure of any system used with pgbench that do not have > threads, so ISTM that the thread fork-emulation hack is more or less > useless, and it is pretty masochistic to maintain when adding > features. Fair point. When added, the code pertaining to fork-emulation was well-isolated, and that remained the case as recently as 9.3. Your recent --progress patch was the first to suffer contortions for the benefit of that mode. (The per-statement latencies feature just declined to support it.) > >For the time being, I propose the attached comment patch. > > It comment seems very clear to me. I do not understand why it starts > with XXX, though. PostgreSQL code uses that notation regularly. When I add it, I typically have in mind "the following is not fully satisfying, but it's not bad enough to make a point of improving". I've committed the comment patch. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing
Note that this is pretty standard stuff with threads, ISTM that it would solve most of the issues, *but* this is not possible with the "thread fork emulation" implemented by pgbench, which really means no threads at all. You could do those same things in the fork emulation mode using anonymous shared memory, like we do in the server. That would permit removing the current "#ifdef PTHREAD_FORK_EMULATION" wart, too. Yep, but I'm not sure that would reduce pgbench code complexity. Also, I'm not sure of any system used with pgbench that do not have threads, so ISTM that the thread fork-emulation hack is more or less useless, and it is pretty masochistic to maintain when adding features. For the time being, I propose the attached comment patch. It comment seems very clear to me. I do not understand why it starts with XXX, though. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing
On Tue, Oct 01, 2013 at 02:50:29PM +0200, Fabien COELHO wrote: > I do not think that there is a clean and simple way to take the > start/stop period into account when computing global performances of > a run. The TPC-C benchmark tells to ignore the warmup/closure > period, whatever they are, and only perform measures within the > steady state. However the full graph must be provided when the bench > is provided. That makes sense to me. "pgbench --progress" and "pgbench --log --aggregate-interval" are good tools for excluding non-steady-state periods. > About better measures: If I could rely on having threads, I would > simply synchronise the threads at the beginning so that they > actually start after they are all created, and one thread would > decide when to stop and set a shared volatile variable to stop all > transactions more or less at once. In this case, the thread start > time would be taken just after the synchronization, and maybe only > by thread 0 would be enough. > > Note that this is pretty standard stuff with threads, ISTM that it > would solve most of the issues, *but* this is not possible with the > "thread fork emulation" implemented by pgbench, which really means > no threads at all. You could do those same things in the fork emulation mode using anonymous shared memory, like we do in the server. That would permit removing the current "#ifdef PTHREAD_FORK_EMULATION" wart, too. For the time being, I propose the attached comment patch. -- Noah Misch EnterpriseDB http://www.enterprisedb.com diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 926246e..816400f 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -2896,7 +2896,16 @@ main(int argc, char **argv) } disconnect_all(state, nclients); - /* get end time */ + /* +* XXX We compute results as though every client of every thread started +* and finished at the same time. That model can diverge noticeably from +* reality for a short benchmark run involving relatively many threads. +* The first thread may process notably many transactions before the last +* thread begins. Improving the model alone would bring limited benefit, +* because performance during those periods of partial thread count can +* easily exceed steady state performance. This is one of the many ways +* short runs convey deceptive performance figures. +*/ INSTR_TIME_SET_CURRENT(total_time); INSTR_TIME_SUBTRACT(total_time, start_time); printResults(ttype, total_xacts, nclients, threads, nthreads, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing
Hello Noah, Thread create time seems to be expensive as well, maybe up 0.1 seconds under some conditions (?). Under --rate, this create delay means that throttling is laging behind schedule by about that time, so all the first transactions are trying to catch up. threadRun() already initializes throttle_trigger with a fresh timestamp. Please detail how the problem remains despite that. Indeed, I did this kludge because I could not rely on the "before fork" start time as it was (possibly) creating a "rush" at the beginning of the run under --rate. The version I submitted takes the start time after the thread is created, and use it directly for throttling, so the start time is taken once per thread and used instead of retaking it because the first one cannot be relied on. [...] Fine detailed analysis! Opinions, other ideas? I do not think that there is a clean and simple way to take the start/stop period into account when computing global performances of a run. The TPC-C benchmark tells to ignore the warmup/closure period, whatever they are, and only perform measures within the steady state. However the full graph must be provided when the bench is provided. About better measures: If I could rely on having threads, I would simply synchronise the threads at the beginning so that they actually start after they are all created, and one thread would decide when to stop and set a shared volatile variable to stop all transactions more or less at once. In this case, the thread start time would be taken just after the synchronization, and maybe only by thread 0 would be enough. Note that this is pretty standard stuff with threads, ISTM that it would solve most of the issues, *but* this is not possible with the "thread fork emulation" implemented by pgbench, which really means no threads at all. A possible compromise would be to do just that when actual threads are used, and let it more or less as it is when fork emulation is on... -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing (fwd)
[oops, resent because stalled, wrong From!] Hello Noah, Thread create time seems to be expensive as well, maybe up 0.1 seconds under some conditions (?). Under --rate, this create delay means that throttling is laging behind schedule by about that time, so all the first transactions are trying to catch up. threadRun() already initializes throttle_trigger with a fresh timestamp. Please detail how the problem remains despite that. Indeed, I did this kludge because I could not rely on the "before fork" start time as it was (possibly) creating a "rush" at the beginning of the run under --rate. The version I submitted takes the start time after the thread is created, and use it directly for throttling, so the start time is taken once per thread and used instead of retaking it because the first one cannot be relied on. [...] Fine detailed analysis! Opinions, other ideas? I do not think that there is a clean and simple way to take the start/stop period into account when computing global performances of a run. The TPC-C benchmark tells to ignore the warmup/closure period, whatever they are, and only perform measures within the steady state. However the full graph must be provided when the bench is provided. About better measures: If I could rely on having threads, I would simply synchronise the threads at the beginning so that they actually start after they are all created, and one thread would decide when to stop and set a shared volatile variable to stop all transactions more or less at once. In this case, the thread start time would be taken just after the synchronization, and maybe only by thread 0 would be enough. Note that this is pretty standard stuff with threads, ISTM that it would solve most of the issues, *but* this is not possible with the "thread fork emulation" implemented by pgbench, which really means no threads at all. A possible compromise would be to do just that when actual threads are used, and let it more or less as it is when fork emulation is on... -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing
On Thu, Sep 26, 2013 at 01:41:01PM +0200, Fabien COELHO wrote: > >I don't get it; why is taking the time just after pthread_create() more sane > >than taking it just before pthread_create()? > > Thread create time seems to be expensive as well, maybe up 0.1 > seconds under some conditions (?). Under --rate, this create delay > means that throttling is laging behind schedule by about that time, > so all the first transactions are trying to catch up. threadRun() already initializes throttle_trigger with a fresh timestamp. Please detail how the problem remains despite that. > >typically far more expensive than pthread_create(). The patch for threaded > >pgbench made the decision to account for pthread_create() as though it were > >part of establishing the connection. You're proposing to not account for it > >all. Both of those designs are reasonable to me, but I do not comprehend the > >benefit you anticipate from switching from one to the other. > > > >>-j 800 vs -j 100 : ITM that if I you create more threads, the time delay > >>incurred is cumulative, so the strangeness of the result should worsen. > > > >Not in general; we do one INSTR_TIME_SET_CURRENT() per thread, just before > >calling pthread_create(). However, thread 0 is a special case; we set its > >start time first and actually start it last. Your observation of cumulative > >delay fits those facts. > > Yep, that must be thread 0 which has a very large delay. I think it > is simpler that each threads record its start time when it has > started, without exception. > > > Initializing the thread-0 start time later, just before calling > >its threadRun(), should clear this anomaly without changing other > >aspects of the measurement. > > Always taking the thread start time when the thread is started does > solve the issue as well, and it is homogeneous for all cases, so the > solution I suggest seems reasonable and simple. To exercise the timing semantics before and after your proposed change, I added a "sleep(1);" before the pthread_create() call. Here are the results with and without "-j", with and without pgbench-measurements-v5.patch: $ echo 'select 1' >test.sql # just the sleep(1) addition $ env time pgbench -c4 -t1000 -S -n -f test.sql | grep tps tps = 6784.410104 (including connections establishing) tps = 7094.701854 (excluding connections establishing) 0.03user 0.07system 0:00.60elapsed 16%CPU (0avgtext+0avgdata 0maxresident)k $ env time pgbench -j4 -c4 -t1000 -S -n -f test.sql | grep tps tps = 1224.327010 (including connections establishing) tps = 2274.160899 (excluding connections establishing) 0.02user 0.03system 0:03.27elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k # w/ pgbench-measurements-v5.patch $ env time pgbench -c4 -t1000 -S -n -f test.sql | grep tps tps = 6792.393877 (including connections establishing) tps = 7207.142278 (excluding connections establishing) 0.08user 0.06system 0:00.60elapsed 23%CPU (0avgtext+0avgdata 0maxresident)k $ env time pgbench -j4 -c4 -t1000 -S -n -f test.sql | grep tps tps = 1212.040409 (including connections establishing) tps = 1214.728830 (excluding connections establishing) 0.09user 0.06system 0:03.31elapsed 4%CPU (0avgtext+0avgdata 0maxresident)k Rather than, as I supposed before, excluding the cost of thread start entirely, pgbench-measurements-v5.patch has us count pthread_create() as part of the main runtime. I now see the cumulative delay you mentioned, but pgbench-measurements-v5.patch does not fix it. The problem is centered on the fact that pgbench.c:main() calculates a single total_time and models each thread as having run for that entire period. If pthread_create() is slow, reality diverges considerably from that model; some threads start notably late, and other threads finish notably early. The threadRun() runtime intervals in the last test run above are actually something like this: thread 1: 1.0s - 1.3s thread 2: 2.0s - 2.3s thread 3: 3.0s - 3.3s thread 0: 3.0s - 3.3s Current pgbench instead models every thread as having run 0.0s - 3.3s, hence the numbers reported. To make the numbers less surprising, we could axe the global total_time=end_time-start_time and instead accumulate total_time on a per-thread basis, just as we now accumulate conn_time on a per-thread basis. That ceases charging threads for time spent not-yet-running or already-finished, but it can add its own inaccuracy. Performance during a period in which some clients have yet to start is not interchangeable with performance when all clients are running. pthread_create() slowness would actually make the workload seem to perform better. An alternate strategy would be to synchronize the actual start of command issuance across threads. All threads would start and make their database connections, then signal readiness. Once the first thread notices that every other thread is ready, it would direct them to actually start issuing queries. This might minimize the result skew problem of the first strateg
Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing
pgbench changes, when adding the throttling stuff. Having the start time taken when the thread really starts is just sanity, and I needed that just to rule out that it was the source of the "strange" measures. I don't get it; why is taking the time just after pthread_create() more sane than taking it just before pthread_create()? Thread create time seems to be expensive as well, maybe up 0.1 seconds under some conditions (?). Under --rate, this create delay means that throttling is laging behind schedule by about that time, so all the first transactions are trying to catch up. typically far more expensive than pthread_create(). The patch for threaded pgbench made the decision to account for pthread_create() as though it were part of establishing the connection. You're proposing to not account for it all. Both of those designs are reasonable to me, but I do not comprehend the benefit you anticipate from switching from one to the other. -j 800 vs -j 100 : ITM that if I you create more threads, the time delay incurred is cumulative, so the strangeness of the result should worsen. Not in general; we do one INSTR_TIME_SET_CURRENT() per thread, just before calling pthread_create(). However, thread 0 is a special case; we set its start time first and actually start it last. Your observation of cumulative delay fits those facts. Yep, that must be thread 0 which has a very large delay. I think it is simpler that each threads record its start time when it has started, without exception. Initializing the thread-0 start time later, just before calling its threadRun(), should clear this anomaly without changing other aspects of the measurement. Always taking the thread start time when the thread is started does solve the issue as well, and it is homogeneous for all cases, so the solution I suggest seems reasonable and simple. While pondering this area of the code, it occurs to me -- shouldn't we initialize the throttle rate trigger later, after establishing connections and sending startup queries? As it stands, we build up a schedule deficit during those tasks. Was that deliberate? On the principle, I agree with you. The connection creation time is another thing, but it depends on the options set. Under some options the connection is open and closed for every transaction, so there is no point in avoiding it in the measure or in the scheduling, and I want to avoid having to distinguish those cases. Morover, ISTM that one of the thread reuse the existing connection while other recreate is. So I left it "as is". -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing
Concerning one of the eventual would-be split patches... On Tue, Sep 24, 2013 at 10:41:17PM +0200, Fabien COELHO wrote: > - Take thread start time at the beginning of the thread (!) > >Otherwise it includes pretty slow thread/fork system start times in >the measurements. May help with bug #8326. This also helps with throttling, >as the start time is used to adjust the rate: if it is not the actual >start time, there is a bit of a rush at the beginning in order to >catch up. Taking the start time when the thread actually starts is >the sane thing to do to avoid surprises at possibly strange measures >on short runs. On Sun, Sep 22, 2013 at 10:08:54AM +0200, Fabien COELHO wrote: > I really spent *hours* debugging stupid measures on the previous round of > pgbench changes, when adding the throttling stuff. Having the start time > taken when the thread really starts is just sanity, and I needed that > just to rule out that it was the source of the "strange" measures. I don't get it; why is taking the time just after pthread_create() more sane than taking it just before pthread_create()? We report separate TPS values for "including connections establishing" and "excluding connections establishing" because establishing the database connection is expensive, typically far more expensive than pthread_create(). The patch for threaded pgbench made the decision to account for pthread_create() as though it were part of establishing the connection. You're proposing to not account for it all. Both of those designs are reasonable to me, but I do not comprehend the benefit you anticipate from switching from one to the other. > -j 800 vs -j 100 : ITM that if I you create more threads, the time delay > incurred is cumulative, so the strangeness of the result should worsen. Not in general; we do one INSTR_TIME_SET_CURRENT() per thread, just before calling pthread_create(). However, thread 0 is a special case; we set its start time first and actually start it last. Your observation of cumulative delay fits those facts. Initializing the thread-0 start time later, just before calling its threadRun(), should clear this anomaly without changing other aspects of the measurement. While pondering this area of the code, it occurs to me -- shouldn't we initialize the throttle rate trigger later, after establishing connections and sending startup queries? As it stands, we build up a schedule deficit during those tasks. Was that deliberate? Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers