Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
On Sat, Oct 5, 2013 at 6:10 PM, Noah Misch n...@leadboat.com wrote: The sum of the squares of the latencies wraps after 2^63/(10^12 * avg_latency * nclients) seconds. That's unlikely to come up with the ordinary pgbench script, but one can reach it in a few hours when benchmarking a command that runs for many seconds. If we care, we can track the figure as a double. I merely added a comment about it. Using a double seems wise to me. I think that both int64 double are fine. The likelyhood of having underflows (double) or overflows/wraparounds (int64) seems pretty low for any practical run. I took the former because it is exact... up to when it is totally wrong. The later one may provide underestimated results silently: it would fail more continuously. So I'm okay if it is changed to double consistently. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
pgbench already offers two schedules of pgbench --initialize messaging, message-per-100k-rows and message-per-5s. A user too picky to find satisfaction in either option can filter the messages through grep, sed et al. We patched pgbench on two occasions during the 9.3 cycle to arrive at that status quo. Had I participated, I may well have voted for something like your proposal over pgbench --quiet. Now that we've released --quiet, a proposal for an additional message schedule option needs to depict a clear and convincing step forward. This proposal does not rise to that level. The step forward is to have the same options apply to *both* modes, instead of options for one mode (default 100k or --quiet 5s), and a convenient --progress (adjustable seconds) for the other. It is convincing to me because I hate commands with non orthogonal options, as I'm sure not to notice that one option is for one mode and the other for another, and to get caught by it. But this is clearly a matter of design taste. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
On Sat, Oct 5, 2013 at 6:10 PM, Noah Misch n...@leadboat.com wrote: The sum of the squares of the latencies wraps after 2^63/(10^12 * avg_latency * nclients) seconds. That's unlikely to come up with the ordinary pgbench script, but one can reach it in a few hours when benchmarking a command that runs for many seconds. If we care, we can track the figure as a double. I merely added a comment about it. Using a double seems wise to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
Hello Noah, Patch (4): Redefine latency as reported by pgbench and report lag more. Here is a first partial patch, which focusses on measuring latency and reporting the measure under --progress. This patch contains the features pertaining to both hypothetical patches (3) and (4), not just (4) like I requested. Sorry, I misunderstood the expected scope of your request. The sum of the squares of the latencies wraps after 2^63/(10^12 * avg_latency * nclients) seconds. That's unlikely to come up with the ordinary pgbench script, but one can reach it in a few hours when benchmarking a command that runs for many seconds. If we care, we can track the figure as a double. I merely added a comment about it. Indeed I thought about that. I choose int64 because the overflow seemed very unlikely: it would required about 1 billion pretty large 100 ms latency (2^16.6 µs) transactions to wrap around, which is a multi-year one thread run. So I stayed homogeneous to the other accumulator and the surveyed data type. Also, the measure is exact with int64, but rounding or undeflows could happen with double. Adding a comment about it is a good idea. I restored applicable parts of your update to the --progress documentation from pgbench-measurements-v5.patch. The patch made output like this: progress: 7.2 s, 1.7 tps, 205.225 stddev 3.484 ms lat, 45.472 ms lag [...] Indeed, the postfix English-like version is not very clear. progress: 36.0 s, 115.2 tps, lat avg 9.678 ms stddev 1.792, lag 0.143 ms I switched to that, except that I removed the word avg to save horizontal space and since lag is also an average though not labelled as such. Yep, space is a concern. That was one of the reason why I used +-. + printf(latency average: %.3f ms\n, + 1000.0 * duration / normal_xacts); I incorporated the nclients factor needed here. Oops!? Thanks a log for the fixes and the improvements. Which part do you want as a next step? -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
On Sun, Oct 06, 2013 at 09:40:40AM +0200, Fabien COELHO wrote: Which part do you want as a next step? I think we're done; here are the distinct changes in your original patch, along with their dispositions: Patch (1): Change the default from --progress=0 to --progress=5 Rejected Patch (2): Make --initialize mode respect --progress. Rejected Patch (3): Report the standard deviation of latency. Committed Patch (4): Redefine latency as reported by pgbench and report lag more. Committed Patch (5): Take thread start time at the beginning of the thread. Returned with Feedback Patch (6): Reduce and compensate throttling underestimation bias. Committed Patch (7): Documentation and --help corrections Committed Patch (8): Remove stray use of float. Committed -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
Hello Noah, Patch (1): Change the default from --progress=0 to --progress=5 Rejected Yep. Too bad, esp as the default is meaningless and does not scale. Patch (2): Make --initialize mode respect --progress. Rejected I missed this one... This is just about having the same option (--progress) to set the progress report delay both under benchmarking and initialization. I do not understand where and why it would be rejected, especially as it does not imply to change the current default behavior. Patch (5): Take thread start time at the beginning of the thread. Returned with Feedback Hmmm. I fought back on the feedback:-) I thought my arguments where good enough to consider an accept. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
On Sun, Oct 06, 2013 at 05:04:56PM +0200, Fabien COELHO wrote: Patch (2): Make --initialize mode respect --progress. Rejected I missed this one... See the second half of this message, including quoted material: http://www.postgresql.org/message-id/CA+TgmoZNXkm-EtszHX=kwq34h5ni4cs8dg31no86cmdryaq...@mail.gmail.com Patch (5): Take thread start time at the beginning of the thread. Returned with Feedback Hmmm. I fought back on the feedback:-) I thought my arguments where good enough to consider an accept. Here is the feedback in question: http://www.postgresql.org/message-id/20130930223621.ga125...@tornado.leadboat.com With or without the patch, reported performance figures are uninformative when thread start time is substantial relative to benchmark duration. A mere time accounting change will not help much; improving this requires tighter synchronization around the start of actual query activity across threads. I didn't read anything in your response as disagreement with that conclusion. nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
Patch (2): Make --initialize mode respect --progress. Rejected I missed this one... See the second half of this message, including quoted material: http://www.postgresql.org/message-id/CA+TgmoZNXkm-EtszHX=kwq34h5ni4cs8dg31no86cmdryaq...@mail.gmail.com I did not read Peter Haas comments as whether --progress should be used for both modes, although it is in the section about it. All his argumentation is about *not changing any defaults*. I understood that his -1 applied about the dropping the row-counting based reporting: Robert complains about any break[ing] backward compatibility again one release later in the paragraph, not really about --progress becoming significant under -i, so this would be a veto agains what you called Patch (1). So what I've done in version 5 of the patch on this subject is that --progress specifies the interval of the progress report in any mode, and quiet just set it to five seconds for initialization as is the current behavior. Version 5 does not change any current default behavior, as I understood that this was the requirement expressed by Robert Haas. Patch (5): Take thread start time at the beginning of the thread. Returned with Feedback Hmmm. I fought back on the feedback:-) I thought my arguments where good enough to consider an accept. Here is the feedback in question: http://www.postgresql.org/message-id/20130930223621.ga125...@tornado.leadboat.com With or without the patch, reported performance figures are uninformative when thread start time is substantial relative to benchmark duration. A mere time accounting change will not help much; improving this requires tighter synchronization around the start of actual query activity across threads. I didn't read anything in your response as disagreement with that conclusion. In my answer to the message you mention. http://www.postgresql.org/message-id/alpine.DEB.2.02.1310011422130.324@sto I explained why I had to re-take gettimeofday under --rate because the start_time value cannot be relied on. The code I suggested simplifies the logic by taking the time only once, instead of ignoring the first one because it does not make sense. It seems to me that the logical answer to this argument could have been ok. But it seems that you do not percieve my logic. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
On Sun, Oct 06, 2013 at 06:44:11PM +0200, Fabien COELHO wrote: Patch (2): Make --initialize mode respect --progress. Rejected I missed this one... See the second half of this message, including quoted material: http://www.postgresql.org/message-id/CA+TgmoZNXkm-EtszHX=kwq34h5ni4cs8dg31no86cmdryaq...@mail.gmail.com I did not read Peter Haas comments as whether --progress should be used for both modes, although it is in the section about it. All his argumentation is about *not changing any defaults*. I understood that his -1 applied about the dropping the row-counting based reporting: Robert complains about any break[ing] backward compatibility again one release later in the paragraph, not really about --progress becoming significant under -i, so this would be a veto agains what you called Patch (1). So what I've done in version 5 of the patch on this subject is that --progress specifies the interval of the progress report in any mode, and quiet just set it to five seconds for initialization as is the current behavior. Version 5 does not change any current default behavior, as I understood that this was the requirement expressed by Robert Haas. I don't know whether that accurately describes Robert's position, but I will elaborate on my own position: pgbench already offers two schedules of pgbench --initialize messaging, message-per-100k-rows and message-per-5s. A user too picky to find satisfaction in either option can filter the messages through grep, sed et al. We patched pgbench on two occasions during the 9.3 cycle to arrive at that status quo. Had I participated, I may well have voted for something like your proposal over pgbench --quiet. Now that we've released --quiet, a proposal for an additional message schedule option needs to depict a clear and convincing step forward. This proposal does not rise to that level. Patch (5): Take thread start time at the beginning of the thread. Returned with Feedback Hmmm. I fought back on the feedback:-) I thought my arguments where good enough to consider an accept. Here is the feedback in question: http://www.postgresql.org/message-id/20130930223621.ga125...@tornado.leadboat.com With or without the patch, reported performance figures are uninformative when thread start time is substantial relative to benchmark duration. A mere time accounting change will not help much; improving this requires tighter synchronization around the start of actual query activity across threads. I didn't read anything in your response as disagreement with that conclusion. In my answer to the message you mention. http://www.postgresql.org/message-id/alpine.DEB.2.02.1310011422130.324@sto I explained why I had to re-take gettimeofday under --rate because the start_time value cannot be relied on. The code I suggested simplifies the logic by taking the time only once, instead of ignoring the first one because it does not make sense. It seems to me that the logical answer to this argument could have been ok. But it seems that you do not percieve my logic. Yeah, perhaps I don't understand. Throttling behaves fine, because pgbench already does take a fresh timestamp in threadRun(). Your patch makes thread-start_time adequate in place of that fresh timestamp. If that were the patch's only effect, it would be okay as a minor code cleanup. That is not the patch's only effect. Its other effect, explored thoroughly in my post referenced above, changes one bad user-visible behavior to a different bad user-visible behavior. When a code cleanup has such a side effect, we don't commit it. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
On Thu, Sep 26, 2013 at 08:50:15PM +0200, Fabien COELHO wrote: Patch (4): Redefine latency as reported by pgbench and report lag more. Here is a first partial patch, which focusses on measuring latency and reporting the measure under --progress. This patch contains the features pertaining to both hypothetical patches (3) and (4), not just (4) like I requested. The sum of the squares of the latencies wraps after 2^63/(10^12 * avg_latency * nclients) seconds. That's unlikely to come up with the ordinary pgbench script, but one can reach it in a few hours when benchmarking a command that runs for many seconds. If we care, we can track the figure as a double. I merely added a comment about it. I restored applicable parts of your update to the --progress documentation from pgbench-measurements-v5.patch. The patch made output like this: progress: 7.2 s, 1.7 tps, 205.225 stddev 3.484 ms lat, 45.472 ms lag I read that as five facts: 7.2 s 1.7 tps 205.225 stddev 3.484 ms lat 45.472 ms lag That was a wrong reading; 205.225 is the latency average and 3.484 is the latency standard deviation. Let's be consistent about the placement of labels relative to their figures. Upthread, had you proposed this format: progress: 36.0 s, 115.2 tps, lat avg 9.678 ms stddev 1.792, lag 0.143 ms I switched to that, except that I removed the word avg to save horizontal space and since lag is also an average though not labelled as such. + printf(latency average: %.3f ms\n, +1000.0 * duration / normal_xacts); I incorporated the nclients factor needed here. Committed with those changes. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2
My feelings on the patch split haven't changed; your three bullet points call for four separate patches. Conflicting patches are bad, but dependent patches are okay; Indeed, this is the only solution if you do not want one patch. Note that it will not possible to backtrack one of the patch but the last one without conflicts. just disclose the dependency order. How about this: as a next step, please extract just this feature that I listed last Saturday: Patch (4): Redefine latency as reported by pgbench and report lag more. Once that's committed, we can move on to others. Ok, I'll submit a first part, hopefully today, possibly the one you suggest, about fixing and extending latency measure under --rate and reporting it under progress. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2 - A
Patch (4): Redefine latency as reported by pgbench and report lag more. Here is a first partial patch, which focusses on measuring latency and reporting the measure under --progress. ** Improve pgbench measurements progress report Measure transaction latency instead under --rate and --progress. The previous computed figure does not make sense under throttling: as sleep throttling time was included in the figures, the displayed results were plain false. The latency and its standard deviation are printed out under progress and in the final report when available. It could be made always available, but that would require to accept additional gettimeofday calls. I do not think that there is a performance issue here, but there is significant opposition to the idea. Sample output under benchmarking with --progress=1 progress: 1.0 s, 2626.1 tps, 0.374 stddev 0.597 ms lat progress: 2.0 s, 2766.6 tps, 0.358 stddev 0.588 ms lat progress: 3.0 s, 2567.4 tps, 0.385 stddev 0.665 ms lat progress: 4.0 s, 3014.2 tps, 0.328 stddev 0.593 ms lat progress: 5.0 s, 2959.3 tps, 0.334 stddev 0.553 ms lat ... progress: 16.0 s, 5009.2 tps, 0.197 stddev 0.381 ms lat ... progress: 24.0 s, 7051.2 tps, 0.139 stddev 0.284 ms lat ... progress: 50.0 s, 6860.5 tps, 0.143 stddev 0.052 ms lat ... The warmup is quite fast because the DB is on a SSD. In the beginning the standard deviation is well over the average transaction time, but when the steady state is reached (later) it is much below. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 06dd709..a48498a 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -172,6 +172,7 @@ int agg_interval; /* log aggregates instead of individual * transactions */ int progress = 0; /* thread progress report every this seconds */ int progress_nclients = 0; /* number of clients for progress report */ +int progress_nthreads = 0; /* number of threads for progress report */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ @@ -214,6 +215,8 @@ typedef struct int nvariables; instr_time txn_begin; /* used for measuring transaction latencies */ instr_time stmt_begin; /* used for measuring statement latencies */ + int64 txn_latencies; /* cumulated latencies */ + int64 txn_sqlats; /* cumulated square latencies */ bool is_throttled; /* whether transaction throttling is done */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; @@ -243,6 +246,8 @@ typedef struct { instr_time conn_time; int xacts; + int64 latencies; + int64 sqlats; int64 throttle_lag; int64 throttle_lag_max; } TResult; @@ -1003,6 +1008,21 @@ top: } /* + * always record latency under progress or throttling + */ + if ((progress || throttle_delay) commands[st-state + 1] == NULL) + { + instr_time now, diff; + int64 latency; + INSTR_TIME_SET_CURRENT(now); + diff = now; + INSTR_TIME_SUBTRACT(diff, st-txn_begin); + latency = INSTR_TIME_GET_MICROSEC(diff); + st-txn_latencies += latency; + st-txn_sqlats += latency*latency; + } + + /* * if transaction finished, record the time it took in the log */ if (logfile commands[st-state + 1] == NULL) @@ -1191,8 +1211,8 @@ top: goto top; } - /* Record transaction start time if logging is enabled */ - if (logfile st-state == 0) + /* Record transaction start time under logging, progress or throttling */ + if ((logfile || progress || throttle_delay) st-state == 0) INSTR_TIME_SET_CURRENT(st-txn_begin); /* Record statement start time if per-command latencies are requested */ @@ -2096,6 +2116,7 @@ static void printResults(int ttype, int normal_xacts, int nclients, TState *threads, int nthreads, instr_time total_time, instr_time conn_total_time, + int64 total_latencies, int64 total_sqlats, int64 throttle_lag, int64 throttle_lag_max) { double time_include, @@ -2135,6 +2156,22 @@ printResults(int ttype, int normal_xacts, int nclients, normal_xacts); } + if (throttle_delay || progress) + { + /* compute and show latency average and standard deviation */ + double latency = 0.001 * total_latencies / normal_xacts; + double sqlat = (double) total_sqlats / normal_xacts; + printf(latency average: %.3f ms\n + latency stddev: %.3f ms\n, + latency, 0.001 * sqrt(sqlat - 100*latency*latency)); + } + else + { + /* only an average latency computed from the duration is available */ + printf(latency average: %.3f ms\n, + 1000.0 * duration / normal_xacts); + } + if (throttle_delay) { /* @@ -2143,7 +2180,7 @@ printResults(int ttype, int normal_xacts, int nclients, * transaction. The measured lag may be caused by thread/client load, * the database load, or the Poisson
Re: [HACKERS] pgbench progress report improvements - split 2
On Tue, Sep 24, 2013 at 08:42:15PM +0200, Fabien COELHO wrote: meet all those goals simultaneously with simpler code, can we not? int64 wait = (int64) (throttle_delay * Min(7.0, -log(1 - pg_erand48(thread-random_state; If you truncate roughly the multipler, as it is done here with min, you necessarily create a bias, my random guess would be a 0.5% under estimation, but maybe it is more... Thus you would have to compute and the correcting factor as well. Its computation is a little bit less easy than with the quantized formula where I just used a simple SUM, and you have to really do the maths here. So I would keep the simple solution, but it is fine with me if you do the maths! Ah, true; I guess each approach has different advantages. I've committed your latest version. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2
On Tue, Sep 24, 2013 at 10:41:17PM +0200, Fabien COELHO wrote: These changes are coupled because measures are changed, and their reporting as well. Submitting separate patches for these different features would result in conflicting or dependent patches, so I wish to avoid that if possible. My feelings on the patch split haven't changed; your three bullet points call for four separate patches. Conflicting patches are bad, but dependent patches are okay; just disclose the dependency order. How about this: as a next step, please extract just this feature that I listed last Saturday: Patch (4): Redefine latency as reported by pgbench and report lag more. Once that's committed, we can move on to others. nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 2
Hello Noah, meet all those goals simultaneously with simpler code, can we not? int64 wait = (int64) (throttle_delay * Min(7.0, -log(1 - pg_erand48(thread-random_state; If you truncate roughly the multipler, as it is done here with min, you necessarily create a bias, my random guess would be a 0.5% under estimation, but maybe it is more... Thus you would have to compute and the correcting factor as well. Its computation is a little bit less easy than with the quantized formula where I just used a simple SUM, and you have to really do the maths here. So I would keep the simple solution, but it is fine with me if you do the maths! -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements
Dear Robert, (1) ... I really don't think it's a good idea to change the default behavior. We've had many discussions about how the overhead of gettimeofday() can sometimes be surprisingly large; I don't think we should assume it's guaranteed to be small in this case. Also, changing established defaults will annoy users who like to present defaults; I don't see any reason to assume that your preferences will be universal. In doubtful cases we should favor leaving the behavior the way it's been in previous releases, because backward-compatibility is very desirable. I argued in another mail precisely, with worst figures found on the net, about the relative overhead of gettimeofday as used by pgbench, which is also handling network traffic and waiting for the database to perform actual transactions. I do not thing that I'm assuming, and I'm trying to argument with objective data. Anyway, this particular behavior is preserved by the current version of the patch, so no worry. The additional gettimeofday is *not* perform under standard silent benchmarking, and the standard deviation of the latency is not measured, because it can't. It is however measured under --rate and --progress. It is necessary for rate because otherwise the computed latency is false. It is done for progress because if you are interested to know what is going on, I assume that it would make sense to provide this data. I must admit that I think, un-universaly, that people should care to know that their transaction latency can vary by several order of magnitudes, but this opinion is clearly not shared. I tried to preserve the row-counting behavior because I thought that someone would object otherwise, but I would be really in favor of dropping the row-counting report behavior altogether and only keep the time triggered report. -1 for changing this again. Frankly, I might have come down in a different place if I had understood exactly how this was going to end up being committed; it ended up being quite different from what I was expecting. But I really think that relitigating this just so we can break backward compatibility again one release later is not a good use of anyone's time, or a good idea in general. The current status on my small (SSD) laptop is that pgbench -i throws about 10 lines of 100,000-rows progress report per second. I must be a slow reader because I can't read that fast. So either other users can read much faster than me, or they have slower computers:-) ISTM that it is no big deal to change this kind of thing on a minor contrib tool of postgresql if it is reasonnably better and useful, and I would be surprise and seeing protests about changing an unreadable flow to a readable one. Anyway, let us keep this default behavior, as I hope there must be people who like it. Well, if anyone could tell me that he/she likes better having 10 lines a second thrown on the screen than a regular progress report every few seconds, I would feel less stupid at reinstating this behavior and re-submitting a new version of the patch. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3
On 09/22/2013 10:44 PM, Fabien COELHO wrote: Due to the possibly repetitive table structure of the data, maybe CSV would make sense as well. It is less extensible, but it is directly usable by sqlite or pg. I'd be OK with CSV. At least I wouldn't be regexing the output. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3 v2
Split 3 of the initial submission, which actually deal with data measured and reported on stderr under various options. This version currently takes into account many comments by Noah Mish. In particular, the default no report behavior under benchmarking is not changed, although I really think that it should. Also, the standard deviation is only shown when available, which is not in under all settings, because of concerns expressed about the cost of additional calls to gettimeofday. ISTM that these concerned are misplaced in this particular case. Yet another version to fulfill requirements by Robert Haas not to change current default behaviors. The initialization progress is reported every 100-k rows, and there is no progress report for benchmarking. Improve pgbench measurements progress report These changes are coupled because measures are changed, and their reporting as well. Submitting separate patches for these different features would result in conflicting or dependent patches, so I wish to avoid that if possible. - Use same progress and quiet options both under init bench. However, the default reporting is NOT changed, as required by Robert Haas. It is currently every 100k rows when initializing, and nothing when benchmarking. I would suggest to change this default to a few seconds for both. The 100-k row report is very verbose an unreadable on good hardware. For benchmarking, the rational is that it is error prone as it must run a long time to be significant because of warmup effects (esp on HDD, less true on SSD). Seeing how the current performance evolve would help pgbench users realise that. See down below a sample output. - Measure transaction latency instead of computing it, for --rate and --progress. The previous computed figure does not make sense under throttling: as sleep throttling time was included in the figures, the displayed results were plain false. The latency and its standard deviation are printed out under progress and in the final report when available. It could be made always available, but that would require to accept additional gettimeofday calls. I do not think that there is a performance issue here, but there is significant opposition to the idea. - Take thread start time at the beginning of the thread (!) Otherwise it includes pretty slow thread/fork system start times in the measurements. May help with bug #8326. This also helps with throttling, as the start time is used to adjust the rate: if it is not the actual start time, there is a bit of a rush at the beginning in order to catch up. Taking the start time when the thread actually starts is the sane thing to do to avoid surprises at possibly strange measures on short runs. Sample output under initialization with --progress=1 creating tables... 1126000 of 300 tuples (37%) done (elapsed 1.00 s, remaining 1.67 s). 2106000 of 300 tuples (70%) done (elapsed 2.00 s, remaining 0.85 s). 2824000 of 300 tuples (94%) done (elapsed 3.00 s, remaining 0.19 s). 300 of 300 tuples (100%) done (elapsed 3.19 s, remaining 0.00 s). vacuum... set primary keys... done. Sample output under benchmarking with --progress=1 progress: 1.0 s, 2626.1 tps, 0.374 stddev 0.597 ms lat progress: 2.0 s, 2766.6 tps, 0.358 stddev 0.588 ms lat progress: 3.0 s, 2567.4 tps, 0.385 stddev 0.665 ms lat progress: 4.0 s, 3014.2 tps, 0.328 stddev 0.593 ms lat progress: 5.0 s, 2959.3 tps, 0.334 stddev 0.553 ms lat ... progress: 16.0 s, 5009.2 tps, 0.197 stddev 0.381 ms lat ... progress: 24.0 s, 7051.2 tps, 0.139 stddev 0.284 ms lat ... progress: 50.0 s, 6860.5 tps, 0.143 stddev 0.052 ms lat ... The warmup is quite fast because the DB is on a SSD. In the beginning the standard deviation is well over the average transaction time, but when the steady state is reached (later) it is much below. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index ad8e272..934244b 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -167,11 +167,13 @@ char *index_tablespace = NULL; #define SCALE_32BIT_THRESHOLD 2 bool use_log; /* log transaction latencies to a file */ -bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ -int progress = 0; /* thread progress report every this seconds */ +bool use_quiet = false; /* more or less quiet logging onto stderr */ +int progress = -1; /* report every this seconds. + 0 is no report, -1 is not set yet */ int progress_nclients = 0; /* number of clients for progress report */ +int progress_nthreads = 0; /* number of threads for progress report */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid;
Re: [HACKERS] pgbench progress report improvements - split 3
On 9/22/13 2:58 PM, Fabien wrote: Split 3 of the initial submission, which actually deal with data measured and reported on stderr under various options. It seems this patch doesn't apply. Does it need the first two applied first? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3
Dear Peter, Split 3 of the initial submission, which actually deal with data measured and reported on stderr under various options. It seems this patch doesn't apply. Does it need the first two applied first? Oops. Indeed. The patch is fully independent of the two others. It was generated with git diff -b master, because I was told not to change spacing, but then it must be applied with patch -l -p1. Please find attached a new version without this requirement. postgresql git branch test postgresql git checkout test Switched to branch 'test' postgresql patch -p1 ../pgbench-measurements-v4.patch patching file contrib/pgbench/pgbench.c patching file doc/src/sgml/pgbench.sgml -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index ad8e272..2590b12 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -167,11 +167,12 @@ char *index_tablespace = NULL; #define SCALE_32BIT_THRESHOLD 2 bool use_log; /* log transaction latencies to a file */ -bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ -int progress = 0; /* thread progress report every this seconds */ +int progress = -1; /* report every this seconds. + 0 is quiet, -1 is not set yet */ int progress_nclients = 0; /* number of clients for progress report */ +int progress_nthreads = 0; /* number of threads for progress report */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ @@ -214,6 +215,8 @@ typedef struct int nvariables; instr_time txn_begin; /* used for measuring transaction latencies */ instr_time stmt_begin; /* used for measuring statement latencies */ + int64 txn_latencies; /* cumulated latencies */ + int64 txn_sqlats; /* cumulated square latencies */ bool is_throttled; /* whether transaction throttling is done */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; @@ -243,6 +246,8 @@ typedef struct { instr_time conn_time; int xacts; + int64 latencies; + int64 sqlats; int64 throttle_lag; int64 throttle_lag_max; } TResult; @@ -347,7 +352,6 @@ usage(void) -i, --initialize invokes initialization mode\n -F, --fillfactor=NUM set fill factor\n -n, --no-vacuum do not run VACUUM after initialization\n - -q, --quiet quiet logging (one message each 5 seconds)\n -s, --scale=NUM scaling factor\n --foreign-keys create foreign key constraints between tables\n --index-tablespace=TABLESPACE\n @@ -367,7 +371,6 @@ usage(void) (default: simple)\n -n, --no-vacuum do not run VACUUM before tests\n -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n - -P, --progress=NUM show thread progress report every NUM seconds\n -r, --report-latencies report average latency per command\n -R, --rate=SPEC target rate in transactions per second\n -s, --scale=NUM report this scale factor in output\n @@ -382,6 +385,8 @@ usage(void) -d, --debug print debugging output\n -h, --host=HOSTNAME database server host or socket directory\n -p, --port=PORT database server port number\n + -P, --progress=NUM show progress report every NUM seconds\n + -q, --quiet no progress report\n -U, --username=USERNAME connect as specified database user\n -V, --versionoutput version information, then exit\n -?, --help show this help, then exit\n @@ -1003,6 +1008,21 @@ top: } /* + * always record latency under progress or throttling + */ + if ((progress || throttle_delay) commands[st-state + 1] == NULL) + { + instr_time now, diff; + int64 latency; + INSTR_TIME_SET_CURRENT(now); + diff = now; + INSTR_TIME_SUBTRACT(diff, st-txn_begin); + latency = INSTR_TIME_GET_MICROSEC(diff); + st-txn_latencies += latency; + st-txn_sqlats += latency*latency; + } + + /* * if transaction finished, record the time it took in the log */ if (logfile commands[st-state + 1] == NULL) @@ -1191,8 +1211,11 @@ top: goto top; } - /* Record transaction start time if logging is enabled */ - if (logfile st-state == 0) + /* + * Record transaction start time + * if logging, progress or throttling is enabled + */ + if ((logfile || progress || throttle_delay) st-state == 0) INSTR_TIME_SET_CURRENT(st-txn_begin); /* Record statement start time if per-command latencies are requested */ @@ -1678,25 +1701,8 @@ init(bool is_no_vacuum) exit(1); } - /* - * If we want to
Re: [HACKERS] pgbench progress report improvements
On Sat, Sep 21, 2013 at 4:55 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: - Use progress option both under init bench. Activate progress report by default, every 5 seconds. When initializing, --quiet reverts to the old every 100,000 insertions behavior... Patch (1): Change the default from --progress=0 to --progress=5 This has a disadvantage of causing two extra gettimeofday() calls per transaction. That's not cheap on all platforms; a user comparing pgbench results across versions will need to make a point of disabling this again to make his results comparable. Given that threat and uncertainty about which default would be more popular, I think we should drop this part. The gettimeofday call time is very small compared to network and database (disk!) accesses involved in a pgbench run. On my small laptop, gettimeofday takes about 0.0002 seconds (0.02 µs -- micro seconds) per call, which is typically under 1/1000 of the fastest local-socket read-only one-table cache-hit prepared trivial transaction. This is a little different when initializing, but I put a guard on gettimeofday in that case. If this is a blocker, I can put 0, but I really do not think it is necessary because of the performance impact. I really don't think it's a good idea to change the default behavior. We've had many discussions about how the overhead of gettimeofday() can sometimes be surprisingly large; I don't think we should assume it's guaranteed to be small in this case. Also, changing established defaults will annoy users who like to present defaults; I don't see any reason to assume that your preferences will be universal. In doubtful cases we should favor leaving the behavior the way it's been in previous releases, because backward-compatibility is very desirable. Patch (2): Make --initialize mode respect --progress. The definition you've chosen for --quiet makes that option contrary to its own name: message-per-100k-tuples is typically more voluminous than your proposed new default of message-per-5s. In fact, since --quiet currently switches from message-per-100k-tuples to message-per-5s, your patch gives it the exact opposite of its present effect. During the 9.3 development cycle, we _twice_ made changes pertaining to --initialize message frequency: http://www.postgresql.org/message-id/flat/20120620.170427.347012755716399568.t-is...@sraoss.co.jp http://www.postgresql.org/message-id/flat/1346472039.18010.10.ca...@vanquo.pezone.net That gives me pause about working through yet another change; we keep burning time on this minor issue. I totally agree that this quiet is not convincing! My actual opinion is that quiet should just mean quiet:-), i.e. no progress report. I tried to preserve the row-counting behavior because I thought that someone would object otherwise, but I would be really in favor of dropping the row-counting report behavior altogether and only keep the time triggered report. -1 for changing this again. Frankly, I might have come down in a different place if I had understood exactly how this was going to end up being commited; it ended up being quite different from what I was expecting. But I really think that relitigating this just so we can break backward compatibility again one release later is not a good use of anyone's time, or a good idea in general. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 1
On Sun, Sep 22, 2013 at 08:44:08PM +0200, Fabien COELHO wrote: pgbench: minor update of documentation help message. Use NUM in help message for homogeneity with other options. The target *start* time of the transaction is set by the stochastic process which is doing the throttling (--rate), not the end time. Committed. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 2
On Sun, Sep 22, 2013 at 08:46:55PM +0200, Fabien wrote: pgbench: reduce and compensate throttling underestimation bias. This is a consequence of relying on an integer random generator, which allow to ensure that delays inserted stay reasonably in range of the target average delay. The bias was about 0.5% with 1000 distinct values. Multiplier added to compensate the 0.05% bias with 1 distinct integer values, so there is no bias now. --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -929,13 +929,17 @@ top: * that the series of delays will approximate a Poisson distribution * centered on the throttle_delay time. * - * 1000 implies a 6.9 (-log(1/1000)) to 0.0 (log 1.0) delay multiplier. + * 1 implies a 9.2 (-log(1/1)) to 0.0 (log 1) delay multiplier, + * and results in a 0.055 % target underestimation bias: + * + * SELECT 1.0/AVG(-LN(i/1.0)) FROM generate_series(1,1) AS i; + * = 1.00055271703266335474 * * If transactions are too slow or a given wait is shorter than * a transaction, the next transaction will start right away. */ - int64 wait = (int64) - throttle_delay * -log(getrand(thread, 1, 1000)/1000.0); + int64 wait = (int64) (throttle_delay * + 1.00055271703 * -log(getrand(thread, 1, 1)/1.0)); thread-throttle_trigger += wait; On Sun, Sep 22, 2013 at 10:08:54AM +0200, Fabien COELHO wrote: Then I understand why you want to remove the bias, but why do you also increase the upper bound? Because the bias was significantly larger for 1000 (about 0.5%), so this also contributed to reduce said bias, and 9.2 times the average target seems as reasonnable a bound as 6.9. The magnitude of the bias is unimportant if known exactly, as you do here. Smaller quanta do give you more possible distinct sleep durations, which is a nice bonus. What I'm hearing is that you don't especially want to change the upper bound, but changing it was an acceptable side effect. However, we can meet all those goals simultaneously with simpler code, can we not? int64 wait = (int64) (throttle_delay * Min(7.0, -log(1 - pg_erand48(thread-random_state; -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements
Dear Noah, Thanks for your answers and remarks, [...] I'll split some part of the patch where there is no coupling, but I do not want to submit conflicting patches. Those benefits aren't compelling enough to counterbalance the risk of gettimeofday() overhead affecting results. (Other opinions welcome.) Yep. If I really leave gettimeofday out, I cannot measure the stddev, so I'll end up with: --rate= gettimeofday, mean ( stddev) measured, because they cannot be derived otherwise. no --rate = no (or less) gettimeofday, mean computed from total time and no stddev report because it cannot be computed. That is annoying, because I do want the standard deviation, and this mean one ifs complexity here and there. ISTM that when running under a target time, the (hopefully very small) overhead is only one gettimeofday call, because the other one is taken anyway to check whether the thread should stop. Or I can add a yet another option, say --stddev, to ask for standard deviation, which will imply the additional gettimeofday call... For a tool like pgbench that requires considerable skill to use effectively, changing a default only helps slightly. It doesn't take much of a risk to make us better off leaving the default unchanged. I can put a 0 default... but even very experienced people will be bitten over and over. Why should I care? ISTM that the default should be the safe option, and experienced user can request -quiet if they want it. [...] I tried to preserve the row-counting behavior because I thought that someone would object otherwise, but I would be really in favor of dropping the row-counting report behavior altogether and only keep the time triggered report. I would be fine with dropping the row-counting behavior. But why subject this distant side issue to its third round of bikeshedding in 18 months? I was not involved:-) The 10 behavior is the initial old version, and only applies to initialization. Someone found it too verbose when scaling, and I agree, so made a quick patch which preserves the old behavior (someone must have said: whatever, do not change the default!) but allowed to switch to a less noisy version, hence the -quiet which is not really quiet. This would be the defective result of a compromise:-) If I follow your request not to change any default, I cannot merge cleanly the -i bench behaviors, as currenty -i does have a default progress report and its own -quiet, and benchmarking does not. The current behavior is inconsistent. I would prefer something consistent, preferably always show a progress report, and -quiet hides it (fully), or if you really insist no progress report, and --progress shows it, and the -quiet option is removed. - Take thread start time at the beginning of the thread. That theory is sound, but I would like at least one report reproducing that behavior and finding that this change improved it. [...] so I'm inclined to leave it alone. I really spent *hours* debugging stupid measures on the previous round of pgbench changes, when adding the throttling stuff. Having the start time taken when the thread really starts is just sanity, and I needed that just to rule out that it was the source of the strange measures. -j 800 vs -j 100 : ITM that if I you create more threads, the time delay incurred is cumulative, so the strangeness of the result should worsen. 800 threads ~ possibly 10 seconds of creation time, to be compared to a few seconds of run time. Shouldn't it just be: int64 wait = (int64) (throttle_delay * -log(1 - pg_erand48(thread-random_state))); [...] Ah; that makes sense. Then I understand why you want to remove the bias, but why do you also increase the upper bound? Because the bias was significantly larger for 1000 (about 0.5%), so this also contributed to reduce said bias, and 9.2 times the average target seems as reasonnable a bound as 6.9. It is also printed without --rate. There is a if above because there is one report with lag (under --rate), and one without. The code I quoted is for the final report in printResults(), and that only shows latency mean/stddev when using --rate. The progress reporting in threadRun() does have two variations as you describe. Indeed, I took it for the progress report. I'll check. It must be consistent whether under --rate or not. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements
It is also printed without --rate. There is a if above because there is one report with lag (under --rate), and one without. The code I quoted is for the final report in printResults(), and that only shows latency mean/stddev when using --rate. The progress reporting in threadRun() does have two variations as you describe. Indeed, I took it for the progress report. I'll check. It must be consistent whether under --rate or not. I checked. The current status is that stddev is currently only available under --progress or --rate, and displayed in the final report if --rate. I can add stddev in the final report *if* progress was on, but not otherwise. This means that under standard benchmarking (no --rate and without progress if it is not enabled by default) stddev cannot be shown. This is consistent with your requirement that gettimeofday calls must be avoided, but results in this necessary inconsistency in reporting results. The alternative is to always measure, but that would imply to always call gettimeofday, and I understood that it is a nogo for you. I think that your performance concerns are needless: I confirm a 50,000,000 gettimeofday calls/s on my laptop. On a desktop workstation I have 43 millions calls/s. The slowest I have found on an old server is 3.3 millions calls/s, that is 0.3 µs per call. Even so, this is to be compared to the fastest possible transaction latency I got which is about 140 µs (one local prepared in-memory table select on my laptop), so the overhead 1/450 of the minimum possible transaction time, pretty negligeable in my opinion. For transaction times which involve disk accesses, the latency is in ms and the overhead well under 1/1000. There are some measures about gettimeofday overhead reported in: http://www.atl.external.lmco.com/projects/QoS/POSIX_html/index.html The worst figure is for a Pentium 166 MHz (!), the overhead is 86.6 µs. However typical values are around or under 1 µs, even for old Pentium II 350 MHz PCs, and this is less that 1% of our minimum possible transaction time. Note also that under option --report-latencies, the number of gettimeofday calls is pretty large, and nobody complained. So I'm of the opinion that we should not worry at all about the number of gettimeofday calls in pgbench. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 1
Split 1 of the initial submission. pgbench: minor update of documentation help message. Use NUM in help message for homogeneity with other options. The target *start* time of the transaction is set by the stochastic process which is doing the throttling (--rate), not the end time. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index ad8e272..06dd709 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -369,7 +369,7 @@ usage(void) -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n -P, --progress=NUM show thread progress report every NUM seconds\n -r, --report-latencies report average latency per command\n - -R, --rate=SPEC target rate in transactions per second\n + -R, --rate=NUM target rate in transactions per second\n -s, --scale=NUM report this scale factor in output\n -S, --select-onlyperform SELECT-only transactions\n -t, --transactions number of transactions each client runs diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml index 49a79b1..5871b45 100644 --- a/doc/src/sgml/pgbench.sgml +++ b/doc/src/sgml/pgbench.sgml @@ -429,7 +429,7 @@ pgbench optional replaceableoptions/ /optional replaceabledbname/ /para para The rate is targeted by starting transactions along a -Poisson-distributed schedule time line. The expected finish time +Poisson-distributed schedule time line. The expected start time schedule moves forward based on when the client first started, not when the previous transaction ended. That approach means that when transactions go past their original scheduled end time, it is -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 2
Split 2 of the initial submission pgbench: reduce and compensate throttling underestimation bias. This is a consequence of relying on an integer random generator, which allow to ensure that delays inserted stay reasonably in range of the target average delay. The bias was about 0.5% with 1000 distinct values. Multiplier added to compensate the 0.05% bias with 1 distinct integer values, so there is no bias now. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index ad8e272..572573a 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -929,13 +929,17 @@ top: * that the series of delays will approximate a Poisson distribution * centered on the throttle_delay time. * - * 1000 implies a 6.9 (-log(1/1000)) to 0.0 (log 1.0) delay multiplier. + * 1 implies a 9.2 (-log(1/1)) to 0.0 (log 1) delay multiplier, + * and results in a 0.055 % target underestimation bias: + * + * SELECT 1.0/AVG(-LN(i/1.0)) FROM generate_series(1,1) AS i; + * = 1.00055271703266335474 * * If transactions are too slow or a given wait is shorter than * a transaction, the next transaction will start right away. */ - int64 wait = (int64) - throttle_delay * -log(getrand(thread, 1, 1000)/1000.0); + int64 wait = (int64) (throttle_delay * + 1.00055271703 * -log(getrand(thread, 1, 1)/1.0)); thread-throttle_trigger += wait; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3
Split 3 of the initial submission, which actually deal with data measured and reported on stderr under various options. This version currently takes into account many comments by Noah Mish. In particular, the default no report behavior under benchmarking is not changed, although I really think that it should. Also, the standard deviation is only shown when available, which is not in under all settings, because of concerns expressed about the cost of additional calls to gettimeofday. ISTM that these concerned are misplaced in this particular case. Improve pgbench measurements progress report These changes are coupled because measures are changed, and their reporting as well. Submitting separate patches for these interlinked features would result in conflicting or dependent patches, so I wish to avoid that. Also it would require more reviewer time. - Use same progress and quiet options both under init bench. The reporting is every 5 seconds for initialization, and currently no report for benchmarking. I strongly suggest to change this default to 5 seconds for both, if people do not veto any change of the default behavior... The rational is that benchmarking is error prone as it must run a long time to be significant because of warmup effects (esp on HDD, less true on SSD). Seeing how the current performance evolve would help pgbench users realise that. See down below a sample output. The previous progress report under initialization, which printed a line every 100,000 rows, is removed, as it is much to verbose for most hardware, and pretty long for any significant scale. - Measure transaction latency instead of computing it, for --rate and --progress. The previous computed figure does not make sense under throttling: as sleep throttling time was included in the figures, the displayed results were plain false. The latency and its standard deviation are printed out under progress and in the final report when available. It could be made always available, but that would require to accept additional gettimeofday calls. I do not think that there is a performance issue here, but there is no concensus yet. - Take thread start time at the beginning of the thread (!) Otherwise it includes pretty slow thread/fork system start times in the measurements. May help with bug #8326. This also helps with throttling, as the start time is used to adjust the rate: if it is not the actual start time, there is a bit of a rush at the beginning in order to catch up. Taking the start time when the thread actually starts is the sane thing to do to avoid surprises at possibly strange measures on short runs. Sample output under initialization with --progress=1 creating tables... 1126000 of 300 tuples (37%) done (elapsed 1.00 s, remaining 1.67 s). 2106000 of 300 tuples (70%) done (elapsed 2.00 s, remaining 0.85 s). 2824000 of 300 tuples (94%) done (elapsed 3.00 s, remaining 0.19 s). 300 of 300 tuples (100%) done (elapsed 3.19 s, remaining 0.00 s). vacuum... set primary keys... done. Sample output under benchmarking with --progress=1 progress: 1.0 s, 2626.1 tps, 0.374 stddev 0.597 ms lat progress: 2.0 s, 2766.6 tps, 0.358 stddev 0.588 ms lat progress: 3.0 s, 2567.4 tps, 0.385 stddev 0.665 ms lat progress: 4.0 s, 3014.2 tps, 0.328 stddev 0.593 ms lat progress: 5.0 s, 2959.3 tps, 0.334 stddev 0.553 ms lat ... progress: 16.0 s, 5009.2 tps, 0.197 stddev 0.381 ms lat ... progress: 24.0 s, 7051.2 tps, 0.139 stddev 0.284 ms lat ... progress: 50.0 s, 6860.5 tps, 0.143 stddev 0.052 ms lat ... The warmup is quite fast because the DB is on a SSD. In the beginning the standard deviation is well over the average transaction time, but when the steady state is reached (later) it is much below. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index ad8e272..0bdf8dd 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -167,11 +167,12 @@ char *index_tablespace = NULL; #define SCALE_32BIT_THRESHOLD 2 bool use_log; /* log transaction latencies to a file */ -bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ -int progress = 0; /* thread progress report every this seconds */ +int progress = -1; /* report every this seconds. + 0 is quiet, -1 is not set yet */ int progress_nclients = 0; /* number of clients for progress report */ +int progress_nthreads = 0; /* number of threads for progress report */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ @@ -214,6 +215,8 @@ typedef struct int nvariables; instr_time txn_begin; /* used for measuring
Re: [HACKERS] pgbench progress report improvements - split 3
Fabien, As long as you're hacking pgbench output, what about offering a JSON option instead of the text output? I'm working on automating pgbench performance testing, and having the output in a proper delimited format would be really helpful. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements - split 3
Hello Josh, As long as you're hacking pgbench output, what about offering a JSON option instead of the text output? I'm working on automating pgbench performance testing, and having the output in a proper delimited format would be really helpful. I'm more a grep | cut | ... person, but I do understand your point about automation. My favorite hack is to load benchmark data into sqlite and perform query to build data extract for gnuplot figures. So I can put that in the stack of things to do when I have some time. Due to the possibly repetitive table structure of the data, maybe CSV would make sense as well. It is less extensible, but it is directly usable by sqlite or pg. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements
Hello Noah, Thanks a lot for all these comments! I'm not planning to apply all of them directly, especially removing features that I think really desirable. Please find a defense against some of your suggestions. I wish to wait for some more feedback about these arguments before spending time in heavy changes. Improve pgbench measurements progress report These changes are loosely coupled; please separate them into several patch files: I thought about this. I submitted a bunch of very small pgbench patches to the previous commit fest, and it resulted in a lot of duplicated tests by reviewers, which made it a waste of their precious time. ISTM that what you suggest will result in more reviewer time, which is the critical resource of the commitfest. Moreover, splitting the patch as you suggest, especially in 7 patches, would result in conflicting or dependent patches, which creates yet more issues for the reviews. There is *some* coupling, because the reports are about the measurements performed, and I change both to try to have something consistent in the end. I can separate the doc fix, but this is just one line... Basically, I tried to optimize reviewer time by a combined patch focussed on pgbench measurements and their reporting. But this does not prevent discussing features separately! - Use progress option both under init bench. Activate progress report by default, every 5 seconds. When initializing, --quiet reverts to the old every 100,000 insertions behavior... Patch (1): Change the default from --progress=0 to --progress=5 This has a disadvantage of causing two extra gettimeofday() calls per transaction. That's not cheap on all platforms; a user comparing pgbench results across versions will need to make a point of disabling this again to make his results comparable. Given that threat and uncertainty about which default would be more popular, I think we should drop this part. The gettimeofday call time is very small compared to network and database (disk!) accesses involved in a pgbench run. On my small laptop, gettimeofday takes about 0.0002 seconds (0.02 µs -- micro seconds) per call, which is typically under 1/1000 of the fastest local-socket read-only one-table cache-hit prepared trivial transaction. This is a little different when initializing, but I put a guard on gettimeofday in that case. If this is a blocker, I can put 0, but I really do not think it is necessary because of the performance impact. About the default behavior : it should be what is more useful. I found that running pgbench to get significant results requires long minutes or even hours because of warmup time. Running a command for 30 minutes without any feedback is annoying. The second point is that the feedback would help user notice that the figures evolve slowly but significantly, and realise that their too short runs are not significant at all. So it seems to me that it should be the desirable behavior, and --progress=0 is always possible anyway for the very performance (over-)conscious tester. Patch (2): Make --initialize mode respect --progress. The definition you've chosen for --quiet makes that option contrary to its own name: message-per-100k-tuples is typically more voluminous than your proposed new default of message-per-5s. In fact, since --quiet currently switches from message-per-100k-tuples to message-per-5s, your patch gives it the exact opposite of its present effect. During the 9.3 development cycle, we _twice_ made changes pertaining to --initialize message frequency: http://www.postgresql.org/message-id/flat/20120620.170427.347012755716399568.t-is...@sraoss.co.jp http://www.postgresql.org/message-id/flat/1346472039.18010.10.ca...@vanquo.pezone.net That gives me pause about working through yet another change; we keep burning time on this minor issue. I totally agree that this quiet is not convincing! My actual opinion is that quiet should just mean quiet:-), i.e. no progress report. I tried to preserve the row-counting behavior because I thought that someone would object otherwise, but I would be really in favor of dropping the row-counting report behavior altogether and only keep the time triggered report. - Measure transaction latency instead of computing it. Patch (3): Report the standard deviation of latency. Seems potentially useful; see inline comments below. In my limited testing, the interesting latency cases involved stray clusters of transactions showing 10x-100x mean latency. If that's typical, I'd doubt mean/stddev is a great analytical tool. But I have little reason to believe that my experience here was representative. The stddev measurement is sensitive to the cache hit ratio, so this is a good indicator of whether the steady state is reached. Indeed, mean is the most basic tool, and stddev the second most basic... They are better than nothing, and the stddev does add value.
Re: [HACKERS] pgbench progress report improvements
On Sat, Sep 21, 2013 at 4:55 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: I suggest getting the term stddev in there somehow, maybe like this: progress: 37.0 s, 115.2 tps, latency avg=8.678 ms stddev=1.792 My issue is to try to keep the line width under control so as to avoid line breaks in the terminal. Under throttling, there is also the time lag appended to the line. Moreover, using 'xxx=figure breaks simple cut pipelining to extract the figures, so I would prefer to stick to spaces. Maybe: progress: 36.0 s, 115.2 tps, lat avg 9.678 ms stddev 1.792, lag 0.143 ms but I liked my +- approach:-) 100 +- 3 implies a range of 97 to 103 and no values are outside of that range.
Re: [HACKERS] pgbench progress report improvements
Moreover, using 'xxx=figure breaks simple cut pipelining to extract the figures, so I would prefer to stick to spaces. Maybe: progress: 36.0 s, 115.2 tps, lat avg 9.678 ms stddev 1.792, lag 0.143 ms but I liked my +- approach:-) 100 +- 3 implies a range of 97 to 103 and no values are outside of that range. This is one of the possible meaning, but ISTM that it is not exclusive. Anyway here we do not have a symmetric distribution of run times. Well, it seems I'm alone in liking my +-, so I'll backtrack on that. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements
On Sat, Sep 21, 2013 at 10:55:54AM +0200, Fabien COELHO wrote: Improve pgbench measurements progress report These changes are loosely coupled; please separate them into several patch files: I thought about this. I submitted a bunch of very small pgbench patches to the previous commit fest, and it resulted in a lot of duplicated tests by reviewers, which made it a waste of their precious time. ISTM that what you suggest will result in more reviewer time, which is the critical resource of the commitfest. Moreover, splitting the patch as you suggest, especially in 7 patches, would result in conflicting or dependent patches, which creates yet more issues for the reviews. There is *some* coupling, because the reports are about the measurements performed, and I change both to try to have something consistent in the end. I can separate the doc fix, but this is just one line... Post all the patch files to the same mailing list thread and have them reviewed under a single CommitFest entry. A reviewer or committer preferring less granularity can squash patches together effortlessly. The inverse task, splitting a patch, is not so mechanical. Consequently, err on the side of more, smaller patches; it almost always saves reviewer and committer time. Basically, I tried to optimize reviewer time by a combined patch focussed on pgbench measurements and their reporting. Relatedness of changes is a continuum, and committers sometimes disagree on how far to subdivide before commit. This patch is not a borderline case; removing a bias from --rate throttle delays does not belong in the same commit as enabling progress reports by default. But this does not prevent discussing features separately! Actually, it does hamper debate. With smaller patches, a person interested in just one issue can read just that patch. With a monolithic patch, one must take in the entire thing to fully comprehend any single issue. Patch (1): Change the default from --progress=0 to --progress=5 This has a disadvantage of causing two extra gettimeofday() calls per transaction. That's not cheap on all platforms; a user comparing pgbench results across versions will need to make a point of disabling this again to make his results comparable. Given that threat and uncertainty about which default would be more popular, I think we should drop this part. The gettimeofday call time is very small compared to network and database (disk!) accesses involved in a pgbench run. On my small laptop, gettimeofday takes about 0.0002 seconds (0.02 µs -- micro seconds) per call, which is typically under 1/1000 of the fastest local-socket read-only one-table cache-hit prepared trivial transaction. Many systems do have cheap gettimeofday(). Many don't. We recently added contrib/pg_test_timing in response to the variability in this area. I found that running pgbench to get significant results requires long minutes or even hours because of warmup time. Running a command for 30 minutes without any feedback is annoying. The second point is that the feedback would help user notice that the figures evolve slowly but significantly, and realise that their too short runs are not significant at all. So it seems to me that it should be the desirable behavior, and --progress=0 is always possible anyway for the very performance (over-)conscious tester. Those benefits aren't compelling enough to counterbalance the risk of gettimeofday() overhead affecting results. (Other opinions welcome.) For a tool like pgbench that requires considerable skill to use effectively, changing a default only helps slightly. It doesn't take much of a risk to make us better off leaving the default unchanged. Patch (2): Make --initialize mode respect --progress. The definition you've chosen for --quiet makes that option contrary to its own name: message-per-100k-tuples is typically more voluminous than your proposed new default of message-per-5s. In fact, since --quiet currently switches from message-per-100k-tuples to message-per-5s, your patch gives it the exact opposite of its present effect. During the 9.3 development cycle, we _twice_ made changes pertaining to --initialize message frequency: http://www.postgresql.org/message-id/flat/20120620.170427.347012755716399568.t-is...@sraoss.co.jp http://www.postgresql.org/message-id/flat/1346472039.18010.10.ca...@vanquo.pezone.net That gives me pause about working through yet another change; we keep burning time on this minor issue. I totally agree that this quiet is not convincing! My actual opinion is that quiet should just mean quiet:-), i.e. no progress report. I tried to preserve the row-counting behavior because I thought that someone would object otherwise, but I would be really in favor of dropping the row-counting report behavior altogether and only keep the time triggered report. I would be fine with
Re: [HACKERS] pgbench progress report improvements
For others following along, Pavel Stehule reviewed this on a different thread: http://www.postgresql.org/message-id/cafj8pravyuhv8x3feewasggvdcijogvlzsavd7rmwn9mw8r...@mail.gmail.com On Tue, Aug 06, 2013 at 10:47:14AM +0200, Fabien wrote: Improve pgbench measurements progress report These changes are loosely coupled; please separate them into several patch files: - Use progress option both under init bench. Activate progress report by default, every 5 seconds. When initializing, --quiet reverts to the old every 100,000 insertions behavior... Patch (1): Change the default from --progress=0 to --progress=5 This has a disadvantage of causing two extra gettimeofday() calls per transaction. That's not cheap on all platforms; a user comparing pgbench results across versions will need to make a point of disabling this again to make his results comparable. Given that threat and uncertainty about which default would be more popular, I think we should drop this part. Patch (2): Make --initialize mode respect --progress. The definition you've chosen for --quiet makes that option contrary to its own name: message-per-100k-tuples is typically more voluminous than your proposed new default of message-per-5s. In fact, since --quiet currently switches from message-per-100k-tuples to message-per-5s, your patch gives it the exact opposite of its present effect. During the 9.3 development cycle, we _twice_ made changes pertaining to --initialize message frequency: http://www.postgresql.org/message-id/flat/20120620.170427.347012755716399568.t-is...@sraoss.co.jp http://www.postgresql.org/message-id/flat/1346472039.18010.10.ca...@vanquo.pezone.net That gives me pause about working through yet another change; we keep burning time on this minor issue. - Measure transaction latency instead of computing it. The previous computed figure does not make sense under throttling, as sleep throttling time was included in the figures. The latency and its standard deviation are printed out under progress and in the final report. Patch (3): Report the standard deviation of latency. Seems potentially useful; see inline comments below. In my limited testing, the interesting latency cases involved stray clusters of transactions showing 10x-100x mean latency. If that's typical, I'd doubt mean/stddev is a great analytical tool. But I have little reason to believe that my experience here was representative. Patch (4): Redefine latency as reported by pgbench and report lag more. If we do this, should we also augment the --log output to contain the figures necessary to reconstruct this calculation? I think that would mean adding fields for the time when the first command of the transaction was sent. - Take thread start time at the beginning of the thread. Otherwise it includes pretty slow thread/fork system start times in the measurements. May help with bug #8326. Patch (5) That theory is sound, but I would like at least one report reproducing that behavior and finding that this change improved it. - Reduce and compensate throttling underestimation bias. This is a consequence of relying on an integer random generator. It was about 0.5% with 1000 distinct values. Multiplier added to compensate the 0.05% bias with 1 distinct integer values. Patch (6) See inline comment below. - Updated documentation help message. Include any relevant documentation and --help updates with the patch necessitating them. If there are errors in the documentation today, put fixes for those in a separate patch (7). Additionally, this patch contains numerous whitespace-only changes. Do not mix those into a patch that makes a functional change. Most of them would be done or undone by the next pgindent run, so I suggest simply dropping them. See the first entry here: https://wiki.postgresql.org/wiki/Submitting_a_Patch#Reasons_your_patch_might_be_returned Patches (5) and (6) are nicely small and free of UI questions. At a minimum, let's try to get those done in this CommitFest. I propose dropping patches (1) and (2). Patches (3) and (4) have open questions, but I think they have good value potential. Sample output under benchmarking with --progress=1 progress: 36.0 s, 10.9 tps, 91.864 +- 124.874 ms lat progress: 37.0 s, 115.2 tps, 8.678 +- 1.792 ms lat x +- y does not convey that y is a standard deviation. I suggest getting the term stddev in there somehow, maybe like this: progress: 37.0 s, 115.2 tps, latency avg=8.678 ms stddev=1.792 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c - -R, --rate=SPEC target rate in transactions per second\n + -R, --rate=NUM target rate in transactions per second\n This would fall under patch (7) if you feel it's needed. @@ -928,14 +931,18 @@ top: * Use inverse transform sampling to randomly generate a delay,
Re: [HACKERS] pgbench progress report improvements
On Tue, 2013-08-06 at 10:47 +0200, Fabien wrote: Here is a patch submission for reference to the next commitfest. Please replace the tabs in the SGML files with spaces. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench progress report improvements
Here is a patch submission for reference to the next commitfest. Please replace the tabs in the SGML files with spaces. Attached a v2 with tabs replaced by spaces. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index ad8e272..a90b188 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -103,7 +103,6 @@ extern int optind; #define MAXCLIENTS 1024 #endif -#define LOG_STEP_SECONDS 5 /* seconds between log messages */ #define DEFAULT_NXACTS 10 /* default nxacts */ int nxacts = 0; /* number of transactions per client */ @@ -167,11 +166,11 @@ char *index_tablespace = NULL; #define SCALE_32BIT_THRESHOLD 2 bool use_log; /* log transaction latencies to a file */ -bool use_quiet; /* quiet logging onto stderr */ int agg_interval; /* log aggregates instead of individual * transactions */ -int progress = 0; /* thread progress report every this seconds */ +int progress = 5; /* report every this seconds, 0 is quiet */ int progress_nclients = 0; /* number of clients for progress report */ +int progress_nthreads = 0; /* number of threads for progress report */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ @@ -214,6 +213,8 @@ typedef struct int nvariables; instr_time txn_begin; /* used for measuring transaction latencies */ instr_time stmt_begin; /* used for measuring statement latencies */ + int64 txn_latencies; /* cumulated latencies */ + int64 txn_sqlats; /* cumulated square latencies */ bool is_throttled; /* whether transaction throttling is done */ int use_file; /* index in sql_files for this client */ bool prepared[MAX_FILES]; @@ -243,8 +244,10 @@ typedef struct { instr_time conn_time; int xacts; - int64 throttle_lag; - int64 throttle_lag_max; + int64 latencies; + int64 sqlats; + int64 throttle_lag; + int64 throttle_lag_max; } TResult; /* @@ -347,7 +350,6 @@ usage(void) -i, --initialize invokes initialization mode\n -F, --fillfactor=NUM set fill factor\n -n, --no-vacuum do not run VACUUM after initialization\n - -q, --quiet quiet logging (one message each 5 seconds)\n -s, --scale=NUM scaling factor\n --foreign-keys create foreign key constraints between tables\n --index-tablespace=TABLESPACE\n @@ -367,9 +369,8 @@ usage(void) (default: simple)\n -n, --no-vacuum do not run VACUUM before tests\n -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n - -P, --progress=NUM show thread progress report every NUM seconds\n -r, --report-latencies report average latency per command\n - -R, --rate=SPEC target rate in transactions per second\n + -R, --rate=NUM target rate in transactions per second\n -s, --scale=NUM report this scale factor in output\n -S, --select-onlyperform SELECT-only transactions\n -t, --transactions number of transactions each client runs @@ -382,6 +383,8 @@ usage(void) -d, --debug print debugging output\n -h, --host=HOSTNAME database server host or socket directory\n -p, --port=PORT database server port number\n + -P, --progress=NUM show progress report every NUM seconds (default 5)\n + -q, --quiet quiet progress report\n -U, --username=USERNAME connect as specified database user\n -V, --versionoutput version information, then exit\n -?, --help show this help, then exit\n @@ -928,14 +931,18 @@ top: * Use inverse transform sampling to randomly generate a delay, such * that the series of delays will approximate a Poisson distribution * centered on the throttle_delay time. - * - * 1000 implies a 6.9 (-log(1/1000)) to 0.0 (log 1.0) delay multiplier. + * + * 1 implies a 9.2 (-log(1/1)) to 0.0 (log 1) delay multiplier, + * and results in a 0.055 % target underestimation bias: + * + * SELECT 1.0/AVG(-LN(i/1.0)) FROM generate_series(1,1) AS i; + * = 1.00055271703266335474 * * If transactions are too slow or a given wait is shorter than * a transaction, the next transaction will start right away. */ - int64 wait = (int64) - throttle_delay * -log(getrand(thread, 1, 1000)/1000.0); + int64 wait = (int64) (throttle_delay * + 1.00055271703 * -log(getrand(thread, 1, 1)/1.0)); thread-throttle_trigger += wait; @@ -1003,6 +1010,21 @@ top: } /* + * always record latency under progress or throttling + */ + if ((progress || throttle_delay)