Here is a patch submission for reference to the next commitfest.
Improve pgbench measurements & progress report
- Use progress option both under init & bench.
Activate progress report by default, every 5 seconds.
When initializing, --quiet reverts to the old every 100,000 insertions
behavior...
- Measure transaction latency instead of computing it.
The previous computed figure does not make sense under throttling,
as sleep throttling time was included in the figures.
The latency and its standard deviation are printed out under progress
and in the final report.
- Take thread start time at the beginning of the thread.
Otherwise it includes pretty slow thread/fork system start times in
the measurements. May help with bug #8326.
- Reduce and compensate throttling underestimation bias.
This is a consequence of relying on an integer random generator.
It was about 0.5% with 1000 distinct values. Multiplier added to
compensate the 0.05% bias with 10000 distinct integer values.
- Updated documentation & help message.
Sample output under initialization with --progress=1
creating tables...
223000 of 1000000 tuples (22%) done (elapsed 1.01 s, remaining 3.52 s).
430200 of 1000000 tuples (43%) done (elapsed 2.03 s, remaining 2.68 s).
507300 of 1000000 tuples (50%) done (elapsed 3.44 s, remaining 3.34 s).
571400 of 1000000 tuples (57%) done (elapsed 4.05 s, remaining 3.03 s).
786800 of 1000000 tuples (78%) done (elapsed 5.03 s, remaining 1.36 s).
893200 of 1000000 tuples (89%) done (elapsed 6.02 s, remaining 0.72 s).
1000000 of 1000000 tuples (100%) done (elapsed 6.93 s, remaining 0.00 s).
...
Sample output under benchmarking with --progress=1
progress: 1.0 s, 116.7 tps, 8.528 +- 1.217 ms lat
progress: 2.0 s, 117.7 tps, 8.493 +- 1.165 ms lat
progress: 3.0 s, 115.5 tps, 8.654 +- 1.650 ms lat
progress: 4.0 s, 116.8 tps, 8.559 +- 1.365 ms lat
progress: 5.0 s, 111.7 tps, 8.947 +- 3.475 ms lat
progress: 6.0 s, 116.7 tps, 8.563 +- 1.387 ms lat
progress: 7.0 s, 116.6 tps, 8.572 +- 1.474 ms lat
...
# oops, things happen:
progress: 36.0 s, 10.9 tps, 91.864 +- 124.874 ms lat
progress: 37.0 s, 115.2 tps, 8.678 +- 1.792 ms lat
...
--
Fabien.
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index ad8e272..a90b188 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -103,7 +103,6 @@ extern int optind;
#define MAXCLIENTS 1024
#endif
-#define LOG_STEP_SECONDS 5 /* seconds between log messages */
#define DEFAULT_NXACTS 10 /* default nxacts */
int nxacts = 0; /* number of transactions per client */
@@ -167,11 +166,11 @@ char *index_tablespace = NULL;
#define SCALE_32BIT_THRESHOLD 20000
bool use_log; /* log transaction latencies to a file */
-bool use_quiet; /* quiet logging onto stderr */
int agg_interval; /* log aggregates instead of individual
* transactions */
-int progress = 0; /* thread progress report every this seconds */
+int progress = 5; /* report every this seconds, 0 is quiet */
int progress_nclients = 0; /* number of clients for progress report */
+int progress_nthreads = 0; /* number of threads for progress report */
bool is_connect; /* establish connection for each transaction */
bool is_latencies; /* report per-command latencies */
int main_pid; /* main process id used in log filename */
@@ -214,6 +213,8 @@ typedef struct
int nvariables;
instr_time txn_begin; /* used for measuring transaction latencies */
instr_time stmt_begin; /* used for measuring statement latencies */
+ int64 txn_latencies; /* cumulated latencies */
+ int64 txn_sqlats; /* cumulated square latencies */
bool is_throttled; /* whether transaction throttling is done */
int use_file; /* index in sql_files for this client */
bool prepared[MAX_FILES];
@@ -243,8 +244,10 @@ typedef struct
{
instr_time conn_time;
int xacts;
- int64 throttle_lag;
- int64 throttle_lag_max;
+ int64 latencies;
+ int64 sqlats;
+ int64 throttle_lag;
+ int64 throttle_lag_max;
} TResult;
/*
@@ -347,7 +350,6 @@ usage(void)
" -i, --initialize invokes initialization mode\n"
" -F, --fillfactor=NUM set fill factor\n"
" -n, --no-vacuum do not run VACUUM after initialization\n"
- " -q, --quiet quiet logging (one message each 5 seconds)\n"
" -s, --scale=NUM scaling factor\n"
" --foreign-keys create foreign key constraints between tables\n"
" --index-tablespace=TABLESPACE\n"
@@ -367,9 +369,8 @@ usage(void)
"(default: simple)\n"
" -n, --no-vacuum do not run VACUUM before tests\n"
" -N, --skip-some-updates skip updates of pgbench_tellers and pgbench_branches\n"
- " -P, --progress=NUM show thread progress report every NUM seconds\n"
" -r, --report-latencies report average latency per command\n"
- " -R, --rate=SPEC target rate in transactions per second\n"
+ " -R, --rate=NUM target rate in transactions per second\n"
" -s, --scale=NUM report this scale factor in output\n"
" -S, --select-only perform SELECT-only transactions\n"
" -t, --transactions number of transactions each client runs "
@@ -382,6 +383,8 @@ usage(void)
" -d, --debug print debugging output\n"
" -h, --host=HOSTNAME database server host or socket directory\n"
" -p, --port=PORT database server port number\n"
+ " -P, --progress=NUM show progress report every NUM seconds (default 5)\n"
+ " -q, --quiet quiet progress report\n"
" -U, --username=USERNAME connect as specified database user\n"
" -V, --version output version information, then exit\n"
" -?, --help show this help, then exit\n"
@@ -928,14 +931,18 @@ top:
* Use inverse transform sampling to randomly generate a delay, such
* that the series of delays will approximate a Poisson distribution
* centered on the throttle_delay time.
- *
- * 1000 implies a 6.9 (-log(1/1000)) to 0.0 (log 1.0) delay multiplier.
+ *
+ * 10000 implies a 9.2 (-log(1/10000)) to 0.0 (log 1) delay multiplier,
+ * and results in a 0.055 % target underestimation bias:
+ *
+ * SELECT 1.0/AVG(-LN(i/10000.0)) FROM generate_series(1,10000) AS i;
+ * = 1.000552717032611116335474
*
* If transactions are too slow or a given wait is shorter than
* a transaction, the next transaction will start right away.
*/
- int64 wait = (int64)
- throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+ int64 wait = (int64) (throttle_delay *
+ 1.00055271703 * -log(getrand(thread, 1, 10000)/10000.0));
thread->throttle_trigger += wait;
@@ -1003,6 +1010,21 @@ top:
}
/*
+ * always record latency under progress or throttling
+ */
+ if ((progress || throttle_delay) && commands[st->state + 1] == NULL)
+ {
+ instr_time now, diff;
+ int64 latency;
+ INSTR_TIME_SET_CURRENT(now);
+ diff = now;
+ INSTR_TIME_SUBTRACT(diff, st->txn_begin);
+ latency = INSTR_TIME_GET_MICROSEC(diff);
+ st->txn_latencies += latency;
+ st->txn_sqlats += latency*latency;
+ }
+
+ /*
* if transaction finished, record the time it took in the log
*/
if (logfile && commands[st->state + 1] == NULL)
@@ -1182,17 +1204,20 @@ top:
}
/*
- * This ensures that a throttling delay is inserted before proceeding
- * with sql commands, after the first transaction. The first transaction
- * throttling is performed when first entering doCustom.
+ * This ensures that a throttling delay is inserted before proceeding
+ * with sql commands, after the first transaction. The first transaction
+ * throttling is performed when first entering doCustom.
*/
if (trans_needs_throttle) {
trans_needs_throttle = false;
goto top;
}
- /* Record transaction start time if logging is enabled */
- if (logfile && st->state == 0)
+ /*
+ * Record transaction start time
+ * if logging, progress or throttling is enabled
+ */
+ if ((logfile || progress || throttle_delay) && st->state == 0)
INSTR_TIME_SET_CURRENT(st->txn_begin);
/* Record statement start time if per-command latencies are requested */
@@ -1682,7 +1707,7 @@ init(bool is_no_vacuum)
* If we want to stick with the original logging, print a message each
* 100k inserted rows.
*/
- if ((!use_quiet) && (j % 100000 == 0))
+ if (!progress && (j % 100000 == 0))
{
INSTR_TIME_SET_CURRENT(diff);
INSTR_TIME_SUBTRACT(diff, start);
@@ -1690,13 +1715,15 @@ init(bool is_no_vacuum)
elapsed_sec = INSTR_TIME_GET_DOUBLE(diff);
remaining_sec = (scale * naccounts - j) * elapsed_sec / j;
- fprintf(stderr, INT64_FORMAT " of " INT64_FORMAT " tuples (%d%%) done (elapsed %.2f s, remaining %.2f s).\n",
+ fprintf(stderr,
+ INT64_FORMAT " of " INT64_FORMAT
+ " tuples (%d%%) done (elapsed %.2f s, remaining %.2f s).\n",
j, (int64) naccounts * scale,
(int) (((int64) j * 100) / (naccounts * scale)),
elapsed_sec, remaining_sec);
}
/* let's not call the timing for each row, but only each 100 rows */
- else if (use_quiet && (j % 100 == 0))
+ else if (progress && (j % 100 == 0))
{
INSTR_TIME_SET_CURRENT(diff);
INSTR_TIME_SUBTRACT(diff, start);
@@ -1705,14 +1732,17 @@ init(bool is_no_vacuum)
remaining_sec = (scale * naccounts - j) * elapsed_sec / j;
/* have we reached the next interval (or end)? */
- if ((j == scale * naccounts) || (elapsed_sec >= log_interval * LOG_STEP_SECONDS))
+ if ((j == scale * naccounts) ||
+ (elapsed_sec >= log_interval * progress))
{
- fprintf(stderr, INT64_FORMAT " of " INT64_FORMAT " tuples (%d%%) done (elapsed %.2f s, remaining %.2f s).\n",
+ fprintf(stderr, INT64_FORMAT " of " INT64_FORMAT
+ " tuples (%d%%) done (elapsed %.2f s, remaining %.2f s).\n",
j, (int64) naccounts * scale,
- (int) (((int64) j * 100) / (naccounts * scale)), elapsed_sec, remaining_sec);
+ (int) (((int64) j * 100) / (naccounts * scale)),
+ elapsed_sec, remaining_sec);
/* skip to the next interval */
- log_interval = (int) ceil(elapsed_sec / LOG_STEP_SECONDS);
+ log_interval = (int) ceil(elapsed_sec / progress);
}
}
@@ -2096,6 +2126,7 @@ static void
printResults(int ttype, int normal_xacts, int nclients,
TState *threads, int nthreads,
instr_time total_time, instr_time conn_total_time,
+ int64 total_latencies, int64 total_sqlats,
int64 throttle_lag, int64 throttle_lag_max)
{
double time_include,
@@ -2143,8 +2174,15 @@ printResults(int ttype, int normal_xacts, int nclients,
* transaction. The measured lag may be caused by thread/client load,
* the database load, or the Poisson throttling process.
*/
- printf("average rate limit schedule lag: %.3f ms (max %.3f ms)\n",
- 0.001 * throttle_lag / normal_xacts, 0.001 * throttle_lag_max);
+ double latency = 0.001 * total_latencies / normal_xacts;
+ double sqlat = (double) total_sqlats / normal_xacts;
+
+ printf("average latency under rate limit: %.3f +- %.3f ms\n"
+ "average rate limit lag: %.3f ms (max %.3f ms)\n",
+ latency,
+ 0.001 * sqrt(sqlat - 1000000*latency*latency),
+ 0.001 * throttle_lag / normal_xacts,
+ 0.001 * throttle_lag_max);
}
printf("tps = %f (including connections establishing)\n", tps_include);
@@ -2254,7 +2292,9 @@ main(int argc, char **argv)
instr_time start_time; /* start up time */
instr_time total_time;
instr_time conn_total_time;
- int total_xacts;
+ int total_xacts = 0;
+ int64 total_latencies = 0;
+ int64 total_sqlats = 0;
int64 throttle_lag = 0;
int64 throttle_lag_max = 0;
@@ -2410,7 +2450,7 @@ main(int argc, char **argv)
use_log = true;
break;
case 'q':
- use_quiet = true;
+ progress = 0;
break;
case 'f':
ttype = 3;
@@ -2554,13 +2594,6 @@ main(int argc, char **argv)
exit(1);
}
- /* -q may be used only with -i */
- if (use_quiet && !is_init_mode)
- {
- fprintf(stderr, "quiet-logging is allowed only in initialization mode (-i)\n");
- exit(1);
- }
-
/* --sampling-rate may must not be used with --aggregate-interval */
if (sample_rate > 0.0 && agg_interval > 0)
{
@@ -2608,6 +2641,7 @@ main(int argc, char **argv)
*/
main_pid = (int) getpid();
progress_nclients = nclients;
+ progress_nthreads = nthreads;
if (nclients > 1)
{
@@ -2796,12 +2830,10 @@ main(int argc, char **argv)
{
TState *thread = &threads[i];
- INSTR_TIME_SET_CURRENT(thread->start_time);
-
/* the first thread (i = 0) is executed by main thread */
if (i > 0)
{
- int err = pthread_create(&thread->thread, NULL, threadRun, thread);
+ int err = pthread_create(&thread->thread, NULL, threadRun, thread);
if (err != 0 || thread->thread == INVALID_THREAD)
{
@@ -2816,22 +2848,25 @@ main(int argc, char **argv)
}
/* wait for threads and accumulate results */
- total_xacts = 0;
INSTR_TIME_SET_ZERO(conn_total_time);
for (i = 0; i < nthreads; i++)
{
- void *ret = NULL;
+ void *ret = NULL;
if (threads[i].thread == INVALID_THREAD)
+ /* run thread 0 */
ret = threadRun(&threads[i]);
else
+ /* wait for other threads */
pthread_join(threads[i].thread, &ret);
if (ret != NULL)
{
- TResult *r = (TResult *) ret;
+ TResult *r = (TResult *) ret;
total_xacts += r->xacts;
+ total_latencies += r->latencies;
+ total_sqlats += r->sqlats;
throttle_lag += r->throttle_lag;
if (r->throttle_lag_max > throttle_lag_max)
throttle_lag_max = r->throttle_lag_max;
@@ -2845,7 +2880,8 @@ main(int argc, char **argv)
INSTR_TIME_SET_CURRENT(total_time);
INSTR_TIME_SUBTRACT(total_time, start_time);
printResults(ttype, total_xacts, nclients, threads, nthreads,
- total_time, conn_total_time, throttle_lag, throttle_lag_max);
+ total_time, conn_total_time,
+ total_latencies, total_sqlats, throttle_lag, throttle_lag_max);
return 0;
}
@@ -2863,21 +2899,24 @@ threadRun(void *arg)
int remains = nstate; /* number of remaining clients */
int i;
/* for reporting progress: */
- int64 thread_start = INSTR_TIME_GET_MICROSEC(thread->start_time);
- int64 last_report = thread_start;
- int64 next_report = last_report + progress * 1000000;
- int64 last_count = 0;
+ int64 thread_start, last_report, next_report;
+ int64 last_count = 0, last_lats = 0, last_sqlats = 0, last_lags = 0;
AggVals aggs;
/*
- * Initialize throttling rate target for all of the thread's clients. It
- * might be a little more accurate to reset thread->start_time here too.
- * The possible drift seems too small relative to typical throttle delay
- * times to worry about it.
+ * This is the real start time of the thread. Take it here to avoid
+ * including thread creation time in measures.
*/
- INSTR_TIME_SET_CURRENT(start);
- thread->throttle_trigger = INSTR_TIME_GET_MICROSEC(start);
+ INSTR_TIME_SET_CURRENT(thread->start_time);
+
+ /*
+ * Initialize throttling rate target for all of the thread's clients.
+ */
+ thread_start = INSTR_TIME_GET_MICROSEC(thread->start_time);
+ last_report = thread_start;
+ next_report = last_report + progress * 1000000;
+ thread->throttle_trigger = thread_start;
thread->throttle_lag = 0;
thread->throttle_lag_max = 0;
@@ -3066,21 +3105,40 @@ threadRun(void *arg)
if (now >= next_report)
{
/* generate and show report */
- int64 count = 0;
+ int64 count = 0, lats = 0, sqlats = 0;
+ int64 lags = thread->throttle_lag;
int64 run = now - last_report;
- float tps, total_run, latency;
+ double tps, total_run, latency, sqlat, stdev, lag;
for (i = 0 ; i < nstate ; i++)
- count += state[i].cnt;
+ {
+ count += state[i].cnt;
+ lats += state[i].txn_latencies;
+ sqlats += state[i].txn_sqlats;
+ }
total_run = (now - thread_start) / 1000000.0;
tps = 1000000.0 * (count - last_count) / run;
- latency = 1000.0 * nstate / tps;
+ latency = 0.001 * (lats - last_lats) / (count - last_count);
+ sqlat = 1.0 * (sqlats - last_sqlats) / (count - last_count);
+ stdev = 0.001 * sqrt(sqlat - 1000000.0 * latency * latency);
+ lag = 0.001 * (lags - last_lags) / (count - last_count);
- fprintf(stderr, "progress %d: %.1f s, %.1f tps, %.3f ms lat\n",
- thread->tid, total_run, tps, latency);
+ if (throttle_delay)
+ fprintf(stderr,
+ "progress %d: %.1f s, %.1f tps, "
+ "%.3f +- %.3f ms lat, %.3f ms lag\n",
+ thread->tid, total_run, tps, latency, stdev, lag);
+ else
+ fprintf(stderr,
+ "progress %d: %.1f s, %.1f tps, "
+ "%.3f +- %.3f ms lat\n",
+ thread->tid, total_run, tps, latency, stdev);
last_count = count;
+ last_lats = lats;
+ last_sqlats = sqlats;
+ last_lags = lags;
last_report = now;
next_report += progress * 1000000;
}
@@ -3096,21 +3154,41 @@ threadRun(void *arg)
if (now >= next_report)
{
/* generate and show report */
- int64 count = 0;
+ int64 count = 0, lats = 0, sqlats = 0, lags = 0;
int64 run = now - last_report;
- float tps, total_run, latency;
+ float tps, total_run, latency, sqlat, lag, stdev;
for (i = 0 ; i < progress_nclients ; i++)
- count += state[i].cnt;
+ {
+ count += state[i].cnt;
+ lats += state[i].txn_latencies;
+ sqlats += state[i].txn_sqlats;
+ }
+
+ for (i = 0 ; i < progress_nthreads ; i++)
+ lags += thread[i].throttle_lag;
total_run = (now - thread_start) / 1000000.0;
tps = 1000000.0 * (count - last_count) / run;
- latency = 1000.0 * progress_nclients / tps;
+ latency = 0.001 * (lats - last_lats) / (count - last_count);
+ sqlat = 1.0 * (sqlats - last_sqlats) / (count - last_count);
+ stdev = 0.001 * sqrt(sqlat - 1000000.0 * latency * latency);
+ lag = 0.001 * (lags - last_lags) / (count - last_count);
- fprintf(stderr, "progress: %.1f s, %.1f tps, %.3f ms lat\n",
- total_run, tps, latency);
+ if (throttle_delay)
+ fprintf(stderr,
+ "progress: %.1f s, %.1f tps, "
+ "%.3f +- %.3f ms lat, %.3f ms lag\n",
+ total_run, tps, latency, stdev, lag);
+ else
+ fprintf(stderr,
+ "progress: %.1f s, %.1f tps, %.3f +- %.3f ms lat\n",
+ total_run, tps, latency, stdev);
last_count = count;
+ last_lats = lats;
+ last_sqlats = sqlats;
+ last_lags = lags;
last_report = now;
next_report += progress * 1000000;
}
@@ -3122,8 +3200,14 @@ done:
INSTR_TIME_SET_CURRENT(start);
disconnect_all(state, nstate);
result->xacts = 0;
+ result->latencies = 0;
+ result->sqlats = 0;
for (i = 0; i < nstate; i++)
+ {
result->xacts += state[i].cnt;
+ result->latencies += state[i].txn_latencies;
+ result->sqlats += state[i].txn_sqlats;
+ }
result->throttle_lag = thread->throttle_lag;
result->throttle_lag_max = thread->throttle_lag_max;
INSTR_TIME_SET_CURRENT(end);
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 49a79b1..d074bde 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -182,18 +182,6 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
</varlistentry>
<varlistentry>
- <term><option>-q</option></term>
- <term><option>--quiet</option></term>
- <listitem>
- <para>
- Switch logging to quiet mode, producing only one progress message per 5
- seconds. The default logging prints one message each 100000 rows, which
- often outputs many lines per second (especially on good hardware).
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
<term><option>-s</option> <replaceable>scale_factor</></term>
<term><option>--scale=</option><replaceable>scale_factor</></term>
<listitem>
@@ -408,16 +396,6 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
</varlistentry>
<varlistentry>
- <term><option>-P</option> <replaceable>sec</></term>
- <term><option>--progress=</option><replaceable>sec</></term>
- <listitem>
- <para>
- Show progress report every <literal>sec</> seconds.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
<term><option>-R</option> <replaceable>rate</></term>
<term><option>--rate=</option><replaceable>rate</></term>
<listitem>
@@ -429,13 +407,13 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
</para>
<para>
The rate is targeted by starting transactions along a
- Poisson-distributed schedule time line. The expected finish time
+ Poisson-distributed schedule time line. The expected start time
schedule moves forward based on when the client first started, not
when the previous transaction ended. That approach means that when
transactions go past their original scheduled end time, it is
possible for later ones to catch up again.
</para>
- <para>
+ <para>
When throttling is active, the average and maximum transaction
schedule lag time are reported in ms. This is the delay between
the original scheduled transaction time and the actual transaction
@@ -591,6 +569,38 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
</varlistentry>
<varlistentry>
+ <term><option>-P</option> <replaceable>sec</></term>
+ <term><option>--progress=</option><replaceable>sec</></term>
+ <listitem>
+ <para>
+ Show progress report every <literal>sec</> seconds.
+
+ When initializing, the report displays the number of inserted rows
+ and its total percentage, the passed time and estimated remaining time
+ for completion.
+
+ When benchmarking, the report displays the time since the beginning
+ of the run, the tps since the last report, the transaction latency
+ average and standard deviation. Under throttling (<option>-R</option>),
+ also includes the lag time behind schedule.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-q</option></term>
+ <term><option>--quiet</option></term>
+ <listitem>
+ <para>
+ Switch off regular (5 second default) progress report.
+
+ When initializing, this revert to the old behavior of having
+ a report every 100,000 tuple insertions.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-U</option> <replaceable>login</></term>
<term><option>--username=</option><replaceable>login</></term>
<listitem>
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers