Re: pgbench logging broken by time logic changes

Yugo NAGATA Wed, 16 Jun 2021 20:25:43 -0700

On Wed, 16 Jun 2021 21:11:41 +0200 (CEST)
Fabien COELHO <coe...@cri.ensmp.fr> wrote:


> 
> > pg_time_now(). This uses INSTR_TIME_SET_CURRENT in it, but this macro
> > can call clock_gettime(CLOCK_MONOTONIC[_RAW], ) instead of gettimeofday
> > or clock_gettime(CLOCK_REALTIME, ). When CLOCK_MONOTONIC[_RAW] is used,
> > clock_gettime doesn't return epoch time. Therefore, we can use
> > INSTR_TIME_SET_CURRENT aiming to calculate a duration, but we should
> > not have used this to get the current timestamp.
> >
> > I think we can fix this issue by using gettimeofday for logging as before
> > this was changed. I attached the patch.
> 
> I cannot say that I'm thrilled by having multiple tv stuff back in several 
> place. I can be okay with one, though. What about the attached? Does it 
> make sense?

At first, I also thought of fixing pg_time_now() to use gettimeofday() instead
of INSTR_TIME_SET_CURRENT, but I noticed that using INSTR_TIME_SET_CURRENT is
proper to measure time interval.  I mean, this macro uses
lock_gettime(CLOCK_MONOTONIC, ) if avilable, which give reliable interval
timing even in the face of changes to the system clock due to NTP.

The commit 547f04e7 changed all of INSTR_TIME_SET_CURRENT, gettimeofday(), and
time() to pg_now_time() which calls INSTR_TIME_SET_CURRENT in it. So, my patch
intented to revert these changes only about gettimeofday() and time(), and 
remain
changes about INSTR_TIME_SET_CURRENT.

I attached the updated patch because I forgot to revert pg_now_time() to 
time(NULL).

Another idea to fix is adding 'use_epoch' flag to pg_now_time() like below, 

 pg_time_now(bool use_epoch)
 {
     if (use_epoch)
      {
        struct timeval tv;
        gettimeofday(&tv, NULL);
        return tv.tv_sec * 1000000 + tv.tv_usec;
       }
     else
       {
         instr_time      now;
         INSTR_TIME_SET_CURRENT(now);
         return (pg_time_usec_t) INSTR_TIME_GET_MICROSEC(now);
       }
 }

or making an additional function that returns epoch time.


By the way, there is another advantage of using clock_gettime(). That is, this
returns precise results in nanoseconds. However, we can not benefit from it 
because
pg_time_now() converts the value to uint64 by using INSTR_TIME_GET_MICROSEC. So,
if we would like more precious statistics in pgbench, it might be better to use
INSTR_TIME_GET_MICROSEC instead of pg_time_now in other places.

Regards,
Yugo Nagata

-- 
Yugo NAGATA <nag...@sraoss.co.jp>

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d7479925cb..83e60a2ce6 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -334,7 +334,7 @@ typedef int64 pg_time_usec_t;
  */
 typedef struct StatsData
 {
-	pg_time_usec_t start_time;	/* interval start time, for aggregates */
+	time_t		start_time;		/* interval start time, for aggregates */
 	int64		cnt;			/* number of transactions, including skipped */
 	int64		skipped;		/* number of transactions skipped under --rate
 								 * and --latency-limit */
@@ -1279,9 +1279,9 @@ mergeSimpleStats(SimpleStats *acc, SimpleStats *ss)
  * the given value.
  */
 static void
-initStats(StatsData *sd, pg_time_usec_t start)
+initStats(StatsData *sd, time_t start_time)
 {
-	sd->start_time = start;
+	sd->start_time = start_time;
 	sd->cnt = 0;
 	sd->skipped = 0;
 	initSimpleStats(&sd->latency);
@@ -3778,7 +3778,6 @@ doLog(TState *thread, CState *st,
 	  StatsData *agg, bool skipped, double latency, double lag)
 {
 	FILE	   *logfile = thread->logfile;
-	pg_time_usec_t now = pg_time_now();
 
 	Assert(use_log);
 
@@ -3793,6 +3792,8 @@ doLog(TState *thread, CState *st,
 	/* should we aggregate the results or not? */
 	if (agg_interval > 0)
 	{
+		time_t		now = time(NULL);
+
 		/*
 		 * Loop until we reach the interval of the current moment, and print
 		 * any empty intervals in between (this may happen with very low tps,
@@ -3830,16 +3831,21 @@ doLog(TState *thread, CState *st,
 	}
 	else
 	{
+		struct timeval tv;
+
+		gettimeofday(&tv, NULL);
+
 		/* no, print raw transactions */
 		if (skipped)
 			fprintf(logfile, "%d " INT64_FORMAT " skipped %d " INT64_FORMAT " "
 					INT64_FORMAT,
-					st->id, st->cnt, st->use_file, now / 1000000, now % 1000000);
+					st->id, st->cnt, st->use_file,
+					(long) tv.tv_sec, (long) tv.tv_usec);
 		else
 			fprintf(logfile, "%d " INT64_FORMAT " %.0f %d " INT64_FORMAT " "
 					INT64_FORMAT,
 					st->id, st->cnt, latency, st->use_file,
-					now / 1000000, now % 1000000);
+					(long) tv.tv_sec, (long) tv.tv_usec);
 		if (throttle_delay)
 			fprintf(logfile, " %.0f", lag);
 		fputc('\n', logfile);
@@ -5455,7 +5461,16 @@ printProgressReport(TState *threads, int64 test_start, pg_time_usec_t now,
 
 	if (progress_timestamp)
 	{
-		snprintf(tbuf, sizeof(tbuf), "%.3f s", PG_TIME_GET_DOUBLE(now));
+		/*
+		 * On some platforms the current system timestamp is available in
+		 * now_time, but rather than get entangled with that, we just eat the
+		 * cost of an extra syscall in all cases.
+		 */
+		struct timeval tv;
+
+		gettimeofday(&tv, NULL);
+		snprintf(tbuf, sizeof(tbuf), "%ld.%03ld s",
+				 (long) tv.tv_sec, (long) (tv.tv_usec / 1000));
 	}
 	else
 	{
@@ -6601,7 +6616,7 @@ threadRun(void *arg)
 	thread->bench_start = start;
 	thread->throttle_trigger = start;
 
-	initStats(&aggs, start);
+	initStats(&aggs, time(NULL));
 	last = aggs;
 
 	/* loop till all clients have terminated */

Re: pgbench logging broken by time logic changes

Reply via email to