Re: [HACKERS] too much pgbench init output

Tomas Vondra Sun, 16 Sep 2012 15:26:39 -0700

On 5.9.2012 06:17, Robert Haas wrote:
> On Tue, Sep 4, 2012 at 11:31 PM, Peter Eisentraut <pete...@gmx.net> wrote:
>> On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote:
>>> Actually, this whole things seems like a solution in search of a
>>> problem to me.  We just reduced the verbosity of pgbench -i tenfold in
>>> the very recent past - I would have thought that enough to address
>>> this problem.  But maybe not.
>>
>> The problem is that
>>
>> a) It blasts out too much output and everything scrolls off the screen,
>> and
>>
>> b) There is no indication of where the end is.
>>
>> These are independent problems, and I'd be happy to address them
>> separately if there are such specific concerns attached to this.
>>
>> Speaking of tenfold, we could reduce the output frequency tenfold to
>> once every 1000000, which would alleviate this problem for a while
>> longer.
> 
> Well, I wouldn't object to displaying a percentage on each output
> line.  But I don't really like the idea of having them less frequent
> than they already are, because if you run into a situation that makes
> pgbench -i run slowly, as I occasionally do, it's marginal to tell the
> difference between "slow" and "completely hung" even with the current
> level of verbosity.
> 
> However, we could add a -q flag to run more quietly, or something like
> that.  Actually, I'd even be fine with making the default quieter,
> though we can't use -v for verbose since that's already taken.  But
> I'd like to preserve the option of getting the current amount of
> output because sometimes I need that to troubleshoot problems.
> Actually it'd be nice to even get a bit more output: say, a timestamp
> on each line, and a completion percentage... but now I'm getting
> greedy.


Hi,

I've been thinking about this a bit more, and do propose to use an
option that determines "logging step" i.e. number of items (either
directly or as a percentage) between log lines.

The attached patch defines a new option "--logging-step" that accepts
either integers or percents. For example if you want to print a line
each 1000 lines, you can to this

  $ pgbench -i -s 1000 --logging-step 1000 testdb

and if you want to print a line each 5%, you can do this

  $ pgbench -i -s 1000 --logging-step 5% testdb

and that's it.

Moreover the patch adds a record of elapsed an estimate of remaining
time. So for example with 21% you may get this:

creating tables...
21000 of 100000 tuples (21%) done (elapsed 1.56 s, remaining 5.85 s).
42000 of 100000 tuples (42%) done (elapsed 3.15 s, remaining 4.35 s).
63000 of 100000 tuples (63%) done (elapsed 4.73 s, remaining 2.78 s).
84000 of 100000 tuples (84%) done (elapsed 6.30 s, remaining 1.20 s).
100000 of 100000 tuples (100%) done (elapsed 8.17 s, remaining 0.00 s).
vacuum...
set primary keys...

Now, I've had a hard time with the patch - no matter what I do, I do get
"invalid option" error whenever I try to run that from command line for
some reason. But when I run it from gdb, it works just fine.

kind regards
Tomas

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index f5ac3b1..ce7e240 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -130,6 +130,11 @@ int                        foreign_keys = 0;
 int                    unlogged_tables = 0;
 
 /*
+ * logging step (inserts)
+ */
+int                    log_step = 100000;
+
+/*
  * tablespace selection
  */
 char      *tablespace = NULL;
@@ -356,6 +361,8 @@ usage(void)
                   "               create tables in the specified tablespace\n"
                   "  --unlogged-tables\n"
                   "               create tables as unlogged tables\n"
+                  "  --logging-step NUM\n"
+                  "               how often to print info about init 
progress\n"
                   "\nBenchmarking options:\n"
                "  -c NUM       number of concurrent database clients (default: 
1)\n"
                   "  -C           establish new connection for each 
transaction\n"
@@ -1340,6 +1347,10 @@ init(bool is_no_vacuum)
        char            sql[256];
        int                     i;
 
+       /* used to track elapsed time and estimate of the remaining time */
+       instr_time      start, diff;
+       double elapsed_sec, remaining_sec;
+
        if ((con = doConnect()) == NULL)
                exit(1);
 
@@ -1408,6 +1419,8 @@ init(bool is_no_vacuum)
        }
        PQclear(res);
 
+       INSTR_TIME_SET_CURRENT(start);
+
        for (i = 0; i < naccounts * scale; i++)
        {
                int                     j = i + 1;
@@ -1419,10 +1432,18 @@ init(bool is_no_vacuum)
                        exit(1);
                }
 
-               if (j % 100000 == 0)
-                       fprintf(stderr, "%d of %d tuples (%d%%) done.\n",
+               if (j % log_step == 0 || j == scale * naccounts)
+               {
+                       INSTR_TIME_SET_CURRENT(diff);
+                       INSTR_TIME_SUBTRACT(diff, start);
+                       
+                       elapsed_sec = INSTR_TIME_GET_DOUBLE(diff);
+                       remaining_sec = (scale * naccounts - j) * elapsed_sec / 
j;
+                       
+                       fprintf(stderr, "%d of %d tuples (%d%%) done (elapsed 
%.2f s, remaining %.2f s).\n",
                                        j, naccounts * scale,
-                                       j * 100 / (naccounts * scale));
+                                       j * 100 / (naccounts * scale), 
elapsed_sec, remaining_sec);
+               }
        }
        if (PQputline(con, "\\.\n"))
        {
@@ -1901,6 +1922,7 @@ main(int argc, char **argv)
        int                     do_vacuum_accounts = 0; /* do vacuum accounts 
before testing? */
        int                     ttype = 0;              /* transaction type. 0: 
TPC-B, 1: SELECT only,
                                                                 * 2: skip 
update of branches and tellers */
+       float           log_step_pct = 0;       /* logging step in percent */
        int                     optindex;
        char       *filename = NULL;
        bool            scale_given = false;
@@ -1920,6 +1942,7 @@ main(int argc, char **argv)
                {"index-tablespace", required_argument, NULL, 3},
                {"tablespace", required_argument, NULL, 2},
                {"unlogged-tables", no_argument, &unlogged_tables, 1},
+               {"logging-step", required_argument, NULL, 6},
                {NULL, 0, NULL, 0}
        };
 
@@ -2125,6 +2148,14 @@ main(int argc, char **argv)
                        case 3:                         /* index-tablespace */
                                index_tablespace = optarg;
                                break;
+                       case 6:
+                               if (optarg[strlen(optarg)-1] == '%') {
+                                       optarg[strlen(optarg)-1] = '\0';
+                                       log_step_pct = atof(optarg);
+                               } else {
+                                       log_step = atol(optarg);
+                               }
+                               break;
                        default:
                                fprintf(stderr, _("Try \"%s --help\" for more 
information.\n"), progname);
                                exit(1);
@@ -2144,6 +2175,11 @@ main(int argc, char **argv)
                        dbName = "";
        }
 
+       /* compute the log_step from total number of accounts and log_step_pct 
*/
+       if (log_step_pct != 0) {
+               log_step = log_step_pct * naccounts * scale / 100;
+       }
+
        if (is_init_mode)
        {
                init(is_no_vacuum);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] too much pgbench init output

Reply via email to