Re: [HACKERS] pgbench progress report improvements

Fabien COELHO Sun, 22 Sep 2013 01:10:29 -0700


Dear Noah,


Thanks for your answers and remarks,

[...]

I'll split some part of the patch where there is no coupling, but I donot want to submit conflicting patches.

Those benefits aren't compelling enough to counterbalance the risk of
gettimeofday() overhead affecting results.  (Other opinions welcome.)

Yep. If I really leave gettimeofday out, I cannot measure the stddev, soI'll end up with:


 --rate    => gettimeofday, mean (& stddev) measured, because they
    cannot be derived otherwise.

 no --rate => no (or less) gettimeofday, mean computed from total time and
    no stddev report because it cannot be computed.

That is annoying, because I do want the standard deviation, and this meanone "if"s complexity here and there.

ISTM that when running under a target time, the (hopefully very small)overhead is only one gettimeofday call, because the other one is takenanyway to check whether the thread should stop.

Or I can add a yet another option, say --stddev, to ask for standarddeviation, which will imply the additional gettimeofday call...

For a tool like pgbench that requires considerable skill to useeffectively, changing a default only helps slightly. It doesn't takemuch of a risk to make us better off leaving the default unchanged.

I can put a 0 default... but even very experienced people will be bittenover and over. Why should I care? ISTM that the default should be the

safe option, and experienced user can request "-quiet" if they want it.

[...]
I tried to preserve the row-counting behavior because I thought that
someone would object otherwise, but I would be really in favor of
dropping the row-counting report behavior altogether and only keep the
time triggered report.


I would be fine with dropping the row-counting behavior.  But why subject this
distant side issue to its third round of bikeshedding in 18 months?


I was not involved:-)

The 100000 behavior is the initial & old version, and only applies toinitialization. Someone found it too verbose when scaling, and I agree, somade a quick patch which preserves the old behavior (someone must havesaid: whatever, do not change the default!) but allowed to switch to aless noisy version, hence the -quiet which is not really quiet. This wouldbe the defective result of a compromise:-)

If I follow your request not to change any default, I cannot merge cleanlythe -i & bench behaviors, as currenty -i does have a default progressreport and its own -quiet, and benchmarking does not.

The current behavior is inconsistent. I would prefer something consistent,preferably always show a progress report, and -quiet hides it (fully), orif you really insist no progress report, and --progress shows it, and the-quiet option is removed.

 - Take thread start time at the beginning of the thread.


That theory is sound, but I would like at least one report reproducing that
behavior and finding that this change improved it.


[...] so I'm inclined to leave it alone.

I really spent *hours* debugging stupid measures on the previous round ofpgbench changes, when adding the throttling stuff. Having the start timetaken when the thread really starts is just sanity, and I needed that justto rule out that it was the source of the "strange" measures.

-j 800 vs -j 100 : ITM that if I you create more threads, the time delayincurred is cumulative, so the strangeness of the result should worsen.800 threads ~ possibly 10 seconds of creation time, to be compared to afew seconds of run time.

 Shouldn't it just be:

                int64 wait = (int64) (throttle_delay *
                        -log(1 - pg_erand48(thread->random_state)));

[...]

Ah; that makes sense.  Then I understand why you want to remove the bias, but
why do you also increase the upper bound?

Because the bias was significantly larger for 1000 (about 0.5%), so thisalso contributed to reduce said bias, and 9.2 times the average targetseems as reasonnable a bound as 6.9.

It is also printed without --rate. There is a "if" above because there is
one report with "lag" (under --rate), and one without.


The code I quoted is for the final report in printResults(), and that only
shows latency mean/stddev when using --rate.  The progress reporting in
threadRun() does have two variations as you describe.

Indeed, I took it for the progress report. I'll check. It must beconsistent whether under --rate or not.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench progress report improvements

Reply via email to