Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing

2013-10-06 Thread Noah Misch
On Sun, Oct 06, 2013 at 09:48:04AM +0200, Fabien COELHO wrote:
> Also, I'm not sure of any system used with pgbench that do not have
> threads, so ISTM that the thread fork-emulation hack is more or less
> useless, and it is pretty masochistic to maintain when adding
> features.

Fair point.  When added, the code pertaining to fork-emulation was
well-isolated, and that remained the case as recently as 9.3.  Your recent
--progress patch was the first to suffer contortions for the benefit of that
mode.  (The per-statement latencies feature just declined to support it.)

> >For the time being, I propose the attached comment patch.
> 
> It comment seems very clear to me. I do not understand why it starts
> with XXX, though.

PostgreSQL code uses that notation regularly.  When I add it, I typically have
in mind "the following is not fully satisfying, but it's not bad enough to
make a point of improving".

I've committed the comment patch.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing

2013-10-06 Thread Fabien COELHO



Note that this is pretty standard stuff with threads, ISTM that it
would solve most of the issues, *but* this is not possible with the
"thread fork emulation" implemented by pgbench, which really means
no threads at all.


You could do those same things in the fork emulation mode using anonymous
shared memory, like we do in the server.  That would permit removing the
current "#ifdef PTHREAD_FORK_EMULATION" wart, too.


Yep, but I'm not sure that would reduce pgbench code complexity. Also, I'm 
not sure of any system used with pgbench that do not have threads, so ISTM 
that the thread fork-emulation hack is more or less useless, and it is 
pretty masochistic to maintain when adding features.



For the time being, I propose the attached comment patch.


It comment seems very clear to me. I do not understand why it starts with 
XXX, though.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing

2013-10-05 Thread Noah Misch
On Tue, Oct 01, 2013 at 02:50:29PM +0200, Fabien COELHO wrote:
> I do not think that there is a clean and simple way to take the
> start/stop period into account when computing global performances of
> a run. The TPC-C benchmark tells to ignore the warmup/closure
> period, whatever they are, and only perform measures within the
> steady state. However the full graph must be provided when the bench
> is provided.

That makes sense to me.  "pgbench --progress" and "pgbench --log
--aggregate-interval" are good tools for excluding non-steady-state periods.

> About better measures: If I could rely on having threads, I would
> simply synchronise the threads at the beginning so that they
> actually start after they are all created, and one thread would
> decide when to stop and set a shared volatile variable to stop all
> transactions more or less at once. In this case, the thread start
> time would be taken just after the synchronization, and maybe only
> by thread 0 would be enough.
> 
> Note that this is pretty standard stuff with threads, ISTM that it
> would solve most of the issues, *but* this is not possible with the
> "thread fork emulation" implemented by pgbench, which really means
> no threads at all.

You could do those same things in the fork emulation mode using anonymous
shared memory, like we do in the server.  That would permit removing the
current "#ifdef PTHREAD_FORK_EMULATION" wart, too.

For the time being, I propose the attached comment patch.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 926246e..816400f 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -2896,7 +2896,16 @@ main(int argc, char **argv)
}
disconnect_all(state, nclients);
 
-   /* get end time */
+   /*
+* XXX We compute results as though every client of every thread started
+* and finished at the same time.  That model can diverge noticeably 
from
+* reality for a short benchmark run involving relatively many threads.
+* The first thread may process notably many transactions before the 
last
+* thread begins.  Improving the model alone would bring limited 
benefit,
+* because performance during those periods of partial thread count can
+* easily exceed steady state performance.  This is one of the many ways
+* short runs convey deceptive performance figures.
+*/
INSTR_TIME_SET_CURRENT(total_time);
INSTR_TIME_SUBTRACT(total_time, start_time);
printResults(ttype, total_xacts, nclients, threads, nthreads,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing

2013-10-01 Thread Fabien COELHO


Hello Noah,


Thread create time seems to be expensive as well, maybe up 0.1
seconds under some conditions (?). Under --rate, this create delay
means that throttling is laging behind schedule by about that time,
so all the first transactions are trying to catch up.


threadRun() already initializes throttle_trigger with a fresh timestamp.
Please detail how the problem remains despite that.


Indeed, I did this kludge because I could not rely on the "before fork" 
start time as it was (possibly) creating a "rush" at the beginning of the 
run under --rate.


The version I submitted takes the start time after the thread is created, 
and use it directly for throttling, so the start time is taken once per 
thread and used instead of retaking it because the first one cannot be 
relied on.



[...]


Fine detailed analysis!


Opinions, other ideas?


I do not think that there is a clean and simple way to take the start/stop 
period into account when computing global performances of a run. The TPC-C 
benchmark tells to ignore the warmup/closure period, whatever they are, 
and only perform measures within the steady state. However the full graph 
must be provided when the bench is provided.


About better measures: If I could rely on having threads, I would simply 
synchronise the threads at the beginning so that they actually start after 
they are all created, and one thread would decide when to stop and set a 
shared volatile variable to stop all transactions more or less at once. In 
this case, the thread start time would be taken just after the 
synchronization, and maybe only by thread 0 would be enough.


Note that this is pretty standard stuff with threads, ISTM that it would 
solve most of the issues, *but* this is not possible with the "thread fork 
emulation" implemented by pgbench, which really means no threads at all.


A possible compromise would be to do just that when actual threads are 
used, and let it more or less as it is when fork emulation is on...


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing (fwd)

2013-10-01 Thread Fabien COELHO


[oops, resent because stalled, wrong From!]

Hello Noah,


Thread create time seems to be expensive as well, maybe up 0.1
seconds under some conditions (?). Under --rate, this create delay
means that throttling is laging behind schedule by about that time,
so all the first transactions are trying to catch up.


threadRun() already initializes throttle_trigger with a fresh timestamp.
Please detail how the problem remains despite that.


Indeed, I did this kludge because I could not rely on the "before fork" start 
time as it was (possibly) creating a "rush" at the beginning of the run under 
--rate.


The version I submitted takes the start time after the thread is created, and 
use it directly for throttling, so the start time is taken once per thread and 
used instead of retaking it because the first one cannot be relied on.



[...]


Fine detailed analysis!


Opinions, other ideas?


I do not think that there is a clean and simple way to take the start/stop 
period into account when computing global performances of a run. The TPC-C 
benchmark tells to ignore the warmup/closure period, whatever they are, and 
only perform measures within the steady state. However the full graph must be 
provided when the bench is provided.


About better measures: If I could rely on having threads, I would simply 
synchronise the threads at the beginning so that they actually start after they 
are all created, and one thread would decide when to stop and set a shared 
volatile variable to stop all transactions more or less at once. In this case, 
the thread start time would be taken just after the synchronization, and maybe 
only by thread 0 would be enough.


Note that this is pretty standard stuff with threads, ISTM that it would solve 
most of the issues, *but* this is not possible with the "thread fork emulation" 
implemented by pgbench, which really means no threads at all.


A possible compromise would be to do just that when actual threads are used, 
and let it more or less as it is when fork emulation is on...


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing

2013-09-30 Thread Noah Misch
On Thu, Sep 26, 2013 at 01:41:01PM +0200, Fabien COELHO wrote:
> >I don't get it; why is taking the time just after pthread_create() more sane
> >than taking it just before pthread_create()?
> 
> Thread create time seems to be expensive as well, maybe up 0.1
> seconds under some conditions (?). Under --rate, this create delay
> means that throttling is laging behind schedule by about that time,
> so all the first transactions are trying to catch up.

threadRun() already initializes throttle_trigger with a fresh timestamp.
Please detail how the problem remains despite that.

> >typically far more expensive than pthread_create().  The patch for threaded
> >pgbench made the decision to account for pthread_create() as though it were
> >part of establishing the connection.  You're proposing to not account for it
> >all.  Both of those designs are reasonable to me, but I do not comprehend the
> >benefit you anticipate from switching from one to the other.
> >
> >>-j 800 vs -j 100 : ITM that if I you create more threads, the time delay
> >>incurred is cumulative, so the strangeness of the result should worsen.
> >
> >Not in general; we do one INSTR_TIME_SET_CURRENT() per thread, just before
> >calling pthread_create().  However, thread 0 is a special case; we set its
> >start time first and actually start it last.  Your observation of cumulative
> >delay fits those facts.
> 
> Yep, that must be thread 0 which has a very large delay. I think it
> is simpler that each threads record its start time when it has
> started, without exception.
> 
> > Initializing the thread-0 start time later, just before calling
> >its threadRun(), should clear this anomaly without changing other
> >aspects of the measurement.
> 
> Always taking the thread start time when the thread is started does
> solve the issue as well, and it is homogeneous for all cases, so the
> solution I suggest seems reasonable and simple.

To exercise the timing semantics before and after your proposed change, I
added a "sleep(1);" before the pthread_create() call.  Here are the results
with and without "-j", with and without pgbench-measurements-v5.patch:

$ echo 'select 1' >test.sql

# just the sleep(1) addition
$ env time pgbench -c4 -t1000 -S -n -f test.sql | grep tps
tps = 6784.410104 (including connections establishing)
tps = 7094.701854 (excluding connections establishing)
0.03user 0.07system 0:00.60elapsed 16%CPU (0avgtext+0avgdata 0maxresident)k

$ env time pgbench -j4 -c4 -t1000 -S -n -f test.sql | grep tps
tps = 1224.327010 (including connections establishing)
tps = 2274.160899 (excluding connections establishing)
0.02user 0.03system 0:03.27elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k

# w/ pgbench-measurements-v5.patch
$ env time pgbench -c4 -t1000 -S -n -f test.sql | grep tps
tps = 6792.393877 (including connections establishing)
tps = 7207.142278 (excluding connections establishing)
0.08user 0.06system 0:00.60elapsed 23%CPU (0avgtext+0avgdata 0maxresident)k

$ env time pgbench -j4 -c4 -t1000 -S -n -f test.sql | grep tps
tps = 1212.040409 (including connections establishing)
tps = 1214.728830 (excluding connections establishing)
0.09user 0.06system 0:03.31elapsed 4%CPU (0avgtext+0avgdata 0maxresident)k


Rather than, as I supposed before, excluding the cost of thread start
entirely, pgbench-measurements-v5.patch has us count pthread_create() as part
of the main runtime.  I now see the cumulative delay you mentioned, but
pgbench-measurements-v5.patch does not fix it.  The problem is centered on the
fact that pgbench.c:main() calculates a single total_time and models each
thread as having run for that entire period.  If pthread_create() is slow,
reality diverges considerably from that model; some threads start notably
late, and other threads finish notably early.  The threadRun() runtime
intervals in the last test run above are actually something like this:

thread 1: 1.0s - 1.3s
thread 2: 2.0s - 2.3s
thread 3: 3.0s - 3.3s
thread 0: 3.0s - 3.3s

Current pgbench instead models every thread as having run 0.0s - 3.3s, hence
the numbers reported.  To make the numbers less surprising, we could axe the
global total_time=end_time-start_time and instead accumulate total_time on a
per-thread basis, just as we now accumulate conn_time on a per-thread basis.
That ceases charging threads for time spent not-yet-running or
already-finished, but it can add its own inaccuracy.  Performance during a
period in which some clients have yet to start is not interchangeable with
performance when all clients are running.  pthread_create() slowness would
actually make the workload seem to perform better.

An alternate strategy would be to synchronize the actual start of command
issuance across threads.  All threads would start and make their database
connections, then signal readiness.  Once the first thread notices that every
other thread is ready, it would direct them to actually start issuing queries.
This might minimize the result skew problem of the first strateg

Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing

2013-09-26 Thread Fabien COELHO



pgbench changes, when adding the throttling stuff. Having the start time
taken when the thread really starts is just sanity, and I needed that
just to rule out that it was the source of the "strange" measures.


I don't get it; why is taking the time just after pthread_create() more sane
than taking it just before pthread_create()?


Thread create time seems to be expensive as well, maybe up 0.1 seconds 
under some conditions (?). Under --rate, this create delay means that 
throttling is laging behind schedule by about that time, so all the first 
transactions are trying to catch up.



typically far more expensive than pthread_create().  The patch for threaded
pgbench made the decision to account for pthread_create() as though it were
part of establishing the connection.  You're proposing to not account for it
all.  Both of those designs are reasonable to me, but I do not comprehend the
benefit you anticipate from switching from one to the other.


-j 800 vs -j 100 : ITM that if I you create more threads, the time delay
incurred is cumulative, so the strangeness of the result should worsen.


Not in general; we do one INSTR_TIME_SET_CURRENT() per thread, just before
calling pthread_create().  However, thread 0 is a special case; we set its
start time first and actually start it last.  Your observation of cumulative
delay fits those facts.


Yep, that must be thread 0 which has a very large delay. I think it is 
simpler that each threads record its start time when it has started, 
without exception.


 Initializing the thread-0 start time later, just before calling its 
threadRun(), should clear this anomaly without changing other aspects of 
the measurement.


Always taking the thread start time when the thread is started does solve 
the issue as well, and it is homogeneous for all cases, so the solution I 
suggest seems reasonable and simple.


While pondering this area of the code, it occurs to me -- shouldn't we 
initialize the throttle rate trigger later, after establishing 
connections and sending startup queries?  As it stands, we build up a 
schedule deficit during those tasks.  Was that deliberate?


On the principle, I agree with you.

The connection creation time is another thing, but it depends on the 
options set. Under some options the connection is open and closed for 
every transaction, so there is no point in avoiding it in the measure or 
in the scheduling, and I want to avoid having to distinguish those cases. 
Morover, ISTM that one of the thread reuse the existing connection while 
other recreate is. So I left it "as is".


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - exclude pthread_create() from connection start timing

2013-09-25 Thread Noah Misch
Concerning one of the eventual would-be split patches...

On Tue, Sep 24, 2013 at 10:41:17PM +0200, Fabien COELHO wrote:
>  - Take thread start time at the beginning of the thread (!)
>
>Otherwise it includes pretty slow thread/fork system start times in
>the measurements. May help with bug #8326. This also helps with throttling,
>as the start time is used to adjust the rate: if it is not the actual
>start time, there is a bit of a rush at the beginning in order to
>catch up. Taking the start time when the thread actually starts is
>the sane thing to do to avoid surprises at possibly strange measures
>on short runs.

On Sun, Sep 22, 2013 at 10:08:54AM +0200, Fabien COELHO wrote:
> I really spent *hours* debugging stupid measures on the previous round of 
> pgbench changes, when adding the throttling stuff. Having the start time  
> taken when the thread really starts is just sanity, and I needed that 
> just to rule out that it was the source of the "strange" measures.

I don't get it; why is taking the time just after pthread_create() more sane
than taking it just before pthread_create()?  We report separate TPS values
for "including connections establishing" and "excluding connections
establishing" because establishing the database connection is expensive,
typically far more expensive than pthread_create().  The patch for threaded
pgbench made the decision to account for pthread_create() as though it were
part of establishing the connection.  You're proposing to not account for it
all.  Both of those designs are reasonable to me, but I do not comprehend the
benefit you anticipate from switching from one to the other.

> -j 800 vs -j 100 : ITM that if I you create more threads, the time delay  
> incurred is cumulative, so the strangeness of the result should worsen.

Not in general; we do one INSTR_TIME_SET_CURRENT() per thread, just before
calling pthread_create().  However, thread 0 is a special case; we set its
start time first and actually start it last.  Your observation of cumulative
delay fits those facts.  Initializing the thread-0 start time later, just
before calling its threadRun(), should clear this anomaly without changing
other aspects of the measurement.


While pondering this area of the code, it occurs to me -- shouldn't we
initialize the throttle rate trigger later, after establishing connections and
sending startup queries?  As it stands, we build up a schedule deficit during
those tasks.  Was that deliberate?

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers