Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-22 Thread Greg Smith
Attached is an update that I think sorts out all of the documentation 
concerns.  I broke this section into paragraphs now that it's getting so 
long too.


The only code change is that this now labels the controversial lag here 
average rate limit schedule lag.  That way if someone wants to 
introduce other measures of rate limit lag, like a more transaction 
oriented one, you might call that average rate limit transaction lag 
and tell the two apart.


The rewritten documentation here tries to communicate that there is a 
schedule that acts like it was pre-computed at the start of each client 
too.  It's not ever adjusted based on what individual transactions do. 
I also noted the way this can cause schedule lag for some time after a 
slow transaction finishes, since that's the main issue observed so far.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
new file mode 100644
index 2ad8f0b..4e6b608
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*** int unlogged_tables = 0;
*** 137,142 
--- 137,148 
  doublesample_rate = 0.0;
  
  /*
+  * When threads are throttled to a given rate limit, this is the target delay
+  * to reach that rate in usec.  0 is the default and means no throttling.
+  */
+ int64 throttle_delay = 0;
+ 
+ /*
   * tablespace selection
   */
  char *tablespace = NULL;
*** typedef struct
*** 202,212 
--- 208,220 
int listen; /* 0 indicates that an 
async query has been
 * sent */
int sleeping;   /* 1 indicates that the 
client is napping */
+   boolthrottling; /* whether nap is for throttling */
int64   until;  /* napping until (usec) */
Variable   *variables;  /* array of variable definitions */
int nvariables;
instr_time  txn_begin;  /* used for measuring 
transaction latencies */
instr_time  stmt_begin; /* used for measuring statement 
latencies */
+   boolis_throttled;   /* whether transaction throttling is 
done */
int use_file;   /* index in sql_files 
for this client */
boolprepared[MAX_FILES];
  } CState;
*** typedef struct
*** 224,229 
--- 232,240 
instr_time *exec_elapsed;   /* time spent executing cmds (per 
Command) */
int*exec_count; /* number of cmd executions 
(per Command) */
unsigned short random_state[3]; /* separate randomness for each 
thread */
+   int64   throttle_trigger;   /* previous/next throttling (us) */
+   int64   throttle_lag;   /* total transaction lag behind 
throttling */
+   int64   throttle_lag_max;   /* max transaction lag */
  } TState;
  
  #define INVALID_THREAD((pthread_t) 0)
*** typedef struct
*** 232,237 
--- 243,250 
  {
instr_time  conn_time;
int xacts;
+   int64   throttle_lag;
+   int64   throttle_lag_max;
  } TResult;
  
  /*
*** usage(void)
*** 356,361 
--- 369,375 
 -N, --skip-some-updates  skip updates of pgbench_tellers 
and pgbench_branches\n
 -P, --progress=NUM   show thread progress report 
every NUM seconds\n
 -r, --report-latencies   report average latency per 
command\n
+-R, --rate SPEC  target rate in transactions per 
second\n
 -s, --scale=NUM  report this scale factor in 
output\n
 -S, --select-onlyperform SELECT-only 
transactions\n
 -t, --transactions   number of transactions each 
client runs 
*** doCustom(TState *thread, CState *st, ins
*** 898,914 
  {
PGresult   *res;
Command   **commands;
  
  top:
commands = sql_files[st-use_file];
  
if (st-sleeping)
{   /* are we 
sleeping? */
instr_time  now;
  
INSTR_TIME_SET_CURRENT(now);
!   if (st-until = INSTR_TIME_GET_MICROSEC(now))
st-sleeping = 0;   /* Done sleeping, go ahead with 
next command */
else
return true;/* Still sleeping, nothing to 
do here */
}
--- 912,973 
  {
PGresult   *res;
Command   **commands;
+   booltrans_needs_throttle = false;
  
  top:
commands = 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-22 Thread Alvaro Herrera
Greg Smith wrote:

Thanks.  I didn't look at the code, but while trying to read the docs:

 +para
 + High rate limit schedule lag values, that is values not small with
 + respect to the actual transaction latency, indicate that something 
 is
 + amiss in the throttling process.

I couldn't really parse the above.  Of the first six words, which one is
a verb?  Is there a noun that needs to be plural?  Is there a word that
shouldn't be there?

... Oh, I think it makes sense if I assume that rate limit schedule lag
is a single concept .. but if so, that phrase seems too many words for it.
(So when the RLSL values are high, this indicates a problem.  Is that
what the above means?)

Also, it took me a while to understand what values not small means.  I
think there must be a way to phrase this that's easier to understand.

High lag can highlight a subtle
 + problem there even if the target rate limit is met in the end.  One
 + possible cause of schedule lage is insufficient pgbench threads to
 + handle all of the clients.

typo lage above.

   To improve that, consider reducing the
 + number of clients, increasing the number of threads in pgbench, or
 + running pgbench on a separate host.  Another possibility is that the
 + database is not keeping up with the load at some point.  When that
 + happens, you will have to reduce the expected transaction rate to
 + lower schedule lag.
 +/para

Thanks

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-22 Thread Fabien COELHO


Hello Greg,

Thanks for the improvement!


I have a small reservation about finish/end time schedule in the second 
paragraph, or maybe there is something that I do not understand. There is 
no schedule for finishing anything, only start times are scheduled, so I 
wish the text could avoid suggesting that finish time are scheduled.


The rate is targeted by starting transactions along a 
Poisson-distributed schedule time line.  The expected



finish time schedule


- start time schedule

moves forward based on when the client first started, not when 
the previous transaction ended.



That approach means that when transactions go past their original 
scheduled end time, it is possible for later ones to catch up again.


- That approach means that long transactions can result in later 
transactions to be late with respect to the schedule, while short 
transactions makes it possible for late ones to catch up again.


Would you be ok with that?

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-22 Thread Fabien COELHO


Hello Alvaro,


Thanks.  I didn't look at the code, but while trying to read the docs:


+para
+ High rate limit schedule lag values, that is values not small with
+ respect to the actual transaction latency, indicate that something is
+ amiss in the throttling process.


I couldn't really parse the above.  Of the first six words, which one is
a verb?


None. High values for the time lag measured with respect to the rate 
limit schedule.


Is there a noun that needs to be plural?  Is there a word that shouldn't 
be there?


I do not think so.


... Oh, I think it makes sense if I assume that rate limit schedule lag
is a single concept .. but if so, that phrase seems too many words for it.
(So when the RLSL values are high, this indicates a problem.  Is that
what the above means?)


Yep!


Also, it took me a while to understand what values not small means.  I
think there must be a way to phrase this that's easier to understand.


That's what we are trying to do, but we still need to be precise. With 
less words it seems more understandable, but previous versions suggested 
that the meaning with ambiguous, that is people put their own intuitive 
definition of lag, which resulted in surprises at the measures and 
cumulative behavior. The alternative was either to change what is 
measured, but I insisted that this measure is the useful one, or to try to 
reduce the ambiguity of the documentation, the result of efforts by Greg  
myself your helping to debug:-)



   High lag can highlight a subtle
+ problem there even if the target rate limit is met in the end.


I'm fine with that, if it is clear from the context that the lag we're 
talking about is the one defined on the preceeding paragraph. Greg what 
do you think?


+ One possible cause of schedule lage is insufficient pgbench threads 
to handle all of the clients.


typo lage above.


Indeed.

  To improve that, consider 
reducing the + number of clients, increasing the number of threads in 
pgbench, or + running pgbench on a separate host.  Another possibility 
is that the + database is not keeping up with the load at some point. 
When that + happens, you will have to reduce the expected transaction 
rate to + lower schedule lag. + /para


Thanks for your help!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-22 Thread Greg Smith
Very minor update with V19 here, to reflect Alvaro's comments.  The 
tricky part now reads like this:


High rate limit schedule lag values, that is lag values that are large 
compared to the actual transaction latency, indicate that something is 
amiss in the throttling process.  High schedule lag can highlight a 
subtle problem there even if the target rate limit is met in the end.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
new file mode 100644
index 2ad8f0b..4e6b608
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*** int unlogged_tables = 0;
*** 137,142 
--- 137,148 
  doublesample_rate = 0.0;
  
  /*
+  * When threads are throttled to a given rate limit, this is the target delay
+  * to reach that rate in usec.  0 is the default and means no throttling.
+  */
+ int64 throttle_delay = 0;
+ 
+ /*
   * tablespace selection
   */
  char *tablespace = NULL;
*** typedef struct
*** 202,212 
--- 208,220 
int listen; /* 0 indicates that an 
async query has been
 * sent */
int sleeping;   /* 1 indicates that the 
client is napping */
+   boolthrottling; /* whether nap is for throttling */
int64   until;  /* napping until (usec) */
Variable   *variables;  /* array of variable definitions */
int nvariables;
instr_time  txn_begin;  /* used for measuring 
transaction latencies */
instr_time  stmt_begin; /* used for measuring statement 
latencies */
+   boolis_throttled;   /* whether transaction throttling is 
done */
int use_file;   /* index in sql_files 
for this client */
boolprepared[MAX_FILES];
  } CState;
*** typedef struct
*** 224,229 
--- 232,240 
instr_time *exec_elapsed;   /* time spent executing cmds (per 
Command) */
int*exec_count; /* number of cmd executions 
(per Command) */
unsigned short random_state[3]; /* separate randomness for each 
thread */
+   int64   throttle_trigger;   /* previous/next throttling (us) */
+   int64   throttle_lag;   /* total transaction lag behind 
throttling */
+   int64   throttle_lag_max;   /* max transaction lag */
  } TState;
  
  #define INVALID_THREAD((pthread_t) 0)
*** typedef struct
*** 232,237 
--- 243,250 
  {
instr_time  conn_time;
int xacts;
+   int64   throttle_lag;
+   int64   throttle_lag_max;
  } TResult;
  
  /*
*** usage(void)
*** 356,361 
--- 369,375 
 -N, --skip-some-updates  skip updates of pgbench_tellers 
and pgbench_branches\n
 -P, --progress=NUM   show thread progress report 
every NUM seconds\n
 -r, --report-latencies   report average latency per 
command\n
+-R, --rate SPEC  target rate in transactions per 
second\n
 -s, --scale=NUM  report this scale factor in 
output\n
 -S, --select-onlyperform SELECT-only 
transactions\n
 -t, --transactions   number of transactions each 
client runs 
*** doCustom(TState *thread, CState *st, ins
*** 898,914 
  {
PGresult   *res;
Command   **commands;
  
  top:
commands = sql_files[st-use_file];
  
if (st-sleeping)
{   /* are we 
sleeping? */
instr_time  now;
  
INSTR_TIME_SET_CURRENT(now);
!   if (st-until = INSTR_TIME_GET_MICROSEC(now))
st-sleeping = 0;   /* Done sleeping, go ahead with 
next command */
else
return true;/* Still sleeping, nothing to 
do here */
}
--- 912,973 
  {
PGresult   *res;
Command   **commands;
+   booltrans_needs_throttle = false;
  
  top:
commands = sql_files[st-use_file];
  
+   /*
+* Handle throttling once per transaction by sleeping.  It is simpler
+* to do this here rather than at the end, because so much complicated
+* logic happens below when statements finish.
+*/
+   if (throttle_delay  ! st-is_throttled)
+   {
+   /*
+* Use inverse transform sampling to randomly generate a delay, 
such
+* that the 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-22 Thread David Fetter
On Mon, Jul 22, 2013 at 01:49:39PM -0400, Greg Smith wrote:
 Very minor update with V19 here, to reflect Alvaro's comments.  The
 tricky part now reads like this:
 
 High rate limit schedule lag values,

High values of the rate limit schedule lag measurement?

 that is lag values that are large compared to the actual transaction
 latency, indicate that something is amiss in the throttling process.
 High schedule lag can highlight a subtle problem there even if the
 target rate limit is met in the end.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-22 Thread Tatsuo Ishii
 Very minor update with V19 here, to reflect Alvaro's comments.  The
 tricky part now reads like this:
 
 High rate limit schedule lag values, that is lag values that are large
 compared to the actual transaction latency, indicate that something is
 amiss in the throttling process.  High schedule lag can highlight a
 subtle problem there even if the target rate limit is met in the end.

I have committed this along with slight modification. I changed
--rate rate to --rate=rate to follow option style of pgbench.

Also I have removed a space in --progress= sec in the doc, which is
probably mistakenly added by previous commit.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-21 Thread Tatsuo Ishii
Greg,

 Yes, I already took at look at it briefly.  The updates move in the
 right direction, but I can edit them usefully before commit.  I'll
 have that done by tomorrow and send out a new version.  I'm hopeful
 that v18 will finally be the one that everyone likes.

Have you done it?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-18 Thread Fabien COELHO


Hello Tatsuo,


I think I'm starting to understand what's going on.  Suppose there are
n transactions be issued by pgbench and it decides each schedule d(0),
d(1)... d(n). Actually the schedule d(i) (which is stored in
st-until) is decided by the following code:

int64 wait = (int64)
throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
thread-throttle_trigger += wait;
st-until = thread-throttle_trigger;


Yep. Let us say d(i) is the target starting time for transaction i, that 
is throttle_trigger above.



st-until represents the time for a transaction to be finished by the
time. Now the transaction i finishes at t(i).


No, it is the time for the **start** of the transaction. The client is 
sleeping until this time. We can only try to control the beginning of 
the transaction. It ends when it ends!



So the lag l(i) = t(i) -d(i) if the transaction is behind.


Transaction i lags behind if it *starts* later that d(i). If it start 
effectively at t(i), t(i)=d(i), lag l(i) = t(i)-d(i). When it completes 
is not the problem of the scheduler.


Then next transaction i+1 begins. The lag l(i+1) = t(i+1) - d(i+1) and 
so on. At the end of pgbench, it shows the average lag as 
sum(l(0)...l(n))/n.


Yes.


Now suppose we have 3 transactions and each has following values:

d(0) = 10
d(1) = 20
d(2) = 30

t(0) = 100
t(1) = 110
t(2) = 120

That says pgbench expects the duration 10 for each transaction.


pgbench does not expect any duration, but your proposed scheduling d(i) 
cannot be followed if the duration is more than 10.


With your above figures, with d(i) the expected start time and t(i) the 
actual start time, then for some reason pgbench was not around to start 
transaction before time 100 (maybe the OS switched the process off to 
attend to other stuff) although it should have started at 10, so l(0) = 
90. Then the second transaction starts readily at 110, but was expected at 
20 nevertheless, 90 lag again. Same for the last one. All transactions 
started 90 units after their scheduled time, the cumulative lag is 270, 
the average lag is 90.


If I take another example.

 - Scheduled start time d(0 .. 3) = 0 20 40 60
 - Durations D(0 .. 3) = 15 25 50 10
 - Actual start time for transactions
   t(0) = 3 (it is late by 3 for some reason), completes by 18
   t(1) = t(0)+D(0) + some more lag for some reason = 21, completes by 46
   t(2) = t(1)+D(1) + no additional lag here = 46, completes by 96
   t(3) = t(2)+D(2) + some more lag for some reason = 97, completes by 107

The l(0 .. 3) = 3-0, 21-20, 46-40, 97-60

Total lag is 3 + 1 + 6 + 37 = 48

Average lag = 48/4 = 12

In this example, some lag is due to the process (3 at the beginning, 1 on 
the second transaction), some other is due to a transaction duration which 
impact the following transactions.



However actually pgbench calculates like this:

average lag = (t(0)-d(0) + t(1)-d(1) + t(2)-d(2))/3
   = (100-10 + 110-20 + 120-30)/3
   = (90 + 90 + 90)/3
   = 90


Yes, this is correct.


Looks like too much lag calculated. The difference between the lag
which pgbench calculates and the expected one will be growing if a lag
happens eariler. I guess why my Linux box shows bigger lag than Mac OS
X is, the first transaction or early transactions run slowly than the
ones run later.


Possibly.


Of course this conclusion depends on the definition of the average
rate limit lag of pgbench. So you might have other opinion. However
the way how pgbench calculates the average lag is not expected at
least for me.


Indeed, I think that it really depends on your definition of lag. The lag 
I defined is the time between the scheduled transaction start time and the 
actual transaction start time. This is a measure of how well pgbench is 
able to follow the stochastic process, and if pgbench is constantly late 
then it cumulates a lot, but that basically mean that there is not enough 
(cpu) resources to run pgbench cleanly.


What you seem to expect is the average transaction latency. This is also a 
useful measure, and I'm planning to add a clean measure of that when under 
throttling, and also with --progress, as the current computation based on 
tps is not significant under throttling.


But that is a plan for the next commitfest!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-18 Thread Fabien COELHO


Hello Greg,

The lag computation was not the interesting part of this feature to me.  As I 
said before, I considered it more of a debugging level thing than a number 
people would analyze as much as you did.  I understand why you don't like it 
though.  If the reference time was moved forward to match the transaction end 
each time, I think that would give the lag definition you're looking for. 
That's fine to me too, if Fabien doesn't have a good reason to reject the 
idea.  We would need to make sure that doesn't break some part of the design 
too.


I really thing that the information currently computed is useful. First, 
as you note, for debug, not really debugging the throttling feature which 
works fine, but being able to debug performance if something goes wrong 
while running a bench. Another reason why it is useful is that from a 
client perspective it measures whether the database system is coping with 
the load without incurring additional delays by processing clients 
requests (say from the web server) far behind their actual (i.e. 
scheduled) occurences.


So my recommendation is : please keep this measure as it, and if you want 
the other lag measure, why not add it as well next to the previous one?


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-18 Thread Tatsuo Ishii
Fabien,

 Hello again Tatsuo,
 
 For your information, included is the patch against git master head to
 implement the lag in a way what I proposed. With the patch, I get more
 consistent number on Linux (and Mac OS X).
 
 I must disagree with your proposal: At least, it does not provide the
 information I want, but another one.
 
 ISTM that this patch measures the lag which is due to pgbench thread
 coming around to deal with a transaction after sleeping. I would
 expect that to be quite small most of the time, so I agree that it
 must be reassuringly consistent.
 
 However it does not provide the information I want, which is the
 measure of the health of pgbench with respect to the stochastic
 process scheduled transactions.
 
 Basically you propose a partial lag measure, which will be smaller,
 but which does not tell whether pgbench is able to follow the
 schedule, which is the information I find useful in this context to
 appreciate if the throttling is going well.

I don't care what kind of measurement is provided by pgbench. However
I *do* care about what the measurement means is clearly described in
the doc as a committer. I think current measurement method will give
enough confusion if it's not provided with detailed explanation. Could
you please provide doc updatation?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-18 Thread Fabien COELHO


Hello Tatsuo

I think current measurement method will give enough confusion if it's 
not provided with detailed explanation. Could you please provide doc 
updatation?


Please find a v17 proposition with an updated and extended documentation, 
focussed on clarifying the lag measure and its implications, and taking 
into account the recent discussion on the list with you  Greg.


However I'm not a native English speaker, if you find that some part are 
not clear enough, please tell me what can be improved further.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 2ad8f0b..752119d 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -202,11 +208,13 @@ typedef struct
 	int			listen;			/* 0 indicates that an async query has been
  * sent */
 	int			sleeping;		/* 1 indicates that the client is napping */
+	boolthrottling; /* whether nap is for throttling */
 	int64		until;			/* napping until (usec) */
 	Variable   *variables;		/* array of variable definitions */
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		is_throttled;	/* whether transaction throttling is done */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -224,6 +232,9 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+	int64   throttle_trigger; 	/* previous/next throttling (us) */
+	int64   throttle_lag; 		/* total transaction lag behind throttling */
+	int64   throttle_lag_max; 	/* max transaction lag */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -232,6 +243,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -356,6 +369,7 @@ usage(void)
 		 -N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches\n
 		 -P, --progress=NUM   show thread progress report every NUM seconds\n
 		 -r, --report-latencies   report average latency per command\n
+		 -R, --rate SPEC  target rate in transactions per second\n
 		 -s, --scale=NUM  report this scale factor in output\n
 		 -S, --select-onlyperform SELECT-only transactions\n
 		 -t, --transactions   number of transactions each client runs 
@@ -898,17 +912,62 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booltrans_needs_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/*
+	 * Handle throttling once per transaction by sleeping.  It is simpler
+	 * to do this here rather than at the end, because so much complicated
+	 * logic happens below when statements finish.
+	 */
+	if (throttle_delay  ! st-is_throttled)
+	{
+		/*
+		 * Use inverse transform sampling to randomly generate a delay, such
+		 * that the series of delays will approximate a Poisson distribution
+		 * centered on the throttle_delay time.
+ *
+ * 1000 implies a 6.9 (-log(1/1000)) to 0.0 (log 1.0) delay multiplier.
+		 *
+		 * If transactions are too slow or a given wait is shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttling = true;
+		st-is_throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (st-throttling)
+			{
+/* Measure lag of throttled transaction relative to target */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+st-throttling = false;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1095,6 +1154,15 @@ top:
 			st-state = 0;
 			st-use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-18 Thread Fabien COELHO


Hello again Tatsuo,


For your information, included is the patch against git master head to
implement the lag in a way what I proposed. With the patch, I get more
consistent number on Linux (and Mac OS X).


I must disagree with your proposal: At least, it does not provide the 
information I want, but another one.


ISTM that this patch measures the lag which is due to pgbench thread 
coming around to deal with a transaction after sleeping. I would expect 
that to be quite small most of the time, so I agree that it must be 
reassuringly consistent.


However it does not provide the information I want, which is the measure 
of the health of pgbench with respect to the stochastic process scheduled 
transactions.


Basically you propose a partial lag measure, which will be smaller, but 
which does not tell whether pgbench is able to follow the schedule, which 
is the information I find useful in this context to appreciate if the 
throttling is going well.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-18 Thread Tatsuo Ishii
 Hello Tatsuo
 
 I think current measurement method will give enough confusion if it's
 not provided with detailed explanation. Could you please provide doc
 updatation?
 
 Please find a v17 proposition with an updated and extended
 documentation, focussed on clarifying the lag measure and its
 implications, and taking into account the recent discussion on the
 list with you  Greg.

Thanks!

 However I'm not a native English speaker, if you find that some part
 are not clear enough, please tell me what can be improved further.

I'm not a native English speaker either... Greg, could you please
review the document?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-18 Thread Greg Smith

On 7/18/13 6:45 PM, Tatsuo Ishii wrote:

I'm not a native English speaker either... Greg, could you please
review the document?


Yes, I already took at look at it briefly.  The updates move in the 
right direction, but I can edit them usefully before commit.  I'll have 
that done by tomorrow and send out a new version.  I'm hopeful that v18 
will finally be the one that everyone likes.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-18 Thread Fabien COELHO


I'm not a native English speaker either... Greg, could you please 
review the document?


Yes, I already took at look at it briefly.  The updates move in the right 
direction, but I can edit them usefully before commit.


Great, thanks for your help!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Fabien COELHO



To clarify what state this is all in: Fabien's latest
pgbench-throttle-v15.patch is the ready for a committer version.  The
last two revisions are just tweaking the comments at this point, and
his version is more correct than my last one.


Got it. I will take care of this.


Please find attached an updated version which solves conflicts introduced 
by the progress patch.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 23ee53c..4111e8c 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -202,11 +208,13 @@ typedef struct
 	int			listen;			/* 0 indicates that an async query has been
  * sent */
 	int			sleeping;		/* 1 indicates that the client is napping */
+	boolthrottling; /* whether nap is for throttling */
 	int64		until;			/* napping until (usec) */
 	Variable   *variables;		/* array of variable definitions */
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		is_throttled;	/* whether transaction throttling is done */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -224,6 +232,9 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+	int64   throttle_trigger; 	/* previous/next throttling (us) */
+	int64   throttle_lag; 		/* total transaction lag behind throttling */
+	int64   throttle_lag_max; 	/* max transaction lag */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -232,6 +243,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -356,6 +369,7 @@ usage(void)
 		 -N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches\n
 		 -P, --progress NUM   show thread progress report every NUM seconds\n
 		 -r, --report-latencies   report average latency per command\n
+		 -R, --rate SPEC  target rate in transactions per second\n
 		 -s, --scale=NUM  report this scale factor in output\n
 		 -S, --select-onlyperform SELECT-only transactions\n
 		 -t, --transactions   number of transactions each client runs 
@@ -898,17 +912,62 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booltrans_needs_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/*
+	 * Handle throttling once per transaction by sleeping.  It is simpler
+	 * to do this here rather than at the end, because so much complicated
+	 * logic happens below when statements finish.
+	 */
+	if (throttle_delay  ! st-is_throttled)
+	{
+		/*
+		 * Use inverse transform sampling to randomly generate a delay, such
+		 * that the series of delays will approximate a Poisson distribution
+		 * centered on the throttle_delay time.
+ *
+ * 1000 implies a 6.9 (-log(1/1000)) to 0.0 (log 1.0) delay multiplier.
+		 *
+		 * If transactions are too slow or a given wait is shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttling = true;
+		st-is_throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (st-throttling)
+			{
+/* Measure lag of throttled transaction relative to target */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+st-throttling = false;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1095,6 +1154,15 @@ top:
 			st-state = 0;
 			st-use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st-use_file];
+			st-is_throttled = false;
+			/*
+			 * No transaction is underway anymore, which means there is nothing
+			 * to 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
 To clarify what state this is all in: Fabien's latest
 pgbench-throttle-v15.patch is the ready for a committer version.  The
 last two revisions are just tweaking the comments at this point, and
 his version is more correct than my last one.

 Got it. I will take care of this.
 
 Please find attached an updated version which solves conflicts
 introduced by the progress patch.

Thanks, but I already solved the conflict and fixed some minor
indentation issues. Now I have question regarding the function.

./pgbench -p 5433 -S -T 10 -R 1 test
starting vacuum...end.
transaction type: SELECT only
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 10 s
number of transactions actually processed: 71339
average rate limit lag: 862.534 ms (max 2960.913 ms)
tps = 7133.745911 (including connections establishing)
tps = 7135.130810 (excluding connections establishing)

What does average rate limit lag mean? From the manual:

-R rate
--rate rate

Execute transactions targeting the specified rate instead of
running as fast as possible (the default). The rate is given in
transactions per second. If the targeted rate is above the maximum
possible rate these transactions can execute at, the rate limit
won't have any impact on results. The rate is targeted by starting
transactions along a Poisson-distributed event time line. When a
rate limit is active, the average and maximum transaction lag time
(the delay between the scheduled and actual transaction start
times) are reported in ms. High values indicate that the database
could not handle the scheduled load at some time.

So in my understanding the number shows the delay time before *each*
transaction starts. If my understanding is correct, why

71339 (total transactions) * 862.534 ms = 61532 sec

could exceed 10 seconds, which is the total run time?

Also I noticed small bug.

./pgbench -R 0 test
invalid rate limit: 0

Shouldn't this be treated as if -R is not specified? Actually in the program:

/*
 * When threads are throttled to a given rate limit, this is the target delay
 * to reach that rate in usec.  0 is the default and means no throttling.
 */
int64   throttle_delay = 0;

So it seems treating -R 0 means no throttling makes more sense to me.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Fabien COELHO


Hello Tatsuo,


Now I have question regarding the function.

./pgbench -p 5433 -S -T 10 -R 1 test
tps = 7133.745911 (including connections establishing)

What does average rate limit lag mean? From the manual:
[...]
So in my understanding the number shows the delay time before *each*
transaction starts.


... with respect to the schedule time assigned by the rate-limiting 
stochastic process. This is to detect that rate limiting does not work 
properly.


If my understanding is correct, why 71339 (total transactions) * 862.534 
ms = 61532 sec could exceed 10 seconds, which is the total run time?


It is possible, because each transaction is far behind schedule, and you 
cumulate the lateness.


Say you have transactions schedules every 0.1 second, but they take 2 
second to complete:


 1. scheduled at 0.0, start at 0.0
 2. scheduled at 0.1, start at 2.0, 1.9 second lag
 3. scheduled at 0.2, start at 4.0, 3.8 second lag, cumulative lag 5.7 s
 4. scheduled at 0.3, start at 6.0, 5.7 second lag, cumulative lag 11.4 s
 5. scheduled at 0.4, start at 8.0, 7.6 second lag, cumulative lag 19.0 s
 6. scheduled at 0.5, never starts

If we stop at 10.0 seconds, 5 transaction have been processed, the average 
lag is about 3.8 seconds, the cumulative lag is 19.0 seconds. The lag of a 
given transaction can cover lag from previous ones.


Basically, if the lag is anything but small, it means that the database 
cannot handle the load and that something is amiss. In your example you 
required 1 tps, but the database can only handle 7000 tps.


Note that the database could catchup at some point, say it usually can 
handle more that 1 tps, but while the database dump is running it 
falls far behing schedule, and then one the dump is done it goes back to 
nominal and late transactions are finally processed. The max lag would 
show that something was amiss during the bench, even if the average lag

is quite low.


Also I noticed small bug.

./pgbench -R 0 test
invalid rate limit: 0

Shouldn't this be treated as if -R is not specified? Actually in the program:

/*
* When threads are throttled to a given rate limit, this is the target delay
* to reach that rate in usec.  0 is the default and means no throttling.
*/
int64   throttle_delay = 0;

So it seems treating -R 0 means no throttling makes more sense to me.


Note that the rate is expressed in tps which make sense to users, but the 
internal variable is in usec which is more useful for scheduling, and is 
the inverse of the other.


So -R 0 would mean zero tps, that is an infinite delay, but a 0 delay 
would require an infinite tps. As requiring 0 tps does not make sense, I 
decided to disable that. If you really fill that -R 0 should mean 
disable the feature, I'm fine with that, but this is not exactly logical 
wrt tps.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
 Hello Tatsuo,
 
 Now I have question regarding the function.

 ./pgbench -p 5433 -S -T 10 -R 1 test
 tps = 7133.745911 (including connections establishing)

 What does average rate limit lag mean? From the manual:
 [...]
 So in my understanding the number shows the delay time before *each*
 transaction starts.
 
 ... with respect to the schedule time assigned by the rate-limiting
 stochastic process. This is to detect that rate limiting does not work
 properly.
 
 If my understanding is correct, why 71339 (total transactions) *
 862.534 ms = 61532 sec could exceed 10 seconds, which is the total run
 time?
 
 It is possible, because each transaction is far behind schedule, and
 you cumulate the lateness.
 
 Say you have transactions schedules every 0.1 second, but they take 2
 second to complete:
 
  1. scheduled at 0.0, start at 0.0
  2. scheduled at 0.1, start at 2.0, 1.9 second lag
  3. scheduled at 0.2, start at 4.0, 3.8 second lag, cumulative lag 5.7
  s
  4. scheduled at 0.3, start at 6.0, 5.7 second lag, cumulative lag 11.4
  s
  5. scheduled at 0.4, start at 8.0, 7.6 second lag, cumulative lag 19.0
  s
  6. scheduled at 0.5, never starts
 
 If we stop at 10.0 seconds, 5 transaction have been processed, the
 average lag is about 3.8 seconds, the cumulative lag is 19.0
 seconds. The lag of a given transaction can cover lag from previous
 ones.
 
 Basically, if the lag is anything but small, it means that the
 database cannot handle the load and that something is amiss. In your
 example you required 1 tps, but the database can only handle 7000
 tps.
 
 Note that the database could catchup at some point, say it usually can
 handle more that 1 tps, but while the database dump is running it
 falls far behing schedule, and then one the dump is done it goes back
 to nominal and late transactions are finally processed. The max lag
 would show that something was amiss during the bench, even if the
 average lag
 is quite low.

Thanks for detailed explainations. I now understand the function.

 Also I noticed small bug.

 ./pgbench -R 0 test
 invalid rate limit: 0

 Shouldn't this be treated as if -R is not specified? Actually in the
 program:

 /*
 * When threads are throttled to a given rate limit, this is the target
 * delay
 * to reach that rate in usec.  0 is the default and means no throttling.
 */
 int64throttle_delay = 0;

 So it seems treating -R 0 means no throttling makes more sense to
 me.
 
 Note that the rate is expressed in tps which make sense to users, but
 the internal variable is in usec which is more useful for scheduling,
 and is the inverse of the other.
 
 So -R 0 would mean zero tps, that is an infinite delay, but a 0 delay
 would require an infinite tps. As requiring 0 tps does not make sense,
 I decided to disable that. If you really fill that -R 0 should mean
 disable the feature, I'm fine with that, but this is not exactly
 logical wrt tps.

Ok, your statement seems to be fair. Unless someone complains the
point, I leave it as it is.

I'm going to test your patches on Mac OS X and Windows.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Greg Smith

On 7/17/13 2:31 AM, Tatsuo Ishii wrote:

./pgbench -p 5433 -S -T 10 -R 1 test
average rate limit lag: 862.534 ms (max 2960.913 ms)
tps = 7133.745911 (including connections establishing)
tps = 7135.130810 (excluding connections establishing)

What does average rate limit lag mean?


The whole concept of lag with the rate limit is complicated.  At one 
point I thought this should be a debugging detail, rather than exposing 
the user to it.


The problem is that if you do that, you might not notice that your limit 
failed to work as expected.  Maybe it's good enough in a case like this 
that the user will see they tried to limit at 1, but they only got 
7135, so something must not have worked as expected.


Tatsuo:  most of my tests were on Mac OS and Linux, I actually tested 
the Mac version a lot more than any other here.  I didn't do any testing 
on Windows.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Fabien COELHO



Thanks for detailed explainations. I now understand the function.


Good. I've looked into the documentation. I'm not sure how I could improve 
it significantly without adding a lot of text which would also add a lot 
of confusion to the casual reader.



I'm going to test your patches on Mac OS X and Windows.


Great! I cannot do that.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Fabien COELHO



The whole concept of lag with the rate limit is complicated.


I must agree on that point, their interpretation is subtle.

At one point I thought this should be a debugging detail, rather than 
exposing the user to it. The problem is that if you do that, you might 
not notice that your limit failed to work as expected.  Maybe it's good 
enough in a case like this that the user will see they tried to limit at 
1, but they only got 7135, so something must not have worked as 
expected.


Yep. As I suggested in answering to Tatsuo, the process can catch up 
later, so you could have 1 in the end even with something amiss.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
 The whole concept of lag with the rate limit is complicated.
 
 I must agree on that point, their interpretation is subtle.
 
 At one point I thought this should be a debugging detail, rather than
 exposing the user to it. The problem is that if you do that, you might
 not notice that your limit failed to work as expected.  Maybe it's
 good enough in a case like this that the user will see they tried to
 limit at 1, but they only got 7135, so something must not have
 worked as expected.
 
 Yep. As I suggested in answering to Tatsuo, the process can catch up
 later, so you could have 1 in the end even with something amiss.

Fabian,

I did another case. First, I run pgbench without -R.

$ ./pgbench -p 5433 -S -n -c 10 -T 300 test
./pgbench -p 5433 -S -n -c 10 -T 300 test
transaction type: SELECT only
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 300 s
number of transactions actually processed: 2945652
tps = 9818.741060 (including connections establishing)
tps = 9819.389689 (excluding connections establishing)

So I thought I could squeeze 1 TPS from my box.
Then I tried with -R 5000 tps.

$ ./pgbench -p 5433 -S -n -c 10 -T 300 -R 5000 test
./pgbench -p 5433 -S -n -c 10 -T 300 -R 5000 test
transaction type: SELECT only
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 300 s
number of transactions actually processed: 1510640
average rate limit lag: 0.304 ms (max 19.101 ms)
tps = 5035.409397 (including connections establishing)
tps = 5035.731093 (excluding connections establishing)

As you can see, I got about 5000 tps as expected. But I'm confused by
the lag:

0.304 ms * 1510640 = 459.2 seconds, which is longer than 300 seconds
(specified by -T). Am I missing something?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
 The whole concept of lag with the rate limit is complicated.
 
 I must agree on that point, their interpretation is subtle.
 
 At one point I thought this should be a debugging detail, rather than
 exposing the user to it. The problem is that if you do that, you might
 not notice that your limit failed to work as expected.  Maybe it's
 good enough in a case like this that the user will see they tried to
 limit at 1, but they only got 7135, so something must not have
 worked as expected.
 
 Yep. As I suggested in answering to Tatsuo, the process can catch up
 later, so you could have 1 in the end even with something amiss.
 
 Fabian,
 
 I did another case. First, I run pgbench without -R.
 
 $ ./pgbench -p 5433 -S -n -c 10 -T 300 test
 ./pgbench -p 5433 -S -n -c 10 -T 300 test
 transaction type: SELECT only
 scaling factor: 1
 query mode: simple
 number of clients: 10
 number of threads: 1
 duration: 300 s
 number of transactions actually processed: 2945652
 tps = 9818.741060 (including connections establishing)
 tps = 9819.389689 (excluding connections establishing)
 
 So I thought I could squeeze 1 TPS from my box.
 Then I tried with -R 5000 tps.
 
 $ ./pgbench -p 5433 -S -n -c 10 -T 300 -R 5000 test
 ./pgbench -p 5433 -S -n -c 10 -T 300 -R 5000 test
 transaction type: SELECT only
 scaling factor: 1
 query mode: simple
 number of clients: 10
 number of threads: 1
 duration: 300 s
 number of transactions actually processed: 1510640
 average rate limit lag: 0.304 ms (max 19.101 ms)
 tps = 5035.409397 (including connections establishing)
 tps = 5035.731093 (excluding connections establishing)
 
 As you can see, I got about 5000 tps as expected. But I'm confused by
 the lag:
 
 0.304 ms * 1510640 = 459.2 seconds, which is longer than 300 seconds
 (specified by -T). Am I missing something?

BTW, the system was Linux (kernel 3.0.77).

Now I tried on Mac OS X.

$ pgbench -S -n -c 10 -T 10  test
transaction type: SELECT only
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 10 s
number of transactions actually processed: 67333
tps = 6730.940132 (including connections establishing)
tps = 6751.078966 (excluding connections establishing)

$ pgbench -S -n -c 10 -T 10  -R 3000 test
transaction type: SELECT only
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 10 s
number of transactions actually processed: 29840
average rate limit lag: 0.089 ms (max 27.301 ms)
tps = 2983.707895 (including connections establishing)
tps = 2991.919611 (excluding connections establishing)

0.089 ms * 29840 = 2.66 seconds. Not too bad compared with 10
seconds. On Linux maybe the overhead to calculate the lag is bigger
than Mac OS X? Just my wild guess though...
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Fabien COELHO



tps = 9818.741060 (including connections establishing)

So I thought I could squeeze 1 TPS from my box.
Then I tried with -R 5000 tps.

number of transactions actually processed: 1510640
average rate limit lag: 0.304 ms (max 19.101 ms)
tps = 5035.409397 (including connections establishing)

As you can see, I got about 5000 tps as expected.


Yep, it works:-)


But I'm confused by the lag:

0.304 ms * 1510640 = 459.2 seconds, which is longer than 300 seconds
(specified by -T). Am I missing something?


The lag is reasonnable, althought no too good. One transaction is about 
1.2 ms, the lag is much smaller than that, and you are at about 50% of the 
maximum load. I've got similar figures on my box for such settings. It 
improves if your reduce the number of clients.


If you reduce the number of clients, or add more threads, the lag is 
reduced.



BTW, the system was Linux (kernel 3.0.77).



tps = 6730.940132 (including connections establishing)
$ pgbench -S -n -c 10 -T 10  -R 3000 test



average rate limit lag: 0.089 ms (max 27.301 ms)
tps = 2983.707895 (including connections establishing)

0.089 ms * 29840 = 2.66 seconds. Not too bad compared with 10
seconds.


Indeed, that is better. Transactions are about 1.5 ms and you run at about 
45% of the maximum load here.



On Linux maybe the overhead to calculate the lag is bigger
than Mac OS X? Just my wild guess though...


I would be surprised that this would be the issue is to compute the 
measure, compared to network connections and the like. With -S the bench 
is cpu bound. Possibly a better scheduler/thread management on OSX? Or 
more available cores? Well, I do not know! At high load with clients 
running on the same box as the server, and with more clients  server than 
available cores, there is a lot of competition between processes, and 
between clients that share a unique thread, and a log context switching 
whoch will result in a measured lag.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
 tps = 9818.741060 (including connections establishing)

 So I thought I could squeeze 1 TPS from my box.
 Then I tried with -R 5000 tps.

 number of transactions actually processed: 1510640
 average rate limit lag: 0.304 ms (max 19.101 ms)
 tps = 5035.409397 (including connections establishing)

 As you can see, I got about 5000 tps as expected.
 
 Yep, it works:-)
 
 But I'm confused by the lag:

 0.304 ms * 1510640 = 459.2 seconds, which is longer than 300 seconds
 (specified by -T). Am I missing something?
 
 The lag is reasonnable, althought no too good. One transaction is
 about 1.2 ms, the lag is much smaller than that, and you are at about
 50% of the maximum load. I've got similar figures on my box for such
 settings. It improves if your reduce the number of clients.

No, 5000 TPS = 1/5000 = 0.2 ms per transaction, no? However pgbench
says average lag is 0.304 ms. So the lag is longer than transaction
itself.

 If you reduce the number of clients, or add more threads, the lag is
 reduced.
 
 BTW, the system was Linux (kernel 3.0.77).
 
 tps = 6730.940132 (including connections establishing)
 $ pgbench -S -n -c 10 -T 10  -R 3000 test
 
 average rate limit lag: 0.089 ms (max 27.301 ms)
 tps = 2983.707895 (including connections establishing)

 0.089 ms * 29840 = 2.66 seconds. Not too bad compared with 10
 seconds.
 
 Indeed, that is better. Transactions are about 1.5 ms and you run at
 about 45% of the maximum load here.
 
 On Linux maybe the overhead to calculate the lag is bigger
 than Mac OS X? Just my wild guess though...
 
 I would be surprised that this would be the issue is to compute the
 measure, compared to network connections and the like. With -S the
 bench is cpu bound. Possibly a better scheduler/thread management on
 OSX? Or more available cores?

The number of cores is same.  I don't understand why number of cores
is relatedx, though. Anyway, as you can see in Mac OS X's case, TPS
itself is no better than the Linux box.

  Well, I do not know! At high load with
 clients running on the same box as the server, and with more clients 
 server than available cores, there is a lot of competition between
 processes, and between clients that share a unique thread, and a log
 context switching whoch will result in a measured lag.

Hmm... I would like to have cleaner explanation/evidence before
committing the patch.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Fabien COELHO


Hello Tatsuo,


The lag is reasonnable, althought no too good. One transaction is
about 1.2 ms, the lag is much smaller than that, and you are at about
50% of the maximum load. I've got similar figures on my box for such
settings. It improves if your reduce the number of clients.


No, 5000 TPS = 1/5000 = 0.2 ms per transaction, no?


Hmmm... Yes, and no:-)

Transaction are handled in parallel because there are 10 clients. I look 
at actual transaction times (latency) from a client perspective, not the 
apparent time because of parallelism, and compare it to the measured 
lag, which is also measured per client.


The transaction time I reported is derived from your maximum tps per 
client : 10 clients / 8300 tps = 1.2 ms / trans. However, there are 10 
transaction in progress in parallel.


When you're running at 50% load, the clients basically spend 1.2 ms doing 
a transaction (sending queries, getting results), and 1.2 ms sleeping 
because of rate limiting. The reported 0.3 ms lag is that when sleeping 
1.2 ms it tends to start a little bit later, after 1.5 ms, but this 
latency does not show up on the throuput figures because the next sleep 
will just be a smaller to catch-up.


As you have 10 clients in one pgbench thread, the scheduling say to start 
a new transaction for a client at a certain time, but the pgbench process 
is late to actually handle this client query because it is doing other 
things, like attending one of the other clients, or being switched off to 
run a server process.


However pgbench says average lag is 0.304 ms. So the lag is longer than 
transaction itself.


See above.


I would be surprised that this would be the issue is to compute the
measure, compared to network connections and the like. With -S the
bench is cpu bound. Possibly a better scheduler/thread management on
OSX? Or more available cores?


The number of cores is same.  I don't understand why number of cores
is relatedx, though.


In my mind, because pgbench -S is cpu bound, and with -c 10 you have 
to run pgbench and 10 postgres backends, that is 11 processes competing 
for cpu time. If you have 11 cores that is mostly fine, if you have less 
then there will be some delay depending on how the process scheduler does 
thing in the OS to allocate cpu time. With a load of 50%, about 6 cores 
should be okay to run the load transparently (client  server).


 Well, I do not know! At high load with clients running on the same box 
as the server, and with more clients  server than available cores, 
there is a lot of competition between processes, and between clients 
that share a unique thread, and a log context switching whoch will 
result in a measured lag.


Hmm... I would like to have cleaner explanation/evidence before
committing the patch.


The lag measures you report seems pretty consistent to me given the load 
your requiring, for a cpu bound bench, with more processes to run than 
available cores. At least, I'm buying my own explanation, and I hope to be 
convincing.


If you want to isolate yourself from such effects, pgbench must run on a 
different host than the server, with has many threads as there are cores 
available, and not too many clients per thread. This is also true without 
throttling, but it shows more under throttling because of the lag 
(latency) measures.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
Fabien,

 Hello Tatsuo,
 
 The lag is reasonnable, althought no too good. One transaction is
 about 1.2 ms, the lag is much smaller than that, and you are at about
 50% of the maximum load. I've got similar figures on my box for such
 settings. It improves if your reduce the number of clients.

 No, 5000 TPS = 1/5000 = 0.2 ms per transaction, no?
 
 Hmmm... Yes, and no:-)
 
 Transaction are handled in parallel because there are 10 clients. I
 look at actual transaction times (latency) from a client perspective,
 not the apparent time because of parallelism, and compare it to the
 measured lag, which is also measured per client.
 
 The transaction time I reported is derived from your maximum tps per
 client : 10 clients / 8300 tps = 1.2 ms / trans. However, there are 10
 transaction in progress in parallel.
 
 When you're running at 50% load, the clients basically spend 1.2 ms
 doing a transaction (sending queries, getting results), and 1.2 ms
 sleeping because of rate limiting. The reported 0.3 ms lag is that
 when sleeping 1.2 ms it tends to start a little bit later, after 1.5
 ms, but this latency does not show up on the throuput figures because
 the next sleep will just be a smaller to catch-up.
 
 As you have 10 clients in one pgbench thread, the scheduling say to
 start a new transaction for a client at a certain time, but the
 pgbench process is late to actually handle this client query because
 it is doing other things, like attending one of the other clients, or
 being switched off to run a server process.
 
 However pgbench says average lag is 0.304 ms. So the lag is longer
 than transaction itself.
 
 See above.
 
 I would be surprised that this would be the issue is to compute the
 measure, compared to network connections and the like. With -S the
 bench is cpu bound. Possibly a better scheduler/thread management on
 OSX? Or more available cores?

 The number of cores is same.  I don't understand why number of cores
 is relatedx, though.
 
 In my mind, because pgbench -S is cpu bound, and with -c 10 you
 have to run pgbench and 10 postgres backends, that is 11 processes
 competing for cpu time. If you have 11 cores that is mostly fine, if
 you have less then there will be some delay depending on how the
 process scheduler does thing in the OS to allocate cpu time. With a
 load of 50%, about 6 cores should be okay to run the load
 transparently (client  server).
 
  Well, I do not know! At high load with clients running on the same box
  as the server, and with more clients  server than available cores,
  there is a lot of competition between processes, and between clients
  that share a unique thread, and a log context switching whoch will
  result in a measured lag.

 Hmm... I would like to have cleaner explanation/evidence before
 committing the patch.
 
 The lag measures you report seems pretty consistent to me given the
 load your requiring, for a cpu bound bench, with more processes to run
 than available cores. At least, I'm buying my own explanation, and I
 hope to be convincing.
 
 If you want to isolate yourself from such effects, pgbench must run on
 a different host than the server, with has many threads as there are
 cores available, and not too many clients per thread. This is also
 true without throttling, but it shows more under throttling because of
 the lag (latency) measures.

I think I'm starting to understand what's going on.  Suppose there are
n transactions be issued by pgbench and it decides each schedule d(0),
d(1)... d(n). Actually the schedule d(i) (which is stored in
st-until) is decided by the following code:

int64 wait = (int64)
throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);

thread-throttle_trigger += wait;

st-until = thread-throttle_trigger;

st-until represents the time for a transaction to be finished by the
time. Now the transaction i finishes at t(i).  So the lag l(i) = t(i)
-d(i) if the transaction is behind.  Then next transaction i+1
begins. The lag l(i+1) = t(i+1) - d(i+1) and so on. At the end of
pgbench, it shows the average lag as sum(l(0)...l(n))/n.

Now suppose we have 3 transactions and each has following values:

d(0) = 10
d(1) = 20
d(2) = 30

t(0) = 100
t(1) = 110
t(2) = 120

That says pgbench expects the duration 10 for each
transaction. Actually, the first transaction runs slowly for some
reason and the lag = 100 - 10 = 90. However, tx(1) and tx(2) are
finished on schedule because they spend only 10 (110-10 = 10, 120-110
= 10). So the expected average lag would be 90/3 = 30.

However actually pgbench calculates like this:

average lag = (t(0)-d(0) + t(1)-d(1) + t(2)-d(2))/3
= (100-10 + 110-20 + 120-30)/3
= (90 + 90 + 90)/3
= 90

Looks like too much lag calculated. The difference between the lag
which pgbench calculates and the expected one will be growing if a lag
happens eariler. I guess why my Linux box shows bigger lag than 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Greg Smith

On 7/17/13 9:16 PM, Tatsuo Ishii wrote:

Now suppose we have 3 transactions and each has following values:

d(0) = 10
d(1) = 20
d(2) = 30

t(0) = 100
t(1) = 110
t(2) = 120

That says pgbench expects the duration 10 for each
transaction. Actually, the first transaction runs slowly for some
reason and the lag = 100 - 10 = 90. However, tx(1) and tx(2) are
finished on schedule because they spend only 10 (110-10 = 10, 120-110
= 10). So the expected average lag would be 90/3 = 30.


The clients are not serialized here in any significant way, even when 
they shared a single process/thread.  I did many rounds of tracing 
through this code with timestamps on each line, and the sequence of 
events here will look like this:


client 0:  send SELECT... to server.  yield to next client.
client 1:  send SELECT... to server.  yield to next client.
client 2:  send SELECT... to server.  yield to next client.
select():  wait for the first response from any client.
client 0:  receive response.  complete transaction, compute lag.
client 1:  receive response.  complete transaction, compute lag.
client 2:  receive response.  complete transaction, compute lag.

There is nothing here that is queuing the clients one after the other. 
If (0) takes 100ms before its reply comes back, (1) and (2) can receive 
their reply back and continue forward at any time.  They are not waiting 
for (0); it has yielded control while waiting for a response.  All three 
times are independent once you reach the select() point where all are 
active.


In this situation, if the server gets stuck doing something such that it 
takes 100ms before any client receives a response, it is correct to 
penalize every client for that latency.  All three clients could have 
received the information earlier if the server had any to send them.  If 
they did not, they all were suffering from some sort of lag.


I'm not even sure why you spaced the start times out at intervals of 10. 
 If I were constructing an example like this, I'd have them start at 
times of 0, 1, 2--as fast as the CPU can fire off statements 
basically--and then start waiting from that point.  If client 1 takes 10 
units of time to send its query out before client 2 runs, and the rate 
goal requires 10 units of time, the rate you're asking for is impossible.


For sorting out what's going on with your two systems, I would recommend 
turning on debugging output with -d and looking at the new 
per-transaction latency numbers that the feature reports.  If your 
theory that the lag is going up as the test proceeds is true, that 
should show up in the individual latency numbers too.


Based on what I saw during weeks of testing here, I would be more 
suspicious that there's a system level difference between your two 
servers than to blame the latency calculation.  I saw a *lot* of weird 
system issues myself when I started looking that carefully at sustained 
throughput.  The latency reports from the perspective of Fabien's code 
were always reasonable though.  When something delays every client, it 
counts that against every active client's lag, and that's the right 
thing to do.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
 On 7/17/13 9:16 PM, Tatsuo Ishii wrote:
 Now suppose we have 3 transactions and each has following values:

 d(0) = 10
 d(1) = 20
 d(2) = 30

 t(0) = 100
 t(1) = 110
 t(2) = 120

 That says pgbench expects the duration 10 for each
 transaction. Actually, the first transaction runs slowly for some
 reason and the lag = 100 - 10 = 90. However, tx(1) and tx(2) are
 finished on schedule because they spend only 10 (110-10 = 10, 120-110
 = 10). So the expected average lag would be 90/3 = 30.
 
 The clients are not serialized here in any significant way, even when
 they shared a single process/thread.  I did many rounds of tracing
 through this code with timestamps on each line, and the sequence of
 events here will look like this:

My example is for 1 client case. So concurrent clients are not the
issue here.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Greg Smith

On 7/17/13 11:34 PM, Tatsuo Ishii wrote:

My example is for 1 client case. So concurrent clients are not the
issue here.


Sorry about that, with your clarification I see what you were trying to 
explain now.  The code initializes the target time like this:


thread-throttle_trigger = INSTR_TIME_GET_MICROSEC(start);

And then each time a transaction fires, it advances the reference time 
forward based on the expected rate:


thread-throttle_trigger += wait;

It does *not* reset thread-throttle_trigger based on when the previous 
transaction ended / when the next transaction started.  If the goal is 
10us transaction times, it beats a steady drum saying the transactions 
should come at 10us, 20us, 30us (on average--there's some randomness in 
the goals).  It does not pay any attention to when the previous 
transactions finished.


That means that if an early transaction takes an extra 1000us, every 
transaction after that will also show as 1000us late--even if all of 
them take 10us.  You expect that those later transactions will show 0 
lag, since they took the right duration.  For that to happen, 
thread-throttle_trigger would need to be re-initialized with the 
current time at the end of each completed transaction.


The lag computation was not the interesting part of this feature to me. 
 As I said before, I considered it more of a debugging level thing than 
a number people would analyze as much as you did.  I understand why you 
don't like it though.  If the reference time was moved forward to match 
the transaction end each time, I think that would give the lag 
definition you're looking for.  That's fine to me too, if Fabien doesn't 
have a good reason to reject the idea.  We would need to make sure that 
doesn't break some part of the design too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
 Sorry about that, with your clarification I see what you were trying
 to explain now.  The code initializes the target time like this:
 
 thread-throttle_trigger = INSTR_TIME_GET_MICROSEC(start);
 
 And then each time a transaction fires, it advances the reference time
 forward based on the expected rate:
 
 thread-throttle_trigger += wait;
 
 It does *not* reset thread-throttle_trigger based on when the
 previous transaction ended / when the next transaction started.  If
 the goal is 10us transaction times, it beats a steady drum saying the
 transactions should come at 10us, 20us, 30us (on average--there's some
 randomness in the goals).  It does not pay any attention to when the
 previous transactions finished.
 
 That means that if an early transaction takes an extra 1000us, every
 transaction after that will also show as 1000us late--even if all of
 them take 10us.  You expect that those later transactions will show 0
 lag, since they took the right duration.  For that to happen,
 thread-throttle_trigger would need to be re-initialized with the
 current time at the end of each completed transaction.

Yes, that's exactly what I understand from the code.

 The lag computation was not the interesting part of this feature to
 me.  As I said before, I considered it more of a debugging level thing
 than a number people would analyze as much as you did.  I understand
 why you don't like it though.  If the reference time was moved forward
 to match the transaction end each time, I think that would give the
 lag definition you're looking for.  That's fine to me too, if Fabien
 doesn't have a good reason to reject the idea.  We would need to make
 sure that doesn't break some part of the design too.

I would like to hear from Fabien about the issue too.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-17 Thread Tatsuo Ishii
 Sorry about that, with your clarification I see what you were trying
 to explain now.  The code initializes the target time like this:
 
 thread-throttle_trigger = INSTR_TIME_GET_MICROSEC(start);
 
 And then each time a transaction fires, it advances the reference time
 forward based on the expected rate:
 
 thread-throttle_trigger += wait;
 
 It does *not* reset thread-throttle_trigger based on when the
 previous transaction ended / when the next transaction started.  If
 the goal is 10us transaction times, it beats a steady drum saying the
 transactions should come at 10us, 20us, 30us (on average--there's some
 randomness in the goals).  It does not pay any attention to when the
 previous transactions finished.
 
 That means that if an early transaction takes an extra 1000us, every
 transaction after that will also show as 1000us late--even if all of
 them take 10us.  You expect that those later transactions will show 0
 lag, since they took the right duration.  For that to happen,
 thread-throttle_trigger would need to be re-initialized with the
 current time at the end of each completed transaction.
 
 Yes, that's exactly what I understand from the code.
 
 The lag computation was not the interesting part of this feature to
 me.  As I said before, I considered it more of a debugging level thing
 than a number people would analyze as much as you did.  I understand
 why you don't like it though.  If the reference time was moved forward
 to match the transaction end each time, I think that would give the
 lag definition you're looking for.  That's fine to me too, if Fabien
 doesn't have a good reason to reject the idea.  We would need to make
 sure that doesn't break some part of the design too.
 
 I would like to hear from Fabien about the issue too.

For your information, included is the patch against git master head to
implement the lag in a way what I proposed. With the patch, I get more
consistent number on Linux (and Mac OS X).
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 2ad8f0b..57e62dc 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -202,11 +208,15 @@ typedef struct
 	int			listen;			/* 0 indicates that an async query has been
  * sent */
 	int			sleeping;		/* 1 indicates that the client is napping */
+	boolthrottling; /* whether nap is for throttling */
 	int64		until;			/* napping until (usec) */
+	int64		wait;			/* randomly generated delay (usec) */
 	Variable   *variables;		/* array of variable definitions */
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
+	instr_time  txn_begin_throttle;		/* tx start time used when transaction throttling enabled */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		is_throttled;	/* whether transaction throttling is done */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -224,6 +234,9 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+	int64   throttle_trigger; 	/* previous/next throttling (us) */
+	int64   throttle_lag; 		/* total transaction lag behind throttling */
+	int64   throttle_lag_max; 	/* max transaction lag */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -232,6 +245,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -356,6 +371,7 @@ usage(void)
 		 -N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches\n
 		 -P, --progress=NUM   show thread progress report every NUM seconds\n
 		 -r, --report-latencies   report average latency per command\n
+		 -R, --rate=SPEC  target rate in transactions per second\n
 		 -s, --scale=NUM  report this scale factor in output\n
 		 -S, --select-onlyperform SELECT-only transactions\n
 		 -t, --transactions   number of transactions each client runs 
@@ -898,19 +914,80 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booltrans_needs_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/*
+	 * Handle throttling once per transaction by sleeping.  It is simpler
+	 * to do this here rather than at 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-16 Thread Tatsuo Ishii
 To clarify what state this is all in: Fabien's latest
 pgbench-throttle-v15.patch is the ready for a committer version.  The
 last two revisions are just tweaking the comments at this point, and
 his version is more correct than my last one.

Got it. I will take care of this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-15 Thread Greg Smith
To clarify what state this is all in:  Fabien's latest 
pgbench-throttle-v15.patch is the ready for a committer version.  The 
last two revisions are just tweaking the comments at this point, and his 
version is more correct than my last one.


My little pgbench-delay-finish-v1.patch is a brand new bug fix of sorts 
that, while trivial, isn't ready for commit yet.  I'll start a separate 
e-mail thread and CF entry for that later.  Fabien has jumped into 
initial review comments of that already below, but the throttle feature 
isn't dependent on that.  The finish delay just proves that the latency 
spikes I was getting here aren't directly tied to the throttle feature.


On 7/14/13 5:42 PM, Fabien COELHO wrote:

* ISTM that the impact of the chosen 1000 should appear somewhere.


I don't have a problem with that, but I didn't see that the little table 
you included was enough to do that.  I think if someone knows how this 
type of random generation works, they don't need the comment to analyze 
the impact.  And if they don't know, that comment alone wasn't enough to 
help them figure it out.  That's why I added some terms that might help 
point the right way for someone who wanted to search for more 
information instead.


The text of pgbench is not really the right place to put a lecture about 
how to generate numbers with a target probability distribution function. 
 Normally the code comments tries to recommend references for this sort 
of thing instead.  I didn't find a really good one in a quick search though.



About 123456 12345 vs 123456.012345: My data parser is usually gnuplot
or my eyes, and in both cases the later option is better:-)


pgbench-tools uses gnuplot too.  If I were doing this again today from 
scratch, I would recommend using the epoch time format compatible with 
it you suggested.  I need to look into this whole topic a little more 
before we get into that though.  This patch just wasn't the right place 
to get into that change.



About the little patch: Parameter ok should be renamed to something
meaningful (say do_finish?).


It's saying if the connection finished ok or not.  I think exactly 
what is done with that information is an implementation detail that 
doesn't need to be exposed to the function interface.  We might change 
how this is tied to PQfinish later.



Also, it seems that when timer is
exceeded in doCustom it is called with true, but maybe you intended that
it should be called with false??


The way timeouts are handled right now is a known messy thing.  Exactly 
what you should do with a client that has hit one isn't obvious.  Try 
again?  Close the connection and abort?  The code has a way it handles 
that now, and I didn't want to change it any.



it is important to remove the connection because it serves as a marker
to know whether a client must run or not.


This little hack moved around how clients finished enough to get rid of 
the weird issue with your patch I was bothered by.  You may be right 
that the change isn't really correct because of how the connection is 
compared to null as a way to see if it's active.  I initially added a 
more complicated finished state to the whole mess that tracked this 
more carefully.  I may need to return to that idea now.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-15 Thread Fabien COELHO



* ISTM that the impact of the chosen 1000 should appear somewhere.


I don't have a problem with that, but I didn't see that the little table you 
included was enough to do that.  I think if someone knows how this type of 
random generation works, they don't need the comment to analyze the impact. 
And if they don't know, that comment alone wasn't enough to help them figure 
it out.  That's why I added some terms that might help point the right way 
for someone who wanted to search for more information instead.


Sure. I agree that comments are not the right place for a lecture about 
Poisson stochastic processes. Only the 1000 parameter as an impact on 
the maximum delay that can be incurred with respect do the target average 
delay, and I think that this information is relevant for a comment.


to generate numbers with a target probability distribution function. 
Normally the code comments tries to recommend references for this sort 
of thing instead.  I didn't find a really good one in a quick search 
though.


Yep. Maybe http://en.wikipedia.org/wiki/Exponential_distribution;.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-14 Thread Greg Smith

On 7/13/13 12:13 PM, Fabien COELHO wrote:

My 0.02€: if it means adding complexity to the pgbench code, I think
that it is not worth it. The point of pgbench is to look at a steady
state, not to end in the most graceful possible way as far as measures
are concerned.


That's how some people use pgbench.  I'm just as likely to use it to 
characterize system latency.  If there's a source of latency that's 
specific to the pgbench code, I want that out of there even if it's hard.


But we don't have to argue about that because it isn't.  The attached 
new patch seems to fix the latency spikes at the end, with -2 lines of 
new code!  With that resolved I did a final pass across the rate limit 
code too, attached as a v14 and ready for a committer.  I don't really 
care what order these two changes are committed, there's no hard 
dependency, but I would like to see them both go in eventually.


No functional code was changed from your v13 except for tweaking the 
output.  The main thing I did was expand/edit comments and rename a few 
variables to try and make this easier to read.  If you have any 
objections to my cosmetic changes feel free to post an update.  I've put 
a good bit of time into trying to simplify this further, thinking it 
can't really be this hard.  But this seems to be the minimum complexity 
that still works given the mess of the pgbench state machine.  Every 
change I try now breaks something.


To wrap up the test results motivating my little pgbench-delay-finish 
patch, the throttled cases that were always giving 100ms of latency 
clustered at the end here now look like this:


average rate limit lag: 0.181 ms (max 53.108 ms)
tps = 10088.727398 (including connections establishing)
tps = 10105.775864 (excluding connections establishing)

There are still some of these cases where latency spikes, but they're 
not as big and they're randomly distributed throughout the run.  The 
problem I had with the ones at the end is how they tended to happen a 
few times in a row.  I kept seeing multiple of these ~50ms lulls adding 
up to a huge one, because the source of the lag kept triggering at every 
connection close.


pgbench was already cleaning up all of its connections at the end, after 
all the transactions were finished.  It looks safe to me to just rely on 
that for calling PQfinish in the normal case.  And calls to client_done 
already label themselves ok or abort, the code just didn't do anything 
with that state before.  I tried adding some more complicated state 
tracking, but that adds complexity while doing the exact same thing as 
the simple implementation I did.


The only part of your code change I reverted was altering the latency 
log transaction timestamps to read like 1373821907.65702 instead of 
1373821907 65702.  Both formats were considered when I added that 
feature, and I completely understand why you'd like to change it.  One 
problem is that doing so introduces a new class of float parsing and 
rounding issues to consumers of that data.  I'd rather not revisit that 
without a better reason to break the output format.  Parsing tools like 
my pgbench-tools already struggle trying to support multiple versions of 
pgbench, and I don't think there's enough benefit to the float format to 
bother breaking them today.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
new file mode 100644
index 08095a9..8dc81e5
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*** preparedStatementName(char *buffer, int 
*** 862,875 
  static bool
  clientDone(CState *st, bool ok)
  {
!   /*
!* When the connection finishes normally, don't call PQfinish yet.
!* PQfinish can cause significant delays in other clients that are
!* still running.  Rather than finishing all of them here, in the
!* normal case clients are instead closed in bulk by disconnect_all,
!* after they have all stopped.
!*/
!   if ((st-con != NULL)  ok)
{
PQfinish(st-con);
st-con = NULL;
--- 862,870 
  static bool
  clientDone(CState *st, bool ok)
  {
!   (void) ok;  /* unused */
! 
!   if (st-con != NULL)
{
PQfinish(st-con);
st-con = NULL;
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 80203d6..da88bd7 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ intunlogged_tables = 0;
 double sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64  throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char  *tablespace = NULL;

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-14 Thread Fabien COELHO


Hello Greg,

But we don't have to argue about that because it isn't.  The attached new 
patch seems to fix the latency spikes at the end, with -2 lines of new code!


Hmmm... that looks like not too much complexity:-)

With that resolved I did a final pass across the rate limit code too, 
attached as a v14 and ready for a committer.


You attached my v13. Could you send your v14?

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-14 Thread Greg Smith

On 7/14/13 2:48 PM, Fabien COELHO wrote:

You attached my v13. Could you send your v14?


Correct patch (and the little one from me again) attached this time.

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
new file mode 100644
index 08095a9..6564057
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*** int unlogged_tables = 0;
*** 137,142 
--- 137,148 
  doublesample_rate = 0.0;
  
  /*
+  * When threads are throttled to a given rate limit, this is the target delay
+  * to reach that rate in usec.  0 is the default and means no throttling.
+  */
+ int64 throttle_delay = 0;
+ 
+ /*
   * tablespace selection
   */
  char *tablespace = NULL;
*** typedef struct
*** 200,210 
--- 206,218 
int listen; /* 0 indicates that an 
async query has been
 * sent */
int sleeping;   /* 1 indicates that the 
client is napping */
+   boolthrottling; /* whether nap is for throttling */
int64   until;  /* napping until (usec) */
Variable   *variables;  /* array of variable definitions */
int nvariables;
instr_time  txn_begin;  /* used for measuring 
transaction latencies */
instr_time  stmt_begin; /* used for measuring statement 
latencies */
+   boolis_throttled;   /* whether transaction should be 
throttled */
int use_file;   /* index in sql_files 
for this client */
boolprepared[MAX_FILES];
  } CState;
*** typedef struct
*** 222,227 
--- 230,238 
instr_time *exec_elapsed;   /* time spent executing cmds (per 
Command) */
int*exec_count; /* number of cmd executions 
(per Command) */
unsigned short random_state[3]; /* separate randomness for each 
thread */
+   int64   throttle_trigger;   /* previous/next throttling (us) */
+   int64   throttle_lag;   /* total transaction lag behind 
throttling */
+   int64   throttle_lag_max;   /* max transaction lag */
  } TState;
  
  #define INVALID_THREAD((pthread_t) 0)
*** typedef struct
*** 230,235 
--- 241,248 
  {
instr_time  conn_time;
int xacts;
+   int64   throttle_lag;
+   int64   throttle_lag_max;
  } TResult;
  
  /*
*** usage(void)
*** 353,358 
--- 366,372 
 -n, --no-vacuum  do not run VACUUM before tests\n
 -N, --skip-some-updates  skip updates of pgbench_tellers 
and pgbench_branches\n
 -r, --report-latencies   report average latency per 
command\n
+-R, --rate SPEC  target rate in transactions per 
second\n
 -s, --scale=NUM  report this scale factor in 
output\n
 -S, --select-onlyperform SELECT-only 
transactions\n
 -t, --transactions   number of transactions each 
client runs 
*** doCustom(TState *thread, CState *st, ins
*** 900,916 
  {
PGresult   *res;
Command   **commands;
  
  top:
commands = sql_files[st-use_file];
  
if (st-sleeping)
{   /* are we 
sleeping? */
instr_time  now;
  
INSTR_TIME_SET_CURRENT(now);
!   if (st-until = INSTR_TIME_GET_MICROSEC(now))
st-sleeping = 0;   /* Done sleeping, go ahead with 
next command */
else
return true;/* Still sleeping, nothing to 
do here */
}
--- 914,973 
  {
PGresult   *res;
Command   **commands;
+   booltrans_needs_throttle = false;
  
  top:
commands = sql_files[st-use_file];
  
+   /*
+* Handle throttling once per transaction by sleeping.  It is simpler
+* to do this here rather than at the end, because so much complicated
+* logic happens below when statements finish.
+*/
+   if (throttle_delay  ! st-is_throttled)
+   {
+   /*
+* Use inverse transform sampling to randomly generate a delay, 
such
+* that the series of delays will approximate a Poisson 
distribution
+* centered on the throttle_delay time.
+*
+* If transactions are too slow or a given wait is shorter than
+* a 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-14 Thread Fabien COELHO


Hello Greg,


Correct patch (and the little one from me again) attached this time.


Please find an updated v15 with only comment changes:

* The new comment about is_throttled was inverted wrt the meaning of the 
variable, at least to my understanding.


* ISTM that the impact of the chosen 1000 should appear somewhere.

* The trans_need_throttle comment was slightly wrong: the first 
transaction is also throttled, when initially entering doCustom.



About 123456 12345 vs 123456.012345: My data parser is usually gnuplot 
or my eyes, and in both cases the later option is better:-)



About the little patch: Parameter ok should be renamed to something 
meaningful (say do_finish?). Also, it seems that when timer is exceeded 
in doCustom it is called with true, but maybe you intended that it should 
be called with false?? Moreover, under normal circumstances (throttling is 
significantly below the maximum rate) PQfinish will be called directly by 
threadRun while interrupting sleeping threads. Also, it is important to 
remove the connection because it serves as a marker to know whether a 
client must run or not. So basically I do not understand how it works. 
Note that it does not mean that it does not work, it just means that I do 
not really understand:-)


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 8dc81e5..4b379ab 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -200,11 +206,13 @@ typedef struct
 	int			listen;			/* 0 indicates that an async query has been
  * sent */
 	int			sleeping;		/* 1 indicates that the client is napping */
+	boolthrottling; /* whether nap is for throttling */
 	int64		until;			/* napping until (usec) */
 	Variable   *variables;		/* array of variable definitions */
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		is_throttled;	/* whether transaction throttling is done */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +230,9 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+	int64   throttle_trigger; 	/* previous/next throttling (us) */
+	int64   throttle_lag; 		/* total transaction lag behind throttling */
+	int64   throttle_lag_max; 	/* max transaction lag */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,6 +241,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -353,6 +366,7 @@ usage(void)
 		 -n, --no-vacuum  do not run VACUUM before tests\n
 		 -N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches\n
 		 -r, --report-latencies   report average latency per command\n
+		 -R, --rate SPEC  target rate in transactions per second\n
 		 -s, --scale=NUM  report this scale factor in output\n
 		 -S, --select-onlyperform SELECT-only transactions\n
 		 -t, --transactions   number of transactions each client runs 
@@ -895,17 +909,62 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booltrans_needs_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/*
+	 * Handle throttling once per transaction by sleeping.  It is simpler
+	 * to do this here rather than at the end, because so much complicated
+	 * logic happens below when statements finish.
+	 */
+	if (throttle_delay  ! st-is_throttled)
+	{
+		/*
+		 * Use inverse transform sampling to randomly generate a delay, such
+		 * that the series of delays will approximate a Poisson distribution
+		 * centered on the throttle_delay time.
+ *
+ * 1000 implies a 6.9 (-log(1/1000)) to 0.0 (log 1.0) delay multiplier.
+		 *
+		 * If transactions are too slow or a given wait is shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttling = true;
+		st-is_throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-13 Thread Greg Smith

On 6/30/13 2:04 AM, Fabien COELHO wrote:

My guess is the OS. PQfinish or select do/are systems calls that
present opportunities to switch context. I think that the OS is passing
time with other processes on the same host, expecially postgres
backends, when it is not with the client.


I went looking for other instances of this issue in pgbench results, 
that's what I got lost in the last two weeks.  It's subtle because the 
clients normally all end in one very short burst of time, but I have 
found evidence of PQfinish issues elsewhere.  Evidence still seems to 
match the theory that throttling highlights this only because it spreads 
out the ending a bit more.  Also, it happens to be a lot worse on the 
Mac I did initial testing with, and I don't have nearly as many Mac 
pgbench results.


There's a refactoring possible here that seems to make this whole class 
of problem go away.  If I change pgbench so that PQfinish isn't called 
for any client until *all* of the clients are actually finished with 
transactions, the whole issue goes away.  I'm going to package that hack 
the right way into its own little change, revisit the throttling code, 
and then this all should wrap up nicely.  I'd like to get this one out 
of the commitfest so I can move onto looking at something else.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-13 Thread Fabien COELHO


Hello Greg,

There's a refactoring possible here that seems to make this whole class of 
problem go away.  If I change pgbench so that PQfinish isn't called for any 
client until *all* of the clients are actually finished with transactions, 
the whole issue goes away.


Sure. If the explanation is that because of the load the OS is just 
switching to a postgres process while PQfinish is being called, then 
waiting for the end of other transactions means that postgres processes 
don't have anything to do anymore, so process switching is much less 
likely, so nothing would show up... However this is really hiding the 
underlying fact from the measures rather than solving anything, IMHO.



I'm going to package that hack the right way into its own little change,
revisit the throttling code, and then this all should wrap up nicely.


My 0.02€: if it means adding complexity to the pgbench code, I think that 
it is not worth it. The point of pgbench is to look at a steady state, not 
to end in the most graceful possible way as far as measures are concerned. 
On the other end, if it does not add too much complexity, why not!


I'd like to get this one out of the commitfest so I can move onto 
looking at something else.


Thanks.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-08 Thread Josh Berkus
On 06/29/2013 04:11 PM, Greg Smith wrote:
 I need to catch up with revisions done to this feature since I started
 instrumenting my copy more heavily.  I hope I can get this ready for
 commit by Monday.  I've certainly beaten on the feature for long enough
 now.

Greg, any progress?  Haven't seen an update on this in 10 days.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-30 Thread Fabien COELHO


[...] Why?  I don't know exactly why, but I am sure that pgbench isn't 
doing anything weird.  It's either libpq acting funny, or the OS.


My guess is the OS. PQfinish or select do/are systems calls that 
present opportunities to switch context. I think that the OS is passing 
time with other processes on the same host, expecially postgres backends, 
when it is not with the client. In order to test that, pgbench should run 
on a dedicated box with less threads than the number of available cores, 
or user time could be measured in addition to elapsed time. Also, testing 
with many clients per thread means that if any client is stuck all other 
clients incur an artificial latency: measures are intrinsically fragile.


I need to catch up with revisions done to this feature since I started 
instrumenting my copy more heavily.  I hope I can get this ready for 
commit by Monday.


Ok, thanks!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-29 Thread Greg Smith

On 6/22/13 12:54 PM, Fabien COELHO wrote:

After some poking around, and pursuing various red herrings, I resorted
to measure the delay for calling PQfinish(), which is really the only
special thing going around at the end of pgbench run...


This wasn't what I was seeing, but it's related.  I've proved to myself 
the throttle change isn't reponsible for the weird stuff I'm seeing now. 
 I'd like to rearrange when PQfinish happens now based on what I'm 
seeing, but that's not related to this review.


I duplicated the PQfinish problem you found too.  On my Linux system, 
calls to PQfinish are normally about 36 us long.  They will sometimes 
get lost for 15ms before they return.  That's a different problem 
though, because the ones I'm seeing on my Mac are sometimes 150ms. 
PQfinish never takes quite that long.


PQfinish doesn't pause for a long time on this platform.  But it does 
*something* that causes socket select() polling to stutter.  I have 
instrumented everything interesting in this part of the pgbench code, 
and here is the problem event.


1372531862.062236 select with no timeout sleeping=0
1372531862.109111 select returned 6 sockets latency 46875 us

Here select() is called with 0 sleeping processes, 11 that are done, and 
14 that are running.  The running ones have all sent SELECT statements 
to the server, and they are waiting for a response.  Some of them 
received some data from the server, but they haven't gotten the entire 
response back.  (The PQfinish calls could be involved in how that happened)


With that setup, select runs for 47 *ms* before it gets the next byte to 
a client.  During that time 6 clients get responses back to it, but it 
stays stuck in there for a long time anyway.  Why?  I don't know exactly 
why, but I am sure that pgbench isn't doing anything weird.  It's either 
libpq acting funny, or the OS.  When pgbench is waiting on a set of 
sockets, and none of them are returning anything, that's interesting. 
But there's nothing pgbench can do about it.


The cause/effect here is that the randomness to the throttling code 
spreads out when all the connections end a bit. There are more times 
during which you might have 20 connections finished while 5 still run.


I need to catch up with revisions done to this feature since I started 
instrumenting my copy more heavily.  I hope I can get this ready for 
commit by Monday.  I've certainly beaten on the feature for long enough now.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-27 Thread Fabien COELHO


Please find attached a v12, which under timer_exceeded interrupts 
clients which are being throttled instead of waiting for the end of the 
transaction, as the transaction is not started yet.


Please find attached a v13 which fixes conflicts introduced by the long 
options patch committed by Robert Haas.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 80203d6..da88bd7 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -200,11 +206,13 @@ typedef struct
 	int			listen;			/* 0 indicates that an async query has been
  * sent */
 	int			sleeping;		/* 1 indicates that the client is napping */
+boolthrottling; /* whether nap is for throttling */
 	int64		until;			/* napping until (usec) */
 	Variable   *variables;		/* array of variable definitions */
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +230,10 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+int64   throttle_trigger;  /* previous/next throttling (us) */
+	int64   throttle_lag;  /* total transaction lag behind throttling */
+	int64   throttle_lag_max;  /* max transaction lag */
+
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,6 +242,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -353,6 +367,7 @@ usage(void)
 		 -n, --no-vacuum  do not run VACUUM before tests\n
 		 -N, --skip-some-updates  skip updates of pgbench_tellers and pgbench_branches\n
 		 -r, --report-latencies   report average latency per command\n
+		 -R, --rate SPEC  target rate in transactions per second\n
 		 -s, --scale=NUM  report this scale factor in output\n
 		 -S, --select-onlyperform SELECT-only transactions\n
 		 -t, --transactions   number of transactions each client runs 
@@ -895,17 +910,58 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booldo_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle_delay  ! st-throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 = 13.8 .. 0 multiplier
+		 *  10 = 11.5 .. 0
+		 *   1 =  9.2 .. 0
+		 *1000 =  6.9 .. 0
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttling = true;
+		st-throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (st-throttling)
+			{
+/* measure lag of throttled transaction */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+st-throttling = false;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1034,7 +1090,7 @@ top:
 	 * This is more than we really ought to know about
 	 * instr_time
 	 */
-	fprintf(logfile, %d %d %.0f %d %ld %ld\n,
+	fprintf(logfile, %d %d %.0f %d %ld.%06ld\n,
 			st-id, st-cnt, usec, st-use_file,
 			(long) now.tv_sec, (long) now.tv_usec);
 #else
@@ -1043,7 +1099,7 @@ top:
 	 * On Windows, instr_time doesn't provide a timestamp
 	 * anyway
 	 */
-	fprintf(logfile, %d %d %.0f %d 0 0\n,
+	fprintf(logfile, %d %d %.0f %d 0.0\n,
 			st-id, st-cnt, usec, st-use_file);
 #endif
 }
@@ -1092,6 +1148,13 @@ 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-23 Thread Fabien COELHO


So my argumented conclusion is that the issue is somewhere within PQfinish(), 
possibly in interaction with pgbench doings, but is *NOT* related in any way 
to the throttling patch, as it is preexisting it. Gregs stumbled upon it 
because he looked at latencies.


An additional thought:

The latency measures *elapsed* time. As a small laptop is running 30 
clients and their server processes at a significant load, there are a lot 
of context switching going on, so maybe it just happens that the pgbench 
process is switched off and then on as PQfinish() is running, so the point 
would really be that the host is loaded and that's it. I'm not sure of the 
likelyhood of such an event. It is possible that would be more frequent 
after timer_exceeded because the system is closing postgres processes, and 
would depend on what the kernel process scheduler does.


So the explanation would be: your system is loaded, and it shows in subtle 
ways here and there when you do detailed measures. That is life.


Basically this is a summary my (long) experience with performance 
experiments on computers. What are you really measuring? What is really 
happening?


When a system is loaded, there are many things which interact one with the 
other and induce particular effects on performance measures. So usually 
what is measured is not what one is expecting.


Greg thought that he was measuring transaction latencies, but it was 
really client contention in a thread. I thought that I was measuring 
PQfinish() time, but maybe it is the probability of a context switch.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-23 Thread Fabien COELHO



An additional thought:


Yet another thought, hopefully final on this subject.

I think that the probability of a context switch is higher when calling 
PQfinish than in other parts of pgbench because it contains system calls 
(e.g. closing the network connection) where the kernel is likely to stop 
this process and activate another one.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-22 Thread Fabien COELHO


Dear Robert and Greg,


I think so.  If it doesn't get fixed now, it's not likely to get fixed
later.  And the fact that nobody understands why it's happening is
kinda worrisome...


Possibly, but I thing that it is not my fault:-)

So, I spent some time at performance debugging...

First, I finally succeeded in reproducing Greg Smith spikes on my ubuntu 
laptop. It needs short transactions and many clients per thread so as to 
be a spike. With pgbench standard transactions and not too many clients 
per thread it is more of a little bump, or even it is not there at all.


After some poking around, and pursuing various red herrings, I resorted to 
measure the delay for calling PQfinish(), which is really the only 
special thing going around at the end of pgbench run...


BINGO!

In his tests Greg is using one thread. Once the overall timer is exceeded, 
clients start closing their connections by calling PQfinish once their 
transactions are done. This call takes between a few us and a few ... ms. 
So if some client closing time hit the bad figures, the transactions of 
other clients are artificially delayed by this time, and it seems they 
have a high latency, but it is really because the thread is in another 
client's PQfinish and was not available to process the data. If you have 
one thread per client, no problem, especially as the final PQfinish() time 
is not measured at all by pgbench:-) Also, the more clients, the higher 
the spike because more are to be stopped and may hit the bad figures.


Here is a trace, with the simple SELECT transaction.

  sh ./pgbench --rate=14000 -T 10 -r -l -c 30 -S bench
  ...

  sh less pgbench_log.*

 # short transactions, about 0.250 ms
 ...
 20 4849 241 0 1371916583.455400
 21 4844 256 0 1371916583.455470
 22 4832 348 0 1371916583.455569
 23 4829 218 0 1371916583.455627

   ** TIMER EXCEEDED **

 25 4722 390 0 1371916583.455820
 25 done in 1560 [1,2]  # BING, 1560 us for PQfinish()
 26 4557 1969 0 1371916583.457407
 26 done in 21 [1,2]
 27 4372 1969 0 1371916583.457447
 27 done in 19 [1,2]
 28 4009 1910 0 1371916583.457486
 28 done in 1445 [1,2]  # BOUM
 2 interrupted in 1300 [0,0]# BANG
 3 interrupted in 15 [0,0]
 4 interrupted in 40 [0,0]
 5 interrupted in 203 [0,0] # boum?
 6 interrupted in 1352 [0,0]# ZIP
 7 interrupted in 18 [0,0]
 ...
 23 interrupted in 12 [0,0]

 ## the cumulated PQfinish() time above is about 6 ms which
 ## appears as an apparent latency for the next clients:

  0 4880 6521 0 1371916583.462157
  0 done in 9 [1,2]
  1 4877 6397 0 1371916583.462194
  1 done in 9 [1,2]
 24 4807 6796 0 1371916583.462217
 24 done in 9 [1,2]
 ...

Note that the bad measures also appear when there is no throttling:

  sh ./pgbench -T 10 -r -l -c 30 -S bench
  sh grep 'done.*[0-9][0-9][0-9]' pgbench_log.*
   0 done in 1974 [1,2]
   1 done in 312 [1,2]
   3 done in 2159 [1,2]
   7 done in 409 [1,2]
  11 done in 393 [1,2]
  12 done in 2212 [1,2]
  13 done in 1458 [1,2]
  ## other clients execute PQfinish in less than 100 us

This done is issued by my instrumented version of clientDone().

The issue does also appear if I instrument pgbench from master, without 
anything from the throttling patch at all:


  sh git diff master
  diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
  index 1303217..7c5ea81 100644
  --- a/contrib/pgbench/pgbench.c
  +++ b/contrib/pgbench/pgbench.c
  @@ -869,7 +869,15 @@ clientDone(CState *st, bool ok)

  if (st-con != NULL)
  {
  +   instr_time now;
  +   int64 s0, s1;
  +   INSTR_TIME_SET_CURRENT(now);
  +   s0 = INSTR_TIME_GET_MICROSEC(now);
  PQfinish(st-con);
  +   INSTR_TIME_SET_CURRENT(now);
  +   s1 = INSTR_TIME_GET_MICROSEC(now);
  +   fprintf(stderr, %d done in %ld [%d,%d]\n,
  +   st-id, s1-s0, st-listen, st-state);
  st-con = NULL;
  }
  return false;   /* always false */

  sh ./pgbench -T 10 -r -l -c 30 -S bench 2 x.err

  sh grep 'done.*[0-9][0-9][0-9]' x.err
  14 done in 1985 [1,2]
  16 done in 276 [1,2]
  17 done in 1418 [1,2]

So my argumented conclusion is that the issue is somewhere within 
PQfinish(), possibly in interaction with pgbench doings, but is *NOT* 
related in any way to the throttling patch, as it is preexisting it. Gregs 
stumbled upon it because he looked at latencies.


I'll submit a slightly improved v12 shortly.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-22 Thread Fabien COELHO


Please find attached a v12, which under timer_exceeded interrupts clients 
which are being throttled instead of waiting for the end of the 
transaction, as the transaction is not started yet.


I've also changed the log format that I used for debugging the apparent 
latency issue:


  x y z 12345677 1234 - x y z 12345677.001234

It seems much clearer that way.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-22 Thread Fabien COELHO


Please find attached a v12, which under timer_exceeded interrupts clients 
which are being throttled instead of waiting for the end of the transaction, 
as the transaction is not started yet.


Oops, I forgot the attachment. Here it is!

--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 1303217..4c9e55d 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -200,11 +206,13 @@ typedef struct
 	int			listen;			/* 0 indicates that an async query has been
  * sent */
 	int			sleeping;		/* 1 indicates that the client is napping */
+boolthrottling; /* whether nap is for throttling */
 	int64		until;			/* napping until (usec) */
 	Variable   *variables;		/* array of variable definitions */
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +230,10 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+int64   throttle_trigger;  /* previous/next throttling (us) */
+	int64   throttle_lag;  /* total transaction lag behind throttling */
+	int64   throttle_lag_max;  /* max transaction lag */
+
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,6 +242,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -355,6 +369,8 @@ usage(void)
 		 -n   do not run VACUUM before tests\n
 		 -N   do not update tables \pgbench_tellers\ and \pgbench_branches\\n
 		 -r   report average latency per command\n
+		 -R SPEC, --rate SPEC\n
+		  target rate in transactions per second\n
 		 -s NUM   report this scale factor in output\n
 		 -S   perform SELECT-only transactions\n
 	   -t NUM   number of transactions each client runs (default: 10)\n
@@ -898,17 +914,58 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booldo_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle_delay  ! st-throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 = 13.8 .. 0 multiplier
+		 *  10 = 11.5 .. 0
+		 *   1 =  9.2 .. 0
+		 *1000 =  6.9 .. 0
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttling = true;
+		st-throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (st-throttling)
+			{
+/* measure lag of throttled transaction */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+st-throttling = false;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1037,7 +1094,7 @@ top:
 	 * This is more than we really ought to know about
 	 * instr_time
 	 */
-	fprintf(logfile, %d %d %.0f %d %ld %ld\n,
+	fprintf(logfile, %d %d %.0f %d %ld.%06ld\n,
 			st-id, st-cnt, usec, st-use_file,
 			(long) now.tv_sec, (long) now.tv_usec);
 #else
@@ -1046,7 +1103,7 @@ top:
 	 * On Windows, instr_time doesn't provide a timestamp
 	 * anyway
 	 */
-	fprintf(logfile, %d %d %.0f %d 0 0\n,
+	fprintf(logfile, %d %d %.0f %d 0.0\n,
 			st-id, st-cnt, usec, st-use_file);
 #endif
 }
@@ -1095,6 +1152,13 @@ top:
 			st-state = 0;
 			st-use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-21 Thread Robert Haas
On Wed, Jun 19, 2013 at 2:42 PM, Fabien COELHO coe...@cri.ensmp.fr wrote:
 number of transactions actually processed: 301921
 Just a thought before spending too much time on this subtle issue.

 The patch worked reasonnably for 301900 transactions in your above run, and
 the few last ones, less than the number of clients, show strange latency
 figures which suggest that something is amiss in some corner case when
 pgbench is stopping. However, the point of pgbench is to test a steady
 state, not to achieve the cleanest stop at the end of a run.

 So my question is: should this issue be a blocker wrt to the feature?

I think so.  If it doesn't get fixed now, it's not likely to get fixed
later.  And the fact that nobody understands why it's happening is
kinda worrisome...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-19 Thread Fabien COELHO



I'm still getting the same sort of pauses waiting for input with your v11.


Alas.

This is a pretty frustrating problem; I've spent about two days so far trying 
to narrow down how it happens.  I've attached the test program I'm using.  It 
seems related to my picking a throttled rate that's close to (but below) the 
maximum possible on your system.  I'm using 10,000 on a system that can do 
about 16,000 TPS when running pgbench in debug mode.


This problem is 100% reproducible here; happens every time.  This is a laptop 
running Mac OS X.  It's possible the problem is specific to that platform. 
I'm doing all my tests with the database itself setup for development, with 
debug and assertions on.  The lag spikes seem smaller without assertions on, 
but they are still there.


Here's a sample:

transaction type: SELECT only


What is this test script? I'm doing pgbench for tests.


scaling factor: 10
query mode: simple
number of clients: 25
number of threads: 1
duration: 30 s
number of transactions actually processed: 301921
average transaction lag: 1.133 ms (max 137.683 ms)
tps = 10011.527543 (including connections establishing)
tps = 10027.834189 (excluding connections establishing)

And those slow ones are all at the end of the latency log file, as shown in 
column 3 here:


22 11953 3369 0 1371578126 954881
23 11926 3370 0 1371578126 954918
3 12238 30310 0 1371578126 984634
7 12205 30350 0 1371578126 984742
8 12207 30359 0 1371578126 984792
11 12176 30325 0 1371578126 984837
13 12074 30292 0 1371578126 984882
0 12288 175452 0 1371578127 126340
9 12194 171948 0 1371578127 126421
12 12139 171915 0 1371578127 126466
24 11876 175657 0 1371578127 126507


Indeed, there are two spikes, but not all clients are concerned.

As I have not seen that, debuging is hard. I'll give it a try on 
tomorrow.


When no one is sleeping, the timeout becomes infinite, so only returning data 
will break it.  This is intended behavior though.


This is not coherent. Under --throttle there should basically always be 
someone asleep, unless the server cannot cope with the load and *all* 
transactions are late. A no time out state looks pretty unrealistic, 
because it means that there is no throttling.


I don't think the st-listen related code has anything to do with this 
either.  That flag is only used to track when clients have completed sending 
their first query over to the server.  Once reaching that point once, 
afterward they should be listening for results each time they exit the 
doCustom() code.


This assumption seems false if you can have a sleep at the beginning of 
the sequence, which is what throttle is doing, but can be done by any 
custom script, so that the client is expected to wait before sending any 
command, thus there can be no select underway in that case.


So listen should be set to 1 when a select as been sent, and set back to 0 
when the result data have all been received.


doCustom makes implicit assumptions about what is going on, whereas it 
should focus on looking at the incoming state, performing operations, and 
leaving with a state which correspond to the actual status, without 
assumptions about what is going to happen next.


st-listen goes to 1 very soon after startup and then it stays there, 
and that logic seems fine too.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-19 Thread Fabien COELHO



number of transactions actually processed: 301921


Just a thought before spending too much time on this subtle issue.

The patch worked reasonnably for 301900 transactions in your above run, 
and the few last ones, less than the number of clients, show strange 
latency figures which suggest that something is amiss in some corner case 
when pgbench is stopping. However, the point of pgbench is to test a 
steady state, not to achieve the cleanest stop at the end of a run.


So my question is: should this issue be a blocker wrt to the feature?

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-18 Thread Greg Smith
I'm still getting the same sort of pauses waiting for input with your 
v11.  This is a pretty frustrating problem; I've spent about two days so 
far trying to narrow down how it happens.  I've attached the test 
program I'm using.  It seems related to my picking a throttled rate 
that's close to (but below) the maximum possible on your system.  I'm 
using 10,000 on a system that can do about 16,000 TPS when running 
pgbench in debug mode.


This problem is 100% reproducible here; happens every time.  This is a 
laptop running Mac OS X.  It's possible the problem is specific to that 
platform.  I'm doing all my tests with the database itself setup for 
development, with debug and assertions on.  The lag spikes seem smaller 
without assertions on, but they are still there.


Here's a sample:

transaction type: SELECT only
scaling factor: 10
query mode: simple
number of clients: 25
number of threads: 1
duration: 30 s
number of transactions actually processed: 301921
average transaction lag: 1.133 ms (max 137.683 ms)
tps = 10011.527543 (including connections establishing)
tps = 10027.834189 (excluding connections establishing)

And those slow ones are all at the end of the latency log file, as shown 
in column 3 here:


22 11953 3369 0 1371578126 954881
23 11926 3370 0 1371578126 954918
3 12238 30310 0 1371578126 984634
7 12205 30350 0 1371578126 984742
8 12207 30359 0 1371578126 984792
11 12176 30325 0 1371578126 984837
13 12074 30292 0 1371578126 984882
0 12288 175452 0 1371578127 126340
9 12194 171948 0 1371578127 126421
12 12139 171915 0 1371578127 126466
24 11876 175657 0 1371578127 126507

Note that there are two long pauses here, which happens maybe half of 
the time.  There's a 27 ms pause and then a 145 ms one.


The exact sequence for when the pauses show up is:

-All remaining clients have sent their SELECT statement and are waiting 
for a response.  They are not sleeping, they're waiting for the server 
to return a query result.
-A client receives part of the data it wants, but there is still data 
left.  This is the path through pgbench.c where the if 
(PQisBusy(st-con)) test is true after receiving some information.  I 
hacked up some logging that distinguishes those as a client %d partial 
receive to make this visible.
-A select() call is made with no timeout, so it can only be satisfied by 
more data being returned.
-Around ~100ms (can be less, can be more) goes by before that select() 
returns more data to the client, where normally it would happen in ~2ms.


You were concerned about a possible bug in the timeout code.  If you 
hack up the select statement to show some state information, the setup 
for the pauses at the end always looks like this:


calling select min_usec=9223372036854775807 sleeping=0

When no one is sleeping, the timeout becomes infinite, so only returning 
data will break it.  This is intended behavior though.  This exact same 
sequence and select() call parameters happen in pgbench code every time 
at the end of a run.  But clients without the throttling patch applied 
never seem to get into the state where they pause for so long, waiting 
for one of the active clients to receive the next bit of result.


I don't think the st-listen related code has anything to do with this 
either.  That flag is only used to track when clients have completed 
sending their first query over to the server.  Once reaching that point 
once, afterward they should be listening for results each time they 
exit the doCustom() code.  st-listen goes to 1 very soon after startup 
and then it stays there, and that logic seems fine too.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
#!/bin/bash
make
make install
rm pgbench_log* nohup.out
#nohup pgbench -c 25 -T 30 -l -d -f select.sql -s 10 pgbench
nohup pgbench -c 25 -T 30 -l -d -S -R 1 pgbench
tail -n 50 pgbench_log*


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-15 Thread Fabien COELHO


Hello Greg,

I've done some more testing with the test patch. I have not seen any 
spike at the end of the throttled run.


The attached version 11 patch does ensure that throttle added sleeps are 
not included in latency measures (-r) and that throttling is performed 
right at the beginning of a transaction. There is an ugly goto to do that.


I think there is still a latent bug in the code with listen which should 
be set back to 0 in other places, but this bug is already there.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 1303217..2fc6f6a 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -205,6 +211,7 @@ typedef struct
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +229,10 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+int64   throttle_trigger;  /* previous/next throttling (us) */
+	int64   throttle_lag;  /* total transaction lag behind throttling */
+	int64   throttle_lag_max;  /* max transaction lag */
+
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,6 +241,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -355,6 +368,8 @@ usage(void)
 		 -n   do not run VACUUM before tests\n
 		 -N   do not update tables \pgbench_tellers\ and \pgbench_branches\\n
 		 -r   report average latency per command\n
+		 -R SPEC, --rate SPEC\n
+		  target rate in transactions per second\n
 		 -s NUM   report this scale factor in output\n
 		 -S   perform SELECT-only transactions\n
 	   -t NUM   number of transactions each client runs (default: 10)\n
@@ -898,17 +913,56 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booldo_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle_delay  ! st-throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 = 13.8 .. 0 multiplier
+		 *  10 = 11.5 .. 0
+		 *   1 =  9.2 .. 0
+		 *1000 =  6.9 .. 0
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (throttle_delay  st-state==0)
+			{
+/* measure lag of throttled transaction */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1095,6 +1149,9 @@ top:
 			st-state = 0;
 			st-use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st-use_file];
+			st-throttled = false;
+			st-listen = 0;
+			do_throttle = (throttle_delay0);
 		}
 	}
 
@@ -1113,6 +1170,12 @@ top:
 		INSTR_TIME_ACCUM_DIFF(*conn_time, end, start);
 	}
 
+	if (do_throttle) {
+		/* delay throttling after reopenning the connection */
+		do_throttle = false;
+		goto top;
+	}
+
 	/* Record transaction start time if logging is enabled */
 	if (logfile  st-state == 0)
 		INSTR_TIME_SET_CURRENT(st-txn_begin);
@@ -2017,7 +2080,8 @@ process_builtin(char *tb)
 static void
 printResults(int ttype, int normal_xacts, int nclients,
 			 TState *threads, int nthreads,
-			 instr_time total_time, instr_time conn_total_time)
+			 instr_time 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-14 Thread Greg Smith

On 6/12/13 3:19 AM, Fabien COELHO wrote:

If you are still worried: if you run the very same command without
throttling and measure the same latency, does the same thing happens at
the end? My guess is that it should be yes. If it is no, I'll try out
pgbench-tools.


It looks like it happens rarely for one client without the rate limit, 
but that increases to every time for multiple client with limiting in 
place.  pgbench-tools just graphs the output from the latency log. 
Here's a setup that runs the test I'm doing:


$ createdb pgbench
$ pgbench -i -s 10 pgbench
$ pgbench -S -c 25 -T 30 -l pgbench  tail -n 40 pgbench_log*

Sometimes there's no slow entries. but I've seen this once so far:

0 21822 1801 0 1371217462 945264
1 21483 1796 0 1371217462 945300
8 20891 1931 0 1371217462 945335
14 20520 2084 0 1371217462 945374
15 20517 1991 0 1371217462 945410
16 20393 1928 0 1371217462 945444
17 20183 2000 0 1371217462 945479
18 20277 2209 0 1371217462 945514
23 20316 2114 0 1371217462 945549
22 20267 250128 0 1371217463 193656

The third column is the latency for that transaction.  Notice how it's a 
steady ~2000 us except for the very last transaction, which takes 
250,128 us.  That's the weird thing; these short SELECT statements 
should never take that long.  It suggests there's something weird 
happening with how the client exits, probably that its latency number is 
being computed after more work than it should.


Here's what a rate limited run looks like for me.  Note that I'm still 
using the version I re-submitted since that's where I ran into this 
issue, I haven't merged your changes to split the rate among each client 
here--which means this is 400 TPS per client == 1 TPS total:


$ pgbench -S -c 25 -T 30 -R 400 -l pgbench  tail -n 40 pgbench_log

7 12049 2070 0 1371217859 195994
22 12064 2228 0 1371217859 196115
18 11957 1570 0 1371217859 196243
23 12130 989 0 1371217859 196374
8 11922 1598 0 1371217859 196646
11 12229 4833 0 1371217859 196702
21 11981 1943 0 1371217859 196754
20 11930 1026 0 1371217859 196799
14 11990 13119 0 1371217859 208014
^^^ fast section
vvv delayed section
1 11982 91926 0 1371217859 287862
2 12033 116601 0 1371217859 308644
6 12195 115957 0 1371217859 308735
17 12130 114375 0 1371217859 308776
0 12026 115507 0 1371217859 308822
3 11948 118228 0 1371217859 308859
4 12061 113484 0 1371217859 308897
5 12110 113586 0 1371217859 308933
9 12032 117744 0 1371217859 308969
10 12045 114626 0 1371217859 308989
12 11953 113372 0 1371217859 309030
13 11883 114405 0 1371217859 309066
15 12018 116069 0 1371217859 309101
16 11890 115727 0 1371217859 309137
19 12140 114006 0 1371217859 309177
24 11884 115782 0 1371217859 309212

There's almost 90,000 usec of latency showing up between epoch time 
1371217859.208014 and 1371217859.287862 here.  What's weird about it is 
that the longer the test runs, the larger the gap is.  If collecting the 
latency data itself caused the problem, that would make sense, so maybe 
this is related to flushing that out to disk.


If you want to look just at the latency numbers without the other 
columns in the way you can use:


cat pgbench_log.* | awk {'print $3'}

That is how I was evaluating the smoothness of the rate limit, by 
graphing those latency values.  pgbench-tools takes those and a derived 
TPS/s number and plots them, which made it easier for me to spot this 
weirdness.  But I've already moved onto analyzing the raw latency data 
instead, I can see the issue without the graph once I've duplicated the 
conditions.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-14 Thread Greg Smith
I don't have this resolved yet, but I think I've identified the cause. 
Updating here mainly so Fabien doesn't duplicate my work trying to track 
this down.  I'm going to keep banging at this until it's resolved now 
that I got this far.


Here's a slow transaction:

1371226017.568515 client 1 executing \set naccounts 10 * :scale
1371226017.568537 client 1 throttling 6191 us
1371226017.747858 client 1 executing \setrandom aid 1 :naccounts
1371226017.747872 client 1 sending SELECT abalance FROM pgbench_accounts 
WHERE aid = 268721;

1371226017.789816 client 1 receiving

That confirms it is getting stuck at the throttling step.  Looks like 
the code pauses there because it's trying to overload the sleeping 
state that was already in pgbench, but handle it in a special way inside 
of doCustom(), and that doesn't always work.


The problem is that pgbench doesn't always stay inside doCustom when a 
client sleeps.  It exits there to poll for incoming messages from the 
other clients, via select() on a shared socket.  It's not safe to assume 
doCustom will be running regularly; that's only true if clients keep 
returning messages.


So as long as other clients keep banging on the shared socket, doCustom 
is called regularly, and everything works as expected.  But at the end 
of the test run that happens less often, and that's when the problem 
shows up.


pgbench already has a \sleep command, and the way that delay is 
handled happens inside threadRun() instead.  The pausing of the rate 
limit throttle needs to operate in the same place.  I have to redo a few 
things to confirm this actually fixes the issue, as well as look at 
Fabien's later updates to this since I wandered off debugging.  I'm sure 
it's in the area of code I'm poking at now though.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-14 Thread Fabien COELHO


pgbench already has a \sleep command, and the way that delay is 
handled happens inside threadRun() instead.  The pausing of the rate 
limit throttle needs to operate in the same place.


It does operate at the same place. The throttling is performed by 
inserting a sleep first thing when starting a new transaction. So if 
their is a weirdness, it should show as well without throttling but with a 
fixed \sleep instead?


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-14 Thread Fabien COELHO


Hello Greg,

I think that the weirdness really comes from the way transactions times 
are measured, their interactions with throttling, and latent bugs in the 
code.


One issue is that the throttling time was included in the measure, but not 
the first time because txn_begin is not set at the beginning of 
doCustom.


Also, flag st-listen is set to 1 but *never* set back to 0...
 sh grep listen pgbench.c
int listen;
if (st-listen)
st-listen = 1;
st-listen = 1;
st-listen = 1;
st-listen = 1;
st-listen = 1;
st-listen = 1;

ISTM that I can fix the weirdness by inserting an ugly goto top;, but 
I would feel better about it by removing all gotos and reordering some 
actions in doCustom in a more logical way. However that would be a bigger 
patch.


Please find attached 2 patches:

 - the first is the full throttle patch which ensures that the
   txn_begin is taken at a consistent point, after throttling,
   which requires resetting listen. There is an ugly goto.
   I've also put times in a consistent format in the log,
   789.012345 instead of 789 12345.

 - the second patch just shows the diff between v10 and the first one.

--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 8c202bf..dc4f819 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -205,6 +211,7 @@ typedef struct
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +229,10 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+int64   throttle_trigger;  /* previous/next throttling (us) */
+	int64   throttle_lag;  /* total transaction lag behind throttling */
+	int64   throttle_lag_max;  /* max transaction lag */
+
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,6 +241,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -355,6 +368,8 @@ usage(void)
 		 -n   do not run VACUUM before tests\n
 		 -N   do not update tables \pgbench_tellers\ and \pgbench_branches\\n
 		 -r   report average latency per command\n
+		 -R SPEC, --rate SPEC\n
+		  target rate in transactions per second\n
 		 -s NUM   report this scale factor in output\n
 		 -S   perform SELECT-only transactions\n
 	   -t NUM   number of transactions each client runs (default: 10)\n
@@ -898,17 +913,56 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 {
 	PGresult   *res;
 	Command   **commands;
+	booldo_throttle = false;
 
 top:
 	commands = sql_files[st-use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle_delay  ! st-throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 = 13.8 .. 0 multiplier
+		 *  10 = 11.5 .. 0
+		 *   1 =  9.2 .. 0
+		 *1000 =  6.9 .. 0
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (throttle_delay  st-state==0)
+			{
+/* measure lag of throttled transaction */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+			}
+		}
 		else
 			return true;		/* 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-14 Thread Greg Smith

On 6/14/13 3:50 PM, Fabien COELHO wrote:

I think that the weirdness really comes from the way transactions times
are measured, their interactions with throttling, and latent bugs in the
code.


measurement times, no; interactions with throttling, no.  If it was 
either of those I'd have finished this off days ago.  Latent bugs, 
possibly.  We may discover there's nothing wrong with your code at the 
end here, that it just makes hitting this bug more likely. 
Unfortunately today is the day *some* bug is popping up, and I want to 
get it squashed before I'll be happy.


The lag is actually happening during a kernel call that isn't working as 
expected.  I'm not sure whether this bug was there all along if \sleep 
was used, or if it's specific to the throttle sleep.


 Also, flag st-listen is set to 1 but *never* set back to 0...

I noticed that st-listen was weird too, and that's on my short list of 
suspicious things I haven't figured out yet.


I added a bunch more logging as pgbench steps through its work to track 
down where it's stuck at.  Until the end all transactions look like this:


1371238832.084783 client 10 throttle lag 2 us
1371238832.084783 client 10 executing \setrandom aid 1 :naccounts
1371238832.084803 client 10 sending SELECT abalance FROM 
pgbench_accounts WHERE aid = 753099;

1371238832.084840 calling select
1371238832.086539 client 10 receiving
1371238832.086539 client 10 finished

All clients who hit lag spikes at the end are going through this 
sequence instead:


1371238832.085912 client 13 throttle lag 790 us
1371238832.085912 client 13 executing \setrandom aid 1 :naccounts
1371238832.085931 client 13 sending SELECT abalance FROM 
pgbench_accounts WHERE aid = 564894;

1371238832.086592 client 13 receiving
1371238832.086662 calling select
1371238832.235543 client 13 receiving
1371238832.235543 client 13 finished

Note the calling select here that wasn't in the normal length 
transaction before it.  The client is receiving something here, but 
rather than it finishing the transaction it falls through and ends up at 
the select() system call outside of doCustom.  All of the clients that 
are sleeping when the system slips into one of these long select() calls 
are getting stuck behind it.  I'm not 100% sure, but I think this only 
happens when all remaining clients are sleeping.


Here's another one, it hits the receive that doesn't finish the 
transaction earlier (1371238832.086587) but then falls into the same 
select() call at 1371238832.086662:


1371238832.085884 client 12 throttle lag 799 us
1371238832.085884 client 12 executing \setrandom aid 1 :naccounts
1371238832.085903 client 12 sending SELECT abalance FROM 
pgbench_accounts WHERE aid = 299080;

1371238832.086587 client 12 receiving
1371238832.086662 calling select
1371238832.231032 client 12 receiving
1371238832.231032 client 12 finished

Investigation is still going here...

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-14 Thread Fabien COELHO



I think that the weirdness really comes from the way transactions times
are measured, their interactions with throttling, and latent bugs in the
code.


measurement times, no; interactions with throttling, no.  If it was either of 
those I'd have finished this off days ago.  Latent bugs, possibly.  We may 
discover there's nothing wrong with your code at the end here,


To summarize my point: I think my v10 code does not take into account all 
of the strangeness in doCustom, and I'm pretty sure that there is no point
in including thottle sleeps into latency measures, which was more or less 
the case. So it is somehow a bug which only shows up if you look at the 
latency measures, but the tps are fine.


that it just makes hitting this bug more likely. Unfortunately today is 
the day *some* bug is popping up, and I want to get it squashed before 
I'll be happy.


The lag is actually happening during a kernel call that isn't working as 
expected.  I'm not sure whether this bug was there all along if \sleep was 
used, or if it's specific to the throttle sleep.


The throttle sleep is inserted out of the state machine. That is why in 
the test patch I added a goto to ensure that it is always taken at the 
right time, that is when state==0 and before txn_begin is set, and not 
possibly between other states when doCustom happens to be recalled after a 
return.


I added a bunch more logging as pgbench steps through its work to track down 
where it's stuck at.  Until the end all transactions look like this:


1371238832.084783 client 10 throttle lag 2 us
1371238832.084783 client 10 executing \setrandom aid 1 :naccounts
1371238832.084803 client 10 sending SELECT abalance FROM pgbench_accounts 
WHERE aid = 753099;

1371238832.084840 calling select
1371238832.086539 client 10 receiving
1371238832.086539 client 10 finished

All clients who hit lag spikes at the end are going through this sequence

 instead:


1371238832.085912 client 13 throttle lag 790 us
1371238832.085912 client 13 executing \setrandom aid 1 :naccounts
1371238832.085931 client 13 sending SELECT abalance FROM pgbench_accounts 
WHERE aid = 564894;

1371238832.086592 client 13 receiving
1371238832.086662 calling select
1371238832.235543 client 13 receiving
1371238832.235543 client 13 finished


Note the calling select here that wasn't in the normal length transaction 
before it.  The client is receiving something here, but rather than it 
finishing the transaction it falls through and ends up at the select() system 
call outside of doCustom.  All of the clients that are sleeping when the 
system slips into one of these long select() calls are getting stuck behind 
it.  I'm not 100% sure, but I think this only happens when all remaining 
clients are sleeping.


Note: in both the slow cases there is a receiving between sending and 
select. This suggests that the goto top at the very end of doCustom is 
followed in one case but not the other.


ISTM that there is a timeout passed to select which is computed based on 
the current sleeping time of each client. I'm pretty sure that not a well 
tested path...


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-12 Thread Fabien COELHO


Did you look at the giant latency spikes at the end of the test run I 
submitted the graph for?  I wanted to nail down what was causing those 
before worrying about the startup timing.


If you are still worried: if you run the very same command without 
throttling and measure the same latency, does the same thing happens at 
the end? My guess is that it should be yes. If it is no, I'll try out 
pgbench-tools.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-11 Thread Greg Smith

On 6/10/13 6:02 PM, Fabien COELHO wrote:

  - the tps is global, with a mutex to share the global stochastic process
  - there is an adaptation for the fork emulation
  - I do not know wheter this works with Win32 pthread stuff.


Instead of this complexity, can we just split the TPS input per client? 
 That's all I was thinking of here, not adding a new set of threading 
issues.  If 1 TPS is requested and there's 10 clients, just set the 
delay so that each of them targets 1000 TPS.


I'm guessing it's more accurate to have them all communicate as you've 
done here, but it seems like a whole class of new bugs and potential 
bottlenecks could come out of that.  Whenever someone touches the 
threading model for pgbench it usually gives a stack of build farm 
headaches.  Better to avoid those unless there's really a compelling 
reason to go through that.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-11 Thread Fabien COELHO



  - the tps is global, with a mutex to share the global stochastic process
  - there is an adaptation for the fork emulation
  - I do not know wheter this works with Win32 pthread stuff.


Instead of this complexity,


Well, the mutex impact is very localized in the code. The complexity is 
more linked to the three thread implementations intermixed there.



can we just split the TPS input per client?


Obviously it is possible. Note that it is more logical to do that per 
thread. I did one shared stochastic process because it makes more sense to 
have just one.


That's all I was thinking of here, not adding a new set of threading issues. 
If 1 TPS is requested and there's 10 clients, just set the delay so that 
each of them targets 1000 TPS.


Ok, so I understand that a mutex is too much!

I'm guessing it's more accurate to have them all communicate as you've done 
here, but it seems like a whole class of new bugs and potential bottlenecks 
could come out of that.


I do not think that there is a performance or locking contension issue: it 
is about getting a mutex for a section which performs one integer add and 
two integer assignements, that is about 3 instructions, to be compared 
with the task of performing database operations over the network. There 
are several orders of magnitudes between those tasks. It would need a more 
than terrible mutex implementation to have any significant impact.


Whenever someone touches the threading model for pgbench it usually 
gives a stack of build farm headaches.  Better to avoid those unless 
there's really a compelling reason to go through that.


I agree that the threading model in pgbench is a mess, mostly because of 
the 3 concurrent implementations intermixed in the code. Getting rid of 
the fork emulation and win32 special handling and only keeping the pthread 
implementation, which seems to be available even on windows, would be a 
relief. I'm not sure if there is still a rationale to have these 3 
implementations, but it ensures a maintenance mess:-(


I'll submit a version without mutex, but ISTM that this one is 
conceptually cleaner, although I'm not sure about what happens on windows.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-11 Thread Fabien COELHO


Submission 10:

 - per thread throttling instead of a global process with a mutex.
   this avoids a mutex, and the process is shared between clients
   of a given thread.

 - ISTM that there thread start time should be initialized at the
   beginning of threadRun instead of in the loop *before* thread creation,
   otherwise the thread creation delays are incorporated in the
   performance measure, but ISTM that the point of pgbench is not to
   measure thread creation performance...

   I've added a comment suggesting where it should be put instead,
   first thing in threadRun.

--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 8c202bf..8d7979b 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When threads are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -205,6 +211,7 @@ typedef struct
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +229,10 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+int64   throttle_trigger;  /* previous/next throttling (us) */
+	int64   throttle_lag;  /* avg lag transaction behind throttling */
+	int64   throttle_lag_max;  /* max lag */
+
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,6 +241,8 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
@@ -355,6 +368,8 @@ usage(void)
 		 -n   do not run VACUUM before tests\n
 		 -N   do not update tables \pgbench_tellers\ and \pgbench_branches\\n
 		 -r   report average latency per command\n
+		 -R SPEC, --rate SPEC\n
+		  target rate in transactions per second\n
 		 -s NUM   report this scale factor in output\n
 		 -S   perform SELECT-only transactions\n
 	   -t NUM   number of transactions each client runs (default: 10)\n
@@ -902,13 +917,51 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 top:
 	commands = sql_files[st-use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle_delay  ! st-throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 = 13.8 .. 0 multiplier
+		 *  10 = 11.5 .. 0
+		 *   1 =  9.2 .. 0
+		 *1000 =  6.9 .. 0
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		thread-throttle_trigger += wait;
+
+		st-until = thread-throttle_trigger;
+		st-sleeping = 1;
+		st-throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (throttle_delay  st-state==0)
+			{
+/* measure lag of throttled transaction */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1095,6 +1148,7 @@ top:
 			st-state = 0;
 			st-use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st-use_file];
+			st-throttled = false;
 		}
 	}
 
@@ -2017,7 +2071,8 @@ process_builtin(char *tb)
 static void
 printResults(int ttype, int normal_xacts, int nclients,
 			 TState *threads, int nthreads,
-			 instr_time total_time, instr_time conn_total_time)
+			 instr_time total_time, instr_time conn_total_time,
+			 int64 throttle_lag, int64 throttle_lag_max)
 {
 	double		time_include,
 tps_include,
@@ -2055,6 +2110,18 @@ printResults(int ttype, int normal_xacts, int nclients,
 		printf(number of transactions actually processed: %d\n,
 			   normal_xacts);
 	}
+
+	if (throttle_delay)
+	{
+		/* Report average transaction lag under throttling, i.e. the delay
+		   

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-11 Thread Greg Smith

On 6/11/13 4:11 PM, Fabien COELHO wrote:

  - ISTM that there thread start time should be initialized at the
beginning of threadRun instead of in the loop *before* thread creation,
otherwise the thread creation delays are incorporated in the
performance measure, but ISTM that the point of pgbench is not to
measure thread creation performance...


I noticed that, but it seemed a pretty minor issue.  Did you look at the 
giant latency spikes at the end of the test run I submitted the graph 
for?  I wanted to nail down what was causing those before worrying about 
the startup timing.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-11 Thread Fabien COELHO



  - ISTM that there thread start time should be initialized at the
beginning of threadRun instead of in the loop *before* thread creation,
otherwise the thread creation delays are incorporated in the
performance measure, but ISTM that the point of pgbench is not to
measure thread creation performance...


I noticed that, but it seemed a pretty minor issue.


Not for me, because the max lag measured in my first version was really 
the threads creation time, not very interesting.


Did you look at the giant latency spikes at the end of the test run I 
submitted the graph for?


I've looked at the graph you sent. I must admit that I did not understand 
exactly what is measured and where it is measured. Because of its position 
at the end of the run, I thought of some disconnection related effects 
when pgbench run is interrupted by a time out signal, so some things are 
done more slowly. Fine with me, we are stopping anyway, and out of the 
steady state.


I wanted to nail down what was causing those before worrying about the 
startup timing.


Well, the short answer is that I'm not worried by that, for the reason 
explained above. I would be worried if it was anywhere else but where the 
transactions are interrupted, the connections are closed and the threads 
are stopped. I may be wrong:-)


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-10 Thread Fabien COELHO


Hello Greg,

Thanks for this very detailed review and the suggestions!

I'll submit a new patch


Question 1: should it report the maximum lang encountered?


I haven't found the lag measurement to be very useful yet, outside of 
debugging the feature itself.  Accordingly I don't see a reason to add even 
more statistics about the number outside of testing the code.  I'm seeing 
some weird lag problems that this will be useful for though right now, more 
on that a few places below.


I'll explain below why it is really interesting to get this figure, and 
that it is not really available as precisely elsewhere.



Question 2: the next step would be to have the current lag shown under
option --progress, but that would mean having a combined --throttle
--progress patch submission, or maybe dependencies between patches.


This is getting too far ahead.


Ok!

Let's get the throttle part nailed down before introducing even more 
moving parts into this.  I've attached an updated patch that changes a 
few things around already.  I'm not done with this yet and it needs some 
more review before commit, but it's not too far away from being ready.


Ok. I'll submit a new version by the end of the week.

This feature works quite well.  On a system that will run at 25K TPS without 
any limit, I did a run with 25 clients and a rate of 400/second, aiming at 
10,000 TPS, and that's what I got:


number of clients: 25
number of threads: 1
duration: 60 s
number of transactions actually processed: 599620
average transaction lag: 0.307 ms
tps = 9954.779317 (including connections establishing)
tps = 9964.947522 (excluding connections establishing)

I never thought of implementing the throttle like this before,


Stochastic processes are a little bit magic:-)

but it seems to work out well so far.  Check out tps.png to see the 
smoothness of the TPS curve (the graphs came out of pgbench-tools. 
There's a little more play outside of the target than ideal for this 
case.  Maybe it's worth tightening the Poisson curve a bit around its 
center?


The point of a Poisson distribution is to model random events the kind of 
which are a little bit irregular, such as web requests or queuing clients 
at a taxi stop. I cannot really change the formula, but if you want to 
argue with Siméon Denis Poisson, hist current address is 19th section of 
Père Lachaise graveyard in Paris:-)


More seriously, the only parameter that can be changed is the 100.0 
which drives the granularity of the Poisson process. A smaller value would 
mean a smaller potential multiplier; that is how far from the average time 
the schedule can go. This may come under tightening, although it would 
depart from a perfect process and possibly may be a little less 
smooth... for a given definition of tight, perfect and smooth:-)


[...] What I did instead was think of this as a transaction rate target, 
which makes the help a whole lot simpler:


 -R SPEC, --rate SPEC
  target rate per client in transactions per second


Ok, I'm fine with this name.

Made the documentation easier to write too.  I'm not quite done with that 
yet, the docs wording in this updated patch could still be better.


I'm not an English native speaker, any help is welcome here. I'll do my 
best.


I personally would like this better if --rate specified a *total* rate across 
all clients.


Ok, I can do that, with some reworking so that the stochastic process is 
shared by all threads instead of being within each client. This mean that 
a lock between threads to access some variables, which should not impact 
the test much. Another option is to have a per-thread stochastic process.


However, there are examples of both types of settings in the 
program already, so there's no one precedent for which is right here.  -t is 
per-client and now -R is too; I'd prefer it to be like -T instead.  It's not 
that important though, and the code is cleaner as it's written right now. 
Maybe this is better; I'm not sure.


I like the idea of just one process instead of a per-client one. I did not 
try at the beginning because the implementation is less straightforward.


On the topic of this weird latency spike issue, I did see that show up in 
some of the results too.


Your example illustrates *exactly* why the lag measure was added.

The Poisson processes generate an ideal event line (that is irregularly 
scheduled transaction start times targetting the expected tps) which 
induces a varrying load that the database is trying to handle.


If it cannot start right away, this means that some transactions are 
differed with respect to their schedule start time. The measure latency 
reports exactly that: the clients do not handle the load. There may be 
some catchup later, that is the clients come back in line with the 
scheduled transactions.


I need to put this measure here because the schedluled time is only 
known to pgbench and not available elsewhere. The max would really be more 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-10 Thread Fabien COELHO


Here is submission v9 based on your v8 version.

 - the tps is global, with a mutex to share the global stochastic process
 - there is an adaptation for the fork emulation
 - I do not know wheter this works with Win32 pthread stuff.
 - reduced multiplier ln(100) - ln(1000)
 - avg  max throttling lag are reported

There's a little more play outside of the target than ideal for this 
case.  Maybe it's worth tightening the Poisson curve a bit around its 
center?


A stochastic process moves around the target value, but is not right on 
it.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 8c202bf..6e52dee 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -75,6 +75,7 @@ static int	pthread_join(pthread_t th, void **thread_return);
 #else
 /* Use emulation with fork. Rename pthread identifiers to avoid conflicts */
 
+#define PTHREAD_FORK_EMULATION
 #include sys/wait.h
 
 #define pthread_tpg_pthread_t
@@ -82,6 +83,11 @@ static int	pthread_join(pthread_t th, void **thread_return);
 #define pthread_create			pg_pthread_create
 #define pthread_join			pg_pthread_join
 
+#define pthread_mutex_t int
+#define PTHREAD_MUTEX_INITIALIZER 0
+#define pthread_mutex_lock(m)
+#define pthread_mutex_unlock(m)
+
 typedef struct fork_pthread *pthread_t;
 typedef int pthread_attr_t;
 
@@ -137,6 +143,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * When clients are throttled to a given rate limit, this is the target delay
+ * to reach that rate in usec.  0 is the default and means no throttling.
+ */
+int64		throttle_delay = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -205,6 +217,7 @@ typedef struct
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -222,6 +235,8 @@ typedef struct
 	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
 	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
+	int64   throttle_lag;   /* transaction lag behind throttling */
+	int64   throttle_lag_max;
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -230,9 +245,17 @@ typedef struct
 {
 	instr_time	conn_time;
 	int			xacts;
+	int64   throttle_lag;
+	int64   throttle_lag_max;
 } TResult;
 
 /*
+ * throttling management
+ */
+pthread_mutex_t throttle_trigger_mutex = PTHREAD_MUTEX_INITIALIZER;
+int64 throttle_trigger;	/* previous/next throttling (us) */
+
+/*
  * queries read from files
  */
 #define SQL_COMMAND		1
@@ -355,6 +378,8 @@ usage(void)
 		 -n   do not run VACUUM before tests\n
 		 -N   do not update tables \pgbench_tellers\ and \pgbench_branches\\n
 		 -r   report average latency per command\n
+		 -R SPEC, --rate SPEC\n
+		  target rate in transactions per second\n
 		 -s NUM   report this scale factor in output\n
 		 -S   perform SELECT-only transactions\n
 	   -t NUM   number of transactions each client runs (default: 10)\n
@@ -902,13 +927,53 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 top:
 	commands = sql_files[st-use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle_delay  ! st-throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 = 13.8 .. 0 multiplier
+		 *  10 = 11.5 .. 0
+		 *   1 =  9.2 .. 0
+		 *1000 =  6.9 .. 0
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
+
+		pthread_mutex_lock(throttle_trigger_mutex);
+		throttle_trigger += wait;
+		st-until = throttle_trigger;
+		pthread_mutex_unlock(throttle_trigger_mutex);
+
+		st-sleeping = 1;
+		st-throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
+		int64 now_us;
 
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (throttle_delay  st-state==0)
+			{
+/* measure lag of throttled transaction */
+int64 lag = now_us - st-until;
+thread-throttle_lag += lag;
+if (lag  thread-throttle_lag_max)
+	thread-throttle_lag_max = lag;
+			}
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-08 Thread Greg Smith

On 6/1/13 5:00 AM, Fabien COELHO wrote:

Question 1: should it report the maximum lang encountered?


I haven't found the lag measurement to be very useful yet, outside of 
debugging the feature itself.  Accordingly I don't see a reason to add 
even more statistics about the number outside of testing the code.  I'm 
seeing some weird lag problems that this will be useful for though right 
now, more on that a few places below.



Question 2: the next step would be to have the current lag shown under
option --progress, but that would mean having a combined --throttle
--progress patch submission, or maybe dependencies between patches.


This is getting too far ahead.  Let's get the throttle part nailed down 
before introducing even more moving parts into this.  I've attached an 
updated patch that changes a few things around already.  I'm not done 
with this yet and it needs some more review before commit, but it's not 
too far away from being ready.


This feature works quite well.  On a system that will run at 25K TPS 
without any limit, I did a run with 25 clients and a rate of 400/second, 
aiming at 10,000 TPS, and that's what I got:


number of clients: 25
number of threads: 1
duration: 60 s
number of transactions actually processed: 599620
average transaction lag: 0.307 ms
tps = 9954.779317 (including connections establishing)
tps = 9964.947522 (excluding connections establishing)

I never thought of implementing the throttle like this before, but it 
seems to work out well so far.  Check out tps.png to see the smoothness 
of the TPS curve (the graphs came out of pgbench-tools.  There's a 
little more play outside of the target than ideal for this case.  Maybe 
it's worth tightening the Poisson curve a bit around its center?


The main implementation issue I haven't looked into yet is why things 
can get weird at the end of the run.  See the latency.png graph attached 
and you can see what I mean.


I didn't like the naming on this option or all of the ways you could 
specify the delay.  None of those really added anything, since you can 
get every other behavior by specifying a non-integer TPS.  And using the 
word throttle inside the code is fine, but I didn't like exposing that 
implementation detail more than it had to be.


What I did instead was think of this as a transaction rate target, which 
makes the help a whole lot simpler:


  -R SPEC, --rate SPEC
   target rate per client in transactions per second

Made the documentation easier to write too.  I'm not quite done with 
that yet, the docs wording in this updated patch could still be better.


I personally would like this better if --rate specified a *total* rate 
across all clients.  However, there are examples of both types of 
settings in the program already, so there's no one precedent for which 
is right here.  -t is per-client and now -R is too; I'd prefer it to be 
like -T instead.  It's not that important though, and the code is 
cleaner as it's written right now.  Maybe this is better; I'm not sure.


I did some basic error handling checks on this and they seemed good, the 
program rejects target rates of =0.


On the topic of this weird latency spike issue, I did see that show up 
in some of the results too.  Here's one where I tried to specify a rate 
higher than the system can actually handle, 8 TPS total on a 
SELECT-only test


$ pgbench -S -T 30 -c 8 -j 4 -R1tps pgbench
starting vacuum...end.
transaction type: SELECT only
scaling factor: 100
query mode: simple
number of clients: 8
number of threads: 4
duration: 30 s
number of transactions actually processed: 761779
average transaction lag: 10298.380 ms
tps = 25392.312544 (including connections establishing)
tps = 25397.294583 (excluding connections establishing)

It was actually limited by the capabilities of the hardware, 25K TPS. 
10298 ms of lag per transaction can't be right though.


Some general patch submission suggestions for you as a new contributor:

-When re-submitting something with improvements, it's a good idea to add 
a version number to the patch so reviewers can tell them apart easily. 
But there is no reason to change the subject line of the e-mail each 
time.  I followed that standard here.  If you updated this again I would 
name the file pgbench-throttle-v9.patch but keep the same e-mail subject.


-There were some extra carriage return characters in your last 
submission.  Wasn't a problem this time, but if you can get rid of those 
that makes for a better patch.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
attachment: tps.pngattachment: latency.pngdiff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 8c202bf..799dfcd 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ intunlogged_tables = 0;
 double sample_rate = 0.0;
 
 /*
+ * When clients are 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-01 Thread Fabien COELHO


New submission for the next commit fest.

This new version also reports the average lag time, i.e. the delay between 
scheduled and actual transaction start times. This may help detect whether 
things went smothly, or if at some time some delay was introduced because 
of the load and some catchup was done afterwards.


Question 1: should it report the maximum lang encountered?

Question 2: the next step would be to have the current lag shown under 
option --progress, but that would mean having a combined --throttle 
--progress patch submission, or maybe dependencies between patches.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 8ff6623..9b5adc2 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * whether clients are throttled to a given rate, expressed as a delay in us.
+ * 0, the default means no throttling.
+ */
+int64		throttle = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -205,6 +211,9 @@ typedef struct
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	int64		trigger;		/* previous/next throttling (us) */
+	bool		throttled;  /* whether current transaction was throttled */
+	int64		throttle_lag;   /* transaction lag behind throttling */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -348,6 +357,9 @@ usage(void)
 		 -D VARNAME=VALUE\n
 		  define variable for use by custom script\n
 		 -f FILENAME  read transaction script from FILENAME\n
+		 -H SPEC, --throttle SPEC\n
+		  delay in second to throttle each client\n
+		  sample specs: 0.025 40tps 25ms 25000us\n
 		 -j NUM   number of threads (default: 1)\n
 		 -l   write transaction times to log file\n
 		 -M simple|extended|prepared\n
@@ -902,13 +914,40 @@ doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVa
 top:
 	commands = sql_files[st-use_file];
 
+	/* handle throttling once per transaction by inserting a sleep.
+	 * this is simpler than doing it at the end.
+	 */
+	if (throttle  ! st-throttled)
+	{
+		/* compute delay to approximate a Poisson distribution
+		 * 100 = 13.8 .. 0 multiplier
+		 * if transactions are too slow or a given wait shorter than
+		 * a transaction, the next transaction will start right away.
+		 */
+		int64 wait = (int64)
+			throttle * -log(getrand(thread, 1, 100)/100.0);
+		st-trigger += wait;
+		st-sleeping = 1;
+		st-until = st-trigger;
+		st-throttled = true;
+		if (debug)
+			fprintf(stderr, client %d throttling INT64_FORMAT us\n,
+	st-id, wait);
+	}
+
 	if (st-sleeping)
 	{			/* are we sleeping? */
 		instr_time	now;
-
+		int64 now_us;
 		INSTR_TIME_SET_CURRENT(now);
-		if (st-until = INSTR_TIME_GET_MICROSEC(now))
+		now_us = INSTR_TIME_GET_MICROSEC(now);
+		if (st-until = now_us)
+		{
 			st-sleeping = 0;	/* Done sleeping, go ahead with next command */
+			if (throttle  st-state==0)
+/* measure lag of throttled transaction */
+st-throttle_lag += (now_us - st-until);
+		}
 		else
 			return true;		/* Still sleeping, nothing to do here */
 	}
@@ -1091,6 +1130,7 @@ top:
 			st-state = 0;
 			st-use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st-use_file];
+			st-throttled = false;
 		}
 	}
 
@@ -2012,7 +2052,8 @@ process_builtin(char *tb)
 
 /* print out results */
 static void
-printResults(int ttype, int normal_xacts, int nclients,
+printResults(int ttype, int normal_xacts,
+			 CState *clients, int nclients,
 			 TState *threads, int nthreads,
 			 instr_time total_time, instr_time conn_total_time)
 {
@@ -2052,6 +2093,23 @@ printResults(int ttype, int normal_xacts, int nclients,
 		printf(number of transactions actually processed: %d\n,
 			   normal_xacts);
 	}
+
+	if (throttle)
+	{
+		/* Report average transaction lag under throttling, i.e. the delay
+		   between scheduled and actual start times for the transaction.
+		   The measured lag may be linked to the thread/client load,
+		   the database load, or the Poisson throttling process.
+		   should it report the maximum encountered lag?
+		 */
+		int64 throttle_lag = 0;
+		int c;
+		for (c = 0; c  nclients; c++)
+			throttle_lag += clients[c].throttle_lag;
+		printf(average transaction lag: %.3f ms\n,
+			   0.001 * throttle_lag / normal_xacts);
+	}
+
 	printf(tps = %f (including connections establishing)\n, tps_include);
 	printf(tps = %f (excluding connections establishing)\n, tps_exclude);
 
@@ -2112,6 +2170,7 @@ main(int argc, char **argv)
 		{unlogged-tables, no_argument, unlogged_tables, 1},
 		{sampling-rate, required_argument, NULL, 4},
 		{aggregate-interval, required_argument, NULL, 5},
+		{throttle, required_argument, NULL, 'H'},
 		{NULL, 0,