Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-06-19 Thread Fabien COELHO



Because you may want to put something very readable and understandable in
a script and like long options, or have to type it interactively every day
in a terminal and like short ones. Most UNIX commands include both kind.


Would it make sense then to add long versions for all the other standard
options too?


Yep. It is really a stylistic (pedantic?) matter. See for pgbench:

https://commitfest.postgresql.org/action/patch_view?id=1106

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-06-19 Thread Jan Wieck
On 06/19/13 14:34, Fabien COELHO wrote:
> 
>>> The use case of the option is to be able to generate a continuous gentle
>>> load for functional tests, eg in a practice session with students or for
>>> testing features on a laptop.
>>
>> Why does this need two option formats (-H and --throttle)?
> 
> On the latest version it is --rate and -R.
> 
> Because you may want to put something very readable and understandable in 
> a script and like long options, or have to type it interactively every day 
> in a terminal and like short ones. Most UNIX commands include both kind.
> 

Would it make sense then to add long versions for all the other standard
options too?


Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-06-19 Thread Fabien COELHO



The use case of the option is to be able to generate a continuous gentle
load for functional tests, eg in a practice session with students or for
testing features on a laptop.


Why does this need two option formats (-H and --throttle)?


On the latest version it is --rate and -R.

Because you may want to put something very readable and understandable in 
a script and like long options, or have to type it interactively every day 
in a terminal and like short ones. Most UNIX commands include both kind.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-06-19 Thread Jan Wieck
On 05/01/13 04:57, Fabien COELHO wrote:
> 
> Add --throttle to pgbench
> 
> Each client is throttled to the specified rate, which can be expressed in 
> tps or in time (s, ms, us). Throttling is achieved by scheduling 
> transactions along a Poisson-distribution.
> 
> This is an update of the previous proposal which fix a typo in the sgml 
> documentation.
> 
> The use case of the option is to be able to generate a continuous gentle 
> load for functional tests, eg in a practice session with students or for 
> testing features on a laptop.

Why does this need two option formats (-H and --throttle)?


Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-06-09 Thread Andres Freund
On 2013-06-09 17:50:13 +0800, Craig Ringer wrote:
> On 05/31/2013 03:41 PM, Fabien COELHO wrote:
> >
> >>> However I'm not sure that pg_stat_replication currently has the
> >>> necessary information on either side to measure the lag (in time
> >>> transactions, but how do I know when a transaction was committed? or
> >>> number of transactions?).
> >>
> >> The BDR codebase now has a handy function to report when a transaction
> >> was committed, pg_get_transaction_committime(xid) .
> >
> > This looks handy for monitoring a replication setup.
> > It should really be in core...
> >
> > Any plans? Or is there other ways to get this kind of information in
> > core?

> pg_get_transaction_committime isn't trivial to just add to core because
> it requires a commit time to be recorded with commit records in the
> transaction logs, among other changes.

The commit records actually already have that information available
(c.f. xl_xact_commit(_compact) in xact.h), the problem is having a
datastructure which collects all that.
That's why the committs (written by Alvaro) added an slru mapping xids
to timestamps. And yes, we want to submit that sometime.

The pg_xlog_wait_remote_apply(), pg_xlog_wait_remote_receive() functions
however don't need any additional infrastructure, so I think those are
easier and less controversial to add.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-06-09 Thread Craig Ringer
On 05/31/2013 03:41 PM, Fabien COELHO wrote:
>
>>> However I'm not sure that pg_stat_replication currently has the
>>> necessary information on either side to measure the lag (in time
>>> transactions, but how do I know when a transaction was committed? or
>>> number of transactions?).
>>
>> The BDR codebase now has a handy function to report when a transaction
>> was committed, pg_get_transaction_committime(xid) .
>
> This looks handy for monitoring a replication setup.
> It should really be in core...
>
> Any plans? Or is there other ways to get this kind of information in
> core?

Yes, it's my understanding that the idea is to eventually get all the
BDR functionality merged, piece by piece, including the commit time
tracking feature.

pg_get_transaction_committime isn't trivial to just add to core because
it requires a commit time to be recorded with commit records in the
transaction logs, among other changes.

I don't know if Andres or any of the others involved are planning on
trying to get this particular feature merged in 9.4, but I wouldn't be
too surprised since (AFAIK) it's fairly self-contained and would be
useful for monitoring streaming replication setups as well.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-31 Thread Fabien COELHO



However I'm not sure that pg_stat_replication currently has the
necessary information on either side to measure the lag (in time
transactions, but how do I know when a transaction was committed? or
number of transactions?).


The BDR codebase now has a handy function to report when a transaction
was committed, pg_get_transaction_committime(xid) .


This looks handy for monitoring a replication setup.
It should really be in core...

Any plans? Or is there other ways to get this kind of information in core?

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-30 Thread Andres Freund
On 2013-05-30 15:54:01 +0800, Craig Ringer wrote:
> On 05/30/2013 03:10 PM, Craig Ringer wrote:
> > On 05/28/2013 07:52 PM, Fabien COELHO wrote:
> >> However I'm not sure that pg_stat_replication currently has the
> >> necessary information on either side to measure the lag (in time
> >> transactions, but how do I know when a transaction was committed? or
> >> number of transactions?). 
> > The BDR codebase now has a handy function to report when a transaction
> > was committed, pg_get_transaction_committime(xid) .
> >
> > It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive
> > that can be used with pg_current_xlog_location() to wait until one or
> > all replicas have caught up, or with LSNs from pg_stat_replication to
> > (say) wait until all replicas have caught up with the most up-to-date one.
> >
> > I don't think these depend on anything BDR-specific
> They do, however, require changes to Pg core. These aren't functions you
> can just borrow and add to an extension, they require additional changes
> to core to collect the data they use.

pg_xlog_wait_remote_receive() doesn't require changes afaics and should
be easily packable as an extension. We might want to make it use the
sync commit infrastructure at some point instead of essentially busy
waiting, but...

'committs' - the mapping of xids to timestamp certainly does though.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-30 Thread Craig Ringer
On 05/30/2013 03:10 PM, Craig Ringer wrote:
> On 05/28/2013 07:52 PM, Fabien COELHO wrote:
>> However I'm not sure that pg_stat_replication currently has the
>> necessary information on either side to measure the lag (in time
>> transactions, but how do I know when a transaction was committed? or
>> number of transactions?). 
> The BDR codebase now has a handy function to report when a transaction
> was committed, pg_get_transaction_committime(xid) .
>
> It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive
> that can be used with pg_current_xlog_location() to wait until one or
> all replicas have caught up, or with LSNs from pg_stat_replication to
> (say) wait until all replicas have caught up with the most up-to-date one.
>
> I don't think these depend on anything BDR-specific
They do, however, require changes to Pg core. These aren't functions you
can just borrow and add to an extension, they require additional changes
to core to collect the data they use.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-30 Thread Craig Ringer
On 05/28/2013 07:52 PM, Fabien COELHO wrote:
>
> However I'm not sure that pg_stat_replication currently has the
> necessary information on either side to measure the lag (in time
> transactions, but how do I know when a transaction was committed? or
> number of transactions?). 

The BDR codebase now has a handy function to report when a transaction
was committed, pg_get_transaction_committime(xid) .

It also adds pg_xlog_wait_remote_apply and pg_xlog_wait_remote_receive
that can be used with pg_current_xlog_location() to wait until one or
all replicas have caught up, or with LSNs from pg_stat_replication to
(say) wait until all replicas have caught up with the most up-to-date one.

I don't think these depend on anything BDR-specific, though Andres or
Álvaro would be able to say for sure. Take a look in:

git://git.postgresql.org/git/users/andresfreund/postgres.git

on the 'bdr' branch. Be aware that it is rebased regularly, though the
'0.4' tag applied earlier today will remain constant and contains the
functions of interest.

I hope this helps.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-28 Thread Fabien COELHO



You can try to use and improve the --progress option in another patch
submission which shows how things are going.



That'll certainly be useful, but won't solve this issue. The thing is
that with asynchronous replication you need to know how long it takes
until all nodes are back in sync, with no replication lag.



I can probably do it with a custom pgbench script, but I'm tempted to
add support for timing that part separately with a "wait command" to run
at the end of the benchmark.


ISTM that a separate process not related to pgbench should try to monitor 
the master-slave async lag, as it is an interesting information anyway...


However I'm not sure that pg_stat_replication currently has the necessary 
information on either side to measure the lag (in time transactions, but 
how do I know when a transaction was committed? or number of 
transactions?).


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-28 Thread Craig Ringer
On 05/28/2013 04:13 PM, Fabien COELHO wrote:
>
> You can try to use and improve the --progress option in another patch
> submission which shows how things are going. 
That'll certainly be useful, but won't solve this issue. The thing is
that with asynchronous replication you need to know how long it takes
until all nodes are back in sync, with no replication lag.

I can probably do it with a custom pgbench script, but I'm tempted to
add support for timing that part separately with a "wait command" to run
at the end of the benchmark.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-28 Thread Fabien COELHO



The use case of the option is to be able to generate a continuous gentle
load for functional tests, eg in a practice session with students or for
testing features on a laptop.


If you add this to
https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll
review it next month.  I have a lot of use cases for a pgbench that
doesn't just run at 100% all the time.

As do I - in particular, if time permits I'll merge this patch into my
working copy of pgbench so I can find the steady-state transaction rate
where BDR replication's lag is stable and doesn't increase continually.
Right now I don't really have any way of doing that, only measuring how
long it takes to catch up once the test run completes.


You can try to use and improve the --progress option in another patch 
submission which shows how things are going.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-27 Thread Craig Ringer
On 05/02/2013 12:56 AM, Greg Smith wrote:
> On 5/1/13 4:57 AM, Fabien COELHO wrote:
>> The use case of the option is to be able to generate a continuous gentle
>> load for functional tests, eg in a practice session with students or for
>> testing features on a laptop.
>
> If you add this to
> https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll
> review it next month.  I have a lot of use cases for a pgbench that
> doesn't just run at 100% all the time.
As do I - in particular, if time permits I'll merge this patch into my
working copy of pgbench so I can find the steady-state transaction rate
where BDR replication's lag is stable and doesn't increase continually.
Right now I don't really have any way of doing that, only measuring how
long it takes to catch up once the test run completes.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-02 Thread Fabien COELHO


Hello Greg,

If you add this to 
https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll review it 
next month.


Ok. Thanks. I just did that.

I have a lot of use cases for a pgbench that doesn't just run at 100% 
all the time.  I had tried to simulate something with simple sleep 
calls, but I realized it was going to take a stronger math basis to do 
the job well.


The situations where I expect this to be useful all require collecting 
latency data and then both plotting it and doing some statistical analysis. 
pgbench-tools computes worst-case and 90th percentile latency for example, 
along with the graph over time.  There's a useful concept that some of the 
official TPC tests have:  how high can you get the throughput while still 
keeping the latency within certain parameters. Right now we have no way to 
simulate that.  What we see with write-heavy pgbench is that latency goes 
crazy (>60 second commits sometimes) if all you do is hit the server with 
maximum throughput.  That's interesting, but it's not necessarily relevant in 
many cases.


Indeed. It is a good thing that my proposed feature can help in more 
situations than my particular need.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-01 Thread Greg Smith

On 5/1/13 4:57 AM, Fabien COELHO wrote:

The use case of the option is to be able to generate a continuous gentle
load for functional tests, eg in a practice session with students or for
testing features on a laptop.


If you add this to 
https://commitfest.postgresql.org/action/commitfest_view?id=18 I'll 
review it next month.  I have a lot of use cases for a pgbench that 
doesn't just run at 100% all the time.  I had tried to simulate 
something with simple sleep calls, but I realized it was going to take a 
stronger math basis to do the job well.


The situations where I expect this to be useful all require collecting 
latency data and then both plotting it and doing some statistical 
analysis.  pgbench-tools computes worst-case and 90th percentile latency 
for example, along with the graph over time.  There's a useful concept 
that some of the official TPC tests have:  how high can you get the 
throughput while still keeping the latency within certain parameters. 
Right now we have no way to simulate that.  What we see with write-heavy 
pgbench is that latency goes crazy (>60 second commits sometimes) if all 
you do is hit the server with maximum throughput.  That's interesting, 
but it's not necessarily relevant in many cases.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] add --throttle to pgbench (submission 3)

2013-05-01 Thread Fabien COELHO


Add --throttle to pgbench

Each client is throttled to the specified rate, which can be expressed in 
tps or in time (s, ms, us). Throttling is achieved by scheduling 
transactions along a Poisson-distribution.


This is an update of the previous proposal which fix a typo in the sgml 
documentation.


The use case of the option is to be able to generate a continuous gentle 
load for functional tests, eg in a practice session with students or for 
testing features on a laptop.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index bc01f07..0142ed0 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -137,6 +137,12 @@ int			unlogged_tables = 0;
 double		sample_rate = 0.0;
 
 /*
+ * whether clients are throttled to a given rate, expressed as a delay in us.
+ * 0, the default means no throttling.
+ */
+int64		throttle = 0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -204,6 +210,8 @@ typedef struct
 	int			nvariables;
 	instr_time	txn_begin;		/* used for measuring transaction latencies */
 	instr_time	stmt_begin;		/* used for measuring statement latencies */
+	int64		trigger;		/* previous/next throttling (us) */
+	bool		throttled;  /* whether current transaction was throttled */
 	int			use_file;		/* index in sql_files for this client */
 	bool		prepared[MAX_FILES];
 } CState;
@@ -361,6 +369,9 @@ usage(void)
 		   "  -S   perform SELECT-only transactions\n"
 	 "  -t NUM   number of transactions each client runs (default: 10)\n"
 		   "  -T NUM   duration of benchmark test in seconds\n"
+		   "  -H SPEC, --throttle SPEC\n"
+		   "   delay in second to throttle each client\n"
+		   "   sample specs: 0.025 40tps 25ms 25000us\n"
 		   "  -v   vacuum all four standard tables before tests\n"
 		   "\nCommon options:\n"
 		   "  -d print debugging output\n"
@@ -1027,7 +1038,7 @@ top:
 			}
 		}
 
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (!st->throttled && commands[st->state]->type == SQL_COMMAND)
 		{
 			/*
 			 * Read and discard the query result; note this is not included in
@@ -1049,26 +1060,54 @@ top:
 			discard_response(st);
 		}
 
+		/* some stuff done at the end */
 		if (commands[st->state + 1] == NULL)
 		{
-			if (is_connect)
+			/* disconnect if required and needed */
+			if (is_connect && st->con)
 			{
 PQfinish(st->con);
 st->con = NULL;
 			}
 
-			++st->cnt;
-			if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded)
-return clientDone(st, true);	/* exit success */
+			/* update transaction counter once, and possibly end */
+			if (!st->throttled)
+			{
+++st->cnt;
+if ((st->cnt >= nxacts && duration <= 0) || timer_exceeded)
+	return clientDone(st, true);	/* exit success */
+			}
+
+			/* handle throttling once, as the last post-transaction stuff */
+			if (throttle && !st->throttled)
+			{
+/* compute delay to approximate a Poisson distribution
+ * 100 => 13.8 .. 0 multiplier
+ * if transactions are too slow or a given wait shorter than
+ * a transaction, the next transaction will start right away.
+ */
+int64 wait = (int64)
+	throttle * -log(getrand(thread, 1, 100)/100.0);
+st->trigger += wait;
+st->sleeping = 1;
+st->until = st->trigger;
+st->throttled = true;
+if (debug)
+	fprintf(stderr, "client %d throttling %d us\n",
+			st->id, (int) wait);
+return true;
+			}
 		}
 
 		/* increment state counter */
 		st->state++;
 		if (commands[st->state] == NULL)
 		{
+			/* reset */
 			st->state = 0;
 			st->use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st->use_file];
+			st->throttled = false;
 		}
 	}
 
@@ -2086,6 +2125,7 @@ main(int argc, char **argv)
 		{"unlogged-tables", no_argument, &unlogged_tables, 1},
 		{"sampling-rate", required_argument, NULL, 4},
 		{"aggregate-interval", required_argument, NULL, 5},
+		{"throttle", required_argument, NULL, 'H'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -2152,7 +2192,7 @@ main(int argc, char **argv)
 	state = (CState *) pg_malloc(sizeof(CState));
 	memset(state, 0, sizeof(CState));
 
-	while ((c = getopt_long(argc, argv, "ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "ih:nvp:dqSNc:j:Crs:t:T:U:lf:D:F:M:H:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -2307,6 +2347,26 @@ main(int argc, char **argv)
 	exit(1);
 }
 break;
+			case 'H':
+			{
+/* get a double from the beginning of option value */
+double throttle_value = atof(optarg);
+if (throttle_value <= 0.0)
+{
+	fprintf(stderr, "invalid throttle value: %s\n", optarg);
+	exit(1);
+}
+/* rough handling of possible units */
+if (strstr(optarg, "us"))
+	throttle = (int64) throttle_value;
+else if (strstr(optarg, "ms"))
+	throttle = (int64) (1000.0 * throttle_value);
+else if (strstr(optarg, "tps"))
+