Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

Fabien COELHO Thu, 18 Jul 2013 00:50:05 -0700


Hello Tatsuo,

I think I'm starting to understand what's going on.  Suppose there are
n transactions be issued by pgbench and it decides each schedule d(0),
d(1)... d(n). Actually the schedule d(i) (which is stored in
st->until) is decided by the following code:

                int64 wait = (int64)
                        throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
                thread->throttle_trigger += wait;
                st->until = thread->throttle_trigger;

Yep. Let us say d(i) is the target starting time for transaction i, thatis "throttle_trigger" above.

st->until represents the time for a transaction to be finished by the
time. Now the transaction i finishes at t(i).

No, it is the time for the **start** of the transaction. The client issleeping "until" this time. We can only try to control the beginning ofthe transaction. It ends when it ends!

So the lag l(i) = t(i) -d(i) if the transaction is behind.

Transaction i "lags behind" if it *starts* later that d(i). If it starteffectively at t(i), t(i)>=d(i), lag l(i) = t(i)-d(i). When it completesis not the problem of the scheduler.

Then next transaction i+1 begins. The lag l(i+1) = t(i+1) - d(i+1) andso on. At the end of pgbench, it shows the average lag assum(l(0)...l(n))/n.


Yes.

Now suppose we have 3 transactions and each has following values:

d(0) = 10
d(1) = 20
d(2) = 30

t(0) = 100
t(1) = 110
t(2) = 120

That says pgbench expects the duration 10 for each transaction.

pgbench does not expect any duration, but your proposed scheduling d(i)cannot be followed if the duration is more than 10.

With your above figures, with d(i) the expected start time and t(i) theactual start time, then for some reason pgbench was not around to starttransaction before time 100 (maybe the OS switched the process off toattend to other stuff) although it should have started at 10, so l(0) =90. Then the second transaction starts readily at 110, but was expected at20 nevertheless, 90 lag again. Same for the last one. All transactionsstarted 90 units after their scheduled time, the cumulative lag is 270,the average lag is 90.


If I take another example.

 - Scheduled start time d(0 .. 3) = 0 20 40 60
 - Durations D(0 .. 3) = 15 25 50 10
 - Actual start time for transactions
   t(0) = 3 (it is late by 3 for some reason), completes by 18
   t(1) = t(0)+D(0) + some more lag for some reason = 21, completes by 46
   t(2) = t(1)+D(1) + no additional lag here = 46, completes by 96
   t(3) = t(2)+D(2) + some more lag for some reason = 97, completes by 107

The l(0 .. 3) = 3-0, 21-20, 46-40, 97-60

Total lag is 3 + 1 + 6 + 37 = 48

Average lag = 48/4 = 12

In this example, some lag is due to the process (3 at the beginning, 1 onthe second transaction), some other is due to a transaction duration whichimpact the following transactions.

However actually pgbench calculates like this:

average lag = (t(0)-d(0) + t(1)-d(1) + t(2)-d(2))/3
           = (100-10 + 110-20 + 120-30)/3
           = (90 + 90 + 90)/3
           = 90


Yes, this is correct.

Looks like too much lag calculated. The difference between the lag
which pgbench calculates and the expected one will be growing if a lag
happens eariler. I guess why my Linux box shows bigger lag than Mac OS
X is, the first transaction or early transactions run slowly than the
ones run later.


Possibly.

Of course this conclusion depends on the definition of the "average
rate limit lag" of pgbench. So you might have other opinion. However
the way how pgbench calculates the average lag is not expected at
least for me.

Indeed, I think that it really depends on your definition of lag. The lagI defined is the time between the scheduled transaction start time and theactual transaction start time. This is a measure of how well pgbench isable to follow the stochastic process, and if pgbench is constantly latethen it cumulates a lot, but that basically mean that there is not enough(cpu) resources to run pgbench cleanly.

What you seem to expect is the average transaction latency. This is also auseful measure, and I'm planning to add a clean measure of that when underthrottling, and also with --progress, as the current computation based ontps is not significant under throttling.


But that is a plan for the next commitfest!

--
Fabien.


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

Reply via email to