Hello Tatsuo,

I think I'm starting to understand what's going on.  Suppose there are
n transactions be issued by pgbench and it decides each schedule d(0),
d(1)... d(n). Actually the schedule d(i) (which is stored in
st->until) is decided by the following code:

                int64 wait = (int64)
                        throttle_delay * -log(getrand(thread, 1, 1000)/1000.0);
                thread->throttle_trigger += wait;
                st->until = thread->throttle_trigger;

Yep. Let us say d(i) is the target starting time for transaction i, that is "throttle_trigger" above.

st->until represents the time for a transaction to be finished by the
time. Now the transaction i finishes at t(i).

No, it is the time for the **start** of the transaction. The client is sleeping "until" this time. We can only try to control the beginning of the transaction. It ends when it ends!

So the lag l(i) = t(i) -d(i) if the transaction is behind.

Transaction i "lags behind" if it *starts* later that d(i). If it start effectively at t(i), t(i)>=d(i), lag l(i) = t(i)-d(i). When it completes is not the problem of the scheduler.

Then next transaction i+1 begins. The lag l(i+1) = t(i+1) - d(i+1) and so on. At the end of pgbench, it shows the average lag as sum(l(0)...l(n))/n.

Yes.

Now suppose we have 3 transactions and each has following values:

d(0) = 10
d(1) = 20
d(2) = 30

t(0) = 100
t(1) = 110
t(2) = 120

That says pgbench expects the duration 10 for each transaction.

pgbench does not expect any duration, but your proposed scheduling d(i) cannot be followed if the duration is more than 10.

With your above figures, with d(i) the expected start time and t(i) the actual start time, then for some reason pgbench was not around to start transaction before time 100 (maybe the OS switched the process off to attend to other stuff) although it should have started at 10, so l(0) = 90. Then the second transaction starts readily at 110, but was expected at 20 nevertheless, 90 lag again. Same for the last one. All transactions started 90 units after their scheduled time, the cumulative lag is 270, the average lag is 90.

If I take another example.

 - Scheduled start time d(0 .. 3) = 0 20 40 60
 - Durations D(0 .. 3) = 15 25 50 10
 - Actual start time for transactions
   t(0) = 3 (it is late by 3 for some reason), completes by 18
   t(1) = t(0)+D(0) + some more lag for some reason = 21, completes by 46
   t(2) = t(1)+D(1) + no additional lag here = 46, completes by 96
   t(3) = t(2)+D(2) + some more lag for some reason = 97, completes by 107

The l(0 .. 3) = 3-0, 21-20, 46-40, 97-60

Total lag is 3 + 1 + 6 + 37 = 48

Average lag = 48/4 = 12

In this example, some lag is due to the process (3 at the beginning, 1 on the second transaction), some other is due to a transaction duration which impact the following transactions.

However actually pgbench calculates like this:

average lag = (t(0)-d(0) + t(1)-d(1) + t(2)-d(2))/3
           = (100-10 + 110-20 + 120-30)/3
           = (90 + 90 + 90)/3
           = 90

Yes, this is correct.

Looks like too much lag calculated. The difference between the lag
which pgbench calculates and the expected one will be growing if a lag
happens eariler. I guess why my Linux box shows bigger lag than Mac OS
X is, the first transaction or early transactions run slowly than the
ones run later.

Possibly.

Of course this conclusion depends on the definition of the "average
rate limit lag" of pgbench. So you might have other opinion. However
the way how pgbench calculates the average lag is not expected at
least for me.

Indeed, I think that it really depends on your definition of lag. The lag I defined is the time between the scheduled transaction start time and the actual transaction start time. This is a measure of how well pgbench is able to follow the stochastic process, and if pgbench is constantly late then it cumulates a lot, but that basically mean that there is not enough (cpu) resources to run pgbench cleanly.

What you seem to expect is the average transaction latency. This is also a useful measure, and I'm planning to add a clean measure of that when under throttling, and also with --progress, as the current computation based on tps is not significant under throttling.

But that is a plan for the next commitfest!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to