Very good analysis, Suresh. I agree with you that we should keep this simple and not introduce a lot of overhead to determine the constants.

--
Øystein

Suresh Thalamati wrote:

My two cents:
If the goal is do to auto-tune the checkpoint interval , I think amount of log generated is a good indication of how fast the
system is. By finding out how fast log is getting generated,
one can predict with in reasonable error how long the recovery
will take on that system.

It might be worth to find a simple solution to start with, than trying to get it perfect. How about some simple solution to tune the checkpoint interval like the following:

Say on a particular system:

1) we can generate N amount of log in X amount time on a running system.

2) R is fraction of time it takes to recover a log generated in X amount of time. One can pre calculate this factor by doing some tests instead of trying to find on each boot.

3) Y is the recovery time, derby users likely to see by
default in the worst case.

4) C is after how much log generation a checkpoint should be scheduled

ideal checkpoint interval log size should be some thing like :

   C = (N / X) * (Y * R)


For example :

X = 5 min,   N  = 50 MB., Y =  1 min, R = 0.5

C = (50 / 5 ) * ( 1 / 0.5)    = 20 MB.


One could use some formula like above at checkpoint to tune when should the next checkpoint should occur , I understand first time it will be completely Off, but the interval will stabilize after a few checkpoints.

I think irrespective of whatever approach is finally implemented, we have to watch for overhead introduced to get it exactly right, I don't think user will really care if recovery takes few seconds
less or more.

Aside from tuning the checkpoint interval, I think we should find ways to minimize the effect of checkpoint on the system throughput.
I guess that is a different topic.



Thanks
-suresht

Reply via email to