On Tue, Nov 19, 2013 at 9:13 AM, Mike James <[email protected]> wrote:

>  I am using slony1 log-shipping to replicate some data to a remote
> location thru a VPN tunnel. A cron job running every minute applies the
> logs to the remote slave. So the latency is at least one minute. Is there a
> way to more precisely measure the latency between the origin and slave?
>

I'd say that things are less deterministic than that, unfortunately.

The expected latency is actually more like 30 seconds, though there are
cases that could induce arbitrarily more.

"Why 30s?" is interesting...

Consider the scenario where there are generally small sets of changes being
applied, and replication is keeping up to date nicely.  Suppose, further,
there is one SYNC generated every second.

With your "apply once per minute" approach, consider when logs are being
applied at the top of the minute.  There are 60 logs there, one from a
minute ago, one from 59 seconds ago, and so forth.  There will be one log
that is just 1s old, so that the "latency" for that log will be just 1s,
considerably better than 1m.  On average, the latency, across the 60 logs,
will be 30s.

This indicates that the number you're measuring as "1 minute latency" isn't
an minimum (some logs will get applied almost instantly after generation,
if you're lucky), or a mean (the mean is likely around 30s, probably a
little more), but rather you are describing a "worst case latency."  At
worst, usually, you expect latency to be 1 minute.

Unfortunately, that's not nearly the end of the story.  By a suitable
pattern of updates, one may induce arbitrary increases to that "worst case
latency."

It's pretty easy to do: I just have to set up a transaction on the origin
that does a large amount of replicable work.
    BEGIN; UPDATE some_replicated_table SET [some change] WHERE [lots and
lots of tuples are affected]. COMMIT;

That set of work will go into the next SYNC that is generated immediately
after the COMMIT.  Supposing the set of updates takes 5 minutes to query
from sl_log_{1,2} and assemble into a log, and it takes 10 minutes to load
that on the remote slave, then we have just induced a latency of 15
minutes.  (That it's 5 minutes and 10 minutes for those activities is
purely made up on my part; it should nonetheless be easy to believe that if
you COMMIT a transaction that touched a million tuples in replicated
tables, it will take a while to process that.  Increase the number of
tuples as needed to get whatever result worries you most!)
_______________________________________________
Slony1-general mailing list
[email protected]
http://lists.slony.info/mailman/listinfo/slony1-general

Reply via email to