It's a work in progress, that information is currently published by every region server in the master cluster (since it's push replication, not pull) through JMX under the name "ageOfLastShippedOp". It's really not perfect though, since if it fails to replicate and starts retrying then the age won't change but the actual lag will go up. Also it will have to be revisited when we add multiple slaves since you don't really want to publish the same metric for multiple slaves... it really wouldn't work.
J-D On Thu, Mar 3, 2011 at 9:10 AM, Bill Graham <billgra...@gmail.com> wrote: > Actually, how far behind replication is w.r.t. edit logs is different > than how out of sync they are, but you get the idea. > > On Thu, Mar 3, 2011 at 9:07 AM, Bill Graham <billgra...@gmail.com> wrote: >> One more question for the FAQ: >> >> 6. Is it possible for an admin to tell just how out of sync the two >> clusters are? Something like Seconds_Behind_Master in MySQL's SHOW >> SLAVE STATUS? >> >> >> On Wed, Mar 2, 2011 at 9:32 PM, Jean-Daniel Cryans <jdcry...@apache.org> >> wrote: >>> Although, I would add that this feature is still experimental so who knows >>> :) >>> >>> I think the worst that happened to us was that replication was broken >>> (see the jira where if the master loses it's zk session with the slave >>> zk ensemble, it requires a HBase restart on the master side) for a few >>> days because of maintenance of the link between the two datacenters >>> which took more than a minute. When we finally did restart the master >>> cluster, it had to process about 2TBs of HLogs... those ICVs can >>> really generate a lot of data! >>> >>> J-D >>> >>> On Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <jdcry...@apache.org> >>> wrote: >>>>> 5. If one is adding replication on the *production* Master cluster, >>>>> what's the >>>>> worst thing that can happen to this Master cluster? Nothing scary other >>>>> than >>>>> changing configs + interruption during a restart? (which is currently >>>>> still bad >>>>> because of region assignments?) >>>>> >>>> >>>> The replication code is pretty much encapsulated from the rest of the >>>> region server code, it won't mess with your Puts or change your >>>> birthday date. >>>> >>>> With 0.90 the regions are reassigned where they were before, so it's >>>> really just the block cache that gets screwed. >>>> >>>> J-D >>>> >>> >> >