Re: Questions about HBase Cluster Replication

Jean-Daniel Cryans Thu, 03 Mar 2011 09:21:39 -0800

It's a work in progress, that information is currently published by
every region server in the master cluster (since it's push
replication, not pull) through JMX under the name
"ageOfLastShippedOp". It's really not perfect though, since if it
fails to replicate and starts retrying then the age won't change but
the actual lag will go up. Also it will have to be revisited when we
add multiple slaves since you don't really want to publish the same
metric for multiple slaves... it really wouldn't work.


J-D

On Thu, Mar 3, 2011 at 9:10 AM, Bill Graham <billgra...@gmail.com> wrote:
> Actually, how far behind replication is w.r.t. edit logs is different
> than how out of sync they are, but you get the idea.
>
> On Thu, Mar 3, 2011 at 9:07 AM, Bill Graham <billgra...@gmail.com> wrote:
>> One more question for the FAQ:
>>
>> 6. Is it possible for an admin to tell just how out of sync the two
>> clusters are? Something like Seconds_Behind_Master in MySQL's SHOW
>> SLAVE STATUS?
>>
>>
>> On Wed, Mar 2, 2011 at 9:32 PM, Jean-Daniel Cryans <jdcry...@apache.org> 
>> wrote:
>>> Although, I would add that this feature is still experimental so who knows 
>>> :)
>>>
>>> I think the worst that happened to us was that replication was broken
>>> (see the jira where if the master loses it's zk session with the slave
>>> zk ensemble, it requires a HBase restart on the master side) for a few
>>> days because of maintenance of the link between the two datacenters
>>> which took more than a minute. When we finally did restart the master
>>> cluster, it had to process about 2TBs of HLogs... those ICVs can
>>> really generate a lot of data!
>>>
>>> J-D
>>>
>>> On Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <jdcry...@apache.org> 
>>> wrote:
>>>>> 5. If one is adding replication on the *production* Master cluster, 
>>>>> what's the
>>>>> worst thing that can happen to this Master cluster?  Nothing scary other 
>>>>> than
>>>>> changing configs + interruption during a restart? (which is currently 
>>>>> still bad
>>>>> because of region assignments?)
>>>>>
>>>>
>>>> The replication code is pretty much encapsulated from the rest of the
>>>> region server code, it won't mess with your Puts or change your
>>>> birthday date.
>>>>
>>>> With 0.90 the regions are reassigned where they were before, so it's
>>>> really just the block cache that gets screwed.
>>>>
>>>> J-D
>>>>
>>>
>>
>

Re: Questions about HBase Cluster Replication

Reply via email to