Yeah, I should have mentioned that: its master-master, and on cdh4b1.
But, replication on that specific slave table is disabled (so,
effectively its master-slave for this test).

Is this same as yours (replication config wise), or shall I enable
replication on the destination table too?

Thanks,
Himanshu

On Tue, May 1, 2012 at 8:01 PM, Jerry Lam <chiling...@gmail.com> wrote:
> Hi Himanshu:
>
> Thanks for following up! I did looked up the log and there were some 
> exceptions. I'm not sure if those exceptions contribute to the problem I've 
> seen a week ago.
> I did aware of the latency between the time that the master said "Nothing to 
> replicate" and the actual time it takes to actually replicate on the slave. I 
> remember I wait 12 hours for the replication to finish (i.e. start the test 
> before leaving office and check the result the next day) and data still not 
> fully replicated.
>
> By the way, is your test running with master-slave replication or 
> master-master replication?
>
> I will resume this again. I was busy on something else for the past week or 
> so.
>
> Best Regards,
>
> Jerry
>
> On 2012-05-01, at 6:41 PM, Himanshu Vashishtha wrote:
>
>> Hello Jerry,
>>
>> Did you try this again.
>>
>> Whenever you try next, can you please share the logs somehow.
>>
>> I tried replicating your scenario today, but no luck. I used the same
>> workload you have copied here; master cluster has 5 nodes and slave
>> has just 2 nodes; and made tiny regions of 8MB (memstore flushing at
>> 8mb too), so that I have around 1200+ regions even for 200k rows; ran
>> the workload with 16, 24 and 32 client threads, but the verifyrep
>> mapreduce job says its good.
>> Yes, I ran the verifyrep command after seeing "there is nothing to
>> replicate" message on all the regionservers; sometimes it was a bit
>> slow.
>>
>>
>> Thanks,
>> Himanshu
>>
>> On Mon, Apr 23, 2012 at 11:57 AM, Jean-Daniel Cryans
>> <jdcry...@apache.org> wrote:
>>>> I will try your suggestion today with a master-slave replication enabled 
>>>> from Cluster A -> Cluster B.
>>>
>>> Please do.
>>>
>>>> Last Friday, I tried to limit the variability/the moving part of the 
>>>> replication components. I reduced the size of Cluster B to have only 1 
>>>> regionserver and having Cluster A to replicate data from one region only 
>>>> without region splitting (therefore I have 1-to-1 region replication 
>>>> setup). During the benchmark, I moved the region between different 
>>>> regionservers in Cluster A (note there are still 3 regionservers in 
>>>> Cluster A). I ran this test for 5 times and no data were lost. Does it 
>>>> mean something? My feeling is there are some glitches/corner cases that 
>>>> have not been covered in the cyclic replication (or hbase replication in 
>>>> general). Note that, this happens only when the load is high.
>>>
>>> And have you looked at the logs? Any obvious exceptions coming up?
>>> Replication uses the normal HBase client to insert the data on the
>>> other cluster and this is what handles regions moving around.
>>>
>>>>
>>>> By the way, why do we need to have a zookeeper not handled by hbase for 
>>>> the replication to work (it is described in the hbase documentation)?
>>>
>>> It says you *should* do it, not you *need* to do it :)
>>>
>>> But basically replication is zk-heavy and getting a better
>>> understanding of it starts with handling it yourself.
>>>
>>> J-D
>

Reply via email to