I'd say it mostly depends on your tolerance to regions being
unavailable while the recovery happens. You have to account for the ZK
timeout (60 secs by default), plus the time to split (I don't have any
good metric for that, usually it's kinda fast but you should try it
with your data), plus the time to reassign the regions and, when they
open, the replay of the split logs. Some tips:

 - Lower ZK timeout means you are more likely to have long GC pauses
taking down your region server (well, the master assumes the RS is
dead). Full GCs do happen. We have it at 40 seconds here.
 - Smaller HLogs / fewer HLogs means you will force flush regions a
lot more in order to keep the data to be replayed lower. Here we have
hbase.regionserver.maxlogs=8 so we do flush a lot, but we serve a live
website. In the future, distributed log splitting (once committed)
will make it faster.

J-D

On Wed, Sep 29, 2010 at 6:12 AM, Daniel Einspanjer
<deinspan...@mozilla.com> wrote:
>  Question regarding configuration and tuning...
>
> Our current configuration/schema has fairly low hlog rollover sizes to keep
> the possibility of data loss to a minimum.  When we upgrade to .89 with
> append support, I imagine we'll be able to safely set this to a much larger
> size.  Are there any rough guidelines for what a good values should be now?
>
> -Daniel
>
> On 9/28/10 6:13 PM, Buttler, David wrote:
>>
>> Fantastic news, I look forward to it
>> Dave
>>
>> -----Original Message-----
>> From: Todd Lipcon [mailto:t...@cloudera.com]
>> Sent: Tuesday, September 28, 2010 11:25 AM
>> To: user@hbase.apache.org
>> Subject: Re: Upgrading 0.20.6 ->  0.89
>>
>> On Tue, Sep 28, 2010 at 9:35 AM, Buttler, David<buttl...@llnl.gov>  wrote:
>>
>>> I currently suggest that you use the CDH3 hadoop package.  Apparently
>>> StumbleUpon has a production version of 0.89 that they are using.  It
>>> would
>>> be helpful if Cloudera put that in their distribution.
>>>
>>>
>> Working on it ;-) CDH3b3 should be available in about 2 weeks and will
>> include an HBase version that's very similar to what StumbleUpon has
>> published.
>>
>> -Todd
>>
>>>
>>> -----Original Message-----
>>> From: Mark Laffoon [mailto:mlaff...@semanticresearch.com]
>>> Sent: Tuesday, September 28, 2010 8:00 AM
>>> To: user@hbase.apache.org
>>> Subject: Upgrading 0.20.6 ->  0.89
>>>
>>> We're using 0.20.6; we have a non-trivial application using many aspects
>>> of hbase; we have a couple of customers in production; we understand this
>>> is still pre-release, however we don't want to lose any data.
>>>
>>>
>>>
>>> Will upgrading to 0.89 be a PITA?
>>>
>>> Should we expect to be able to upgrade the servers without losing data?
>>>
>>> Will there be tons of client code changes?
>>>
>>> What about configuration changes (especially little changes that will
>>> bite
>>> us)?
>>>
>>> Do we need/want to upgrade hadoop at all (we're on 0.20.2)?
>>>
>>> If we do upgrade, what is the recommended package to get it from?
>>>
>>>
>>>
>>> Thanks in advance for any or all answers,
>>>
>>> Mark
>>>
>>>
>>
>

Reply via email to