Hi Sachin,

We have been using multiwal in production here in Alibaba for over 2 years
and see no problem. Facebook is also running multiwal online. Please refer
to HBASE-14457 <https://issues.apache.org/jira/browse/HBASE-14457> for more
details.

There's also a JIRA HBASE-15131
<https://issues.apache.org/jira/browse/HBASE-15131> proposing to turn on
multiwal by default but still under discussion, please feel free to leave
your voice there.

Regarding the issue you met, what's the setting of
hbase.regionserver.maxlogs in your env? By default it's 32 which means for
each RS the un-archived wal number shouldn't exceed 32. However, when
multiwal enabled, it allows 32 logs for each group, thus becoming 64 wals
allowed for a single RS.

Let me further explain how it leads to RegionTooBusyException:
1. if the number of un-archived wal exceeds the setting, it will check the
oldest WAL and flush all regions involved in it
2. if the data ingestion speed is high and wal keeps rolling, there'll be
many small hfiles flushed out, that compaction speed cannot catch up
3. when hfile number of one store exceeds the setting of
hbase.hstore.blockingStoreFiles (10 by default), it will delay the flush
for hbase.hstore.blockingWaitTime (90s by default)
4. when data ingestion continues but flush delayed, the memstore size might
exceed the upper limit thus throw RegionTooBusyException

Hope these information helps.

Best Regards,
Yu

On 6 June 2017 at 13:39, Sachin Jain <sachinjain...@gmail.com> wrote:

> Hi,
>
> I was in the middle of a situation where I was getting
> *RegionTooBusyException* with log something like:
>
>     *Above Memstore limit, regionName = X ... memstore size = Y and
> blockingMemstoreSize = Z*
>
> This potentially hinted me towards *hotspotting* of a particular region. So
> I fixed my keyspace partitioning to have more uniform distribution per
> region. It did not completely fix the problem but definitely delayed it a
> bit.
>
> Next thing, I enabled *multiWal*. As I remember there is a configuration
> which leads to flushing of memstores when the threshold of wal is reached.
> Upon doing this, problem seems to go away.
>
> But, this raises couple of questions
>
> 1. Are there any reprecussions of using *multiWal* in production
> environment ?
> 2. If there are no repercussions and only benefits of using *multiWal*, why
> is this not turned on by default. Let other consumers turn it off in
> certain (whatever) scenarios.
>
> PS: *Hbase Configuration*
> Single Node (Local Setup) v1.3.1 Ubuntu 16 Core machine.
>
> Thanks
> -Sachin
>

Reply via email to