Re: Uneven write request to regions

Asaf Mesika Tue, 19 Nov 2013 22:02:02 -0800

Thanks for clearing that out.
I'm using your message to ping anyone who assist as to it appears the use
case should happen to a lot of people?


Thanks!

On Wednesday, November 20, 2013, Himanshu Vashishtha wrote:

> Re: "The 32 limit makes HBase go into
> stress mode, and dump all involving regions contains in those 32 WAL
> Files."
>
> Pardon, I haven't read all your data points/details thoroughly, but the
> above statement is not true. Rather, it looks at the oldest WAL file, and
> flushes those regions which would free that WAL file.
>
> But I agree that in general with this kind of workload, we should handle
> WAL files more intelligently and free up those WAL files which don't have
> any dependency (that is, all their entries are already flushed) when
> archiving. We do that in trunk but not in any released version, though.
>
>
>
> On Sat, Nov 16, 2013 at 11:16 AM, Asaf Mesika <[email protected]>
> wrote:
>
> > First I forgot to mention that <customerId> in our case is
> > MD5(<customerId>).
> > In our case, we have so much data flowing in, that we end up having a
> > region per <customerId><bucket> pretty quickly and even that, is splitted
> > into different regions by specific date duration (timestamp).
> >
> > We're not witnessing a hotspot issue. I built some scripts in java and
> awk,
> > and saw that 66% of our customers use more than 1Rs.
> >
> > We have two main serious issues: primary and secondary.
> >
> > Our primary issue being the slow-region vs fast-region. First let's be
> > reminded that a region represents as I detailed before a specific
> > <customerId><bucket>. Some customers gets x50 times more data that other
> > customers at a specific time frame (2hrs - 1 day). So in a one RS, we
> have
> > regions getting 10 write requests per hour, vs 50k write requests per
> hour.
> > So the region mapped to the slow-filling customer id, doesn't get to the
> > 256MB flush limit and hence isn't flushed, while the regions mapped to
> the
> > fast-filling customer id, are flushing very quickly since they are
> filling
> > very quickly.
> > Let's say the 1st WAL file contains the put of a slow-filling customerId.
> > the fast-filling customerId, fills up the rest of that file. After 20-30
> > seconds, the file gets rolled, and another file fills up with fast
> filling
> > customerId. After a while, we get to 32 WAL Files. The 1st file wasn't
> > deleted since its region wasn't flushed. The 32 limit makes HBase go into
> > stress mode, and dump all involving regions contains in those 32 WAL
> Files.
> > In our case, we saw that it flushes 111 regions. Lots of the store files
> > are 3k-3mb sized. So our compaction queue start filling up with those
> store
> > files needs to be compacted.
> > At the of the road, the RS gets dead.
> >
> > Our secondary issue is those of empty regions - we get to a situation
> where
> > a region is mapped to a specific <customerId>, <bucket>, and date range
> > (1/7 - 3/7). Those when we are in August (we TTL set to 30 days), those
> > regions gets empty and will never get filled again.
> > We assume this somehow wrecks havoc in the load balancer, and also MSLAB
> > probably steals 1-2 GB of memory for those empty regions.
> >
> > Thanks!
> >
> >
> >
> > On Sat, Nov 16, 2013 at 7:25 PM, Mike Axiak <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > One new key pattern that we're starting to use is a salt based on a
> > shard.
> > > For example, let's take your key:
> > >
> > >   <customerId><bucket><timestampInMs><uniqueId>
> > >
> > > Consider a shard between 0 and 15 inclusive. We determine this with:
> > >
> > >  <shard> = abs(hash32(uniqueId) % 16)
> > >
> > > We can then define a salt to be based on customerId and the shard:
> > >
> > >  <salt> = hash32(<shard><customerId>)
> > >
> > > So then the new key becomes:
> > >
> > >  <salt><customerId><timestampInMs><uniqueId>
> > >
> > > This will distribute the data for a given customer across the N shards
> > that
> > > you pick, while having a deterministic function for a given row key (so
> > > long as the # of shards you pick is fixed, otherwise you can migrate
> the
> > > data). Placing the bucket after the customerId doesn't help distribute
> > the
> > > single customer's data at all. Furthermore, by using a separate hash
> > > (instead of just <shard><customerId>),  you're guaranteeing that new
> data
> > > will appear in a somewhat random location (i.e., solving the problem of
> > > adding a bunch of new data for a new customer).
> > >
> > > I have a key simulation script in python that I can start tweaking and
> > > share with people if they'd like.
> > >
> > > Hope this helps,
> > > Mike
> > >
> > >
> > > On Sat, Nov 16, 2013 at 1:16 AM, Ted Yu <[email protected]> wrote:
> > >
> > > > bq. all regions of that customer
> > > >
>

Re: Uneven write request to regions

Reply via email to