Re: Region split during mapreduce

Flavio Pompermaier Sat, 01 Nov 2014 07:40:11 -0700

Ok thanks for the explanation!
On Nov 1, 2014 8:20 AM, "lars hofhansl" <la...@apache.org> wrote:


> I do not believe that to be true.
> HBase only uses Region boundaries to identify useful scan ranges during
> the setup of the job. These ranges will work regardless of whether the
> number of regions increases later or not. The worst case is that a single
> mapper might be scanning multiple regions (those that are the result of a
> split of the region it was supposed to scan).
> Regions are unavailable for a short time during a split, but the mappers
> are normal HBase clients and so they wait out the splits by retrying.
> -- Lars
>
>       From: Flavio Pompermaier <pomperma...@okkam.it>
>  To: user@hbase.apache.org
>  Sent: Friday, October 31, 2014 10:23 AM
>  Subject: Re: Region split during mapreduce
>
> The problem is that I don't know if what they say at that link is true or
> not.
> In the past I experienced several problems running mapreduce jobs on a
> "live" Hbase table but I didn't know about the fact that mapreduce jobs
> crash if region were splitting..
> Do I have to create a snapshot if I want to use TableSnapshotInputFormat or
> it automatically handles the snapshot creation and deletion of a snapshot?
> Is there any detailed reference about how to deal with such event during
> mapreduce jobs?
>
> Thanks for the support,
> Flavio
>
>
>
> On Fri, Oct 31, 2014 at 6:12 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> > Flavio:
> > Have you considered using TableSnapshotInputFormat ?
> >
> > See TableMapReduceUtil#initTableSnapshotMapperJob()
> >
> > Cheers
> >
> > On Fri, Oct 31, 2014 at 10:01 AM, Flavio Pompermaier <
> pomperma...@okkam.it
> > >
> > wrote:
> >
> > > Is there anybody here..?
> > >
> > > On Thu, Oct 30, 2014 at 2:28 PM, Flavio Pompermaier <
> > pomperma...@okkam.it>
> > > wrote:
> > >
> > > > Any help about this..?
> > > >
> > > > On Wed, Oct 29, 2014 at 9:08 AM, Flavio Pompermaier <
> > > pomperma...@okkam.it>
> > > > wrote:
> > > >
> > > >> Hi to all,
> > > >> I was reading
> > > >>
> > >
> >
> http://www.abcn.net/2014/07/spark-hbase-result-keyvalue-bytearray.html?m=1
> > > >> and they say " still using
> > > >> org.apache.hadoop.hbase.mapreduce.TableInputFormat is a big problem,
> > > your
> > > >> job will fail when one of HBase Region for target HBase table is
> > > splitting
> > > >> ! because the original region will be offline by splitting".
> > > >>
> > > >> Is that true?
> > > >> Is there a solution to that?
> > > >>
> > > >> Best,
> > > >> Flavio
> > > >>
> > > >
> > >
> >
>
>
>

Re: Region split during mapreduce

Reply via email to