At Lithium, we power Klout using HBase. We load Klout scores for about 500
million users into HBase every night. When a load is happening, we noticed
that the performance of klout.com was severely degraded. We also see
severely degraded performance when performing operations like compactions.
In order to mitigate this, we stood up 2 HBase cluster in an
"Active/Standy" configuration (not the built in replication, but something
else entirely). We serve data from the "Active" cluster and load data into
the "Standby" and then swap, load into the other cluster while serving from
the cluster that just got the update.

We don't use coprocessors, so we didn't have the problem you're describing.
However, in our configuration, what we would do is upgrade the coprocessor
in the "Standby" and then swap the clusters. But since you would have to
stand up a second HBase cluster, this may be a non-starter for you. Just
another option thrown into the mix. :)

On Wed Oct 29 2014 at 12:07:02 PM Michael Segel <mse...@segel.com> wrote:

> Well you could redesign your cp.
>
> There is a way to work around the issue by creating a cp that's really a
> framework and then manage the cps in a different jvm(s) using messaging
> between the two.
> So if you want to reload or restart your cp, you can do it outside of the
> RS.
>
> Its a bit more work...
>
>
> On Oct 29, 2014, at 9:21 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> > Rolling restart of servers may have bigger impact on operations - server
> > hosting hbase:meta would be involved which has more impact compared to
> > disabling / enabling user table.
> >
> > You should give ample timeout to your client. The following is an
> > incomplete list of configs (you can find their explanation on
> > http://hbase.apache.org/book.html):
> >
> > hbase.client.scanner.timeout.period
> > hbase.rpc.timeout
> >
> > Cheers
> >
> > On Tue, Oct 28, 2014 at 11:18 PM, Hayden Marchant <hayd...@amobee.com>
> > wrote:
> >
> >> Thanks all for confirming what I thought was happening.
> >>
> >> I am considering implementing a pattern similar to Iain's in that I
> >> version that path of the cp, and disable/enable the table while
> upgrading
> >> the cp metadata.
> >>
> >> However, what are the operational considerations of disabling a table
> for
> >> a number of seconds, versus rolling restart of region servers? Assuming
> >> that however hard I try, there still might be a process or 2 that are
> >> accessing that table at that time. What sort of error handling will I
> need
> >> to more aware of now (I assume that MapReduce would recover from either
> of
> >> these two strategies?)
> >>
> >> Thanks,
> >> Hayden
> >>
> >> ________________________________________
> >> From: iain wright <iainw...@gmail.com>
> >> Sent: Wednesday, October 29, 2014 1:51 AM
> >> To: user@hbase.apache.org
> >> Subject: Re: Upgrading a coprocessor
> >>
> >> Hi Hayden,
> >>
> >> We ran into the same thing & ended up going with a rudimentary cp deploy
> >> script for appending epoch to the cp name, placing on hdfs, and
> >> disabling/modifying hbase table/enabling
> >>
> >> Heres the issue for this: https://issues.apache.org/
> jira/browse/HBASE-9046
> >>
> >> -
> >>
> >> --
> >> Iain Wright
> >>
> >> This email message is confidential, intended only for the recipient(s)
> >> named above and may contain information that is privileged, exempt from
> >> disclosure under applicable law. If you are not the intended recipient,
> do
> >> not disclose or disseminate the message to anyone except the intended
> >> recipient. If you have received this message in error, or are not the
> named
> >> recipient(s), please immediately notify the sender by return email, and
> >> delete all copies of this message.
> >>
> >> On Tue, Oct 28, 2014 at 10:51 AM, Bharath Vissapragada <
> >> bhara...@cloudera.com> wrote:
> >>
> >>> Hi Hayden,
> >>>
> >>> Currently there is no workaround. We can't unload already loaded
> classes
> >>> unless we make changes to Hbase's classloader design and I believe its
> >> not
> >>> that trivial.
> >>>
> >>> - Bharath
> >>>
> >>> On Tue, Oct 28, 2014 at 2:52 AM, Hayden Marchant <hayd...@amobee.com>
> >>> wrote:
> >>>
> >>>> I have been using a RegionObserver coprocessor on my HBase 0.94.6
> >> cluster
> >>>> for quite a while and it works great. I am currently upgrading the
> >>>> functionality. When doing some testing in our integration environment
> I
> >>> met
> >>>> with the issue that even when I uploaded a new version of my
> >> coprocessor
> >>>> jar to HDFS, HBase did not recognize it, and it kept using the old
> >>> version.
> >>>>
> >>>> I even disabled/reenabled the table - no help. Even with a new table,
> >> it
> >>>> still loads old class. Only when I changed the location of the jar in
> >>> HDFS,
> >>>> did it load the new version.
> >>>>
> >>>> I looked at the source code of CoprocessorHost and I see that it is
> >>>> forever holding a classloaderCache with no mechanism for clearing it
> >> out.
> >>>>
> >>>> I assume that if I restart the region server it will take the new
> >> version
> >>>> of my coprocessor.
> >>>>
> >>>> Is there any workaround for upgrading a coprocessor without either
> >>>> changing the path, or restarting the HBase region server?
> >>>>
> >>>> Thanks,
> >>>> Hayden
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Bharath Vissapragada
> >>> <http://www.cloudera.com>
> >>>
> >>
>
>

Reply via email to