Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Bryan Beaudreault Thu, 14 May 2015 09:44:22 -0700

Just an update here.  I've got something working locally that can run
against either a 0.94.17 hbase or a 1.0 hbase transparently.  I implemented
as laid out above, but there were a bunch of gotchas.  It helps that we
maintain our own fork of each version, as I needed to make some
supplemental changes in each version to make things easier.  I will do a
writeup with all of the gotchas later in the process.


Next steps:

- Convert server-side coprocessors
- Apply the same or similar shim logic to our TableInputFormat and other
mapreduce interfaces

A couple notes for the devs:

- I love that 1.0 has a separate hbase-client artifact.  Unfortunately the
TableInputFormat and other mapreduce classes live in hbase-server for some
reason.  So the end result is I basically need to pull the entire hbase
super-artifact into my clients.  I may move these to hbase-client in my
local fork if that is possible.

- There are a few places where you are statically calling
HBaseConfiguration.create().  This makes it hard for people who have a lot
of libraries built around HBase like us.  In our clients we inject
configuration properties from our own configuration servers to supplement
hbase-site/hbase-default.xml. When HBaseConfiguration.create() is called,
it disregards these changes.  In my local fork I hacked in a
LazyConfigurationHolder, which just keeps a static reference to a
Configuration, but has a setter.  This allows me to inject my customized
Configuration object into the hbase stack.

 -- (For reference, the places you do this are, at least, ProtobufUtil and
ConnectionManager)
 -- Hadoop also does something like this in their UserGroupInformation
class, but they do provide a setConfiguration method.  Ideally there are no
static calls to create a Configuration, but this is an ok compromise where
necessary.

I can put JIRAs in for these if it makes sense



On Tue, May 5, 2015 at 10:48 PM, Bryan Beaudreault <bbeaudrea...@hubspot.com
> wrote:

> Thanks for the response guys!
>
> You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
>> mistakenly dropped anything you need? (I see that stuff has moved around
>> but HTI should have everything still from 0.94)
>
>
> Yea, so far so good for HTI features.
>
> Sounds like you have experience copying tables in background in a manner
>> that minimally impinges serving given you have dev'd your own in-house
>> cluster cloning tools?
>> You will use the time while tables are read-only to 'catch-up' the
>> difference between the last table copy and data that has come in since?
>
>
> Correct, we have some tools left over from our 0.92 to 0.94 upgrade, which
> we've used for cluster copies.  It basically does an incremental distcp by
> comparing the file length and md5 of each table in the target and source
> cluster, then only copies the diffs.  We can get very close to real time
> with this, then switch to read-only, do some flushes, and do one final copy
> to catch up.  We have done this many times for various cluster moves.
>
> CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
>
>
> Good to know, will keep this in mind! We already shade some of the
> dependencies of hbase such as guava, apache commons http, and joda.  We
> will do the same for protobuf.
>
>  Can you 'talk out loud' as you try stuff Bryan and if we can't
>> help highlevel, perhaps we can help on specifics.
>
>
> Gladly! I feel like I have a leg up since I've already survived the 0.92
> to 0.94 migration, so glad to share my experiences with this migration as
> well.  I'll update this thread as I move along.  I also plan to release a
> blog post on the ordeal once it's all said and done.
>
> We just created our initial shade of hbase.  I'm leaving tomorrow for
> HBaseCon, but plan on tackling and testing all of this next week once I'm
> back from SF.  If anyone is facing similar upgrade challenges I'd be happy
> to compare notes.
>
> If your clients are interacting with HDFS then you need to go the route of
>> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use
>> PBs in the RPC protocol and it shouldn't be any problem as long as you
>> don't need security
>
>
> Thankfully we don't interact directly with the HDFS of hbase.  There is
> some interaction with the HDFS of our CDH4 hadoop clusters though.  I'll be
> experimenting with these incompatibilities soon and will post here.
> Hopefully I'll be able to separate them enough to not cause an issue.
> Thankfully we have not moved to secure HBase yet.  That's actually on the
> to-do list, but hoping to do it *after* the CDH upgrade.
>
> ---
>
> Thanks again guys.  I'm expecting this will be a drawn out process
> considering our scope, but will be happy to keep updates here as I proceed.
>
> On Tue, May 5, 2015 at 10:31 PM, Esteban Gutierrez <este...@cloudera.com>
> wrote:
>
>> Just to a little bit to what StAck said:
>>
>> --
>> Cloudera, Inc.
>>
>>
>> On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote:
>>
>> > On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <
>> > bbeaudrea...@hubspot.com>
>> > wrote:
>> >
>> > > Hello,
>> > >
>> > > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We
>> have 6
>> > > production hbase clusters, 2 hadoop clusters, and hundreds of
>> > > APIs/daemons/crons/etc hitting all of these things.  Many of these
>> > clients
>> > > hit multiple clusters in the same process.  Daunting to say the least.
>> > >
>> > >
>> > Nod.
>> >
>> >
>> >
>> > > We can't take full downtime on any of these, though we can take
>> > read-only.
>> > > And ideally we could take read-only on each cluster in a staggered
>> > fashion.
>> > >
>> > > From a client perspective, all of our code currently assumes an
>> > > HTableInterface, which gives me some wiggle room I think.  With that
>> in
>> > > mind, here's my current plan:
>> > >
>> >
>> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
>> > mistakenly dropped anything you need? (I see that stuff has moved around
>> > but HTI should have everything still from 0.94)
>> >
>> >
>> > >
>> > > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
>> > > - Create a shim implementation of HTableInterface.  This shim would
>> > > delegate to either the old cdh4 APIs or the new shaded CDH5 classes,
>> > > depending on the cluster being talked to.
>> > > - Once the shim is in place across all clients, I will put each
>> cluster
>> > > into read-only (a client side config of ours), migrate data to a new
>> CDH5
>> > > cluster, then bounce affected services so they look there instead. I
>> will
>> > > do this for each cluster in sequence.
>> > >
>> > >
>> > Sounds like you have experience copying tables in background in a manner
>> > that minimally impinges serving given you have dev'd your own in-house
>> > cluster cloning tools?
>> >
>> > You will use the time while tables are read-only to 'catch-up' the
>> > difference between the last table copy and data that has come in since?
>> >
>> >
>> >
>> > > This provides a great rollback strategy, and with our existing
>> in-house
>> > > cluster cloning tools we can minimize the read-only window to a few
>> > minutes
>> > > if all goes well.
>> > >
>> > > There are a couple gotchas I can think of with the shim, which I'm
>> hoping
>> > > some of you might have ideas/opinions on:
>> > >
>> > > 1) Since protobufs are used for communication, we will have to avoid
>> > > shading those particular classes as they need to match the
>> > > package/classnames on the server side.  I think this should be fine,
>> as
>> > > these are net-new, not conflicting with CDH4 artifacts.  Any
>> > > additions/concerns here?
>> > >
>> > >
>> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
>> >
>>
>> If your clients are interacting with HDFS then you need to go the route of
>> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use
>> PBs in the RPC protocol and it shouldn't be any problem as long as you
>> don't need security (this is mostly because the client does a UGI in the
>> client and its easy to patch on both 0.94 and 1.0 to avoid to call UGI).
>> Another option is to move your application to asynchbase and it should be
>> clever enough to handle both HBase versions.
>>
>>
>>
>> > I myself have little experience going a shading route so have little to
>> > contribute. Can you 'talk out loud' as you try stuff Bryan and if we
>> can't
>> > help highlevel, perhaps we can help on specifics.
>> >
>> > St.Ack
>> >
>>
>> cheers,
>> esteban.
>>
>
>

Re: Upgrading from 0.94 (CDH4) to 1.0 (CDH5)

Reply via email to