Just to a little bit to what StAck said:

--
Cloudera, Inc.


On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote:

> On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault <
> bbeaudrea...@hubspot.com>
> wrote:
>
> > Hello,
> >
> > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We have 6
> > production hbase clusters, 2 hadoop clusters, and hundreds of
> > APIs/daemons/crons/etc hitting all of these things.  Many of these
> clients
> > hit multiple clusters in the same process.  Daunting to say the least.
> >
> >
> Nod.
>
>
>
> > We can't take full downtime on any of these, though we can take
> read-only.
> > And ideally we could take read-only on each cluster in a staggered
> fashion.
> >
> > From a client perspective, all of our code currently assumes an
> > HTableInterface, which gives me some wiggle room I think.  With that in
> > mind, here's my current plan:
> >
>
> You've done a review of HTI in 1.0 vs 0.94 to make sure we've not
> mistakenly dropped anything you need? (I see that stuff has moved around
> but HTI should have everything still from 0.94)
>
>
> >
> > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase.
> > - Create a shim implementation of HTableInterface.  This shim would
> > delegate to either the old cdh4 APIs or the new shaded CDH5 classes,
> > depending on the cluster being talked to.
> > - Once the shim is in place across all clients, I will put each cluster
> > into read-only (a client side config of ours), migrate data to a new CDH5
> > cluster, then bounce affected services so they look there instead. I will
> > do this for each cluster in sequence.
> >
> >
> Sounds like you have experience copying tables in background in a manner
> that minimally impinges serving given you have dev'd your own in-house
> cluster cloning tools?
>
> You will use the time while tables are read-only to 'catch-up' the
> difference between the last table copy and data that has come in since?
>
>
>
> > This provides a great rollback strategy, and with our existing in-house
> > cluster cloning tools we can minimize the read-only window to a few
> minutes
> > if all goes well.
> >
> > There are a couple gotchas I can think of with the shim, which I'm hoping
> > some of you might have ideas/opinions on:
> >
> > 1) Since protobufs are used for communication, we will have to avoid
> > shading those particular classes as they need to match the
> > package/classnames on the server side.  I think this should be fine, as
> > these are net-new, not conflicting with CDH4 artifacts.  Any
> > additions/concerns here?
> >
> >
> CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5?
>

If your clients are interacting with HDFS then you need to go the route of
shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use
PBs in the RPC protocol and it shouldn't be any problem as long as you
don't need security (this is mostly because the client does a UGI in the
client and its easy to patch on both 0.94 and 1.0 to avoid to call UGI).
Another option is to move your application to asynchbase and it should be
clever enough to handle both HBase versions.



> I myself have little experience going a shading route so have little to
> contribute. Can you 'talk out loud' as you try stuff Bryan and if we can't
> help highlevel, perhaps we can help on specifics.
>
> St.Ack
>

cheers,
esteban.

Reply via email to