Just an update here. I've got something working locally that can run against either a 0.94.17 hbase or a 1.0 hbase transparently. I implemented as laid out above, but there were a bunch of gotchas. It helps that we maintain our own fork of each version, as I needed to make some supplemental changes in each version to make things easier. I will do a writeup with all of the gotchas later in the process.
Next steps: - Convert server-side coprocessors - Apply the same or similar shim logic to our TableInputFormat and other mapreduce interfaces A couple notes for the devs: - I love that 1.0 has a separate hbase-client artifact. Unfortunately the TableInputFormat and other mapreduce classes live in hbase-server for some reason. So the end result is I basically need to pull the entire hbase super-artifact into my clients. I may move these to hbase-client in my local fork if that is possible. - There are a few places where you are statically calling HBaseConfiguration.create(). This makes it hard for people who have a lot of libraries built around HBase like us. In our clients we inject configuration properties from our own configuration servers to supplement hbase-site/hbase-default.xml. When HBaseConfiguration.create() is called, it disregards these changes. In my local fork I hacked in a LazyConfigurationHolder, which just keeps a static reference to a Configuration, but has a setter. This allows me to inject my customized Configuration object into the hbase stack. -- (For reference, the places you do this are, at least, ProtobufUtil and ConnectionManager) -- Hadoop also does something like this in their UserGroupInformation class, but they do provide a setConfiguration method. Ideally there are no static calls to create a Configuration, but this is an ok compromise where necessary. I can put JIRAs in for these if it makes sense On Tue, May 5, 2015 at 10:48 PM, Bryan Beaudreault <bbeaudrea...@hubspot.com > wrote: > Thanks for the response guys! > > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not >> mistakenly dropped anything you need? (I see that stuff has moved around >> but HTI should have everything still from 0.94) > > > Yea, so far so good for HTI features. > > Sounds like you have experience copying tables in background in a manner >> that minimally impinges serving given you have dev'd your own in-house >> cluster cloning tools? >> You will use the time while tables are read-only to 'catch-up' the >> difference between the last table copy and data that has come in since? > > > Correct, we have some tools left over from our 0.92 to 0.94 upgrade, which > we've used for cluster copies. It basically does an incremental distcp by > comparing the file length and md5 of each table in the target and source > cluster, then only copies the diffs. We can get very close to real time > with this, then switch to read-only, do some flushes, and do one final copy > to catch up. We have done this many times for various cluster moves. > > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5? > > > Good to know, will keep this in mind! We already shade some of the > dependencies of hbase such as guava, apache commons http, and joda. We > will do the same for protobuf. > > Can you 'talk out loud' as you try stuff Bryan and if we can't >> help highlevel, perhaps we can help on specifics. > > > Gladly! I feel like I have a leg up since I've already survived the 0.92 > to 0.94 migration, so glad to share my experiences with this migration as > well. I'll update this thread as I move along. I also plan to release a > blog post on the ordeal once it's all said and done. > > We just created our initial shade of hbase. I'm leaving tomorrow for > HBaseCon, but plan on tackling and testing all of this next week once I'm > back from SF. If anyone is facing similar upgrade challenges I'd be happy > to compare notes. > > If your clients are interacting with HDFS then you need to go the route of >> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use >> PBs in the RPC protocol and it shouldn't be any problem as long as you >> don't need security > > > Thankfully we don't interact directly with the HDFS of hbase. There is > some interaction with the HDFS of our CDH4 hadoop clusters though. I'll be > experimenting with these incompatibilities soon and will post here. > Hopefully I'll be able to separate them enough to not cause an issue. > Thankfully we have not moved to secure HBase yet. That's actually on the > to-do list, but hoping to do it *after* the CDH upgrade. > > --- > > Thanks again guys. I'm expecting this will be a drawn out process > considering our scope, but will be happy to keep updates here as I proceed. > > On Tue, May 5, 2015 at 10:31 PM, Esteban Gutierrez <este...@cloudera.com> > wrote: > >> Just to a little bit to what StAck said: >> >> -- >> Cloudera, Inc. >> >> >> On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote: >> >> > On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault < >> > bbeaudrea...@hubspot.com> >> > wrote: >> > >> > > Hello, >> > > >> > > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We >> have 6 >> > > production hbase clusters, 2 hadoop clusters, and hundreds of >> > > APIs/daemons/crons/etc hitting all of these things. Many of these >> > clients >> > > hit multiple clusters in the same process. Daunting to say the least. >> > > >> > > >> > Nod. >> > >> > >> > >> > > We can't take full downtime on any of these, though we can take >> > read-only. >> > > And ideally we could take read-only on each cluster in a staggered >> > fashion. >> > > >> > > From a client perspective, all of our code currently assumes an >> > > HTableInterface, which gives me some wiggle room I think. With that >> in >> > > mind, here's my current plan: >> > > >> > >> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not >> > mistakenly dropped anything you need? (I see that stuff has moved around >> > but HTI should have everything still from 0.94) >> > >> > >> > > >> > > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase. >> > > - Create a shim implementation of HTableInterface. This shim would >> > > delegate to either the old cdh4 APIs or the new shaded CDH5 classes, >> > > depending on the cluster being talked to. >> > > - Once the shim is in place across all clients, I will put each >> cluster >> > > into read-only (a client side config of ours), migrate data to a new >> CDH5 >> > > cluster, then bounce affected services so they look there instead. I >> will >> > > do this for each cluster in sequence. >> > > >> > > >> > Sounds like you have experience copying tables in background in a manner >> > that minimally impinges serving given you have dev'd your own in-house >> > cluster cloning tools? >> > >> > You will use the time while tables are read-only to 'catch-up' the >> > difference between the last table copy and data that has come in since? >> > >> > >> > >> > > This provides a great rollback strategy, and with our existing >> in-house >> > > cluster cloning tools we can minimize the read-only window to a few >> > minutes >> > > if all goes well. >> > > >> > > There are a couple gotchas I can think of with the shim, which I'm >> hoping >> > > some of you might have ideas/opinions on: >> > > >> > > 1) Since protobufs are used for communication, we will have to avoid >> > > shading those particular classes as they need to match the >> > > package/classnames on the server side. I think this should be fine, >> as >> > > these are net-new, not conflicting with CDH4 artifacts. Any >> > > additions/concerns here? >> > > >> > > >> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5? >> > >> >> If your clients are interacting with HDFS then you need to go the route of >> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 use >> PBs in the RPC protocol and it shouldn't be any problem as long as you >> don't need security (this is mostly because the client does a UGI in the >> client and its easy to patch on both 0.94 and 1.0 to avoid to call UGI). >> Another option is to move your application to asynchbase and it should be >> clever enough to handle both HBase versions. >> >> >> >> > I myself have little experience going a shading route so have little to >> > contribute. Can you 'talk out loud' as you try stuff Bryan and if we >> can't >> > help highlevel, perhaps we can help on specifics. >> > >> > St.Ack >> > >> >> cheers, >> esteban. >> > >