+1 on moving MR related code to hbase-client or we can have a separate artifact called hbase-mapreduce I also have to include hbase-server along with hbase-client in my project just because of this reason. And once we pull in hbase-server and build an uber jar. hbase-server pulls in a lots of unnecessary stuff. Note: my project is not related to migration from 0.94 to 1.0. But, i am supporting the argument for moving MR code in client or a separate artifact.
On Thu, May 14, 2015 at 9:43 AM, Bryan Beaudreault <bbeaudrea...@hubspot.com > wrote: > Just an update here. I've got something working locally that can run > against either a 0.94.17 hbase or a 1.0 hbase transparently. I implemented > as laid out above, but there were a bunch of gotchas. It helps that we > maintain our own fork of each version, as I needed to make some > supplemental changes in each version to make things easier. I will do a > writeup with all of the gotchas later in the process. > > Next steps: > > - Convert server-side coprocessors > - Apply the same or similar shim logic to our TableInputFormat and other > mapreduce interfaces > > A couple notes for the devs: > > - I love that 1.0 has a separate hbase-client artifact. Unfortunately the > TableInputFormat and other mapreduce classes live in hbase-server for some > reason. So the end result is I basically need to pull the entire hbase > super-artifact into my clients. I may move these to hbase-client in my > local fork if that is possible. > > - There are a few places where you are statically calling > HBaseConfiguration.create(). This makes it hard for people who have a lot > of libraries built around HBase like us. In our clients we inject > configuration properties from our own configuration servers to supplement > hbase-site/hbase-default.xml. When HBaseConfiguration.create() is called, > it disregards these changes. In my local fork I hacked in a > LazyConfigurationHolder, which just keeps a static reference to a > Configuration, but has a setter. This allows me to inject my customized > Configuration object into the hbase stack. > > -- (For reference, the places you do this are, at least, ProtobufUtil and > ConnectionManager) > -- Hadoop also does something like this in their UserGroupInformation > class, but they do provide a setConfiguration method. Ideally there are no > static calls to create a Configuration, but this is an ok compromise where > necessary. > > I can put JIRAs in for these if it makes sense > > > > On Tue, May 5, 2015 at 10:48 PM, Bryan Beaudreault < > bbeaudrea...@hubspot.com > > wrote: > > > Thanks for the response guys! > > > > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not > >> mistakenly dropped anything you need? (I see that stuff has moved around > >> but HTI should have everything still from 0.94) > > > > > > Yea, so far so good for HTI features. > > > > Sounds like you have experience copying tables in background in a manner > >> that minimally impinges serving given you have dev'd your own in-house > >> cluster cloning tools? > >> You will use the time while tables are read-only to 'catch-up' the > >> difference between the last table copy and data that has come in since? > > > > > > Correct, we have some tools left over from our 0.92 to 0.94 upgrade, > which > > we've used for cluster copies. It basically does an incremental distcp > by > > comparing the file length and md5 of each table in the target and source > > cluster, then only copies the diffs. We can get very close to real time > > with this, then switch to read-only, do some flushes, and do one final > copy > > to catch up. We have done this many times for various cluster moves. > > > > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5? > > > > > > Good to know, will keep this in mind! We already shade some of the > > dependencies of hbase such as guava, apache commons http, and joda. We > > will do the same for protobuf. > > > > Can you 'talk out loud' as you try stuff Bryan and if we can't > >> help highlevel, perhaps we can help on specifics. > > > > > > Gladly! I feel like I have a leg up since I've already survived the 0.92 > > to 0.94 migration, so glad to share my experiences with this migration as > > well. I'll update this thread as I move along. I also plan to release a > > blog post on the ordeal once it's all said and done. > > > > We just created our initial shade of hbase. I'm leaving tomorrow for > > HBaseCon, but plan on tackling and testing all of this next week once I'm > > back from SF. If anyone is facing similar upgrade challenges I'd be > happy > > to compare notes. > > > > If your clients are interacting with HDFS then you need to go the route > of > >> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 > use > >> PBs in the RPC protocol and it shouldn't be any problem as long as you > >> don't need security > > > > > > Thankfully we don't interact directly with the HDFS of hbase. There is > > some interaction with the HDFS of our CDH4 hadoop clusters though. I'll > be > > experimenting with these incompatibilities soon and will post here. > > Hopefully I'll be able to separate them enough to not cause an issue. > > Thankfully we have not moved to secure HBase yet. That's actually on the > > to-do list, but hoping to do it *after* the CDH upgrade. > > > > --- > > > > Thanks again guys. I'm expecting this will be a drawn out process > > considering our scope, but will be happy to keep updates here as I > proceed. > > > > On Tue, May 5, 2015 at 10:31 PM, Esteban Gutierrez <este...@cloudera.com > > > > wrote: > > > >> Just to a little bit to what StAck said: > >> > >> -- > >> Cloudera, Inc. > >> > >> > >> On Tue, May 5, 2015 at 3:53 PM, Stack <st...@duboce.net> wrote: > >> > >> > On Tue, May 5, 2015 at 8:58 AM, Bryan Beaudreault < > >> > bbeaudrea...@hubspot.com> > >> > wrote: > >> > > >> > > Hello, > >> > > > >> > > I'm about to start tackling our upgrade path for 0.94 to 1.0+. We > >> have 6 > >> > > production hbase clusters, 2 hadoop clusters, and hundreds of > >> > > APIs/daemons/crons/etc hitting all of these things. Many of these > >> > clients > >> > > hit multiple clusters in the same process. Daunting to say the > least. > >> > > > >> > > > >> > Nod. > >> > > >> > > >> > > >> > > We can't take full downtime on any of these, though we can take > >> > read-only. > >> > > And ideally we could take read-only on each cluster in a staggered > >> > fashion. > >> > > > >> > > From a client perspective, all of our code currently assumes an > >> > > HTableInterface, which gives me some wiggle room I think. With that > >> in > >> > > mind, here's my current plan: > >> > > > >> > > >> > You've done a review of HTI in 1.0 vs 0.94 to make sure we've not > >> > mistakenly dropped anything you need? (I see that stuff has moved > around > >> > but HTI should have everything still from 0.94) > >> > > >> > > >> > > > >> > > - Shade CDH5 to something like org.apache.hadoop.cdh5.hbase. > >> > > - Create a shim implementation of HTableInterface. This shim would > >> > > delegate to either the old cdh4 APIs or the new shaded CDH5 classes, > >> > > depending on the cluster being talked to. > >> > > - Once the shim is in place across all clients, I will put each > >> cluster > >> > > into read-only (a client side config of ours), migrate data to a new > >> CDH5 > >> > > cluster, then bounce affected services so they look there instead. I > >> will > >> > > do this for each cluster in sequence. > >> > > > >> > > > >> > Sounds like you have experience copying tables in background in a > manner > >> > that minimally impinges serving given you have dev'd your own in-house > >> > cluster cloning tools? > >> > > >> > You will use the time while tables are read-only to 'catch-up' the > >> > difference between the last table copy and data that has come in > since? > >> > > >> > > >> > > >> > > This provides a great rollback strategy, and with our existing > >> in-house > >> > > cluster cloning tools we can minimize the read-only window to a few > >> > minutes > >> > > if all goes well. > >> > > > >> > > There are a couple gotchas I can think of with the shim, which I'm > >> hoping > >> > > some of you might have ideas/opinions on: > >> > > > >> > > 1) Since protobufs are used for communication, we will have to avoid > >> > > shading those particular classes as they need to match the > >> > > package/classnames on the server side. I think this should be fine, > >> as > >> > > these are net-new, not conflicting with CDH4 artifacts. Any > >> > > additions/concerns here? > >> > > > >> > > > >> > CDH4 has pb2.4.1 in it as opposed to pb2.5.0 in cdh5? > >> > > >> > >> If your clients are interacting with HDFS then you need to go the route > of > >> shading around PB and its hard, but HBase-wise only HBase 0.98 and 1.0 > use > >> PBs in the RPC protocol and it shouldn't be any problem as long as you > >> don't need security (this is mostly because the client does a UGI in the > >> client and its easy to patch on both 0.94 and 1.0 to avoid to call UGI). > >> Another option is to move your application to asynchbase and it should > be > >> clever enough to handle both HBase versions. > >> > >> > >> > >> > I myself have little experience going a shading route so have little > to > >> > contribute. Can you 'talk out loud' as you try stuff Bryan and if we > >> can't > >> > help highlevel, perhaps we can help on specifics. > >> > > >> > St.Ack > >> > > >> > >> cheers, > >> esteban. > >> > > > > > -- Thanks & Regards, Anil Gupta