Hey Joe, We have tried a few different things wrt the C++ clients and thrift. Just putting out some of out thoughts here.
First, we used the existing Thrift proxy as a separate tier (Thrift proxy tier). The issue there was that we just didn't get enough throughput (for various reasons). Indepedently, adoption of HBase from C++ was increasing - so we thought it made sense to write a native client. So we wrote the native C++ client and embedded the thrift proxy into the region server (embedded thrift proxy). Cutting the redirect from the client was one gain (as the native client is a smart client), but the real advantage came from short-circuiting the flow. In the thrift proxy tier case, the Thrift client would talk to the proxy using Thrift serialization, proxy would deserialize the Thrift call and re-serialize it into the Java client format, then send it to the region server which would deserialize the java formatted buffers again. But in the embedded proxy + native client, we can short-circuit on the embedded proxy and make a function call to the region server which is running in the same JVM (which helps cut one round of serialization and deserialization). The issues, however, with the thrift based approach are that the Java objects (Htable, scan, get, put, etc) are not thrift definitions, so they need to be updated as a separate (and often very different) set of api's every time there is an enhancement to the Java side of things. The proxy tier has to be separately configured/tuned/bug fixed from the region server to make sure it is as performant as the region server - as the overall system will perform like the slowest component in the stack. The ideal solution (IMHO) is to have a C++ client which has a compatible protocol with the Java client, so that there are no significant perf differences between the two approaches, and there is no separate proxy to tune. Just a though of course, might be hard to achieve. Of course we have just talked about this :) but with the move to protocol buffers in trunk, this should be easier. Out of curiosity, why thrift2 - do you specifically need thrift api's to region servers? Why not " efficient C/C++ client for HBase"? Thanks Karthik On 8/22/12 4:06 PM, "Joe Pallas" <joseph.pal...@oracle.com> wrote: > >On Aug 21, 2012, at 9:29 AM, Stack wrote: > >> On Mon, Aug 20, 2012 at 6:18 PM, Joe Pallas <joseph.pal...@oracle.com> >>wrote: >>> Anyone out there actively using the thrift2 interface in 0.94? Thrift >>>bindings for C++ don¹t seem to handle optional arguments too well (that >>>is to say, it seems that optional arguments are not optional). >>>Unfortunately, checkAndPut uses an optional argument for value to >>>distinguish between the two cases (value must match vs no cell with >>>that column qualifier). Any clues on how to work around that >>>difficulty would be welcome. >>> >> >> If you make a patch, we'll commit it Joe. > >Well, I think the patch really needs to be in Thrift; the only workaround >I can see is to restructure the hbase.thrift interface file to avoid >having routines with optional arguments. It seems a shame to break >compatibility with existing clients for that, and I am not sure if there >is a way to do it without breaking compatibility. (On the other hand, >we¹re talking about thrift2, so it isn¹t like there are many existing >clients.) > >The state of Thrift documentation is lamentable. The original white >paper is the most detailed information I can find about compatibility >rules. It has enough information to tell me that Thrift doesn¹t support >overloading of routine names within a service, because the names are the >identifiers used to identify the routines. I think that means it isn¹t >possible to make a compatible change that would only affect the client >side. > >> Have you seen this? >> https://github.com/facebook/native-cpp-hbase-client Would it help? > >The native client stuff is certainly interesting, but, as near as I can >tell, it expects the in-region-server Thrift server, which I would like >to give a chance to mature a bit before playing with. I¹m also puzzled >by the hbase.thrift file in that repository. It seems to be based on the >older HBase Thrift interface, but it adds some functions. I can¹t see >how a client could use them, though, since there are no HBase-side >patches. > >Anyone involved with FB¹s native client efforts care to enlighten me? > >joe >