Nick, While I believe having an order-preserving canonical serialization is a good idea, from doing a read of the mail and a skim of the jira it is not clear to my why this is inside hbase as part of hbase-common.
Why isn't this part of a library on top of hbase (a dependency for Pig/Hive) instead of "inside" hbase? Can't this functionality be done just from the client level? What's the end goal hee? Is the goal here to replace the Bytes.toBytes(*) methods to enforced the ordering? If I HBase has two mutually incompatible encodings "built-in", how does a dev know to use one or the other later on? If this is essentially a mega import of a library (300k.. yikes) , why not make it a separate module instead of part of common? Jon. On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk <ndimi...@gmail.com> wrote: > Hi everyone, > > I'm of the opinion that HBase should provide a mechanism for serializing > common java types such that the serialized format sorts according the > the natural ordering of the type. I think many application efforts end up > building a custom, partial implementation of this kind of functionality on > their own. I think HBase should provide a canonical implementation of such > a serialization format so that third-parties can reliably build on top of > HBase. Not just user applications, but other tools like Pig and Hive are > also enabled. Implementations for > HIVE-3634<https://issues.apache.org/jira/browse/HIVE-3634>, > HIVE-2599 <https://issues.apache.org/jira/browse/HIVE-2599>, or > HIVE-2903<https://issues.apache.org/jira/browse/HIVE-2903>could be > compatible with similar features in Pig. > > After implementing something similar on multiple occasions, stumbled across > the Orderly <https://github.com/ndimiduk/orderly> library. It's also > appears to have been adopted by other large projects, including > Lily<https://github.com/NGDATA/orderly>. > I've engaged the library's author for some improvements only to find out > he's now at Google and will no longer be maintaining it. Thus, I propose we > take it into HBase. > > HBASE-7692 <https://issues.apache.org/jira/browse/HBASE-7692> includes a > patch that introduces Orderly into hbase-common under the orderly > namespace. I have an associated branch on > gihub<https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization > >wherein > I've broken the patch out into multiple commits to ease review. > Please take a few minutes to give it a look. > > Thanks, > Nick > -- // Jonathan Hsieh (shay) // Software Engineer, Cloudera // j...@cloudera.com