So I buy the argument about this being included in hbase, but several of the questions still stand --
Why is this part of hbase-common? shouldn't this be just a dependency of hbase-client module? Does the hbase-server side need to depend on this? Since this is a large import of a currently isolated library, why not make it a separate module instead of part of hbase-common? This would enforce a boundary that will prevent pollution from circular dependencies. Jon. On Thu, Feb 21, 2013 at 7:23 PM, Enis Söztutar <e...@apache.org> wrote: > I think this belongs in core HBase, as a replacement to Bytes, which should > be deprecated eventually. We have a Bytes utility which is supposed to > convert basic java types to byte[]'s, but it does not work for signed > numbers. > > We already know that all of the clients, Hive, Pig, Phoenix, have to have > at least java type -> byte[] conversion utilities, and I think it is > HBase's job to supply one so that different clients can interoperate. Since > internally we are also relying on serializing java types, we need that > library in the core. > > BTW, I also think that we need to have a SQL-type to java type to byte[] > layer, but that is another discussion. > > Enis > > > On Thu, Feb 21, 2013 at 3:04 PM, Jonathan Hsieh <j...@cloudera.com> wrote: > > > Nick, > > > > While I believe having an order-preserving canonical serialization is a > > good idea, from doing a read of the mail and a skim of the jira it is > not > > clear to my why this is inside hbase as part of hbase-common. > > > > Why isn't this part of a library on top of hbase (a dependency for > > Pig/Hive) instead of "inside" hbase? > > Can't this functionality be done just from the client level? > > What's the end goal hee? Is the goal here to replace the Bytes.toBytes(*) > > methods to enforced the ordering? > > If I HBase has two mutually incompatible encodings "built-in", how does a > > dev know to use one or the other later on? > > If this is essentially a mega import of a library (300k.. yikes) , why > not > > make it a separate module instead of part of common? > > > > Jon. > > > > On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk <ndimi...@gmail.com> > wrote: > > > > > Hi everyone, > > > > > > I'm of the opinion that HBase should provide a mechanism for > serializing > > > common java types such that the serialized format sorts according the > > > the natural ordering of the type. I think many application efforts end > up > > > building a custom, partial implementation of this kind of functionality > > on > > > their own. I think HBase should provide a canonical implementation of > > such > > > a serialization format so that third-parties can reliably build on top > of > > > HBase. Not just user applications, but other tools like Pig and Hive > are > > > also enabled. Implementations for > > > HIVE-3634<https://issues.apache.org/jira/browse/HIVE-3634>, > > > HIVE-2599 <https://issues.apache.org/jira/browse/HIVE-2599>, or > > > HIVE-2903<https://issues.apache.org/jira/browse/HIVE-2903>could be > > > compatible with similar features in Pig. > > > > > > After implementing something similar on multiple occasions, stumbled > > across > > > the Orderly <https://github.com/ndimiduk/orderly> library. It's also > > > appears to have been adopted by other large projects, including > > > Lily<https://github.com/NGDATA/orderly>. > > > I've engaged the library's author for some improvements only to find > out > > > he's now at Google and will no longer be maintaining it. Thus, I > propose > > we > > > take it into HBase. > > > > > > HBASE-7692 <https://issues.apache.org/jira/browse/HBASE-7692> > includes a > > > patch that introduces Orderly into hbase-common under the orderly > > > namespace. I have an associated branch on > > > gihub< > > https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization > > > >wherein > > > I've broken the patch out into multiple commits to ease review. > > > Please take a few minutes to give it a look. > > > > > > Thanks, > > > Nick > > > > > > > > > > > -- > > // Jonathan Hsieh (shay) > > // Software Engineer, Cloudera > > // j...@cloudera.com > > > -- // Jonathan Hsieh (shay) // Software Engineer, Cloudera // j...@cloudera.com