On Fri, Feb 22, 2013 at 10:14 AM, Matt Corgan <mcor...@hotpads.com> wrote:
> > > > Not quite true. It makes use of Bytes and ImmutableBytesWritable from > > hbase-common. > > Oh, interesting. Could we inline the code from Bytes.java and somehow get > rid of the ImmutableBytesWritable. Like calling packages can add > ImmutableBytesWritable functionality on top if they want to? I'll need to do a more thorough evaluation, but a cursory glance indicates use of Bytes could be replaced by arraycopy. ImmutableBytesWritable is used mostly as a convenient wrapper over byte[], and may well be replaceable with Hadoop's BytesWritable. Seems like something as low level as rearranging bytes should be dependency > free. > The implementation makes heavy use of Hadoop Writables, but the dependencies on HBase instances are mostly convenience. On Fri, Feb 22, 2013 at 10:04 AM, Nick Dimiduk <ndimi...@gmail.com> wrote: > > > Inline. > > > > On Fri, Feb 22, 2013 at 10:00 AM, Matt Corgan <mcor...@hotpads.com> > wrote: > > > > > To nitpick a little it wouldn't quite be a sibling of hbase-client > > because > > > hbase-client depends on hbase-common and hbase-protocol while this new > > one > > > will not depend on anything. Would hbase-server be able to see it? > > Would > > > it basically be a standalone module being maintained by HBase? > > > > > > > Not quite true. It makes use of Bytes and ImmutableBytesWritable from > > hbase-common. > > > > Also, assuming the original Orderly library goes unmaintained and we want > > > people to use it, this will be the primary place to get it. Having no > > > dependencies on other hbase modules is important for people who want to > > use > > > the Orderly library for something unrelated to hbase. For example, a > web > > > application that logs data in this format but not directly to hbase. > > > > > > > Orderly has gone unmaintained. The only fork with any activity that I'm > > aware of is my own. I'd much rather see it gain the publicity, > > additional scrutiny, wider adoption than continue as a pet-project. > > > > On Fri, Feb 22, 2013 at 9:32 AM, Elliott Clark <ecl...@apache.org> > wrote: > > > > > > > Yep the client will be fully separated as soon as rpc changes > > > > are stabilized. Until then keeping up the move patch was just too > > > onerous. > > > > > > > > > > > > On Fri, Feb 22, 2013 at 6:31 AM, Jonathan Hsieh <j...@cloudera.com> > > > wrote: > > > > > > > > > Nick, > > > > > > > > > > I'm +1 for it having its own module, and being a sibling of > > > hbase-client. > > > > > I'm assuming the client stuff will happen before we release 0.96 > > since > > > > it > > > > > has been started. > > > > > > > > > > Jon. > > > > > > > > > > On Fri, Feb 22, 2013 at 6:13 AM, Nick Dimiduk <ndimi...@gmail.com> > > > > wrote: > > > > > > > > > > > You're absolutely correct: this library introduces client-side > > > > > conventions > > > > > > and is not needed from within the HMaster or RegionServer. Is > > > > > > the consensus that it should reside in it's own module or be a > > > sibling > > > > to > > > > > > the o.a.h.hbase.client source tree? I'm a little confused by the > > > > current > > > > > > state of the modules; hbase-client looks empty while > > > o.a.h.hbase.client > > > > > > sits under hbase-server. > > > > > > > > > > > > Thanks, > > > > > > Nick > > > > > > > > > > > > On Thu, Feb 21, 2013 at 11:56 PM, Jonathan Hsieh < > j...@cloudera.com > > > > > > > > wrote: > > > > > > > > > > > > > So I buy the argument about this being included in hbase, but > > > several > > > > > of > > > > > > > the questions still stand -- > > > > > > > > > > > > > > Why is this part of hbase-common? shouldn't this be just a > > > > dependency > > > > > of > > > > > > > hbase-client module? Does the hbase-server side need to depend > > on > > > > > this? > > > > > > > > > > > > > > Since this is a large import of a currently isolated library, > why > > > not > > > > > > make > > > > > > > it a separate module instead of part of hbase-common? This > would > > > > > > enforce a > > > > > > > boundary that will prevent pollution from circular > dependencies. > > > > > > > > > > > > > > Jon. > > > > > > > > > > > > > > On Thu, Feb 21, 2013 at 7:23 PM, Enis Söztutar < > e...@apache.org> > > > > > wrote: > > > > > > > > > > > > > > > I think this belongs in core HBase, as a replacement to > Bytes, > > > > which > > > > > > > should > > > > > > > > be deprecated eventually. We have a Bytes utility which is > > > supposed > > > > > to > > > > > > > > convert basic java types to byte[]'s, but it does not work > for > > > > signed > > > > > > > > numbers. > > > > > > > > > > > > > > > > We already know that all of the clients, Hive, Pig, Phoenix, > > have > > > > to > > > > > > have > > > > > > > > at least java type -> byte[] conversion utilities, and I > think > > it > > > > is > > > > > > > > HBase's job to supply one so that different clients can > > > > interoperate. > > > > > > > Since > > > > > > > > internally we are also relying on serializing java types, we > > need > > > > > that > > > > > > > > library in the core. > > > > > > > > > > > > > > > > BTW, I also think that we need to have a SQL-type to java > type > > to > > > > > > byte[] > > > > > > > > layer, but that is another discussion. > > > > > > > > > > > > > > > > Enis > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Feb 21, 2013 at 3:04 PM, Jonathan Hsieh < > > > j...@cloudera.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > Nick, > > > > > > > > > > > > > > > > > > While I believe having an order-preserving canonical > > > > serialization > > > > > > is a > > > > > > > > > good idea, from doing a read of the mail and a skim of the > > > jira > > > > it > > > > > > is > > > > > > > > not > > > > > > > > > clear to my why this is inside hbase as part of > hbase-common. > > > > > > > > > > > > > > > > > > Why isn't this part of a library on top of hbase (a > > dependency > > > > for > > > > > > > > > Pig/Hive) instead of "inside" hbase? > > > > > > > > > Can't this functionality be done just from the client > level? > > > > > > > > > What's the end goal hee? Is the goal here to replace the > > > > > > > Bytes.toBytes(*) > > > > > > > > > methods to enforced the ordering? > > > > > > > > > If I HBase has two mutually incompatible encodings > > "built-in", > > > > how > > > > > > > does a > > > > > > > > > dev know to use one or the other later on? > > > > > > > > > If this is essentially a mega import of a library (300k.. > > > yikes) > > > > , > > > > > > why > > > > > > > > not > > > > > > > > > make it a separate module instead of part of common? > > > > > > > > > > > > > > > > > > Jon. > > > > > > > > > > > > > > > > > > On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk < > > > > ndimi...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > I'm of the opinion that HBase should provide a mechanism > > for > > > > > > > > serializing > > > > > > > > > > common java types such that the serialized format sorts > > > > according > > > > > > the > > > > > > > > > > the natural ordering of the type. I think many > application > > > > > efforts > > > > > > > end > > > > > > > > up > > > > > > > > > > building a custom, partial implementation of this kind of > > > > > > > functionality > > > > > > > > > on > > > > > > > > > > their own. I think HBase should provide a canonical > > > > > implementation > > > > > > of > > > > > > > > > such > > > > > > > > > > a serialization format so that third-parties can reliably > > > build > > > > > on > > > > > > > top > > > > > > > > of > > > > > > > > > > HBase. Not just user applications, but other tools like > Pig > > > and > > > > > > Hive > > > > > > > > are > > > > > > > > > > also enabled. Implementations for > > > > > > > > > > HIVE-3634< > https://issues.apache.org/jira/browse/HIVE-3634 > > >, > > > > > > > > > > HIVE-2599 < > https://issues.apache.org/jira/browse/HIVE-2599 > > >, > > > > or > > > > > > > > > > HIVE-2903< > https://issues.apache.org/jira/browse/HIVE-2903 > > > > >could > > > > > be > > > > > > > > > > compatible with similar features in Pig. > > > > > > > > > > > > > > > > > > > > After implementing something similar on multiple > occasions, > > > > > > stumbled > > > > > > > > > across > > > > > > > > > > the Orderly <https://github.com/ndimiduk/orderly> > library. > > > > It's > > > > > > also > > > > > > > > > > appears to have been adopted by other large projects, > > > including > > > > > > > > > > Lily<https://github.com/NGDATA/orderly>. > > > > > > > > > > I've engaged the library's author for some improvements > > only > > > to > > > > > > find > > > > > > > > out > > > > > > > > > > he's now at Google and will no longer be maintaining it. > > > Thus, > > > > I > > > > > > > > propose > > > > > > > > > we > > > > > > > > > > take it into HBase. > > > > > > > > > > > > > > > > > > > > HBASE-7692 < > > https://issues.apache.org/jira/browse/HBASE-7692 > > > > > > > > > > > > includes a > > > > > > > > > > patch that introduces Orderly into hbase-common under the > > > > orderly > > > > > > > > > > namespace. I have an associated branch on > > > > > > > > > > gihub< > > > > > > > > > > > > > > > https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization > > > > > > > > > > >wherein > > > > > > > > > > I've broken the patch out into multiple commits to ease > > > review. > > > > > > > > > > Please take a few minutes to give it a look. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Nick > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > // Jonathan Hsieh (shay) > > > > > > > > > // Software Engineer, Cloudera > > > > > > > > > // j...@cloudera.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > // Jonathan Hsieh (shay) > > > > > > > // Software Engineer, Cloudera > > > > > > > // j...@cloudera.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > // Jonathan Hsieh (shay) > > > > > // Software Engineer, Cloudera > > > > > // j...@cloudera.com > > > > > > > > > > > > > > >