Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-06-09 Thread Ryan Blue
On 06/08/2014 05:13 PM, James Taylor wrote: Couple items I didn't see mentioned, but I think would be good to get clarity on: * variable length DECIMAL (Phoenix relies on this) Did you send a description of Phoenix's current implementation? I can't find it in my inbox. * ARRAY type (Phoenix

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-06-08 Thread James Taylor
Couple items I didn't see mentioned, but I think would be good to get clarity on: * variable length DECIMAL (Phoenix relies on this) * ARRAY type (Phoenix supports this - arrays of fixed width data is just concatenated together, while arrays of variable length data is run-length-encoded with a doub

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-20 Thread Nick Dimiduk
That's correct Andy. We're locking down the "default" primitive type implementations going forward, while maintaining a flexible API such that we can support existing users who want to migrate to the applicable new features without rewriting existing data. Obviously some of those features will depe

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-19 Thread Jonathan Hsieh
On Mon, May 19, 2014 at 6:31 AM, Andrew Purtell wrote: > So if I can summarize this thread so far, we are going to try and hammer > out a types encoding spec agreeable to HBase, Phoenix, and Kite alike? As > opposed to select a particular implementation today as both spec and > reference implement

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-19 Thread Andrew Purtell
So if I can summarize this thread so far, we are going to try and hammer out a types encoding spec agreeable to HBase, Phoenix, and Kite alike? As opposed to select a particular implementation today as both spec and reference implementation. Is that correct? If so, that sounds like a promising

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-19 Thread Nick Dimiduk
On Thu, May 15, 2014 at 9:32 AM, James Taylor wrote: > @Nick - I like the abstraction of the DataType, but that doesn't solve the > problem for non Java usage. That's true. It's very much a Java construct. Likewise, Struct only codes for semantics; there's no encoding defined there. For correct

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-16 Thread James Taylor
@Ryan & Jon - thanks again for pursuing this - I think it'll be a big improvement. IMHO, it'd be good to add a Requirements section to the doc. If the current Phoenix type system meets those requirements, then why not just go with that? I think we need a binary serialization spec that includes co

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-16 Thread Ryan Blue
On 05/15/2014 09:32 AM, James Taylor wrote: @Ryan & Jon - thanks again for pursuing this - I think it'll be a big improvement. IMHO, it'd be good to add a Requirements section to the doc. If the current Phoenix type system meets those requirements, then why not just go with that? Good idea. Pa

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-15 Thread Nick Dimiduk
On Tue, May 13, 2014 at 3:35 PM, Ryan Blue wrote: > I think there's a little confusion in what we are trying to accomplish. > What I want to do is to write a minimal specification for how to store a > set of types. I'm not trying to leave much flexibility, what I want is > clarity and simplicity

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-13 Thread Nick Dimiduk
Breaking off hackathon thread. The conversation around HBASE-8089 concluded with two points: - HBase should provide support for order-preserving encodings while not dropping support for the existing encoding formats. - HBase is not in the business of schema management; that is a responsibility l

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-13 Thread Stack
On Tue, May 13, 2014 at 3:58 PM, Ryan Blue wrote: > Here are a few more specific responses. > > ... > > OrderedBytes implements a bit-shifting strategy for this. >> {FixedLength,Terminated}Wrapper are provided to add flexibility. Ryan >> has suggested a variation of run-length encoding as anot

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-13 Thread Ryan Blue
Here are a few more specific responses. Hopefully this clears up some remaining points in the context of my last post. Why not use protobuf directly instead of reimplementing a slight variation of their format? I intend to use protobuf directly for compound values. It isn't practical right

Re: [common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-13 Thread Ryan Blue
Hi Nick, Thanks for taking the time for a close look at this, it's great to see this discussion happening in depth. I think there's a little confusion in what we are trying to accomplish. What I want to do is to write a minimal specification for how to store a set of types. I'm not trying to

[common type encoding breakout] Re: HBase Hackathon @ Salesforce 05/06/2014 notes

2014-05-12 Thread Jonathan Hsieh
Hey all, There was a pow-wow at 5/6/14's HBase Hackathon after HBasecon about coalescing to a common set of type encodings for simplified multi-system interop. Systems mentioned include apache phoenix, hive, kite, or spark could move to. Present in the conversation were folks who have written thr