Re: Some questions/proposals for the spec (Layout.md)

2016-04-23 Thread Wes McKinney
t; So in summary, it's good to mention this alignment consideration in the >>>> spec, also saying the fixed 64 byte alignment address is used by default; >>>> and hard-code the fixed value in source codes (for example, when >>>> allocating buffers for primitive a

Re: Some questions/proposals for the spec (Layout.md)

2016-04-22 Thread Micah Kornfield
by default; >>> and hard-code the fixed value in source codes (for example, when allocating >>> buffers for primitive arrays, chunk array buffers, and null bitmap buffers). >>> >>> Please help clarify if I'm not getting you right. Thanks. >>> >

Re: Some questions/proposals for the spec (Layout.md)

2016-04-22 Thread Wes McKinney
gt;> >> Please help clarify if I'm not getting you right. Thanks. >> >> Regards, >> Kai >> >> -Original Message- >> From: Micah Kornfield [mailto:emkornfi...@gmail.com] >> Sent: Saturday, April 09, 2016 12:56 PM >> To: dev@arrow.a

Re: Some questions/proposals for the spec (Layout.md)

2016-04-09 Thread Micah Kornfield
nfield [mailto:emkornfi...@gmail.com] > Sent: Saturday, April 09, 2016 12:56 PM > To: dev@arrow.apache.org > Subject: Re: Some questions/proposals for the spec (Layout.md) > > Hi Kai, > Are you proposing making alignment and width part of the RPC metadata? > I think this is a g

RE: Some questions/proposals for the spec (Layout.md)

2016-04-09 Thread Zheng, Kai
ht. Thanks. Regards, Kai -Original Message- From: Micah Kornfield [mailto:emkornfi...@gmail.com] Sent: Saturday, April 09, 2016 12:56 PM To: dev@arrow.apache.org Subject: Re: Some questions/proposals for the spec (Layout.md) Hi Kai, Are you proposing making alignment and width part of th

Re: Some questions/proposals for the spec (Layout.md)

2016-04-09 Thread Wes McKinney
My two bits: 1) I support making 64-byte alignment the default. We can always retrofit the metadata later with a different alignment type, but in the absence of such metadata, 512 bits can be assumed. I realize this will have bad optics with small arrays (a lot of unused bytes), but that's okay.

Re: Some questions/proposals for the spec (Layout.md)

2016-04-09 Thread Micah Kornfield
An additional data-point, it looks like Apache Hive also uses one byte for unions: https://github.com/apache/hive/blob/26b5c7b56a4f28ce3eabc0207566cce46b29b558/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUnion.java On Fri, Apr 8, 2016 at 8:21 PM, Micah Kornfield wrote: > I

Re: Some questions/proposals for the spec (Layout.md)

2016-04-08 Thread Micah Kornfield
om: Wes McKinney [mailto:w...@cloudera.com] > Sent: Friday, April 08, 2016 11:40 PM > To: dev@arrow.apache.org > Subject: Re: Some questions/proposals for the spec (Layout.md) > > On the SIMD question, it seems AVX is going to 512 bits, so one could even > argue for 64-byte alignment a

Re: Some questions/proposals for the spec (Layout.md)

2016-04-08 Thread Micah Kornfield
I think one of Arrow's initial design goals should be simplicity of implementation of the spec. We can always make things more complicated in the future. This leads me to prefer a fixed size. Wes (or anyone else) in practice have you seen a union of structs with more then 127 members? I would

RE: Some questions/proposals for the spec (Layout.md)

2016-04-08 Thread Zheng, Kai
rg Subject: Re: Some questions/proposals for the spec (Layout.md) On the SIMD question, it seems AVX is going to 512 bits, so one could even argue for 64-byte alignment as a matter of future-proofing. AVX2 / 256-bit seems fairly widely available nowadays, but it would be great if Todd or any of the

RE: Some questions/proposals for the spec (Layout.md)

2016-04-08 Thread Wang, Yanping
rom: Wes McKinney [mailto:w...@cloudera.com] Sent: Friday, April 08, 2016 8:40 AM To: dev@arrow.apache.org Subject: Re: Some questions/proposals for the spec (Layout.md) On the SIMD question, it seems AVX is going to 512 bits, so one could even argue for 64-byte alignment as a matter of future-

Re: Some questions/proposals for the spec (Layout.md)

2016-04-08 Thread Wes McKinney
On the SIMD question, it seems AVX is going to 512 bits, so one could even argue for 64-byte alignment as a matter of future-proofing. AVX2 / 256-bit seems fairly widely available nowadays, but it would be great if Todd or any of the hardware folks (e.g. from Intel) on the list could weigh in with

Re: Some questions/proposals for the spec (Layout.md)

2016-04-08 Thread Wes McKinney
On Fri, Apr 8, 2016 at 8:07 AM, Jacques Nadeau wrote: >> >> >> > I believe this choice was primarily about simplifying the code (similar >> to why we have a n+1 >> > offsets instead of just n in the list/varchar representations (even >> though n=0 is always 0)). In both >> > situations, you don't

Re: Some questions/proposals for the spec (Layout.md)

2016-04-08 Thread Jacques Nadeau
> > > > I believe this choice was primarily about simplifying the code (similar > to why we have a n+1 > > offsets instead of just n in the list/varchar representations (even > though n=0 is always 0)). In both > > situations, you don't have to worry about writing special code (and a > condition) f

Re: Some questions/proposals for the spec (Layout.md)

2016-04-08 Thread Micah Kornfield
Thanks for the quick and thorough reply. Snipping out some segments for follow-up (everything else sounds reasonable): >> >> 2. The document specifies null bitmaps should have a length padded to >> a multiple of 8 bytes to avoid word alignment concerns. I assume the >> concern is having a subs

Re: Some questions/proposals for the spec (Layout.md)

2016-04-07 Thread Jacques Nadeau
Inline... > 1. For completeness it might be useful to add a statement that the > byte order (endianness) is platform native. > Actually, Arrow is little-endian. It is an oversight that we haven't documented it as such. One of the key capabilities is to push it across the wire between separate s