Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-11-09 Thread Steven Phillips
+1 on merging this soon. Going forward, I agree it makes sense to break the RPC module into a stand-alone module that is not specific to drill. But whether it is better for it live in the Drill project or in the new Vector project, I am not sure. On Sun, Nov 8, 2015 at 6:42 PM, Jacques Nadeau

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-11-08 Thread Jacques Nadeau
FYI, the patch also just successfully completed the extended regression suite. -- Jacques Nadeau CTO and Co-Founder, Dremio On Sun, Nov 8, 2015 at 5:09 PM, Jacques Nadeau wrote: > Ok guys, > > I took the quiet time directly after the release candidate went out to do > the

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-11-08 Thread Jacques Nadeau
Ok guys, I took the quiet time directly after the release candidate went out to do the first phase of componentization. You can see my work at [1]. This set of commits has little functional impact. I've also done my best to avoid package or file renaming, rather keeping things in their same

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-26 Thread Ted Dunning
This sounds like a really good idea to me. On Mon, Oct 26, 2015 at 2:50 PM, Julien Le Dem wrote: > +1, looking forward to vectorized Parquet Readers/Writers in Drill. > Making VV a standalone standard sounds great to me. > > On Mon, Oct 26, 2015 at 2:46 PM, Parth Chandra

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-26 Thread Hanifi Gunes
I was hoping to see this discussion happening sooner :) VVs has helped Drill representing and moving data around so flexibly that it would not be hard to prove its usefulness to the community as a standalone library. I am in support of this proposal. -Hanifi On Mon, Oct 26, 2015 at 2:19 PM,

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-26 Thread Parth Chandra
+1. Agree with Hanifi that we probably should have done this sooner :). Jason and I faced this need when trying to get a stand alone vectorized parquet reader out of the Drill code last year. On Mon, Oct 26, 2015 at 2:37 PM, Hanifi Gunes wrote: > I was hoping to see this

[DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-26 Thread Jacques Nadeau
Drillers, A number of people have approached me recently about the possibility of collaborating on a shared columnar in-memory representation of data. This shared representation of data could be operated on efficiently with modern cpus as well as shared efficiently via shared memory, IPC and

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-26 Thread Julien Le Dem
+1, looking forward to vectorized Parquet Readers/Writers in Drill. Making VV a standalone standard sounds great to me. On Mon, Oct 26, 2015 at 2:46 PM, Parth Chandra wrote: > +1. Agree with Hanifi that we probably should have done this sooner :). > Jason and I faced this

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-26 Thread Julian Hyde
+100 Thanks for spearheading this, Jacques. They say memory is the new disk. So, it’s no longer sufficient to use the same on-disk data format if we want our tools to interoperate. The idea of engines interoperating by reading the same in-memory temporary tables, and passing data from one

Re: [DISCUSS] Proposal to turn ValueVectors into separate reusable library & project

2015-10-26 Thread Wes McKinney
hi all, I am excited about this initiative and I personally am looking forward to seeing a standard in-memory columnar representation made available to data science languages like Python, R, and Julia, and it's also the ideal place to build out a reference vectorized Parquet implementation for