Re: [Rust] [DISCUSS] Donate DataFusion to Arrow project

Wes McKinney Mon, 14 Jan 2019 12:17:25 -0800

Getting the 0.12 release out is my priority right now, but it seems
that there are no major objections to this code donation.


@Andy -- I can kick off the vote to accept the code donation in the
next few days if you'd like to proceed with that. How much time do you
think it would take for you to ready the merge?

Thanks,
Wes

On Wed, Jan 9, 2019 at 8:28 AM Andy Grove <[email protected]> wrote:
>
> Wes,
>
> Thanks. This sounds great.
>
> Andy.
>
> On Tue, Jan 8, 2019 at 8:28 AM Wes McKinney <[email protected]> wrote:
>
> > hi Andy -- I'm supportive of the code donation. I see building
> > in-memory, embeddable analytics and query processing as the natural
> > next stage of this project. As I have described on this mailing list,
> > I intend to work on this with my colleagues in C++ with the goal of
> > making such functionality available at least in C, Python, R, and
> > Ruby. I see no reason why such work should be exclusive to C++.
> >
> > Rust seems like a reasonable implementation language for this, and
> > given growing interest in the language, I think it will help grow the
> > Arrow community.
> >
> > I'd like to wait a few more days to allow others to weigh in, but we
> > could conduct a vote about accepting the code donation as early as
> > next week. We would need to go through the ASF IP Clearance process
> > after that. So the entire procedural process would take about 6 days,
> > assuming that there are no licensing issues and the code will be ready
> > to merge into the Arrow codebase.
> >
> > Thanks
> > Wes
> >
> > On Tue, Jan 8, 2019 at 9:07 AM Neville Dipale <[email protected]>
> > wrote:
> > >
> > > Hi Andy,
> > >
> > > I can't comment on the voting process, but regarding the addition of
> > > DataFusion:
> > >
> > > I support the idea to donate the code, mainly as I think that will help
> > us
> > > accelerate some work on Rust. Out of curiousity, I've been prototying a
> > > 'Rust dataframe' abstraction which (can/will) have various scalar,
> > > aggregation, array and window functions.
> > >
> > > I'm doing this trying to put on the hat of someone wanting to use Rust in
> > > their binary or library. I'm already finding some things that might be
> > > *core* but are still not yet implemented. The presence of array_ops is
> > also
> > > helpful because in addition to an efficient in-memory rep of data, they
> > > enable one to do some basic data manipulation on such data.
> > >
> > > Having DataFusion added to Arrow could help fill some gaps in our
> > codebase;
> > > and I'm willing to work there.
> > >
> > > Regards
> > > Neville
> > >
> > > On Tue, 8 Jan 2019 at 16:14, Andy Grove <[email protected]> wrote:
> > >
> > > > Bumping this thread ... I know everyone is busy with getting the 0.12
> > > > release out, but would be good to know the process for raising this
> > for a
> > > > vote. However, given the lack of comments on this thread I'm starting
> > to
> > > > suspect that maybe there isn't much of an appetite for this, which is
> > fine,
> > > > but would be good to find out for sure.
> > > >
> > > > Thanks,
> > > >
> > > > Andy.
> > > >
> > > > On Mon, Jan 7, 2019 at 1:03 PM Andy Grove <[email protected]>
> > wrote:
> > > >
> > > > > Thanks, Ted!
> > > > >
> > > > > I wish I'd been a bit more specific about my ask in the original
> > email...
> > > > > I guess my question (for Wes?) is what is the process to raise this
> > for a
> > > > > vote?
> > > > >
> > > > > Andy.
> > > > >
> > > > >
> > > > >
> > > > > On Sun, Jan 6, 2019 at 2:59 PM Ted Dunning <[email protected]>
> > > > wrote:
> > > > >
> > > > >> Cool!
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Sun, Jan 6, 2019 at 1:52 PM Andy Grove <[email protected]>
> > > > wrote:
> > > > >>
> > > > >> > I'm starting a new thread for this discussion (this was previously
> > > > >> > discussed in the Rust Roadmap thread).
> > > > >> >
> > > > >> > The reason I got involved with Arrow is that I have been working
> > on
> > > > >> > DataFusion[1] which is currently an in-process SQL query engine
> > on top
> > > > >> of
> > > > >> > Arrow. It allows queries to be executed against the Arrow CSV
> > reader
> > > > >> (and
> > > > >> > will shortly support the Arrow Parquet reader too) and presents
> > > > results
> > > > >> as
> > > > >> > a sequence of RecordBatch instances.
> > > > >> >
> > > > >> > I would like to donate this code to the Arrow project so that
> > Arrow
> > > > has
> > > > >> a
> > > > >> > Rust-native query execution engine built in and to accelerate
> > > > >> development
> > > > >> > of this capability.
> > > > >> >
> > > > >> > I have a fairly detailed roadmap[2] in mind for the project and it
> > > > could
> > > > >> > eventually become a standalone project potentially (under ASF
> > still).
> > > > >> >
> > > > >> > I don't know what the process is to vote on this, so wanted to
> > discuss
> > > > >> that
> > > > >> > in this thread first.
> > > > >> >
> > > > >> > References:
> > > > >> >
> > > > >> > [1] DataFusion: https://github.com/andygrove/datafusion
> > > > >> > [2] Roadmap:
> > > > >> > https://github.com/andygrove/datafusion/blob/master/ROADMAP.md
> > > > >> >
> > > > >> > Thanks,
> > > > >> >
> > > > >> > Andy.
> > > > >> >
> > > > >>
> > > > >
> > > >
> >

Re: [Rust] [DISCUSS] Donate DataFusion to Arrow project

Reply via email to