hi all

In light of

https://github.com/apache/arrow/pull/9621

and some related issues, we're planning to start building out some
query execution machinery in C++, with the goal of plumbing together
the existing Datasets API with essential relational algorithms
(projection, aggregation, filter, eventually join, sorting, etc.). We
don't have SQL support planned for the time being so I would say this
work is a good deal narrower in scope compared with the Rust
DataFusion work, but we would like to have some high quality, reusable
algorithms in the C++ core library and a lightweight multithreaded
runtime to use them.

Here's a document to help organize some of the initial labor and point
out areas where we may need to build some new abstractions or refactor
existing code related to Datasets:

https://docs.google.com/document/d/1AyTdLU-RxA-Gsb9EsYnrQrmqPMOYMfPlWwxRi1Is1tQ/edit#heading=h.t89hffc3t7si

Comments welcome.

Thanks
Wes

Reply via email to