Hi, If there is time available, I would like to present the status of the experimental arrow2 <https://github.com/jorgecarleitao/arrow2> repo, and gather feedback on what would be the best way to proceed. 10-15m?
Best, Jorge On Wed, Mar 10, 2021 at 1:57 PM Andrew Lamb <al...@influxdata.com> wrote: > Also: > * semantics for CAST and what to do on failure (return NULL or error) > [Mike S] > > On Wed, Mar 10, 2021 at 7:38 AM Andrew Lamb <al...@influxdata.com> wrote: > > > Reminder that today is the next Rust sync call > > > > Potential topics for discussion: > > * Ballista / DataFusion / etc > > * I remember that someone else was going to demo the use of Arrow but I > > can't remember exactly what that was now > > > > On Tue, Feb 16, 2021 at 10:59 AM Dominik Moritz <domor...@cmu.edu> > wrote: > > > >> Somewhat related, I tried to compile DataFusion to WASM and it didn’t > >> work > >> because of some dependencies: > >> https://issues.apache.org/jira/projects/ARROW/issues/ARROW-11615. I > >> wonder > >> whether DataFusion could have a feature flag for only shipping what is > >> WASM > >> compatible? > >> > >> On Feb 15, 2021 at 12:13:04, Andrew Lamb <al...@influxdata.com> wrote: > >> > >> > Also, unrelated, is there a schedule for the sync calls? Will try and > >> > > >> > carve out some free time for the next one :) > >> > > >> > It is every other Wednesday at noon EST. Here is the original > >> announcement > >> > with more details: > >> > > >> > > >> > https://lists.apache.org/thread.html/raa72e1a8a3ad5dbb8366e9609a041eccca87f85545c3bc3d85170cfc%40%3Cdev.arrow.apache.org%3E > >> > > >> > > >> > On Sun, Feb 14, 2021 at 8:29 AM Ruan Pearce-Authers < > >> r...@reservoirdb.com> > >> > wrote: > >> > > >> > I'd be interested in helping spec this out, it's especially tricky atm > >> to > >> > > >> > track down issues when integrating DataFusion into the same binary as > >> other > >> > > >> > medium/large dependencies. > >> > > >> > > >> > Recently hit a really specific issue where DataFusion depends on > >> Parquet, > >> > > >> > which supports various compression algs, including Brotli, and > actix-web > >> > > >> > also depends on a slightly different Rust implementation of Brotli. > >> Both of > >> > > >> > these Brotli libs package the same underlying C lib separately, > >> resulting > >> > > >> > in multiply-defined symbols compiling using msvc (and maybe on other > >> > > >> > platforms? didn't test in CI in the end). > >> > > >> > > >> > Got a quick interim hack [1] in place for my use case which doesn't > >> really > >> > > >> > use Parquet, so it's not pressing, but would be awesome to sort this > >> > > >> > properly upstream. > >> > > >> > > >> > I guess the only major tradeoff of having a comprehensive feature > setup > >> is > >> > > >> > that it could make testing slightly harder, in terms of making sure > >> no-one > >> > > >> > breaks the build for specific feature combinations; this can always be > >> > > >> > mitigated with more CI though (yay, unlimited Actions minutes for > public > >> > > >> > repos). > >> > > >> > > >> > Also, unrelated, is there a schedule for the sync calls? Will try and > >> > > >> > carve out some free time for the next one :) > >> > > >> > > >> > [1] > >> > > >> > > >> > > >> > https://github.com/reservoirdb/arrow/commit/e63e157927a552ecf1a6f63ec401f0b6157b5468 > >> > > >> > > >> > -----Original Message----- > >> > > >> > From: Andrew Lamb <al...@influxdata.com> > >> > > >> > Sent: 14 February 2021 11:14 > >> > > >> > To: dev <dev@arrow.apache.org> > >> > > >> > Subject: [Rust] [DataFusion] Topic for next Rust Sync Call > >> > > >> > > >> > I would like to add the following item to the agenda call for the next > >> > > >> > Rust sync call: > >> > > >> > > >> > Dependencies > >> > > >> > > >> > Background: As the dependency stack gets larger, it will be harder to > >> use > >> > > >> > DataFusion as an embedded query engine and the compile / dev times > will > >> get > >> > > >> > higher. > >> > > >> > > >> > As we expand the supported functions of DataFusion this problem is > >> likely > >> > > >> > to get worse. For example > >> > > >> > https://github.com/apache/arrow/pull/9243#discussion_r575716759 and > >> > > >> > https://github.com/apache/arrow/pull/9139 > >> > > >> > > >> > Proposal: Add Rust "features" to the datafusion crate and make many of > >> the > >> > > >> > new dependencies optional (so that we had features like regex and > >> unicode > >> > > >> > and hash which would only pull in the dependencies / have those > >> functions > >> > > >> > if the features were enabled.) This approach has worked well for Arrow > >> > > >> > (which has only chrono and num as required dependencies) > >> > > >> > > >> > > >> > > >