RE: [ANNOUNCE] New Arrow committer: Daniël Heres

2021-04-28 Thread Ruan Pearce-Authers
Awesome stuff Daniël, well-deserved :D > -Original Message- > From: Daniël Heres > Sent: 28 April 2021 17:26 > To: dev@arrow.apache.org > Subject: Re: [ANNOUNCE] New Arrow committer: Daniël Heres > > Thank you all! > > It has been an amazing experience working with you! Looking forward

RE: [Rust] [DataFusion] Proposal: support catalogs and schemas for table namespacing

2021-03-21 Thread Ruan Pearce-Authers
For anyone interested in this, I've also put together a draft PR with a supporting implementation: https://github.com/apache/arrow/pull/9762 > -Original Message- > From: Ruan Pearce-Authers > Sent: 20 March 2021 11:43 > To: dev@arrow.apache.org > Subject: [Rust] [

[Rust] [DataFusion] Proposal: support catalogs and schemas for table namespacing

2021-03-20 Thread Ruan Pearce-Authers
Hey all, I've put together a short proposal for how we might augment DataFusion with full support for catalog/schema-based namespacing of tables, in line with the SQL standard. The doc lives here, should allow comments from everyone: https://docs.google.com/document/d/1_bCP_tjVRLJyOrMBOezSFNpF

RE: [DataFusion] Promoting GroupByScalar to public API

2021-02-23 Thread Ruan Pearce-Authers
run OOM there > - for ~0.5GB of input, with 32GB of RAM). Much more efficient would be to > store the accumulated data in (typed) arrays, keep offsets to values in those > arrays and get rid of using per-row scalar values in those cases. > > Best, > > Daniël > > Op di 2

[DataFusion] Promoting GroupByScalar to public API

2021-02-23 Thread Ruan Pearce-Authers
Hey all, Whilst working on some UDAFs, I noticed I essentially had to reimplement GroupByScalar to use scalars as HashMap keys inside accumulator struct state, as ScalarValue (correctly!) doesn't implement Eq/Hash. A simple fix to ease this process would be to remove the crate-only access qual

RE: [Rust] [DataFusion] Topic for next Rust Sync Call

2021-02-14 Thread Ruan Pearce-Authers
I'd be interested in helping spec this out, it's especially tricky atm to track down issues when integrating DataFusion into the same binary as other medium/large dependencies. Recently hit a really specific issue where DataFusion depends on Parquet, which supports various compression algs, inc

[Rust] [DataFusion] Target-typing for string literal scalars in queries

2021-02-09 Thread Ruan Pearce-Authers
Hey all, I'm currently running some UX testing for a prototype DB engine integrating DataFusion, and one recurring point that crops up is that specifying literal timestamps, e.g. as gt/lt predicates in a where clause, is a bit awkward right now. Most of the testing is borrowing existing queries