Awesome stuff Daniël, well-deserved :D
> -Original Message-
> From: Daniël Heres
> Sent: 28 April 2021 17:26
> To: dev@arrow.apache.org
> Subject: Re: [ANNOUNCE] New Arrow committer: Daniël Heres
>
> Thank you all!
>
> It has been an amazing experience working with you! Looking forward
For anyone interested in this, I've also put together a draft PR with a
supporting implementation: https://github.com/apache/arrow/pull/9762
> -Original Message-
> From: Ruan Pearce-Authers
> Sent: 20 March 2021 11:43
> To: dev@arrow.apache.org
> Subject: [Rust] [
Hey all,
I've put together a short proposal for how we might augment DataFusion with
full support for catalog/schema-based namespacing of tables, in line with the
SQL standard.
The doc lives here, should allow comments from everyone:
https://docs.google.com/document/d/1_bCP_tjVRLJyOrMBOezSFNpF
run OOM there
> - for ~0.5GB of input, with 32GB of RAM). Much more efficient would be to
> store the accumulated data in (typed) arrays, keep offsets to values in those
> arrays and get rid of using per-row scalar values in those cases.
>
> Best,
>
> Daniël
>
> Op di 2
Hey all,
Whilst working on some UDAFs, I noticed I essentially had to reimplement
GroupByScalar to use scalars as HashMap keys inside accumulator struct state,
as ScalarValue (correctly!) doesn't implement Eq/Hash.
A simple fix to ease this process would be to remove the crate-only access
qual
I'd be interested in helping spec this out, it's especially tricky atm to track
down issues when integrating DataFusion into the same binary as other
medium/large dependencies.
Recently hit a really specific issue where DataFusion depends on Parquet, which
supports various compression algs, inc
Hey all,
I'm currently running some UX testing for a prototype DB engine integrating
DataFusion, and one recurring point that crops up is that specifying literal
timestamps, e.g. as gt/lt predicates in a where clause, is a bit awkward right
now. Most of the testing is borrowing existing queries