Attendees:
* Bryan Cutler
* Ben Kietsman
* Praveen Kumar
* Wes McKinney
* John Muehlhausen
* Neal Richardson

Bryan to start on Map type in Java, interested in the Spark-to-Arrow
connection. Ben volunteered to look into the C++ implementation.

Wes raised the issue of the large number of open Java PRs.

Things enabled by Wes's C++ variable dictionary work:
* Parallel CSV reader (because dictionaries no longer need to be the same
across chunks)
* Schema negotiation in Flight, led to sending potentially very large
chunks of (meta)data
* Read Parquet files a lot faster
* Binary protocol already supports delta dictionaries, implemented in
JavaScript but not in C++

No changes to the format, only to the C++ implementation of it.

John: interested in streaming for financial data. Wants to preallocate
Arrow memory in order to insert data row-wise into a data structure that
people can analyze in columnar format. Wes recommended checking out the
ArrayBuilder and RecordBatchBuilder classes. We ended up back on the
discussion from the mailing list about how to annotate to other readers how
much of the array has been populated. John restated that he'd put together
a proposal and circulate.

On Tue, May 14, 2019 at 11:18 AM Neal Richardson <
neal.p.richard...@gmail.com> wrote:

> Hi all,
> Just a reminder that the biweekly Arrow call is tomorrow at
> https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will
> be sent out to the mailing list afterwards.
>
> Neal
>

Reply via email to