Attendees

Nic Crane, Micah Kornfeld, Eduardo Ponce, Will Jones, Rok Mihevc,
David Li, Niranda Perera, Benson Muite


Agenda

- Discussion about the new columnar memory layout
- Preparing for 7.0.0 release - 2nd or 3rd week of January
- Documentation improvement
- Support for table like structures (Apache Iceberg, Delta Lake)


Minutes

- Not enough stakeholders on the call to discuss the new layout
proposal [1]. Micah might chime in on ML.

- 7.0.0 release is scheduled for the 2nd or 3rd week of January.
Please plan to complete PRs for 7.0.0 in time or bump Fix Version from
7.0.0 to 8.0.0 in your Jira issues not expected to be resolved in time
[2].  See [3] to track the progress of the release.

- Eduardo proposed discussion of documentation improvement. Main
pinpoint being sparse documentation of C++ compute kernels available
to users: Cookbook is not very extensive yet, few public examples of
usage. Just browsing public API shows many undocumented
functionalities. Functions are documented in code with docstrings, but
these are not used for documentation (?). There is a table of kernels
[4] but it could be more verbose. Could we use docstrings?
Jon says R wrapper can pull C++ docstrings for it’s documentation but
the mapping of functionality is not always 1-on-1.
Eduardo: Another pain point is internal abstractions are not well
documented which stalls new committers. Eduardo will open a PR for
this. There are already two PRs in review to improve kernel docs: [5],
[6].

- Support for table like structures discussion - Micah is interested
if there is any progress in this area. Will looked into this and
opened two open Jiras for Delta Lake [7] and Iceberg [8]. Technically
there are no issues implementing readers for either option, but there
are some worries about governance/maintenance/licensing. We don’t have
a reader for Avro hence Wil first looked into Delta Lake via the Rust
reader.


[1] https://lists.apache.org/thread/49qzofswg1r5z7zh39pjvd1m2ggz2kdq
[2] https://lists.apache.org/thread/ng11x17yhvdfo8b3wgmd1qn40hy50g13
[3] https://cwiki.apache.org/confluence/display/ARROW/Arrow+7.0.0+Release
[4] https://arrow.apache.org/docs/cpp/compute.html
[5] https://github.com/apache/arrow/pull/10296 - ARROW-12724: [C++]
Add documentation for authoring compute kernels
[6] https://github.com/apache/arrow/pull/12076 - ARROW-10317: [Python]
Document compute function options
[7] https://issues.apache.org/jira/browse/ARROW-14730
[8] https://issues.apache.org/jira/browse/ARROW-15135

Reply via email to