Below is a summary of the notes from today's meeting: Attendees:
- Ian Cook - Raúl Cumplido - Xuwei Fu - Will Jones - Bryce Mecum - Rok Mihevc - Sri Nadukudy - Ashish Paliwal - Dane Pitkin - David Dali Susanibar Arce - Matthew Topol - Joris Van den Bossche - Jacob Wujciak Discussion: 12.0.0 release - Code freeze is scheduled for later today, April 12 - There are many nightly failures currently on main; Raúl and Jacob have opened several blocker issues and we might need to create more - Discussion of several current issues that might affect the release - C# tests not finding Python - PyArrow tests slowness on Windows [1] - PyArrow wheels on Windows not uploading to Gemfury - Important items to mention in release changelog, release blog, etc. - Drop support for Ubuntu 18.04 [2] - Acero refactor (splitting Acero out from core Arrow library) [3] - Fixed shape tensor extension type [4] - Run-end encoded layout [5] - Plasma removal [6] and suggested alternatives [7] - Reminder about Jira to GitHub move (which happened just before the 11.0.0 release) - Initial Swift implementation [8] - nanoarrow (not technically a part of this release, but worth drawing attention to) [9] - Also see ASF board report Parquet tickets are still tracked in the ASF Jira - We have to maintain a lot of code in Archery, etc. to automate the tracking of Parquet C++ issues which are still in Jira, even though there are only a few Parquet issues in each release (4 for 12.0.0) - PARQUET-2201 Add stress test for RecordReader ReadRecords and SkipRecords. (#14879) - PARQUET-2225 Allow reading dense with RecordReader (#17877) - PARQUET-2232 Add an api to ColumnChunkMetaData to indicate if the column chunk uses a bloom filter (#33736) - PARQUET-2250 Expose column descriptor through RecordReader (#34318) - Can we move the Parquet C++ issues from the ASF Jira to GitHub? - Joris believes we can go ahead and do this; the Parquet Rust implementation did something similar - There are already some Parquet issues that were reported and resolved in the Arrow monorepo in this release without ever being opened as Parquet Jira issues [10] - Check with Micah Kornfield, Fatemah Panah - There was a related Parquet mailing list discussion about this in February [11] [1] https://github.com/apache/arrow/issues/35078 [2] https://github.com/apache/arrow/issues/33800 [3] https://lists.apache.org/thread/5h5g9k9lvbybzl8fnbg4fppxczm42g6r [4] https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#fixed-shape-tensor [5] https://arrow.apache.org/docs/format/Columnar.html#run-end-encoded-layout [6] https://github.com/apache/arrow/pull/34718 [7] https://lists.apache.org/thread/lk277x3b9gjol42sjg27bst2ggm5s0j2 [8] https://github.com/apache/arrow/issues/20484 [9] https://arrow.apache.org/blog/2023/03/07/nanoarrow-0.1.0-release/ [10] https://github.com/apache/arrow/issues?q=is%3Aissue+label%3A%22Component%3A+Parquet%22+is%3Aclosed [11] https://lists.apache.org/thread/jf9wos3t6xxk6xdyx2dof1jlkbpkr56p On Tue, Apr 11, 2023 at 5:35 PM Ian Cook <i...@ursacomputing.com> wrote: > > Hi all, > > Our biweekly Arrow community meeting is tomorrow at 16:00 UTC / 12:00 EDT. > > Zoom meeting URL: > https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 > Meeting ID: 876 4903 3008 > Passcode: 958092 > > The notes for this and future instances of this meeting will be > captured in this Google Doc: > https://docs.google.com/document/d/1xrji8fc6_24TVmKiHJB4ECX1Zy2sy2eRbBjpVJMnPmk/ > If you plan to attend this meeting, you are welcome to edit the > document to add the topics that you would like to discuss. > > Thanks, > Ian