> Rust Parquet was donated directly to the Arrow project and developed under its auspices after donation.
Yes, this is my recollection as well -- the original implementation I believe is [1] Andrew [1] https://github.com/sunchao/parquet-rs On Fri, Apr 14, 2023 at 10:59 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > - Joris believes we can go ahead and do this; the Parquet Rust > > implementation did something similar > > Small note here, IIRC the origins of the code in Rust and Parquet are > different. Rust Parquet was donated directly to the Arrow project and > developed under its auspices after donation. Parquet-cpp integration at > the time was done with the agreement that it would still live under > governance of the Parquet PMC (with the hope of it getting split out again > at some point). I think there has been enough code creep here that without > a significant amount of work separating out parquet C++ back out of Arrow > is likely not tenable. > > I pinged the thread again to see if we can get the parquet PMC to weigh in > here. > > > > On Wed, Apr 12, 2023 at 12:39 PM Ian Cook <i...@ursacomputing.com> wrote: > > > Below is a summary of the notes from today's meeting: > > > > Attendees: > > > > - Ian Cook > > - Raúl Cumplido > > - Xuwei Fu > > - Will Jones > > - Bryce Mecum > > - Rok Mihevc > > - Sri Nadukudy > > - Ashish Paliwal > > - Dane Pitkin > > - David Dali Susanibar Arce > > - Matthew Topol > > - Joris Van den Bossche > > - Jacob Wujciak > > > > > > Discussion: > > > > 12.0.0 release > > > > - Code freeze is scheduled for later today, April 12 > > - There are many nightly failures currently on main; Raúl and Jacob > > have opened several blocker issues and we might need to create more > > - Discussion of several current issues that might affect the release > > - C# tests not finding Python > > - PyArrow tests slowness on Windows [1] > > - PyArrow wheels on Windows not uploading to Gemfury > > - Important items to mention in release changelog, release blog, etc. > > - Drop support for Ubuntu 18.04 [2] > > - Acero refactor (splitting Acero out from core Arrow library) [3] > > - Fixed shape tensor extension type [4] > > - Run-end encoded layout [5] > > - Plasma removal [6] and suggested alternatives [7] > > - Reminder about Jira to GitHub move (which happened just before the > > 11.0.0 release) > > - Initial Swift implementation [8] > > - nanoarrow (not technically a part of this release, but worth > > drawing attention to) [9] > > - Also see ASF board report > > > > > > Parquet tickets are still tracked in the ASF Jira > > > > - We have to maintain a lot of code in Archery, etc. to automate the > > tracking of Parquet C++ issues which are still in Jira, even though > > there are only a few Parquet issues in each release (4 for 12.0.0) > > - PARQUET-2201 Add stress test for RecordReader ReadRecords and > > SkipRecords. (#14879) > > - PARQUET-2225 Allow reading dense with RecordReader (#17877) > > - PARQUET-2232 Add an api to ColumnChunkMetaData to indicate if the > > column chunk uses a bloom filter (#33736) > > - PARQUET-2250 Expose column descriptor through RecordReader (#34318) > > - Can we move the Parquet C++ issues from the ASF Jira to GitHub? > > - Joris believes we can go ahead and do this; the Parquet Rust > > implementation did something similar > > - There are already some Parquet issues that were reported and > > resolved in the Arrow monorepo in this release without ever being > > opened as Parquet Jira issues [10] > > - Check with Micah Kornfield, Fatemah Panah > > - There was a related Parquet mailing list discussion about this in > > February [11] > > > > > > [1] https://github.com/apache/arrow/issues/35078 > > [2] https://github.com/apache/arrow/issues/33800 > > [3] https://lists.apache.org/thread/5h5g9k9lvbybzl8fnbg4fppxczm42g6r > > [4] > > > https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#fixed-shape-tensor > > [5] > > > https://arrow.apache.org/docs/format/Columnar.html#run-end-encoded-layout > > [6] https://github.com/apache/arrow/pull/34718 > > [7] https://lists.apache.org/thread/lk277x3b9gjol42sjg27bst2ggm5s0j2 > > [8] https://github.com/apache/arrow/issues/20484 > > [9] https://arrow.apache.org/blog/2023/03/07/nanoarrow-0.1.0-release/ > > [10] > > > https://github.com/apache/arrow/issues?q=is%3Aissue+label%3A%22Component%3A+Parquet%22+is%3Aclosed > > [11] https://lists.apache.org/thread/jf9wos3t6xxk6xdyx2dof1jlkbpkr56p > > > > > > On Tue, Apr 11, 2023 at 5:35 PM Ian Cook <i...@ursacomputing.com> wrote: > > > > > > Hi all, > > > > > > Our biweekly Arrow community meeting is tomorrow at 16:00 UTC / 12:00 > > EDT. > > > > > > Zoom meeting URL: > > > https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 > > > Meeting ID: 876 4903 3008 > > > Passcode: 958092 > > > > > > The notes for this and future instances of this meeting will be > > > captured in this Google Doc: > > > > > > https://docs.google.com/document/d/1xrji8fc6_24TVmKiHJB4ECX1Zy2sy2eRbBjpVJMnPmk/ > > > If you plan to attend this meeting, you are welcome to edit the > > > document to add the topics that you would like to discuss. > > > > > > Thanks, > > > Ian > > >