> Rust Parquet was donated directly to the Arrow project and
developed under its auspices after donation.

Yes, this is my recollection as well -- the original implementation I
believe is [1]

Andrew

[1] https://github.com/sunchao/parquet-rs

On Fri, Apr 14, 2023 at 10:59 PM Micah Kornfield <emkornfi...@gmail.com>
wrote:

> >
> > - Joris believes we can go ahead and do this; the Parquet Rust
> > implementation did something similar
>
> Small note here, IIRC the origins of the code in Rust and Parquet are
> different.  Rust Parquet was donated directly to the Arrow project and
> developed under its auspices after donation.  Parquet-cpp integration at
> the time was done with the agreement that it would still live under
> governance of the Parquet PMC (with the hope of it getting split out again
> at some point).  I think there has been enough code creep here that without
> a significant amount of work separating out parquet C++ back out of Arrow
> is likely not tenable.
>
> I pinged the thread again to see if we can get the parquet PMC to weigh in
> here.
>
>
>
> On Wed, Apr 12, 2023 at 12:39 PM Ian Cook <i...@ursacomputing.com> wrote:
>
> > Below is a summary of the notes from today's meeting:
> >
> > Attendees:
> >
> > - Ian Cook
> > - Raúl Cumplido
> > - Xuwei Fu
> > - Will Jones
> > - Bryce Mecum
> > - Rok Mihevc
> > - Sri Nadukudy
> > - Ashish Paliwal
> > - Dane Pitkin
> > - David Dali Susanibar Arce
> > - Matthew Topol
> > - Joris Van den Bossche
> > - Jacob Wujciak
> >
> >
> > Discussion:
> >
> > 12.0.0 release
> >
> > - Code freeze is scheduled for later today, April 12
> > - There are many nightly failures currently on main; Raúl and Jacob
> > have opened several blocker issues and we might need to create more
> > - Discussion of several current issues that might affect the release
> >    - C# tests not finding Python
> >    - PyArrow tests slowness on Windows [1]
> >    - PyArrow wheels on Windows not uploading to Gemfury
> > - Important items to mention in release changelog, release blog, etc.
> >   - Drop support for Ubuntu 18.04 [2]
> >   - Acero refactor (splitting Acero out from core Arrow library) [3]
> >   - Fixed shape tensor extension type [4]
> >   - Run-end encoded layout [5]
> >   - Plasma removal [6] and suggested alternatives [7]
> >   - Reminder about Jira to GitHub move (which happened just before the
> > 11.0.0 release)
> >   - Initial Swift implementation [8]
> >   - nanoarrow (not technically a part of this release, but worth
> > drawing attention to) [9]
> >   - Also see ASF board report
> >
> >
> > Parquet tickets are still tracked in the ASF Jira
> >
> > - We have to maintain a lot of code in Archery, etc. to automate the
> > tracking of Parquet C++ issues which are still in Jira, even though
> > there are only a few Parquet issues in each release (4 for 12.0.0)
> >   - PARQUET-2201 Add stress test for RecordReader ReadRecords and
> > SkipRecords. (#14879)
> >   - PARQUET-2225 Allow reading dense with RecordReader (#17877)
> >   - PARQUET-2232 Add an api to ColumnChunkMetaData to indicate if the
> > column chunk uses a bloom filter (#33736)
> >   - PARQUET-2250 Expose column descriptor through RecordReader (#34318)
> > - Can we move the Parquet C++ issues from the ASF Jira to GitHub?
> > - Joris believes we can go ahead and do this; the Parquet Rust
> > implementation did something similar
> > - There are already some Parquet issues that were reported and
> > resolved in the Arrow monorepo in this release without ever being
> > opened as Parquet Jira issues [10]
> > - Check with Micah Kornfield, Fatemah Panah
> > - There was a related Parquet mailing list discussion about this in
> > February [11]
> >
> >
> > [1] https://github.com/apache/arrow/issues/35078
> > [2] https://github.com/apache/arrow/issues/33800
> > [3] https://lists.apache.org/thread/5h5g9k9lvbybzl8fnbg4fppxczm42g6r
> > [4]
> >
> https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#fixed-shape-tensor
> > [5]
> >
> https://arrow.apache.org/docs/format/Columnar.html#run-end-encoded-layout
> > [6] https://github.com/apache/arrow/pull/34718
> > [7] https://lists.apache.org/thread/lk277x3b9gjol42sjg27bst2ggm5s0j2
> > [8] https://github.com/apache/arrow/issues/20484
> > [9] https://arrow.apache.org/blog/2023/03/07/nanoarrow-0.1.0-release/
> > [10]
> >
> https://github.com/apache/arrow/issues?q=is%3Aissue+label%3A%22Component%3A+Parquet%22+is%3Aclosed
> > [11] https://lists.apache.org/thread/jf9wos3t6xxk6xdyx2dof1jlkbpkr56p
> >
> >
> > On Tue, Apr 11, 2023 at 5:35 PM Ian Cook <i...@ursacomputing.com> wrote:
> > >
> > > Hi all,
> > >
> > > Our biweekly Arrow community meeting is tomorrow at 16:00 UTC / 12:00
> > EDT.
> > >
> > > Zoom meeting URL:
> > > https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09
> > > Meeting ID: 876 4903 3008
> > > Passcode: 958092
> > >
> > > The notes for this and future instances of this meeting will be
> > > captured in this Google Doc:
> > >
> >
> https://docs.google.com/document/d/1xrji8fc6_24TVmKiHJB4ECX1Zy2sy2eRbBjpVJMnPmk/
> > > If you plan to attend this meeting, you are welcome to edit the
> > > document to add the topics that you would like to discuss.
> > >
> > > Thanks,
> > > Ian
> >
>

Reply via email to