Re: [VOTE] Release Apache Arrow ADBC 15 - RC1

2024-11-08 Thread Bryce Mecum
+1 (non-binding)

I successfully verified on macOS 15.1 (aarch64) by running:

DOCKER_DEFAULT_PLATFORM=linux/amd64 \
USE_CONDA=1 \
./dev/release/verify-release-candidate.sh 15 1

PS: I ran into rate limits when downloading artifacts from GitHub.
Dewey pointed me to a recent PR [1] which helps alleviate the issue if
you set GH_TOKEN. With the GitHub CLI [2], you can get a token with
`gh auth token`.

[1] https://github.com/apache/arrow/pull/44666
[2] https://cli.github.com/


On Thu, Nov 7, 2024 at 6:21 PM David Li  wrote:
>
> Hello,
>
> I would like to propose the following release candidate (RC1) of Apache Arrow 
> ADBC version 15. This is a release consisting of 29 resolved GitHub issues 
> [1].
>
> The subcomponents are versioned independently:
>
> - C/C++/GLib/Go/Python/Ruby: 1.3.0
> - C#: 0.15.0
> - Java: 0.15.0
> - R: 0.15.0
> - Rust: 0.15.0
>
> This release candidate is based on commit: 
> 4bd94e2f9b56ebed3aa2639ff515e758930c8c9b [2]
>
> The source release rc1 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8].
> The changelog is located at [9].
>
> Please download, verify checksums and signatures, run the unit tests, and 
> vote on the release. See [10] for how to validate a release candidate.
>
> See also a verification result on GitHub Actions [11].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow ADBC 15
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow ADBC 15 because...
>
> Note: to verify APT/YUM packages on macOS/AArch64, you must `export 
> DOCKER_DEFAULT_PLATFORM=linux/amd64`. (Or skip this step by `export 
> TEST_APT=0 TEST_YUM=0`.)
>
> [1]: 
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A%22ADBC+Libraries+15%22+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow-adbc/commit/4bd94e2f9b56ebed3aa2639ff515e758930c8c9b
> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-adbc-15-rc1/
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [7]: 
> https://repository.apache.org/content/repositories/staging/org/apache/arrow/adbc/
> [8]: 
> https://github.com/apache/arrow-adbc/releases/tag/apache-arrow-adbc-15-rc1
> [9]: 
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-15-rc1/CHANGELOG.md
> [10]: 
> https://arrow.apache.org/adbc/main/development/releasing.html#how-to-verify-release-candidates
> [11]: https://github.com/apache/arrow-adbc/actions/runs/11734486506


Re: 18.0.1 release manager

2024-11-07 Thread Bryce Mecum
That plan sounds good to me, Kou. I'll let Jacob reply in the
affirmative and then we'll get started ASAP.

On Wed, Nov 6, 2024 at 11:27 PM Sutou Kouhei  wrote:
>
> Hi,
>
> Thank you Jacob and Bryce for volunteering.
>
> How about the following plan?
>
>
> Jacob, could you become a release manager for the next
> release? I think that you haven't never been a release
> manager yet. Could you have experience as a release manager?
>
> Bryce, could you work on release tasks as much as possible
> with Jacob and my help? Release tasks are listed at
> https://arrow.apache.org/docs/developers/release.html .
> There are some tasks you can't do such as signing. I'll do
> these tasks.
>
> Jacob, could you focus on managing release process instead
> of working on release tasks as a release manager? For
> example, deciding which PRs should be merged or not with
> discussion, scheduling the next release, starting a vote and
> so on.
>
>
> Thanks,
> --
> kou
>
> In 
>   "Re: 18.0.1 release manager" on Tue, 5 Nov 2024 10:09:24 -0800,
>   Bryce Mecum  wrote:
>
> >> > We're still waiting for one or more volunteers for releasemanager.
> >> Yes makes sense to spread the knowledge!
> >
> > I'm happy to help even more with the follow-up release but I think I'd
> > feel most comfortable with someone more familiar with the release
> > management process to lead this one (Jacob? Kou?).
> >
> > On Sun, Nov 3, 2024 at 6:57 AM Jacob Wujciak  wrote:
> >>
> >> Hi,
> >>
> >> Kou I am not sure what you mean with changing location of the binaries in
> >> the links you just replaced the version number, do you want to publish them
> >> as github release artifacts similar to matlab?
> >> CRAN does not like github as a download source (because of purported
> >> 'flakiness'...) so it would be better to keep them in the artifactory.
> >>
> >> Though we do also have the yearly jfrog outtage coming up and we did
> >> discuss finding other ways to host our artifacts, maybe we should resume
> >> exploring our options.
> >>
> >> > We're still waiting for one or more volunteers for releasemanager.
> >> Yes makes sense to spread the knowledge!
> >>
> >> Best
> >> Jacob
> >>
> >>
> >> Am So., 3. Nov. 2024 um 10:21 Uhr schrieb Nic Crane :
> >>
> >> > Good to know, thanks for clarifying the reasoning Kou! :)
> >> >
> >> > On Sun, 3 Nov 2024, 02:25 Sutou Kouhei,  wrote:
> >> >
> >> > > Hi Nic,
> >> > >
> >> > > I think that replacing the C++ binaries for 18.0.0 doesn't
> >> > > affect 18.0.1 release.
> >> > >
> >> > > The current wrong 18.0.0 C++ binaries are "official"
> >> > > release. (We voted them.) The correct 18.0.0 C++ binaries
> >> > > are "unofficial" because we don't vote them. So we can't the
> >> > > correct ones to
> >> > > https://apache.jfrog.io/ui/native/arrow/r/X.Y.Z/ . We need
> >> > > to vote the correct 18.0.0 C++ binaries or 18.0.1 release.
> >> > > The former is an irregular process. So the latter may be
> >> > > easier.
> >> > >
> >> > >
> >> > > It may be better that the Arrow R package may wait for
> >> > > 18.0.1 (or 18.1.0) if we haven't submitted it to CRAN yet.
> >> > >
> >> > >
> >> > > BTW, can we change the C++ binaries location to
> >> > > https://github.com/apache/arrow/releases/tag/apache-arrow-X.Y.Z
> >> > > something like
> >> > > https://github.com/apache/arrow/releases/tag/apache-arrow-18.0.0
> >> > > from
> >> > > https://apache.jfrog.io/ui/native/arrow/r/X.Y.Z/
> >> > > something like
> >> > > https://apache.jfrog.io/ui/native/arrow/r/18.0.0/ ?
> >> > >
> >> > > Our Artifactory space uses 75% of quota. So it sends
> >> > > notification e-mails periodically...
> >> > >
> >> > >
> >> > > Thanks,
> >> > > --
> >> > > kou
> >> > >
> >> > > In 
> >> > >   "Re: 18.0.1 release manager" on Fri, 1 Nov 2024 06:11:26 +,
> >> > >   Nic Crane  wrote:
> >> > >
> >> > > > Hi Kou,
> >> > > &

Re: 18.0.1 release manager

2024-11-05 Thread Bryce Mecum
> > We're still waiting for one or more volunteers for releasemanager.
> Yes makes sense to spread the knowledge!

I'm happy to help even more with the follow-up release but I think I'd
feel most comfortable with someone more familiar with the release
management process to lead this one (Jacob? Kou?).

On Sun, Nov 3, 2024 at 6:57 AM Jacob Wujciak  wrote:
>
> Hi,
>
> Kou I am not sure what you mean with changing location of the binaries in
> the links you just replaced the version number, do you want to publish them
> as github release artifacts similar to matlab?
> CRAN does not like github as a download source (because of purported
> 'flakiness'...) so it would be better to keep them in the artifactory.
>
> Though we do also have the yearly jfrog outtage coming up and we did
> discuss finding other ways to host our artifacts, maybe we should resume
> exploring our options.
>
> > We're still waiting for one or more volunteers for releasemanager.
> Yes makes sense to spread the knowledge!
>
> Best
> Jacob
>
>
> Am So., 3. Nov. 2024 um 10:21 Uhr schrieb Nic Crane :
>
> > Good to know, thanks for clarifying the reasoning Kou! :)
> >
> > On Sun, 3 Nov 2024, 02:25 Sutou Kouhei,  wrote:
> >
> > > Hi Nic,
> > >
> > > I think that replacing the C++ binaries for 18.0.0 doesn't
> > > affect 18.0.1 release.
> > >
> > > The current wrong 18.0.0 C++ binaries are "official"
> > > release. (We voted them.) The correct 18.0.0 C++ binaries
> > > are "unofficial" because we don't vote them. So we can't the
> > > correct ones to
> > > https://apache.jfrog.io/ui/native/arrow/r/X.Y.Z/ . We need
> > > to vote the correct 18.0.0 C++ binaries or 18.0.1 release.
> > > The former is an irregular process. So the latter may be
> > > easier.
> > >
> > >
> > > It may be better that the Arrow R package may wait for
> > > 18.0.1 (or 18.1.0) if we haven't submitted it to CRAN yet.
> > >
> > >
> > > BTW, can we change the C++ binaries location to
> > > https://github.com/apache/arrow/releases/tag/apache-arrow-X.Y.Z
> > > something like
> > > https://github.com/apache/arrow/releases/tag/apache-arrow-18.0.0
> > > from
> > > https://apache.jfrog.io/ui/native/arrow/r/X.Y.Z/
> > > something like
> > > https://apache.jfrog.io/ui/native/arrow/r/18.0.0/ ?
> > >
> > > Our Artifactory space uses 75% of quota. So it sends
> > > notification e-mails periodically...
> > >
> > >
> > > Thanks,
> > > --
> > > kou
> > >
> > > In 
> > >   "Re: 18.0.1 release manager" on Fri, 1 Nov 2024 06:11:26 +,
> > >   Nic Crane  wrote:
> > >
> > > > Hi Kou,
> > > >
> > > > The Arrow R package is still in the process of being submitted to CRAN,
> > > so
> > > > if it helps and could reduce the work needed for 18.0.1 could we
> > replace
> > > > the C++ binaries with the correct ones?
> > > >
> > > > Thanks,
> > > >
> > > > Nic
> > > >
> > > > On Fri, 1 Nov 2024, 00:26 Sutou Kouhei,  wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> We built 18.0.0 RC0 binaries with wrong source. The 18.0.0
> > > >> RC0 source archive was generated from the
> > > >> apache-arrow-18.0.0-rc0 tag[0] but the binaries were built
> > > >> from d0e7d07[1] not the apache-arrow-18.0.0-rc0 tag. It's
> > > >> my fault. Sorry.
> > > >>
> > > >> [0]
> > > https://github.com/apache/arrow/releases/tag/apache-arrow-18.0.0-rc0
> > > >> [1]
> > https://github.com/apache/arrow/pull/0#issuecomment-2416557688
> > > >>
> > > >> I think that we should release 18.0.1. Is there any
> > > >> committer or PMC member who want to volunteer 18.0.1 release
> > > >> manager? I'll help you. (I think we can have multiple
> > > >> release managers for one release to reduce each release
> > > >> manager's cost.)
> > > >>
> > > >> See also
> > > >> https://arrow.apache.org/docs/developers/release.html for
> > > >> release related tasks.
> > > >>
> > > >> In recent years, only some limited contributors worked on
> > > >> our releases. It may be better that we have more
> > > >> contributors working on our releases to keep releasing
> > > >> continuously.
> > > >>
> > > >>
> > > >> FYI: Here are problems of 18.0.0 RC0 binaries:
> > > >>
> > > >> Affected binaries:
> > > >>
> > > >> * Python wheels
> > > >> * Apache Arrow C++ binaries used by the R packages
> > > >> * Java jars
> > > >> * C# NuGets
> > > >> * MATLAB MLTBX
> > > >>
> > > >> Not affected binaries:
> > > >>
> > > >> * deb
> > > >> * RPM
> > > >>
> > > >> Affected details:
> > > >>
> > > >> Python wheels:
> > > >>
> > > >> * pyarrow.cpp_build_info.version is "18.0.0-SNAPSHOT" not "18.0.0"
> > > >>
> > > >> Apache Arrow C++ binaries used by the R packages:
> > > >>
> > > >> * ARROW_VERSION_STRING is "18.0.0-SNAPSHOT" not "18.0.0"
> > > >>   (I'm not sure the R bindings have API something like
> > > >>   pyarrow.cpp_build_info.version.)
> > > >>
> > > >> Java jars:
> > > >>
> > > >> * Version is "18.0.0-SNAPSHOT" not "18.0.0"
> > > >> * Bump logback.version from 1.5.8 to 1.5.10
> > > >> * Bump checker.framework.version from 3.48.0 to 3.48.1
> > > >>   (This may n

Re: 18.0.1 release manager

2024-11-05 Thread Bryce Mecum
> It may be better that the Arrow R package may wait for 18.0.1 (or 18.1.0) if 
> we haven't submitted it to CRAN yet.

I think so too. For the R package, we'll wait for 18.0.1/18.1.0 to
submit to CRAN [1].

[1] https://github.com/apache/arrow/issues/44496#issuecomment-2451831654


On Sat, Nov 2, 2024 at 7:25 PM Sutou Kouhei  wrote:
>
> Hi Nic,
>
> I think that replacing the C++ binaries for 18.0.0 doesn't
> affect 18.0.1 release.
>
> The current wrong 18.0.0 C++ binaries are "official"
> release. (We voted them.) The correct 18.0.0 C++ binaries
> are "unofficial" because we don't vote them. So we can't the
> correct ones to
> https://apache.jfrog.io/ui/native/arrow/r/X.Y.Z/ . We need
> to vote the correct 18.0.0 C++ binaries or 18.0.1 release.
> The former is an irregular process. So the latter may be
> easier.
>
>
> It may be better that the Arrow R package may wait for
> 18.0.1 (or 18.1.0) if we haven't submitted it to CRAN yet.
>
>
> BTW, can we change the C++ binaries location to
> https://github.com/apache/arrow/releases/tag/apache-arrow-X.Y.Z
> something like
> https://github.com/apache/arrow/releases/tag/apache-arrow-18.0.0
> from
> https://apache.jfrog.io/ui/native/arrow/r/X.Y.Z/
> something like
> https://apache.jfrog.io/ui/native/arrow/r/18.0.0/ ?
>
> Our Artifactory space uses 75% of quota. So it sends
> notification e-mails periodically...
>
>
> Thanks,
> --
> kou
>
> In 
>   "Re: 18.0.1 release manager" on Fri, 1 Nov 2024 06:11:26 +,
>   Nic Crane  wrote:
>
> > Hi Kou,
> >
> > The Arrow R package is still in the process of being submitted to CRAN, so
> > if it helps and could reduce the work needed for 18.0.1 could we replace
> > the C++ binaries with the correct ones?
> >
> > Thanks,
> >
> > Nic
> >
> > On Fri, 1 Nov 2024, 00:26 Sutou Kouhei,  wrote:
> >
> >> Hi,
> >>
> >> We built 18.0.0 RC0 binaries with wrong source. The 18.0.0
> >> RC0 source archive was generated from the
> >> apache-arrow-18.0.0-rc0 tag[0] but the binaries were built
> >> from d0e7d07[1] not the apache-arrow-18.0.0-rc0 tag. It's
> >> my fault. Sorry.
> >>
> >> [0] https://github.com/apache/arrow/releases/tag/apache-arrow-18.0.0-rc0
> >> [1] https://github.com/apache/arrow/pull/0#issuecomment-2416557688
> >>
> >> I think that we should release 18.0.1. Is there any
> >> committer or PMC member who want to volunteer 18.0.1 release
> >> manager? I'll help you. (I think we can have multiple
> >> release managers for one release to reduce each release
> >> manager's cost.)
> >>
> >> See also
> >> https://arrow.apache.org/docs/developers/release.html for
> >> release related tasks.
> >>
> >> In recent years, only some limited contributors worked on
> >> our releases. It may be better that we have more
> >> contributors working on our releases to keep releasing
> >> continuously.
> >>
> >>
> >> FYI: Here are problems of 18.0.0 RC0 binaries:
> >>
> >> Affected binaries:
> >>
> >> * Python wheels
> >> * Apache Arrow C++ binaries used by the R packages
> >> * Java jars
> >> * C# NuGets
> >> * MATLAB MLTBX
> >>
> >> Not affected binaries:
> >>
> >> * deb
> >> * RPM
> >>
> >> Affected details:
> >>
> >> Python wheels:
> >>
> >> * pyarrow.cpp_build_info.version is "18.0.0-SNAPSHOT" not "18.0.0"
> >>
> >> Apache Arrow C++ binaries used by the R packages:
> >>
> >> * ARROW_VERSION_STRING is "18.0.0-SNAPSHOT" not "18.0.0"
> >>   (I'm not sure the R bindings have API something like
> >>   pyarrow.cpp_build_info.version.)
> >>
> >> Java jars:
> >>
> >> * Version is "18.0.0-SNAPSHOT" not "18.0.0"
> >> * Bump logback.version from 1.5.8 to 1.5.10
> >> * Bump checker.framework.version from 3.48.0 to 3.48.1
> >>   (This may not be affected.)
> >> * Bump org.cyclonedx:cyclonedx-maven-plugin from 2.8.2 to 2.9.0
> >>   (This may not be affected.)
> >> * UnionMapWriter supports map()
> >>
> >> https://github.com/apache/arrow/commit/7df47483197186b564a9882ac6d1b5f32b2e3d51
> >>
> >> C# NuGets:
> >>
> >> * Version is "18.0.0-SNAPSHOT" not "18.0.0"
> >> * Bump Grpc.Tools from 2.66.0 to 2.67.0
> >> * Have a fix of Flight DoExchange incompatibility
> >>
> >> https://github.com/apache/arrow/commit/da5a2957accee3ec082199cfa2c305d19d9ada20
> >>
> >> MATLAB MLTBX:
> >>
> >> * Version is "18.0.0-SNAPSHOT" not "18.0.0"
> >>
> >>
> >> Thanks,
> >> --
> >> kou
> >>


Re: [ANNOUNCE] New Arrow PMC member: Curt Hagenlocher

2024-10-30 Thread Bryce Mecum
Congrats Curt!

On Wed, Oct 30, 2024 at 2:56 PM Sutou Kouhei  wrote:
>
> The Project Management Committee (PMC) for Apache Arrow has
> invited Curt Hagenlocher to become a PMC member and we are
> pleased to announce that Curt Hagenlocher has accepted.
>
> Congratulations and welcome!


Re: [ANNOUNCE] New Arrow PMC chair: Neil Richardson

2024-10-30 Thread Bryce Mecum
Congratulations Neal!

On Wed, Oct 30, 2024 at 4:28 AM Andrew Lamb  wrote:
>
> I am pleased to announce that the Arrow Project has a new PMC chair and VP
> as per our tradition of rotating the chair once a year. Andy Grove has
> resigned and
> Neil Richardson was duly elected by the PMC and approved unanimously by the
> board.
>
> Please join me in congratulating Neil Richardson!
>
> Thanks,
> Andrew


Re: [ANNOUNCE] New Arrow committer: Rossi Sun

2024-10-22 Thread Bryce Mecum
Congrats, welcome, and thanks for all the great work so far!

On Tue, Oct 22, 2024 at 12:02 PM Weston Pace  wrote:
>
> On behalf of the Arrow PMC, I'm happy to announce that Rossi Sun has
> accepted an invitation to become a committer on Apache Arrow. Welcome,
> and thank you for your contributions!


Re: [VOTE] Release Apache Arrow 18.0.0 - RC0

2024-10-21 Thread Bryce Mecum
+1 (non-binding)

I ran `dev/release/verify-release-candidate.sh 18.0.0 0` on macOS 14.7
(aarch64).


On Fri, Oct 18, 2024 at 3:26 AM Raúl Cumplido  wrote:
>
> Hi,
>
> I would like to propose the following release candidate (RC0) of Apache
> Arrow version 18.0.0. This is a release consisting of 327
> resolved GitHub issues[1].
>
> This release candidate is based on commit:
> 9105a4109a80a1c01eabb24ee4b9f7c94ee942cb [2]
>
> The source release rc0 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> The changelog is located at [12].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [13] for how to validate a release candidate.
>
> See also a verification result on GitHub pull request [14].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow 18.0.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow 18.0.0 because...
>
> [1]: 
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A18.0.0+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow/tree/9105a4109a80a1c01eabb24ee4b9f7c94ee942cb
> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-18.0.0-rc0
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/18.0.0-rc0
> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/18.0.0-rc0
> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/18.0.0-rc0
> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [12]: 
> https://github.com/apache/arrow/blob/9105a4109a80a1c01eabb24ee4b9f7c94ee942cb/CHANGELOG.md
> [13]: https://arrow.apache.org/docs/developers/release_verification.html
> [14]: https://github.com/apache/arrow/pull/0


Re: [VOTE][RUST] Release Apache Arrow Rust 53.2.0 RC1

2024-10-21 Thread Bryce Mecum
+1 (non-binding)

I ran `dev/release/verify-release-candidate.sh 53.2.0 1` on macOS 14.7
(aarch64) using the 1.82.0 toolchain.

On Mon, Oct 21, 2024 at 11:27 AM Andrew Lamb  wrote:
>
> Hi,
>
> I would like to propose a release of Apache Arrow Rust Implementation,
> version 53.2.0. Note this is ahead of schedule for reasons explained on
> [5]. I expect to propose 53.3.0 in a few weeks time.
>
> This release candidate is based on commit:
> 10c4059b40f838bb8f7bac5259cb499e6eceec88 [1]
>
> The proposed release tarball and signatures are hosted at [2].
>
> The changelog is located at [3].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. There is a script [4] that automates some of
> the verification.
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow Rust
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow Rust  because...
>
> [1]:
> https://github.com/apache/arrow-rs/tree/10c4059b40f838bb8f7bac5259cb499e6eceec88
> [2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-53.2.0-rc1
> [3]:
> https://github.com/apache/arrow-rs/blob/10c4059b40f838bb8f7bac5259cb499e6eceec88/CHANGELOG.md
> [4]:
> https://github.com/apache/arrow-rs/blob/master/dev/release/verify-release-candidate.sh
> [5]: https://github.com/apache/arrow-rs/issues/6341


Re: [VOTE][Go] Release Apache Arrow Go 18.0.0 RC0

2024-10-16 Thread Bryce Mecum
+1 (non-binding)

Verified on macOS 14.7 aarch64

On Wed, Oct 16, 2024 at 12:10 PM Matt Topol  wrote:
>
> Hi,
>
> I would like to propose the following release candidate (RC0) of
> Apache Arrow Go version 18.0.0.
>
> This release candidate is based on commit:
> c124ae4449d8cb249bb870cd7a3c533f6ca17434 [1]
>
> The source release rc0 is hosted at [2].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [3] for how to validate a release candidate.
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow Go 18.0.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow Go 18.0.0 because...
>
> [1]:
> https://github.com/apache/arrow-go/tree/c124ae4449d8cb249bb870cd7a3c533f6ca17434
> [2]: https://github.com/apache/arrow-go/releases/v18.0.0-rc0
> [3]:
> https://github.com/apache/arrow-go/blob/main/dev/release/README.md#verify


Re: [VOTE] Release Apache Arrow nanoarrow 0.6.0

2024-10-09 Thread Bryce Mecum
Hi Dewey,

After seeing the test failure Kou reported, I attempted verification
on my x86 debian sid machine and got a Python test failure in
tests/test_iterator.py::test_get_tzinfo. I put the detailed output in
a gist [1]. I can reproduce it outside the verification script.

[1] https://gist.github.com/amoeba/416de429633108d15e610567937e3588


Re: [VOTE] Release Apache Arrow nanoarrow 0.6.0

2024-10-08 Thread Bryce Mecum
+1 (non-binding)

Verified on macOS 15.0.1 (aarch64) w/ Homebrew

On Tue, Oct 8, 2024 at 8:05 PM Dewey Dunnington
 wrote:
>
> Hello,
>
> I would like to propose the following release candidate (rc0) of
> Apache Arrow nanoarrow [0] version 0.6.0. This release consists of 114
> resolved GitHub issues from 10 contributors [1].
>
> This release candidate is based on commit:
> 33d2c8b973d8f8f424e02ac92ddeaace2a92f8dd [2]
>
> The source release rc0 is hosted at [3].
> The changelog is located at [4].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [5] for how to validate a release
> candidate, [6] for a suite of successful verification runs, and [7]
> for a (preliminary) draft release post.
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow nanoarrow 0.6.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow nanoarrow 0.6.0 because...
>
> [0] https://github.com/apache/arrow-nanoarrow
> [1] https://github.com/apache/arrow-nanoarrow/milestone/6?closed=1
> [2] 
> https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.6.0-rc0
> [3] 
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-nanoarrow-0.6.0-rc0/
> [4] 
> https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.6.0-rc0/CHANGELOG.md
> [5] https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md
> [6] https://github.com/apache/arrow-nanoarrow/actions/runs/11243985397
> [7] https://github.com/apache/arrow-site/pull/545


Re: [ANNOUNCE] New Arrow committer: Will Ayd

2024-10-01 Thread Bryce Mecum
Congrats Will!

On Tue, Oct 1, 2024 at 9:55 AM Dewey Dunnington
 wrote:
>
> On behalf of the Arrow PMC, I'm happy to announce that Will Wyd has
> accepted an invitation to become a committer on Apache Arrow. Welcome,
> and thank you for your contributions!
>
> -dewey


Re: [DISCUSS][Acero] Upgrading to 64-bit row offsets in row table

2024-08-01 Thread Bryce Mecum
Thanks for driving this forward. I didn't see the links in my email client
so I'm adding those below in case helps others:

Issue: https://github.com/apache/arrow/issues/43495
PR: https://github.com/apache/arrow/pull/43389

On Thu, Aug 1, 2024 at 4:06 AM Ruoxi Sun  wrote:

> Hello everyone,
>
> We've identified an issue with Acero's hash join/aggregation, which is
> currently limited to processing only up to 4GB data due to the use of
> `uint32_t` for row offsets. This limitation not only impacts our ability to
> handle large datasets but also makes typical solutions like splitting the
> data into smaller batches ineffective.
>
> * Proposed solution
> We are considering upgrading the row offsets from 32-bit to 64-bit. This
> change would allow us to process larger datasets and expand Arrow's
> application possibilities.
>
> * Trade-offs to consider
> ** Pros: Allows handling of larger datasets, breaking the current 4GB
> limit.
> ** Cons: Each row would consume an additional 4 bytes of memory, and there
> might be slightly more CPU instructions involved in processing.
>
> Preliminary benchmarks indicate that the impact on CPU performance is
> minimal, so the main consideration is the increased memory consumption.
>
> * We need your feedback
> ** How would this change affect your current usage of Arrow, especially in
> terms of memory consumption?
> ** Do you have any concerns or thoughts about this proposal?
>
> Please review the detailed information in [1] and [2] and share your
> feedback. Your input is crucial as we gather community insights to decide
> whether or not to proceed with this change.
>
> Looking forward to your feedback and working together to enhance Arrow.
> Thank you!
>
> *Regards,*
> *Rossi SUN*
>


Re: [VOTE] Release Apache Arrow nanoarrow 0.5.0

2024-05-22 Thread Bryce Mecum
+1 (non-binding)

Verified on:

- macOS aarch64
- Debian 12 x86_64 inside a conda environment (note I had to install
Python 3.11 separately from the instructions, not sure I missed a
step)

On Wed, May 22, 2024 at 10:18 AM Dewey Dunnington
 wrote:
>
> Hello,
>
> I would like to propose the following release candidate (rc0) of
> Apache Arrow nanoarrow [0] version 0.5.0. This is an initial release
> consisting of 79 resolved GitHub issues from 9 contributors [1].
>
> This release candidate is based on commit:
> c5fb10035c17b598e6fd688ad9eb7b874c7c631b [2]
>
> The source release rc0 is hosted at [3].
> The changelog is located at [4].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [5] for how to validate a release
> candidate.
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow nanoarrow 0.5.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow nanoarrow 0.5.0 because...
>
> [0] https://github.com/apache/arrow-nanoarrow
> [1] https://github.com/apache/arrow-nanoarrow/milestone/5?closed=1
> [2] 
> https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.5.0-rc0
> [3] 
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-nanoarrow-0.5.0-rc0/
> [4] 
> https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.5.0-rc0/CHANGELOG.md
> [5] https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md


Re: Flight Python EC2 Server for parquet on S3

2024-05-10 Thread Bryce Mecum
Hi Christian, welcome.

Your code looks reasonable to me at first glance. It does seem
possible you're resource-constrained with that t2.micro instance. You
might try using a larger instance or reducing the batch size in your
call to iter_batches [1] to some very small number.

[1] 
https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetFile.html#pyarrow.parquet.ParquetFile.iter_batches

On Fri, May 10, 2024 at 7:30 AM Christian Casazza
 wrote:
>
> Hello everyone,
>
> This is my first time emailing this mailing list, so I hope I am explaining
> things correctly below.
>
> I am attempting to get started with Arrow Flight. I am storing parquet
> files and Iceberg tables on S3. I would like to use arrow flight as the
> interface data consumers use to access my data so they always receive Arrow
> back, where they can then continue to iterate locally with DuckDB, polars,
> etc.
>
> I am first attempting to get it working with a single parquet file in a
> private bucket on S3. For this test, I am just putting the credentials and
> paths directly in the server code, after working I can move to env before
> production.
>
> The parquet file is about 0.6GB. I am running the EC2 on a t2.micro
> instance.
>
> I was originally running into an ACCESS_DENIED during HeadObject operation
> AWS error when attempting to get the flight_info metadata about the file.
> From this issue , I added in
> using s3fs, and I was able to avoid the HeadObject error. So, the client is
> able to successfully see the available datasets, and return the schema.
>
> When I attempt to actually download the data itself, it is causing my EC2
> instance to break down and my SSH connection to drop. Is this likely a
> memory issue, or something with my code?
>
>
> The goal is to provide users with a common interface to access my data.
> After getting this working, I would add more datasets, data sources,
> introduce auth and RBAC, etc. For now, I thought this was a good base
> starting point. For now, I am just going with the user downloads the entire
> dataset. In the future, I hope to figure out an easy interface to support
> more fine grained data/tablescans, or supporting a query first, to return
> desired data.
>
> To keep things simple, I just added my code here
> .(
> https://github.com/ChristianCasazza/arrowflights3example).
> When I was actually testing, I connected to the EC2 instance through VScode
> for the server, and I was running the client code locally in a different
> window. I removed my actual parquet file path and credentials.
>
>
> This is my first time working with Arrow Flight, so I apologize if I am
> overlooking something simple or if the answer was in the docs.
>
> Any suggestions for changes I can make to get the data download working, or
> clear errors I am making?
>
> Thank you!
>
> Best,
> Christian Casazza


Re: [VOTE] Release Apache Arrow 16.1.0 - RC1

2024-05-09 Thread Bryce Mecum
+1 (non-binding)

I ran TEST_DEFAULT=0 TEST_CPP=1
./dev/release/verify-release-candidate.sh 16.1.0 1 on aarch64 macOS
14.4.1 with Homebrew. I did run into one failing test which I've filed
as [1].

[1] https://github.com/apache/arrow/issues/41605

On Thu, May 9, 2024 at 5:05 AM Raúl Cumplido  wrote:
>
> Hi,
>
> I would like to propose the following release candidate (RC1) of Apache
> Arrow version 16.1.0. This is a release consisting of 35
> resolved GitHub issues[1].
>
> This release candidate is based on commit:
> 7dd1d34074af176d9e861a360e135ae57b21cf96 [2]
>
> The source release rc1 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> The changelog is located at [12].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [13] for how to validate a release candidate.
>
> See also a verification result on GitHub pull request [14].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow 16.1.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow 16.1.0 because...
>
> [1]: 
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.1.0+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow/tree/7dd1d34074af176d9e861a360e135ae57b21cf96
> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.1.0-rc1
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.1.0-rc1
> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.1.0-rc1
> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/16.1.0-rc1
> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [12]: 
> https://github.com/apache/arrow/blob/7dd1d34074af176d9e861a360e135ae57b21cf96/CHANGELOG.md
> [13]: https://arrow.apache.org/docs/developers/release_verification.html
> [14]: https://github.com/apache/arrow/pull/41600


Re: [ANNOUNCE] New Arrow committer: Dane Pitkin

2024-05-07 Thread Bryce Mecum
Congrats Dane!

On Tue, May 7, 2024 at 5:53 AM Joris Van den Bossche
 wrote:
>
> On behalf of the Arrow PMC, I'm happy to announce that Dane Pitkin has
> accepted an invitation to become a committer on Apache Arrow. Welcome,
> and thank you for your contributions!
>
> Joris


Re: [DISCUSS] Status and future of @ApacheArrow Twitter account

2024-04-15 Thread Bryce Mecum
Apologies for letting this thread go cold, I did want to post the resolution.

The Arrow PMC does have access to the Twitter account and I believe
it's Raúl who has resumed tweeting about blogs/releases starting with
15.0.1. Thanks Raúl and the other PMC members for the help.

No additional social media platforms have been adopted at this time
and I appreciate all comments others have shared.

Thanks all.

On Sat, Jan 27, 2024 at 1:06 PM Bryce Mecum  wrote:
>
> I noticed that the @ApacheArrow Twitter account [1] hasn't posted
> since June 2023 which is around the time of the Arrow 12 release. When
> I asked on Zulip [2] about who runs or has access to post as that
> account, Kou indicated the account was managed using TweetDeck [3] and
> that this may no longer be an option due to subscription changes.
>
> I'm writing to get a sense of who currently has access and how the
> community would like to move forward with using the account. I'm also
> volunteering to help manage it.
>
> My questions are:
>
> - Who has access to @ApacheArrow [1]?
> - Is the community still interested in engaging on Twitter?
> - Is the community interested in other platforms, potentially just
> engaging with them through cross-posting?
>
> Thanks,
> Bryce
>
> [1] https://twitter.com/ApacheArrow
> [2] 
> https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/ApacheArrow.20Twitter.20account/near/418346643
> [3] https://en.wikipedia.org/wiki/Tweetdeck


Re: [ANNOUNCE] New Arrow committer: Sarah Gilmore

2024-04-11 Thread Bryce Mecum
Congratulations!

On Thu, Apr 11, 2024 at 3:13 AM Sutou Kouhei  wrote:
>
> Hi,
>
> On behalf of the Arrow PMC, I'm happy to announce that Sarah
> Gilmore has accepted an invitation to become a committer on
> Apache Arrow. Welcome, and thank you for your contributions!
>
> Thanks,
> --
> kou


Re: [ANNOUNCE] New Committer Joel Lubinitsky

2024-04-01 Thread Bryce Mecum
Congrats, Joel!

On Mon, Apr 1, 2024 at 6:59 AM Matt Topol  wrote:
>
> On behalf of the Arrow PMC, I'm happy to announce that Joel Lubinitsky has
> accepted an invitation to become a committer on Apache Arrow. Welcome, and
> thank you for your contributions!
>
> --Matt


Re: [VOTE] Release Apache Arrow ADBC 0.11.0 - RC0

2024-03-28 Thread Bryce Mecum
+1 (non-binding)

Verified on Windows x86_64 with USE_CONDA=1. I ran into two small
issues and filed an issue [1] and sent in a PR containing the changes
I made.

[1] https://github.com/apache/arrow-adbc/issues/1683

On Thu, Mar 28, 2024 at 7:07 AM David Li  wrote:
>
> Hello,
>
> I would like to propose the following release candidate (RC0) of Apache Arrow 
> ADBC version 0.11.0. This is a release consisting of 36 resolved GitHub 
> issues [1].
>
> This release candidate is based on commit: 
> 3cb5825bf551ae93d0e9ed2f64be226b569b27a7 [2]
>
> The source release rc0 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8].
> The changelog is located at [9].
>
> Please download, verify checksums and signatures, run the unit tests, and 
> vote on the release. See [10] for how to validate a release candidate.
>
> See also a verification result on GitHub Actions [11].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow ADBC 0.11.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow ADBC 0.11.0 because...
>
> Note: to verify APT/YUM packages on macOS/AArch64, you must `export 
> DOCKER_DEFAULT_PLATFORM=linux/amd64`. (Or skip this step by `export 
> TEST_APT=0 TEST_YUM=0`.)
>
> [1]: 
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A%22ADBC+Libraries+0.11.0%22+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow-adbc/commit/3cb5825bf551ae93d0e9ed2f64be226b569b27a7
> [3]: 
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-adbc-0.11.0-rc0/
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [7]: 
> https://repository.apache.org/content/repositories/staging/org/apache/arrow/adbc/
> [8]: 
> https://github.com/apache/arrow-adbc/releases/tag/apache-arrow-adbc-0.11.0-rc0
> [9]: 
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.11.0-rc0/CHANGELOG.md
> [10]: 
> https://arrow.apache.org/adbc/main/development/releasing.html#how-to-verify-release-candidates
> [11]: https://github.com/apache/arrow-adbc/actions/runs/8468352632


Re: [DISCUSS] Conventions for transporting Arrow data over HTTP

2024-03-11 Thread Bryce Mecum
I'd be happy to contribute C# and Ruby examples. I'll work on those this week.

On Tue, Mar 5, 2024 at 7:03 PM Ian Cook  wrote:
>
> Update on recent progress in this Arrow-over-HTTP project:
>
> I cleaned up the minimal examples of HTTP clients and servers and
> moved them into a directory in the Arrow Experiments repo:
> https://github.com/apache/arrow-experiments/tree/main/http
>
> So far there are client examples in six languages and server examples
> in two languages (Python and Go). They all have READMEs describing how
> to use them.
>
> I have an open PR that adds a third server example in Java. Reviews 
> appreciated:
> https://github.com/apache/arrow-experiments/pull/4
>
> I would like to see minimal client and server examples in a few more
> languages (especially Rust) before we move on to developing richer
> types of examples. Is anyone interested in contributing additional
> minimal examples?
>
> Thanks,
> Ian
>
> On Wed, Dec 6, 2023 at 2:29 PM Ian Cook  wrote:
> >
> > I just remembered that there is an unused "Arrow Experiments" repo [1]
> > which Wes created a few years ago [2]. That seems like a more
> > appropriate place to open PRs like this one. If there are no
> > objections, I will start using that repo for these Arrow-over-HTTP
> > PRs.
> >
> > [1] https://github.com/apache/arrow-experiments
> > [2] https://lists.apache.org/thread/cw14s874pwplzf9ycnvfwtwq0xq17npg
> >
> > Ian
> >
> > On Wed, Dec 6, 2023 at 1:45 PM Ian Cook  wrote:
> > >
> > > Antoine,
> > >
> > > Thank you for taking a look. I agree—these are basic examples intended
> > > to prove the concept and answer fundamental questions. Next I intend
> > > to expand the set of examples to cover more complex cases.
> > >
> > > > This might necessitate some kind of framing layer, or a
> > > > standardized delimiter.
> > >
> > > I am interested to hear more perspectives on this. My perspective is
> > > that we should recommend using HTTP conventions to keep clean
> > > separation between the Arrow-formatted binary data payloads and the
> > > various application-specific fields. This can be achieved by encoding
> > > application-specific fields in URI paths, query parameters, headers,
> > > or separate parts of multipart/form-data messages.
> > >
> > > Ian
> > >
> > > On Wed, Dec 6, 2023 at 1:24 PM Antoine Pitrou  wrote:
> > > >
> > > >
> > > > Hi,
> > > >
> > > > While this looks like a nice start, I would expect more precise
> > > > recommendations for writing non-trivial services. Especially, one
> > > > question is how to send both an application-specific POST request and an
> > > > Arrow stream, or an application-specific GET response and an Arrow
> > > > stream. This might necessitate some kind of framing layer, or a
> > > > standardized delimiter.
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> > > >
> > > >
> > > >
> > > > Le 05/12/2023 à 21:10, Ian Cook a écrit :
> > > > > This is a continuation of the discussion entitled "[DISCUSS] Protocol 
> > > > > for
> > > > > exchanging Arrow data over REST APIs". See the previous messages at
> > > > > https://lists.apache.org/thread/vfz74gv1knnhjdkro47shzd1z5g5ggnf.
> > > > >
> > > > > To inform this discussion, I created some basic Arrow-over-HTTP 
> > > > > client and
> > > > > server examples here:
> > > > > https://github.com/apache/arrow/pull/39081
> > > > >
> > > > > My intention is to expand and improve this set of examples (with your 
> > > > > help)
> > > > > until they reflect a set of conventions that we are comfortable 
> > > > > documenting
> > > > > as recommendations.
> > > > >
> > > > > Please take a look and add comments / suggestions in the PR.
> > > > >
> > > > > Thanks,
> > > > > Ian
> > > > >
> > > > > On Tue, Nov 21, 2023 at 1:35 PM Dewey Dunnington
> > > > >  wrote:
> > > > >
> > > > >> I also think a set of best practices for Arrow over HTTP would be a
> > > > >> valuable resource for the community...even if it never becomes a
> > > > >> specification of its own, it will be beneficial for API developers 
> > > > >> and
> > > > >> consumers of those APIs to have a place to look to understand how
> > > > >> Arrow can help improve throughput/latency/maybe other things. 
> > > > >> Possibly
> > > > >> something like httpbin.org but for requests/responses that use Arrow
> > > > >> would be helpful as well. Thank you Ian for leading this effort!
> > > > >>
> > > > >> It has mostly been covered already, but in the (ubiquitous) situation
> > > > >> where a response contains some schema/table and some non-schema/table
> > > > >> information there is some tension between throughput (best served by 
> > > > >> a
> > > > >> JSON response plus one or more IPC stream responses) and latency 
> > > > >> (best
> > > > >> served by a single HTTP response? JSON? IPC with metadata/header?). 
> > > > >> In
> > > > >> addition to Antoine's list, I would add:
> > > > >>
> > > > >> - How to serve the same table in multiple requests (e.g., to saturate
> > > > >> a network connection,

Re: [VOTE] Release Apache Arrow 15.0.1 - RC0

2024-03-05 Thread Bryce Mecum
+1 (non-binding)

Verified C++ on Windows 11, VS 2019, conda. I ran into an issue
building PyArrow [1] which I don't think is a problem for release.

[1] https://github.com/apache/arrow/issues/40375

On Mon, Mar 4, 2024 at 12:05 AM Raúl Cumplido  wrote:
>
> Hi,
>
> I would like to propose the following release candidate (RC0) of Apache
> Arrow version 15.0.1. This is a release consisting of 37
> resolved GitHub issues[1].
>
> This release candidate is based on commit:
> 5ce6ff434c1e7daaa2d7f134349f3ce4c22683da [2]
>
> The source release rc0 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> The changelog is located at [12].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [13] for how to validate a release candidate.
>
> See also a verification result on GitHub pull request [14].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow 15.0.1
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow 15.0.1 because...
>
> [1]: 
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A15.0.1+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow/tree/5ce6ff434c1e7daaa2d7f134349f3ce4c22683da
> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-15.0.1-rc0
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/15.0.1-rc0
> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/15.0.1-rc0
> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/15.0.1-rc0
> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [12]: 
> https://github.com/apache/arrow/blob/5ce6ff434c1e7daaa2d7f134349f3ce4c22683da/CHANGELOG.md
> [13]: 
> https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
> [14]: https://github.com/apache/arrow/pull/40211


Re: R Date class lost when column used for partitioning

2024-03-01 Thread Bryce Mecum
Hi Andrew, thanks for the question.

Try specifying a schema to `open_dataset` with d1 specified as
date32[day]. When I do that, I get the correct type for that field and
the values look correct too.

schm <- schema(bb)
new_schm <- schm$SetField(6, arrow::field("d1", arrow::date32()))
bb <- arrow::open_dataset(..., schema = new_schm)

On Tue, Feb 27, 2024 at 7:59 PM Andrew Piskorski  wrote:
>
> Hi, using the R arrow package version 14.0.2.1, I'm stumped by
> something seemingly simple.  For date columns, I like to use R's Date
> class, which is stored internally as a number but prints as a
> -MM-DD string.
>
> In most cases arrow handles these Date columns nicely.  The exception
> is when I partition on a Date column, as in column "d1" in my example
> below.  When I read my data back in with open_dataset(), the d1 column
> is now a string instead of Date.  In contrast, the types of all the
> other columns are preserved, including my "d2" Date column, because I
> did not partition on that one.
>
> It sort of makes sense that d1 is now a string, because the directory
> names on disk really are strings like "2024-01-01".  But I'd really
> like to convert it back to the Date class format!  In plain R that's
> easy, but with the Dataset mmap-ed on disk, I don't know how to do it.
>
> What should I do to get arrow to convert the partitioned d1 column to
> Arrow's date32[day] type, and thus back to R's Date class?  Can I
> somehow do this directly on the Dataset object itself, WITHOUT first
> converting it to ArrowTabular or data.frame?
>
> Thanks for your help!
>
>
> Example follows:
> --
> require("arrow")
> my.dir <- "/tmp/arrow"
> # Example data with some Date-class columns:
> aa <- do.call("rbind" ,lapply(split(iris ,iris$Species) ,function(xx){
>cbind(head(xx ,5)
> ,d1=(as.Date('2024-01-01') + 0:4)
> ,d2=(as.Date('1980-01-01') + 0:4))
> })); rownames(aa) <- NULL
> arrow::write_dataset(aa ,my.dir ,partitioning=c('d1') ,hive_style=FALSE 
> ,format="feather" ,codec=Codec$create("LZ4_FRAME"))
> bb <- arrow::open_dataset(my.dir ,format="feather" ,unify_schemas=TRUE 
> ,partitioning=c('d1'))
> # Unfortunately the "d1" column is now a string.
>
> > dim(aa)
> [1] 15  7
> > class(aa)
> [1] "data.frame"
>
> > sapply(aa ,class)
> Sepal.Length  Sepal.Width Petal.Length  Petal.Width  Species   d1 
>   d2
>"numeric""numeric""numeric""numeric" "factor"   "Date" 
>   "Date"
> > sapply(aa ,storage.mode)
> Sepal.Length  Sepal.Width Petal.Length  Petal.Width  Species   d1 
>   d2
> "double" "double" "double" "double""integer" "double" 
> "double"
>
> > dim(bb)
> [1] 15  7
> > class(bb)
> [1] "FileSystemDataset" "Dataset"   "ArrowObject"   "R6"
>
> > bb$schema$d1
> Field
> d1: string
> > bb$schema$d2
> Field
> d2: date32[day]
>
> > bb
> FileSystemDataset with 5 Feather files
> Sepal.Length: double
> Sepal.Width: double
> Petal.Length: double
> Petal.Width: double
> Species: dictionary
> d2: date32[day]
> d1: string
>
> See $metadata for additional Schema metadata
>
> > sapply(arrow:::as.data.frame.ArrowTabular(bb$NewScan()$Finish()$ToTable()) 
> > ,class)
> Sepal.Length  Sepal.Width Petal.Length  Petal.Width  Species   d2 
>   d1
>"numeric""numeric""numeric""numeric" "factor"   "Date" 
>  "character"
> --
>
> --
> Andrew Piskorski 


Re: [C#][Flight RPC] DoExchange method support

2024-02-16 Thread Bryce Mecum
Hi Sujud. It looks like there was some mention of the reason for not
implementing DoExchange in the PR [1] that created the initial
implementation. I'm not familiar with the code but I hope that helps.

[1] https://github.com/apache/arrow/pull/8694

On Tue, Feb 13, 2024 at 11:59 PM Sujud Abu Atta
 wrote:
>
> Hi,
>
> Currently .NET nuget Apache.Arrow.Flight does not have support for DoExchange 
> method.
> This is also documented here: https://arrow.apache.org/docs/status.html
>
> I was wondering If there are any specific limitations to why it’s not 
> implemented, has this been discussed before? I would love to gain insights 
> before contributing to it.
>
> Thanks,
> Sujud
>


Re: [DISCUSS] Status and future of @ApacheArrow Twitter account

2024-01-31 Thread Bryce Mecum
Thanks all for the replies so far. To summarize, the answers to my
original questions are:

> 1. Who has access to @ApacheArrow [1]?

Raúl messaged the PMC list (thanks!) and some members of the PMC still
have the original credentials for the @ApacheArrow Twitter account.

> 2. Is the community still interested in engaging on Twitter?

Generally yes.

> 3. Is the community interested in other platforms, potentially just engaging 
> with them through cross-posting?

Following Wes' question above, how would others feel if this project
created LinkedIn and Mastodon accounts and cross-posted on all three
platforms?

If there isn't sufficient interest in additional platforms, I think it
would be best to obtain an X Premium account ($8/mo) to make sharing
the account easier and less likely to be considered a violation of the
X ToS.

If there _is_ sufficient interest in additional platforms,
cross-posting using a tool like Buffer seems like the best option even
if the project doesn't create any new accounts at this time.




On Sat, Jan 27, 2024 at 1:06 PM Bryce Mecum  wrote:
>
> I noticed that the @ApacheArrow Twitter account [1] hasn't posted
> since June 2023 which is around the time of the Arrow 12 release. When
> I asked on Zulip [2] about who runs or has access to post as that
> account, Kou indicated the account was managed using TweetDeck [3] and
> that this may no longer be an option due to subscription changes.
>
> I'm writing to get a sense of who currently has access and how the
> community would like to move forward with using the account. I'm also
> volunteering to help manage it.
>
> My questions are:
>
> - Who has access to @ApacheArrow [1]?
> - Is the community still interested in engaging on Twitter?
> - Is the community interested in other platforms, potentially just
> engaging with them through cross-posting?
>
> Thanks,
> Bryce
>
> [1] https://twitter.com/ApacheArrow
> [2] 
> https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/ApacheArrow.20Twitter.20account/near/418346643
> [3] https://en.wikipedia.org/wiki/Tweetdeck


Re: [DISCUSS] Status and future of @ApacheArrow Twitter account

2024-01-30 Thread Bryce Mecum
Buffer [1] looks like a reasonably similar, cheaper alternative to
Hootsuite. For the scale of this project, what costs at least $249/mo
with Hootsuite costs $36/mo with Buffer. X Premium [2] is $8/mo and
would give the project back the ability for multiple people to post to
@ApacheArrow from their personal accounts.

[1] https://buffer.com
[2] https://help.twitter.com/en/using-x/x-premium

On Mon, Jan 29, 2024 at 9:50 AM Wes McKinney  wrote:
>
> Is there a different tool other than TweetDeck available that can
> synchronize posts that go out on different social channels (LinkedIn,
> Twitter, Mastodon, etc.)? I've heard of things like Hootsuite but that's
> pretty expensive and definitely overkill for an open source project, but
> perhaps there is a more modest tool that would help with mirroring content
> onto different platforms.
>
> On Sat, Jan 27, 2024 at 5:39 PM Antoine Pitrou  wrote:
>
> >
> > My 2 cents : I don't understand what an open source project gains by
> > publishing on a microblogging platform.
> >
> > As for Twitter specifically, its recent governance changes would be good
> > reason for terminating the @ApacheArrow account, IMHO.
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 27/01/2024 à 23:06, Bryce Mecum a écrit :
> > > I noticed that the @ApacheArrow Twitter account [1] hasn't posted
> > > since June 2023 which is around the time of the Arrow 12 release. When
> > > I asked on Zulip [2] about who runs or has access to post as that
> > > account, Kou indicated the account was managed using TweetDeck [3] and
> > > that this may no longer be an option due to subscription changes.
> > >
> > > I'm writing to get a sense of who currently has access and how the
> > > community would like to move forward with using the account. I'm also
> > > volunteering to help manage it.
> > >
> > > My questions are:
> > >
> > > - Who has access to @ApacheArrow [1]?
> > > - Is the community still interested in engaging on Twitter?
> > > - Is the community interested in other platforms, potentially just
> > > engaging with them through cross-posting?
> > >
> > > Thanks,
> > > Bryce
> > >
> > > [1] https://twitter.com/ApacheArrow
> > > [2]
> > https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/ApacheArrow.20Twitter.20account/near/418346643
> > > [3] https://en.wikipedia.org/wiki/Tweetdeck
> >


Re: [DISCUSS] Status and future of @ApacheArrow Twitter account

2024-01-30 Thread Bryce Mecum
On Sat, Jan 27, 2024 at 2:39 PM Antoine Pitrou  wrote:
>
>
> My 2 cents : I don't understand what an open source project gains by
> publishing on a microblogging platform.
>
> As for Twitter specifically, its recent governance changes would be good
> reason for terminating the @ApacheArrow account, IMHO.

Thanks for sharing this perspective, Antoine. I suspect you're not
alone in it but, based in part on other responses in this thread so
far, I suspect the community thinks the positives outweigh the
negatives at this point.

>
> Regards
>
> Antoine.
>
>
> Le 27/01/2024 à 23:06, Bryce Mecum a écrit :
> > I noticed that the @ApacheArrow Twitter account [1] hasn't posted
> > since June 2023 which is around the time of the Arrow 12 release. When
> > I asked on Zulip [2] about who runs or has access to post as that
> > account, Kou indicated the account was managed using TweetDeck [3] and
> > that this may no longer be an option due to subscription changes.
> >
> > I'm writing to get a sense of who currently has access and how the
> > community would like to move forward with using the account. I'm also
> > volunteering to help manage it.
> >
> > My questions are:
> >
> > - Who has access to @ApacheArrow [1]?
> > - Is the community still interested in engaging on Twitter?
> > - Is the community interested in other platforms, potentially just
> > engaging with them through cross-posting?
> >
> > Thanks,
> > Bryce
> >
> > [1] https://twitter.com/ApacheArrow
> > [2] 
> > https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/ApacheArrow.20Twitter.20account/near/418346643
> > [3] https://en.wikipedia.org/wiki/Tweetdeck


[DISCUSS] Status and future of @ApacheArrow Twitter account

2024-01-27 Thread Bryce Mecum
I noticed that the @ApacheArrow Twitter account [1] hasn't posted
since June 2023 which is around the time of the Arrow 12 release. When
I asked on Zulip [2] about who runs or has access to post as that
account, Kou indicated the account was managed using TweetDeck [3] and
that this may no longer be an option due to subscription changes.

I'm writing to get a sense of who currently has access and how the
community would like to move forward with using the account. I'm also
volunteering to help manage it.

My questions are:

- Who has access to @ApacheArrow [1]?
- Is the community still interested in engaging on Twitter?
- Is the community interested in other platforms, potentially just
engaging with them through cross-posting?

Thanks,
Bryce

[1] https://twitter.com/ApacheArrow
[2] 
https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/ApacheArrow.20Twitter.20account/near/418346643
[3] https://en.wikipedia.org/wiki/Tweetdeck


Re: What's wrong with my TLS reasoning and FlightServerBase ?

2023-12-30 Thread Bryce Mecum
Thanks David, and apologies to Rick. I missed that you were starting
your server without TLS (as well as the client) and confused things
here.


On Sat, Dec 30, 2023 at 3:03 PM David Li  wrote:
>
> Just to be clear - the server never supports both TLS and plaintext 
> connections at the same time. (I don't believe this is possible in gRPC.) The 
> URI scheme determines how the server listens so if you don't use grpc+tls:// 
> it will use plaintext regardless of if you pass certificates or not. The code 
> could do more input validation in this case but it was never listening using 
> TLS in the first place.
>
> On Sat, Dec 30, 2023, at 18:57, Bryce Mecum wrote:
> > Hi Rick,
> >
> > You're right that TLS support is built into PyArrow Flight [1]. I
> > think the issue with your code is that your client isn't attempting to
> > connect over TLS and that the default behavior of the FlightServerBase
> > must be to allow both TLS and non-TLS connections. This seems to be
> > similar to how web servers might choose to accept connections over
> > HTTP and HTTPS (though many may not).
> >
> > To make your code fail as you expect, see [1] and, in your client
> > code, either change server_location to use
> > pyarrow.flight.Location.for_grpc_tls to construct the Location object
> > or change your URI to "grpc+tls://localhost:8081" instead of just
> > "grpc://localhost:8081". Once you change this, your client should fail
> > with an SSL handshake error.
> >
> > [1] https://arrow.apache.org/docs/python/flight.html#enabling-tls
> >
> > On Sat, Dec 30, 2023 at 2:20 PM Rick Spencer
> >  wrote:
> >>
> >> I am working on supporting TLS, and it looks like everything that I need is
> >> built into FlightServerBase.
> >>
> >> However, I am struggling to understand how it works, or how to test that it
> >> is working. For example, I don't understand why I can pass garbage in for
> >> the tls_certs, and still get results when called from a client. Here is a
> >> minimal example I put together to show where I am confused.
> >>
> >> Server that I think should not work:
> >> ```python
> >> from pyarrow import flight, Table
> >>
> >> class SampleServer(flight.FlightServerBase):
> >> def __init__(self, *args, **kwargs):
> >> tls_certificates = [("garbage", "garbage")]
> >> location = flight.Location.for_grpc_tcp("localhost", 8081)
> >> super(SampleServer, self).__init__(location,
> >> None,
> >> tls_certificates,
> >> False,
> >> None,
> >> *args, **kwargs)
> >>
> >> def do_get(self, context, ticket):
> >> data = {'col': [1]}
> >> table = Table.from_pydict(data)
> >> return flight.RecordBatchStream(table)
> >>
> >> if __name__ == "__main__":
> >> server = SampleServer()
> >> server.serve()
> >> ```
> >>
> >> Client code that I think should not work: ```python
> >> import pyarrow.flight as fl
> >> import json
> >> def main():
> >> server_location = "grpc://localhost:8081"
> >>
> >> client = fl.FlightClient(server_location)
> >> ticket = fl.Ticket(json.dumps({}))
> >> reader = client.do_get(ticket)
> >> print(reader.read_all().to_pandas())
> >>
> >> if __name__ == "__main__":
> >> main()
> >> ```
> >>
> >> But when I run the server, and then the client, I get a result: ``` %
> >> python3 client.py
> >>col
> >> 01 ``` I would expect some kind of TLS error.
> >>
> >> I am sure that I am confused about something, but if someone could help me
> >> with my reasoning, I would appreciate it.


Re: What's wrong with my TLS reasoning and FlightServerBase ?

2023-12-30 Thread Bryce Mecum
Hi Rick,

You're right that TLS support is built into PyArrow Flight [1]. I
think the issue with your code is that your client isn't attempting to
connect over TLS and that the default behavior of the FlightServerBase
must be to allow both TLS and non-TLS connections. This seems to be
similar to how web servers might choose to accept connections over
HTTP and HTTPS (though many may not).

To make your code fail as you expect, see [1] and, in your client
code, either change server_location to use
pyarrow.flight.Location.for_grpc_tls to construct the Location object
or change your URI to "grpc+tls://localhost:8081" instead of just
"grpc://localhost:8081". Once you change this, your client should fail
with an SSL handshake error.

[1] https://arrow.apache.org/docs/python/flight.html#enabling-tls

On Sat, Dec 30, 2023 at 2:20 PM Rick Spencer
 wrote:
>
> I am working on supporting TLS, and it looks like everything that I need is
> built into FlightServerBase.
>
> However, I am struggling to understand how it works, or how to test that it
> is working. For example, I don't understand why I can pass garbage in for
> the tls_certs, and still get results when called from a client. Here is a
> minimal example I put together to show where I am confused.
>
> Server that I think should not work:
> ```python
> from pyarrow import flight, Table
>
> class SampleServer(flight.FlightServerBase):
> def __init__(self, *args, **kwargs):
> tls_certificates = [("garbage", "garbage")]
> location = flight.Location.for_grpc_tcp("localhost", 8081)
> super(SampleServer, self).__init__(location,
> None,
> tls_certificates,
> False,
> None,
> *args, **kwargs)
>
> def do_get(self, context, ticket):
> data = {'col': [1]}
> table = Table.from_pydict(data)
> return flight.RecordBatchStream(table)
>
> if __name__ == "__main__":
> server = SampleServer()
> server.serve()
> ```
>
> Client code that I think should not work: ```python
> import pyarrow.flight as fl
> import json
> def main():
> server_location = "grpc://localhost:8081"
>
> client = fl.FlightClient(server_location)
> ticket = fl.Ticket(json.dumps({}))
> reader = client.do_get(ticket)
> print(reader.read_all().to_pandas())
>
> if __name__ == "__main__":
> main()
> ```
>
> But when I run the server, and then the client, I get a result: ``` %
> python3 client.py
>col
> 01 ``` I would expect some kind of TLS error.
>
> I am sure that I am confused about something, but if someone could help me
> with my reasoning, I would appreciate it.


Re: Documentation of Breaking Changes

2023-11-21 Thread Bryce Mecum
Hi Chris, this is very much the place to ask a question like this and
thanks for doing so.

Could we get a little more information on the specific change you were
affected by just so we're all on the same page? Was this the bump from
Parquet 2.4 to 2.6 [1] that happened in the PyArrow 13 release [2] or
something else?

Currently, breaking changes are communicated in the release blog post
[2] and the corresponding GitHub Issue gets a Breaking Change label
[3], as documented in our development guide [4]. When you say
"change-logs", are you referring to the changelogs on our release page
[5]?

[1] https://github.com/apache/arrow/issues/35746
[2] https://arrow.apache.org/blog/2023/08/24/13.0.0-release/
[3] https://github.com/apache/arrow/labels
[4] https://arrow.apache.org/docs/developers/reviewing.html#labelling
[5] https://arrow.apache.org/release/13.0.0.html#changelog

On Tue, Nov 21, 2023 at 1:00 PM Chris Thomas  wrote:
>
> Evening folks,
>
> I apologize if this is not the appropriate venue for this request; if
> that's the case, please let me know where I should be asking:
>
> Earlier this month Dependabot flagged a security vulnerability with PyArrow
> which prompted us to do an upgrade from v10 to v14.1 of the software.
> Obviously this is a lot of major versions so the upgrade was subjected to a
> bunch of tests but, alas, there was a breaking change to the way PyArrow
> handled time precision that slipped through the cracks.
>
> Upon review I'm not sure how that change could possibly have been caught.
> The change-logs for the package are a verbose dump of all of the PRs
> included in the release.  Working out which of them constitute a breaking
> change and what the implications are of that change is difficult.
>
> Is this something that could be addressed in the project?
>
> --
>
> Best,
>
> Chris Thomas
> Engineering Manager - Feature Team
> 540.808.2782


Re: C++: Code that read parquet into Arrow Arrays?

2023-11-17 Thread Bryce Mecum
Hi Li, I think what you're after is ColumnReaderImpl::NextBatch [1]
which looks like it eventually calls TransferZeroCopy [2] in the case
of primitive types like float/double (amongst others).

[1] 
https://github.com/apache/arrow/blob/main/cpp/src/parquet/arrow/reader.cc#L107
[2] 
https://github.com/apache/arrow/blob/main/cpp/src/parquet/arrow/reader_internal.cc#L345

On Fri, Nov 17, 2023 at 12:27 PM Li Jin  wrote:
>
> Hi,
>
> I am recently investigating a null/nan issue with Parquet and Arrow and
> wonder if someone can give me a pointer to the code that decodes Parquet
> row group into Arrow float/double arrays?
>
> Thanks,
> Li


Re: [ANNOUNCE] New Arrow PMC member: Raúl Cumplido

2023-11-13 Thread Bryce Mecum
Congrats, Raúl!

On Mon, Nov 13, 2023 at 10:28 AM Andrew Lamb  wrote:
>
> The Project Management Committee (PMC) for Apache Arrow has invited
> Raúl Cumplido  to become a PMC member and we are pleased to announce
> that  Raúl Cumplido has accepted.
>
> Please join me in congratulating them.
>
> Andrew


Re: Gandiva List Type feature development

2023-11-04 Thread Bryce Mecum
Submitting a PR and marking it as a "Draft" would be fine and so would
writing up a design doc and soliciting feedback on this mailing list.
It really depends on what you think would help you the most at this
point. If you submit a draft PR, reply to this email thread with a
link to increase its visibility.

Is the original issue [1] sufficiently descriptive of the work you're
looking to do? If so, comment "take" on it which will assign the issue
to you, letting others know you're working on it. If it's not, you
might open a new issue and we can close the old one.

[1] https://github.com/apache/arrow/issues/27001

On Fri, Nov 3, 2023 at 3:47 PM Logan Riggs
 wrote:
>
> Hi,
>
> I'm working on adding List support to Gandiva and its turning into a fairly
> complicated feature. I have code that works for a few list types (ie
> list etc) but I'm not at the stage yet for a formal PR (needs more
> tests and some tests are broken, code needs to be cleaned up some). Is
> there a general process for getting early feedback on a proposed change?
> For example a work in progress PR or design doc review maybe?
>
> For reference, my change is an expansion/continuation of this abandoned PR:
> https://github.com/apache/arrow/pull/9060
>
> Thanks
> Logan


Re: [VOTE] Release Apache Arrow 14.0.0 - RC2

2023-10-24 Thread Bryce Mecum
I've failed to verify this release candidate on macOS M1, running
"dev/release/verify-release-candidate.sh 14.0.0 2" [1]. The failure
looks related to the Go implementation's "parquet-encryption-test".
Can anyone on a similar machine verify?

[1] https://gist.github.com/amoeba/f47534bea44d78a7ee79e4b44ed0e4ff


On Mon, Oct 23, 2023 at 11:19 PM Raúl Cumplido  wrote:
>
> Hi,
>
> I would like to propose the following release candidate (RC2) of Apache
> Arrow version 14.0.0. This is a release consisting of 461
> resolved GitHub issues[1].
>
> This release candidate is based on commit:
> 2dcee3f82c6cf54b53a64729fd81840efa583244 [2]
>
> The source release rc2 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> The changelog is located at [12].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [13] for how to validate a release candidate.
>
> See also a verification result on GitHub pull request [14].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow 14.0.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow 14.0.0 because...
>
> [1]: 
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A14.0.0+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow/tree/2dcee3f82c6cf54b53a64729fd81840efa583244
> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-14.0.0-rc2
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/14.0.0-rc2
> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/14.0.0-rc2
> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/14.0.0-rc2
> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [12]: 
> https://github.com/apache/arrow/blob/2dcee3f82c6cf54b53a64729fd81840efa583244/CHANGELOG.md
> [13]: 
> https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
> [14]: https://github.com/apache/arrow/pull/38343


Re: Help regarding setting up the r package in arrow apache

2023-10-23 Thread Bryce Mecum
Is the `docker-compose run r` output before or after you changed the
line endings in build_arrow_static.sh? This error:

inst/build_arrow_static.sh: line 38: $'\r': command not found

makes it look like you may have written the file out with
Windows-style line endings which I suspect MinGW is having a hard time
with. If that's the case, do you get a different output from
`docker-compose run r` before you rewrite the file? Are you running
this with (1) recent checkout and (2) with a completely clean working
copy? If you're not sure, include the output of `git rev-parse HEAD`
and `git status` in an updated gist.

On Sun, Oct 22, 2023 at 11:25 PM Divyansh Khatri
 wrote:
>
> Hi Jonathan,
> So,first of all regarding the previous message I sent I thought I setup the
> R correctly but when i ran the container it was exiting immediately so I
> did the process again.I followed the developer documentation [1] (
> https://arrow.apache.org/docs/r/articles/developers/docker.html#example---the-manual-way
> ).but i am encountering an error (I think it's regarding the line ending
> of inst/build_arrow_static.sh).but even after changing the line ending I am
> still encountering the same error. Can you tell me what's the problem?
>
> https://gist.github.com/Divyansh200102/dd7fc370e39818796c58df44badcc1b8
>
> I have attached the terminal output of the documentation [1] and the 4
> files containing "inst/build_arrow_static.sh"(If I need to make any changes
> in the line ending of any of  them please tell me which one).
>
> Thanks
>
> On Mon, 23 Oct 2023 at 00:59, Divyansh Khatri 
> wrote:
>
> > Thanks for the help Jonathan,Nic and Bryce I was able to setup R  and the
> >> docs.
> >
> >


Re: Help regarding setting up the r package in arrow apache

2023-10-16 Thread Bryce Mecum
That error makes it look like you're running `docker compose up` from
the root of the Arrow source tree which is likely not what you want.
Are you trying to use the Arrow R package in a Docker container or are
you trying to contribute to it by developing inside of a Docker
container? Nic's link [1] is a good starting point.

[1] https://arrow.apache.org/docs/r/articles/developers/docker.html

On Mon, Oct 16, 2023 at 4:31 AM Divyansh Khatri
 wrote:
>
> Hi,so i am basically using the docker cmd 'docker compose up -d' in the
> docker-compose.yml but i am encountering this error(Error response from
> daemon: manifest for amd64/maven:3.5.4-eclipse-temurin-8 not found:
> manifest unknown: manifest unknown)so i am not sure how to proceed from
> here?
>
> On Mon, 16 Oct 2023 at 14:17, Nic Crane  wrote:
>
> > Hi Divyansh,
> >
> > There are instructions for creating a R package dev setup here:
> > https://arrow.apache.org/docs/r/articles/developers/setup.html
> >
> > If you can explain a bit more about what you've tried so far and what's not
> > working, we may be able to advise.
> >
> > Best wishes,
> >
> > Nic
> >
> > On Mon, 16 Oct 2023 at 06:02, Divyansh Khatri  > >
> > wrote:
> >
> > > I am having problems regarding setting up the r package using docker of
> > the
> > > apache arrow.Can you give me the step by step process of how do i setup
> > the
> > > r package in my vs code system using docker.
> > >
> >


Re: [ANNOUNCE] New Arrow PMC member: Jonathan Keane

2023-10-15 Thread Bryce Mecum
Congratulations, Jon!

On Sat, Oct 14, 2023 at 9:24 AM Andrew Lamb  wrote:
>
> The Project Management Committee (PMC) for Apache Arrow has invited
> Jonathan Keane to become a PMC member and we are pleased to announce
> that Jonathan Keane has accepted.
>
> Congratulations and welcome!
>
> Andrew


Re: [VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-26 Thread Bryce Mecum
+1 (non-binding)

Verified with `./verify-release-candidate.sh 0.3.0 0` on:
- Windows 10, x86_64, libarrow-main, MSVC 17 2022, R 4.3.1, Rtools 43
- macOS 13.6, aarch64, libarrow 13.0.0, R 4.3.1
- Ubuntu 23.04, aarch64, libarrow 13.0.0, R 4.2.2


Re: [QUESTION] Syndication site(s) for Apache Arrow related content?

2023-07-21 Thread Bryce Mecum
I'm not aware of one but I'd love to see one get started and would be
happy to contribute.

Related, I'm aware that Nic Crane and Marlene Mhangami have put
together resources in the "awesome-x" style for R [1] and Python [2],
respectively.

[1] https://github.com/thisisnic/awesome-arrow-r
[2] https://github.com/marlenezw/awesome-arrow-python


On Fri, Jul 21, 2023 at 7:27 AM Andrew Lamb  wrote:
>
> Hi,
>
> Does anyone know a location that collects / syndicates Apache Arrow related 
> content?
>
> Some examples of such a thing are [1] for python and [2]  for Rust [2].
>
> Andrew
>
> [1]: https://planetpython.org/
> [2] https://this-week-in-rust.org/


Re: [ANNOUNCE] New Arrow PMC member: Will Jones

2023-03-13 Thread Bryce Mecum
Congratulations, Will!


Re: [VOTE] Release Apache Arrow nanoarrow 0.1.0 - RC1

2023-03-01 Thread Bryce Mecum
+1 (non-binding)

Verified on:

- Fedora 37 (amd64)
- Debian 11 (amd64)

On Wed, Mar 1, 2023 at 8:04 AM Dewey Dunnington
 wrote:

> Hello,
>
> I would like to propose the following release candidate (RC1) of Apache
> Arrow nanoarrow [0] version 0.1.0. This is an initial release consisting of
> 31 resolved GitHub issues [1].
>
> Special thanks to David Li for his reviews and support during the
> preparation of this initial release candidate!
>
> This release candidate is based on commit:
> 341279af1b2fdede36871d212f339083ffbd75eb [2]
>
> The source release rc1 is hosted at [3].
> The changelog is located at [4].
>
> Please download, verify checksums and signatures, run the unit tests, and
> vote on the release. See [5] for how to validate a release candidate.
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow nanoarrow 0.1.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow nanoarrow 0.1.0 because...
>
> [0] https://github.com/apache/arrow-nanoarrow
> [1] https://github.com/apache/arrow-nanoarrow/milestone/1?closed=1
> [2]
>
> https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.1.0-rc1
> [3]
>
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-nanoarrow-0.1.0-rc1/
> [4]
>
> https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.1.0-rc1/CHANGELOG.md
> [5]
> https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md
>


Re: Getting issues in cpp build

2023-02-15 Thread Bryce Mecum
Hi Shaheer, welcome! I think the mailing list may have had an issue with
your attachment. If the output is short, could you reply with it here? If
it's more than 10-20 lines, you might put it in a Gist [1] or similar type
of pastebin and reply with a link.

[1] https://gist.github.com/


Re: R package arrow

2022-10-12 Thread Bryce Mecum
Hi Jean-Luc,

I was able to run your code successfully on my machine but I found it used
considerably more memory (~30GB) and took longer to execute (~30s) than
expected. Could you please file a JIRA ticket [1] and someone can look into
it? The docs have a bug report guide [2] which might be helpful. The
discrepancy in behavior between letting arrow handle the join versus DuckDB
isn't ideal and can be investigated in the ticket.

Thanks,
Bryce

[1] https://issues.apache.org/jira/projects/ARROW/issues
[2] https://arrow.apache.org/docs/developers/bug_reports.html#bug-reports