Re: [DISCUSS] Drop Java 11 support in Arrow

2024-08-23 Thread Dane Pitkin
Hi Laurent,

These are good ideas worth evaluating. I am also on board with waiting for
user feedback after the Java 8 deprecation is released. It is important to
strike the right balance between supporting the broader community and
allowing developers to improve the project with newer technologies to stay
relevant. The quick push to Java 17 mainly originates from the general
consensus among other related Java projects (e.g. Spark, Iceberg, Avro)
that have discussed migrating directly from Java 8 to Java 17. Perhaps we
revisit this discussion in 3-6 months to see how the Java ecosystem has
taken to these changes?

Thanks,
Dane

On Wed, Aug 21, 2024 at 6:31 PM Laurent Goujon 
wrote:

> There's another path which is to have arrow core still targeting Java 11
> but having the arrow orc module targeting Java 17 for example. Or like I
> also mentioned in the other thread, leverage MRJAR capability to offer
> support for recent java APIs while maintaining compatibility for those
> users who cannot migrate out of Java 11 yet. I even contributed some of the
> work to support it but we postponed it due to the lack of immediate use for
> it (if I interpret correctly the feedback I received).
>
> I guess I'm a bit surprised by this push to drop support for older java
> versions which are still supported and used (even if less and less) while
> we also trying very hard to not ditch support for RHEL7/CentOS 7 for
> example, something i discovered when we realized that grpc proto plugin was
> not compatible with this OS anymore (discussed at
> https://github.com/apache/arrow/pull/43264, and mentioned in
> https://github.com/apache/arrow/issues/40735, but at the same time we
> upgraded grpc in https://github.com/apache/arrow/pull/43657?). In
> comparison, even if it's kind of true that Java 8, 11 and soon 17 may not
> see new features, they are still receiving fixes and security updates from
> multiple vendors, including Adoptium which are planning new releases for
> Java 8 and 11 in october (and availability until 2026 for java 8 and 2027
> for java 11).
> I don't want to tune down the desire of try and adopt latest technology,
> but I'm kindly asking for the whole community to be considered and to
> discuss possible alternatives to achieve the same goals?
>
> Laurent
>
> On Wed, Aug 21, 2024 at 9:43 AM Dane Pitkin  wrote:
>
> > I realized I'm repeating myself a bit here with my reasoning, my
> apologies.
> > I'll emphasize that the dependencies are the strongest factor, in that
> over
> > time we have to pin more and more dependencies to older versions. I think
> > code/stack modernization would be nice and the biggest feature I'm
> looking
> > forward to is Arrow integration with the new FFM APIs in Java 22.
> However,
> > modernization hasn't been the biggest priority yet mainly due to
> > contributor capacity. I imagine if we start filing GitHub issues with
> > improvement ideas, contributors might start picking them up over time.
> >
> > On Wed, Aug 21, 2024 at 12:01 PM Dane Pitkin  wrote:
> >
> > > Hi Laurent,
> > >
> > > Kudos again for doing the migration from Java 8 -> 11 in Arrow. That
> was
> > a
> > > large contribution, thank you!
> > >
> > > I see a few reasons to drop Java 11:
> > > - Java 11 official support ended September 2023, but it will receive
> > > extended support until 2032.
> > > - Dependencies are upgrading at a faster pace. For example, ORC v2.0
> > > supports Java 17+.
> > >
> > > Thanks,
> > > Dane
> > >
> > >
> > > On Fri, Aug 16, 2024 at 11:51 AM Laurent Goujon
> > 
> > > wrote:
> > >
> > >> I think like last time we should see where the community is as a whole
> > on
> > >> JDK version support? Just looking at Spark may not be enough to get a
> > >> sense
> > >> that it is okay to drop Java 11. Overall I think we should at least
> > give 2
> > >> major versions before removing support.
> > >>
> > >> When I did the migration from Java 8 to Java 11, most of the updates
> > were
> > >> on the toolchain but I haven't seen a lot of code changes which would
> > not
> > >> compile with Java 8. And so I'm not sure what we expect from Java 17
> in
> > >> terms of code changes and or stack/modernization. @Dane, could you
> > >> elaborate maybe?
> > >>
> > >> Cheers,
> > >>
> > >> Laurent
> > >>
> > >> On Tue, Aug 6, 2024 at 5:30 PM Vibhatha Abeykoon 
> > >> wr

Re: [DISCUSS] Drop Java 11 support in Arrow

2024-08-21 Thread Dane Pitkin
I realized I'm repeating myself a bit here with my reasoning, my apologies.
I'll emphasize that the dependencies are the strongest factor, in that over
time we have to pin more and more dependencies to older versions. I think
code/stack modernization would be nice and the biggest feature I'm looking
forward to is Arrow integration with the new FFM APIs in Java 22. However,
modernization hasn't been the biggest priority yet mainly due to
contributor capacity. I imagine if we start filing GitHub issues with
improvement ideas, contributors might start picking them up over time.

On Wed, Aug 21, 2024 at 12:01 PM Dane Pitkin  wrote:

> Hi Laurent,
>
> Kudos again for doing the migration from Java 8 -> 11 in Arrow. That was a
> large contribution, thank you!
>
> I see a few reasons to drop Java 11:
> - Java 11 official support ended September 2023, but it will receive
> extended support until 2032.
> - Dependencies are upgrading at a faster pace. For example, ORC v2.0
> supports Java 17+.
>
> Thanks,
> Dane
>
>
> On Fri, Aug 16, 2024 at 11:51 AM Laurent Goujon 
> wrote:
>
>> I think like last time we should see where the community is as a whole on
>> JDK version support? Just looking at Spark may not be enough to get a
>> sense
>> that it is okay to drop Java 11. Overall I think we should at least give 2
>> major versions before removing support.
>>
>> When I did the migration from Java 8 to Java 11, most of the updates were
>> on the toolchain but I haven't seen a lot of code changes which would not
>> compile with Java 8. And so I'm not sure what we expect from Java 17 in
>> terms of code changes and or stack/modernization. @Dane, could you
>> elaborate maybe?
>>
>> Cheers,
>>
>> Laurent
>>
>> On Tue, Aug 6, 2024 at 5:30 PM Vibhatha Abeykoon 
>> wrote:
>>
>> > Thanks Dane for once again pushing the topic on Java language support.
>> > In terms of project maintenance and long term growth, I am happy with
>> this
>> > change.
>> >
>> > Regarding the usage of `--add-opens`, it would still be okay, given the
>> > fact that we provided that option either way.
>> > But in future, I think we should figure out a way to do this better.
>> >
>> > Although as you suggested, it would be best to gather feedback from the
>> > community,
>> > specifically that we have already done a minimum version upgrade very
>> > recently.
>> > Also, it would be better to do this around v20 or v21 if we agree to
>> move
>> > forward. The reason
>> > is it would at least give enough time for some users to get ready for a
>> > change. Again this may require
>> > consensus in the community.  Also we can take another consensus from the
>> > github dependabot PRs,
>> > it provides a hint on how much technical burden and vulnerabilities we
>> have
>> > to keep up when we don't
>> > upgrade the minimal supported Java version. I think we had a good
>> > experience in gathering those details for Java 8 [1] (citing once more).
>> >
>> > +1 for this proposed change.
>> >
>> > [1] https://github.com/apache/arrow/issues/38051
>> >
>> > On Thu, Aug 1, 2024 at 2:17 AM Jacob Wujciak 
>> > wrote:
>> >
>> > > Thanks Dane for starting the discussion!
>> > > I would be +1 but I am neither a Java user nor familiar with the space
>> > but
>> > > seeing spark go 17+ is encouraging.
>> > >
>> > > Also worth mentioning is that people that can't drop 11 can always
>> > continue
>> > > using the versions that still support it.
>> > >
>> > > Best
>> > > Jacob
>> > >
>> > > Am Mi., 31. Juli 2024 um 19:00 Uhr schrieb Dane Pitkin <
>> > dpit...@apache.org
>> > > >:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I'd like to bring up for discussion dropping Java 11 and supporting
>> > Java
>> > > 17
>> > > > as the minimum version[1]. Earlier this year we agreed to drop Java
>> 8
>> > and
>> > > > support Java 11 as the min version[2]. That has now been completed
>> and
>> > > will
>> > > > be released in Arrow v18 [3].
>> > > >
>> > > > My suggestion would be to drop Java 11 in Arrow v19 (~Jan 2025). If
>> we
>> > > want
>> > > > to wait for feedback from users after we release removal of Java 8,
>> > then
>> > > > perhaps Arrow v20 (~Apr 2025).
>> > > >
>> > > >
>> > > > Some reasonings:
>> > > > - Java 11 is now in Extended Support for the remainder of its
>> lifecycle
>> > > > - Apache Spark only supports Java 17+ in v4.X
>> > > >
>> > > > Some drawbacks:
>> > > > - Users will be required to add java command line arguments
>> > > > (--add-opens)[4].
>> > > >
>> > > >
>> > > > Overall, this could be a big step towards modernizing the Arrow Java
>> > > > project.
>> > > >
>> > > > Thanks,
>> > > > Dane
>> > > >
>> > > >
>> > > > [1]https://github.com/apache/arrow/issues/43307
>> > > > [2]https://lists.apache.org/thread/65vqpmrrtpshxo53572zcv91j1lb2y8g
>> > > > [3]https://github.com/apache/arrow/issues/38051
>> > > > [4]https://arrow.apache.org/docs/java/install.html#id3
>> > > >
>> > >
>> >
>>
>


Re: [DISCUSS] Drop Java 11 support in Arrow

2024-08-21 Thread Dane Pitkin
Hi Laurent,

Kudos again for doing the migration from Java 8 -> 11 in Arrow. That was a
large contribution, thank you!

I see a few reasons to drop Java 11:
- Java 11 official support ended September 2023, but it will receive
extended support until 2032.
- Dependencies are upgrading at a faster pace. For example, ORC v2.0
supports Java 17+.

Thanks,
Dane


On Fri, Aug 16, 2024 at 11:51 AM Laurent Goujon 
wrote:

> I think like last time we should see where the community is as a whole on
> JDK version support? Just looking at Spark may not be enough to get a sense
> that it is okay to drop Java 11. Overall I think we should at least give 2
> major versions before removing support.
>
> When I did the migration from Java 8 to Java 11, most of the updates were
> on the toolchain but I haven't seen a lot of code changes which would not
> compile with Java 8. And so I'm not sure what we expect from Java 17 in
> terms of code changes and or stack/modernization. @Dane, could you
> elaborate maybe?
>
> Cheers,
>
> Laurent
>
> On Tue, Aug 6, 2024 at 5:30 PM Vibhatha Abeykoon 
> wrote:
>
> > Thanks Dane for once again pushing the topic on Java language support.
> > In terms of project maintenance and long term growth, I am happy with
> this
> > change.
> >
> > Regarding the usage of `--add-opens`, it would still be okay, given the
> > fact that we provided that option either way.
> > But in future, I think we should figure out a way to do this better.
> >
> > Although as you suggested, it would be best to gather feedback from the
> > community,
> > specifically that we have already done a minimum version upgrade very
> > recently.
> > Also, it would be better to do this around v20 or v21 if we agree to move
> > forward. The reason
> > is it would at least give enough time for some users to get ready for a
> > change. Again this may require
> > consensus in the community.  Also we can take another consensus from the
> > github dependabot PRs,
> > it provides a hint on how much technical burden and vulnerabilities we
> have
> > to keep up when we don't
> > upgrade the minimal supported Java version. I think we had a good
> > experience in gathering those details for Java 8 [1] (citing once more).
> >
> > +1 for this proposed change.
> >
> > [1] https://github.com/apache/arrow/issues/38051
> >
> > On Thu, Aug 1, 2024 at 2:17 AM Jacob Wujciak 
> > wrote:
> >
> > > Thanks Dane for starting the discussion!
> > > I would be +1 but I am neither a Java user nor familiar with the space
> > but
> > > seeing spark go 17+ is encouraging.
> > >
> > > Also worth mentioning is that people that can't drop 11 can always
> > continue
> > > using the versions that still support it.
> > >
> > > Best
> > > Jacob
> > >
> > > Am Mi., 31. Juli 2024 um 19:00 Uhr schrieb Dane Pitkin <
> > dpit...@apache.org
> > > >:
> > >
> > > > Hi all,
> > > >
> > > > I'd like to bring up for discussion dropping Java 11 and supporting
> > Java
> > > 17
> > > > as the minimum version[1]. Earlier this year we agreed to drop Java 8
> > and
> > > > support Java 11 as the min version[2]. That has now been completed
> and
> > > will
> > > > be released in Arrow v18 [3].
> > > >
> > > > My suggestion would be to drop Java 11 in Arrow v19 (~Jan 2025). If
> we
> > > want
> > > > to wait for feedback from users after we release removal of Java 8,
> > then
> > > > perhaps Arrow v20 (~Apr 2025).
> > > >
> > > >
> > > > Some reasonings:
> > > > - Java 11 is now in Extended Support for the remainder of its
> lifecycle
> > > > - Apache Spark only supports Java 17+ in v4.X
> > > >
> > > > Some drawbacks:
> > > > - Users will be required to add java command line arguments
> > > > (--add-opens)[4].
> > > >
> > > >
> > > > Overall, this could be a big step towards modernizing the Arrow Java
> > > > project.
> > > >
> > > > Thanks,
> > > > Dane
> > > >
> > > >
> > > > [1]https://github.com/apache/arrow/issues/43307
> > > > [2]https://lists.apache.org/thread/65vqpmrrtpshxo53572zcv91j1lb2y8g
> > > > [3]https://github.com/apache/arrow/issues/38051
> > > > [4]https://arrow.apache.org/docs/java/install.html#id3
> > > >
> > >
> >
>


Re: [VOTE][Format] Bool8 Canonical Extension Type

2024-08-06 Thread Dane Pitkin
+1 (non-binding)

Nice work!

On Tue, Aug 6, 2024 at 12:22 PM Joris Van den Bossche <
jorisvandenboss...@gmail.com> wrote:

> +1 (binding)
>
> On Tue, 6 Aug 2024 at 17:41, Matt Topol  wrote:
> >
> > +1 (binding)
> >
> > On Tue, Aug 6, 2024 at 11:40 AM Felipe Oliveira Carvalho <
> > felipe...@gmail.com> wrote:
> >
> > > +1 (non-binding)
> > >
> > > --
> > > Felipe
> > >
> > > On Tue, Aug 6, 2024 at 6:24 AM Gang Wu  wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Looked through the spec and C++ impl.
> > > >
> > > > Best,
> > > > Gang
> > > >
> > > > On Tue, Aug 6, 2024 at 11:55 AM wish maple 
> > > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Best,
> > > > > Xuwei Fu
> > > > >
> > > > > David Li  于2024年8月6日周二 10:20写道:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > On Tue, Aug 6, 2024, at 10:17, Sutou Kouhei wrote:
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > In  > > > xbr9m_tfzz4-...@mail.gmail.com
> > > > > >
> > > > > > >   "[VOTE][Format] Bool8 Canonical Extension Type" on Mon, 5 Aug
> > > 2024
> > > > > > > 08:59:42 -0400,
> > > > > > >   Joel Lubinitsky  wrote:
> > > > > > >
> > > > > > >> Hello Devs,
> > > > > > >>
> > > > > > >> I would like to propose a new canonical extension type: Bool8
> > > > > > >>
> > > > > > >> The prior mailing list discussion thread can be found at [1].
> > > > > > >> The format documentation change can be found at [2]. A copy
> of the
> > > > > text
> > > > > > is
> > > > > > >> included in this email.
> > > > > > >> A Go implementation can be found at [3].
> > > > > > >> A C++/Python implementation can be found at [4].
> > > > > > >>
> > > > > > >> Thank you for your time and attention in reviewing this
> proposal.
> > > > > > >>
> > > > > > >> The vote will be open for at least 72 hours.
> > > > > > >>
> > > > > > >> [ ] +1 Accept this proposal
> > > > > > >> [ ] +0
> > > > > > >> [ ] -1 Do not accept this proposal because...
> > > > > > >>
> > > > > > >> [1]:
> > > > https://lists.apache.org/thread/nz44qllq53h6kjl3rhy0531n2n2tpfr0
> > > > > > >> [2]: https://github.com/apache/arrow/pull/43234
> > > > > > >> [3]: https://github.com/apache/arrow/pull/43323
> > > > > > >> [4]: https://github.com/apache/arrow/pull/43488
> > > > > > >>
> > > > > > >> ---
> > > > > > >>
> > > > > > >> 8-bit Boolean
> > > > > > >> =
> > > > > > >>
> > > > > > >> Bool8 represents a boolean value using 1 byte (8 bits) to
> store
> > > each
> > > > > > value
> > > > > > >> instead of only 1 bit as in the original Arrow Boolean type.
> > > > Although
> > > > > > less
> > > > > > >> compact than the original representation, Bool8 may have
> better
> > > > > > zero-copy
> > > > > > >> compatibility with various systems that also store booleans
> using
> > > 1
> > > > > > byte.
> > > > > > >>
> > > > > > >> * Extension name: ``arrow.bool8``.
> > > > > > >>
> > > > > > >> * The storage type of this extension is ``Int8`` where:
> > > > > > >>
> > > > > > >>   * **false** is denoted by the value ``0``.
> > > > > > >>   * **true** can be specified using any non-zero value.
> Preferably
> > > > > > ``1``.
> > > > > > >>
> > > > > > >> * Extension type parameters:
> > > > > > >>
> > > > > > >>   This type does not have any parameters.
> > > > > > >>
> > > > > > >> * Description of the serialization:
> > > > > > >>
> > > > > > >>   No metadata is required to interpret the type. Any metadata
> > > > present
> > > > > > >> should be ignored.
> > > > > >
> > > > >
> > > >
> > >
>


[DISCUSS] Drop Java 11 support in Arrow

2024-07-31 Thread Dane Pitkin
Hi all,

I'd like to bring up for discussion dropping Java 11 and supporting Java 17
as the minimum version[1]. Earlier this year we agreed to drop Java 8 and
support Java 11 as the min version[2]. That has now been completed and will
be released in Arrow v18 [3].

My suggestion would be to drop Java 11 in Arrow v19 (~Jan 2025). If we want
to wait for feedback from users after we release removal of Java 8, then
perhaps Arrow v20 (~Apr 2025).


Some reasonings:
- Java 11 is now in Extended Support for the remainder of its lifecycle
- Apache Spark only supports Java 17+ in v4.X

Some drawbacks:
- Users will be required to add java command line arguments
(--add-opens)[4].


Overall, this could be a big step towards modernizing the Arrow Java
project.

Thanks,
Dane


[1]https://github.com/apache/arrow/issues/43307
[2]https://lists.apache.org/thread/65vqpmrrtpshxo53572zcv91j1lb2y8g
[3]https://github.com/apache/arrow/issues/38051
[4]https://arrow.apache.org/docs/java/install.html#id3


Re: [VOTE][Format] Opaque canonical extension type

2024-07-24 Thread Dane Pitkin
+1 (non-binding)

I reviewed the spec and Java implementation.

On Wed, Jul 24, 2024 at 10:37 AM Ian Cook  wrote:

> +1 (non-binding)
>
> I reviewed the spec additions.
>
> Ian
>
> On Wed, Jul 24, 2024 at 10:27 AM Jacob Wujciak-Jens 
> wrote:
>
> > +1 (non binding)
> >
> > wish maple  schrieb am Mi., 24. Juli 2024,
> 15:12:
> >
> > > +1 (non-binding)
> > >
> > > Checked spec change and C++ impl.
> > >
> > > Best,
> > > Xuwei Fu
> > >
> > > Gang Wu  于2024年7月24日周三 20:51写道:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Checked spec change and C++ impl.
> > > >
> > > > On Wed, Jul 24, 2024 at 6:52 PM Joel Lubinitsky 
> > > > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Go implementation LGTM
> > > > >
> > > > > On Wed, Jul 24, 2024 at 5:12 AM Raúl Cumplido 
> > > wrote:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > Format change looks good to me. I haven't reviewed the individual
> > > > > > implementations.
> > > > > >
> > > > > > Thanks David for leading this.
> > > > > >
> > > > > > El mié, 24 jul 2024 a las 10:51, Joris Van den Bossche
> > > > > > () escribió:
> > > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > On Wed, 24 Jul 2024 at 07:34, David Li 
> > > wrote:
> > > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I'd like to propose the 'Opaque' canonical extension type.
> > Prior
> > > > > > discussion can be found at [1] and the proposal and
> implementations
> > > for
> > > > > > C++, Go, Java, and Python can be found at [2]. The proposal is
> > > > > additionally
> > > > > > reproduced below.
> > > > > > > >
> > > > > > > > The vote will be open for at least 72 hours.
> > > > > > > >
> > > > > > > > [ ] +1 Accept this proposal
> > > > > > > > [ ] +0
> > > > > > > > [ ] -1 Do not accept this proposal because...
> > > > > > > >
> > > > > > > > [1]:
> > > > > https://lists.apache.org/thread/8d5ldl5cb7mms21rd15lhpfrv4j9no4n
> > > > > > > > [2]: https://github.com/apache/arrow/pull/41823
> > > > > > > >
> > > > > > > > ---
> > > > > > > >
> > > > > > > > Opaque represents a type that an Arrow-based system received
> > from
> > > > an
> > > > > > external
> > > > > > > > (often non-Arrow) system, but that it cannot interpret.  In
> > this
> > > > > case,
> > > > > > it can
> > > > > > > > pass on Opaque to its clients to at least show that a field
> > > exists
> > > > > and
> > > > > > > > preserve metadata about the type from the other system.
> > > > > > > >
> > > > > > > > Extension parameters:
> > > > > > > >
> > > > > > > > * Extension name: ``arrow.opaque``.
> > > > > > > >
> > > > > > > > * The storage type of this extension is any type.  If there
> is
> > no
> > > > > > underlying
> > > > > > > >   data, the storage type should be Null.
> > > > > > > >
> > > > > > > > * Extension type parameters:
> > > > > > > >
> > > > > > > >   * **type_name** = the name of the unknown type in the
> > external
> > > > > > system.
> > > > > > > >   * **vendor_name** = the name of the external system.
> > > > > > > >
> > > > > > > > * Description of the serialization:
> > > > > > > >
> > > > > > > >   A valid JSON object containing the parameters as fields.
> In
> > > the
> > > > > > future,
> > > > > > > >   additional fields may be added, but all fields current and
> > > future
> > > > > > are never
> > > > > > > >   required to interpret the array.
> > > > > > > >
> > > > > > > >   Developers **should not** attempt to enable public semantic
> > > > > > interoperability
> > > > > > > >   of Opaque by canonicalizing specific values of these
> > > parameters.
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Release Apache Arrow ADBC 13 - RC0

2024-07-03 Thread Dane Pitkin
+1 (non-binding)

Verified on MacOS 14 aarch64 with:

DOCKER_DEFAULT_PLATFORM=linux/amd64 USE_CONDA=1
./dev/release/verify-release-candidate.sh 13 0

I also had to install arrow-glib-devel in ./dev/release/verify-yum.sh to
fully verify the release:

+${install_command} --enablerepo=epel arrow-glib-devel
 ${install_command} --enablerepo=epel adbc
-arrow-glib-devel-${package_version}
 ${install_command} --enablerepo=epel adbc-arrow-glib-doc-${package_version}


On Mon, Jul 1, 2024 at 8:49 PM Matt Topol  wrote:

> +1 (binding)
>
> Release candidate validated successfully with:
> USE_CONDA=0 dev/release/verify-release-candidate.sh 13 0
>
> using Pop_OS! 22.04
>
> Same issue as Kou, i needed to install arrow-glib-devel manually to get
> verification to work
>
> On Mon, Jul 1, 2024 at 8:31 PM Sutou Kouhei  wrote:
>
> > +1 (binding)
> >
> > I ran the following on Debian GNU/Linux sid:
> >
> >   TEST_DEFAULT=0 \
> > TEST_SOURCE=1 \
> > LANG=C \
> > TZ=UTC \
> > JAVA_HOME=/usr/lib/jvm/default-java \
> > dev/release/verify-release-candidate.sh 13 0
> >
> >   TEST_DEFAULT=0 \
> > TEST_APT=1 \
> > LANG=C \
> > dev/release/verify-release-candidate.sh 13 0
> >
> >   TEST_DEFAULT=0 \
> > TEST_BINARY=1 \
> > LANG=C \
> > dev/release/verify-release-candidate.sh 13 0
> >
> >   TEST_DEFAULT=0 \
> > TEST_JARS=1 \
> > LANG=C \
> > dev/release/verify-release-candidate.sh 13 0
> >
> >   TEST_DEFAULT=0 \
> > TEST_WHEELS=1 \
> > TEST_PYTHON_VERSIONS=3.11 \
> > LANG=C \
> > TZ=UTC \
> > dev/release/verify-release-candidate.sh 13 0
> >
> >   TEST_DEFAULT=0 \
> > TEST_YUM=1 \
> > LANG=C \
> > dev/release/verify-release-candidate.sh 13 0
> >
> > with:
> >
> >   * g++ (Debian 13.3.0-1) 13.3.0
> >   * go version go1.22.4 linux/amd64
> >   * openjdk version "17.0.11" 2024-04-16
> >   * Python 3.11.9
> >   * ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux-gnu]
> >   * R version 4.4.1 (2024-06-14) -- "Race for Your Life"
> >   * Apache Arrow 17.0.0-SNAPSHOT
> >
> > Note:
> >
> > I needed to install arrow-glib-devel explicitly to verify
> > Yum repository like I did for ADBC 12:
> >
> > 
> > diff --git a/dev/release/verify-yum.sh b/dev/release/verify-yum.sh
> > index f7f023611..ff30176f1 100755
> > --- a/dev/release/verify-yum.sh
> > +++ b/dev/release/verify-yum.sh
> > @@ -170,6 +170,7 @@ echo "::endgroup::"
> >
> >  echo "::group::Test ADBC Arrow GLib"
> >
> > +${install_command} --enablerepo=epel arrow-glib-devel
> >  ${install_command} --enablerepo=epel
> > adbc-arrow-glib-devel-${package_version}
> >  ${install_command} --enablerepo=epel
> > adbc-arrow-glib-doc-${package_version}
> >
> > 
> >
> > This is not a blocker for 13 too. I want to find the
> > solution for this but I don' have any idea yet...
> >
> >
> > Thanks,
> > --
> > kou
> >
> > In 
> >   "[VOTE] Release Apache Arrow ADBC 13 - RC0" on Mon, 01 Jul 2024
> 17:01:02
> > +0900,
> >   "David Li"  wrote:
> >
> > > Hello,
> > >
> > > I would like to propose the following release candidate (RC0) of Apache
> > Arrow ADBC version 13. This is a release consisting of 24 resolved GitHub
> > issues [1].
> > >
> > > The subcomponents are versioned independently:
> > >
> > > - C/C++/GLib/Go/Python/Ruby: 1.1.0
> > > - C#: 0.13.0
> > > - Java: 0.13.0
> > > - R: 0.13.0
> > > - Rust: 0.13.0
> > >
> > > This release candidate is based on commit:
> > 37f79efbcd1641e6906a36e76df57cb896f2bc68 [2]
> > >
> > > The source release rc0 is hosted at [3].
> > > The binary artifacts are hosted at [4][5][6][7][8].
> > > The changelog is located at [9].
> > >
> > > Please download, verify checksums and signatures, run the unit tests,
> > and vote on the release. See [10] for how to validate a release
> candidate.
> > >
> > > See also a verification result on GitHub Actions [11].
> > >
> > > The vote will be open for at least 72 hours.
> > >
> > > [ ] +1 Release this as Apache Arrow ADBC 13
> > > [ ] +0
> > > [ ] -1 Do not release this as Apache Arrow ADBC 13 because...
> > >
> > > Note: to verify APT/YUM packages on macOS/AArch64, you must `export
> > DOCKER_DEFAULT_PLATFORM=linux/amd64`. (Or skip this step by `export
> > TEST_APT=0 TEST_YUM=0`.)
> > >
> > > [1]:
> >
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A%22ADBC+Libraries+13%22+is%3Aclosed
> > > [2]:
> >
> https://github.com/apache/arrow-adbc/commit/37f79efbcd1641e6906a36e76df57cb896f2bc68
> > > [3]:
> > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-adbc-13-rc0/
> > > [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> > > [5]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> > > [6]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> > > [7]:
> >
> https://repository.apache.org/content/repositories/staging/org/apache/arrow/adbc/
> > > [8]:
> >
> https://github.com/apache/arrow-adbc/releases/tag/apache-arrow-adbc-13-rc0
> > > [9]:
> >
> https://github.com/apache/arrow-adbc/blob/apache

Re: [VOTE] Release Apache Arrow nanoarrow 0.5.0

2024-05-22 Thread Dane Pitkin
+1 (non-binding)

Verified on MacOS 14 aarch64.

On Wed, May 22, 2024 at 2:55 PM Bryce Mecum  wrote:

> +1 (non-binding)
>
> Verified on:
>
> - macOS aarch64
> - Debian 12 x86_64 inside a conda environment (note I had to install
> Python 3.11 separately from the instructions, not sure I missed a
> step)
>
> On Wed, May 22, 2024 at 10:18 AM Dewey Dunnington
>  wrote:
> >
> > Hello,
> >
> > I would like to propose the following release candidate (rc0) of
> > Apache Arrow nanoarrow [0] version 0.5.0. This is an initial release
> > consisting of 79 resolved GitHub issues from 9 contributors [1].
> >
> > This release candidate is based on commit:
> > c5fb10035c17b598e6fd688ad9eb7b874c7c631b [2]
> >
> > The source release rc0 is hosted at [3].
> > The changelog is located at [4].
> >
> > Please download, verify checksums and signatures, run the unit tests,
> > and vote on the release. See [5] for how to validate a release
> > candidate.
> >
> > The vote will be open for at least 72 hours.
> >
> > [ ] +1 Release this as Apache Arrow nanoarrow 0.5.0
> > [ ] +0
> > [ ] -1 Do not release this as Apache Arrow nanoarrow 0.5.0 because...
> >
> > [0] https://github.com/apache/arrow-nanoarrow
> > [1] https://github.com/apache/arrow-nanoarrow/milestone/5?closed=1
> > [2]
> https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.5.0-rc0
> > [3]
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-nanoarrow-0.5.0-rc0/
> > [4]
> https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.5.0-rc0/CHANGELOG.md
> > [5]
> https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md
>


Re: [DISCUSS] Drop Java 8 support

2024-05-21 Thread Dane Pitkin
I haven't been active in Apache Parquet, but I did not see any prior
discussions on this topic in their Jira or dev mailing list.

Do we think a vote is needed before officially moving forward with Java 8
deprecation?

On Mon, May 20, 2024 at 12:50 PM Laurent Goujon 
wrote:

> I also mentioned Apache Parquet and haven't seen someone mentioned if/when
> Apache Parquet would transition.
>
>
>
> On Fri, May 17, 2024 at 9:07 AM Dane Pitkin  wrote:
>
> > Fokko, thank you for these datapoints! It's great to see how other low
> > level Java OSS projects are approaching this.
> >
> > JB, I believe yes we have formal consensus to drop Java 8 in Arrow. There
> > was no contention in current discussions across [GitHub issues | Arrow
> > Mailing List | Community Syncs].
> >
> > We can save Java 11 deprecation for a future discussion. For users on
> Java
> > 11, I do anticipate this discussion to come shortly after Java 8
> > deprecation is released.
> >
> > On Fri, May 17, 2024 at 10:02 AM Fokko Driesprong 
> > wrote:
> >
> > > I was traveling the last few weeks, so just a follow-up from my end.
> > >
> > > Fokko, can you elaborate on the discussions held in other OSS projects
> to
> > >> drop Java <17? How did they weigh the benefits/drawbacks for dropping
> > both
> > >> Java 8 and 11 LTS versions? I'd also be curious if other projects plan
> > to
> > >> support older branches with security patches.
> > >
> > >
> > > So, the ones that I'm involved with (including a TLDR):
> > >
> > >- Avro:
> > >   - (April 2024: Consensus on moving to 11+, +1 for moving to 17+)
> > >   https://lists.apache.org/thread/6vbd3w5qk7mpb5lyrfyf2s0z1cymjt5w
> > >   - (Jan 2024: Consensus on dropping 8)
> > >   https://lists.apache.org/thread/bd39zhk655pgzfctq763vp3z4xrjpx58
> > >   - Iceberg:
> > >   - (Jan 2023: Concerns about Hive):
> > >   https://lists.apache.org/thread/hr7rdxvddw3fklfyg3dfbqbsy81hzhyk
> > >   - (Feb 2024: Concensus to drop Hadoop 2.x, and move to JDK11+,
> > >   also +1's for moving to 17+):
> > >   https://lists.apache.org/thread/ntrk2thvsg9tdccwd4flsdz9gg743368
> > >
> > > I think the most noteworthy (slow-moving in general):
> > >
> > >- Spark 4 supports JDK 17+
> > >- Hive 4 is still on Java 8
> > ><https://github.com/apache/hive?tab=readme-ov-file#java>
> > >
> > > It looks like most of the projects are looking at each other. Keep in
> > > mind, that projects that still support older versions of Java, can
> still
> > > use older versions of Arrow.
> > >
> > > [image: spiderman-pointing-at-spiderman.jpeg]
> > > (in case the image doesn't come through, that's Spiderman pointing at
> > > Spiderman)
> > >
> > > Concerning the Java 11 support, some data:
> > >
> > >- Oracle 11: support until January 2032 (extended fee has been
> waived)
> > >- Cornetto 11: September 2027
> > >- Adoptium 11: At least Oct 2027
> > >- Zulu 11: Jan 2032
> > >- OpenJDK11: October 2024
> > >
> > > I think it is fair to support 11 for the time being, but at some point,
> > we
> > > also have to move on and start exploiting the new features and make
> sure
> > > that we keep up to date. For example, Java 8 also has extended support
> > > until 2030. Dependabot on the Iceberg project
> > > <
> >
> https://github.com/apache/iceberg/pulls?q=is%3Aopen+is%3Apr+label%3Adependencies
> > >
> > > nicely shows which projects are already at JDK11+ :)
> > >
> > > Thanks Dane for driving this!
> > >
> > > Kind regards,
> > > Fokko
> > >
> > >
> > >
> > >
> > >
> > > Op vr 17 mei 2024 om 07:44 schreef Jean-Baptiste Onofré <
> j...@nanthrax.net
> > >:
> > >
> > >> Hi Dane
> > >>
> > >> Do we have a formal consensus about Java version in regards of arrow
> > >> version ?
> > >> I agree with the plan but just wondering if it’s ok from everyone with
> > the
> > >> community.
> > >>
> > >> Regards
> > >> JB
> > >>
> > >> Le jeu. 16 mai 2024 à 18:05, Dane Pitkin  a
> écrit :
> > >>
> > >> > To wrap up this thread on Java 8 deprecation, here i

Re: [DISCUSS] Drop Java 8 support

2024-05-17 Thread Dane Pitkin
Fokko, thank you for these datapoints! It's great to see how other low
level Java OSS projects are approaching this.

JB, I believe yes we have formal consensus to drop Java 8 in Arrow. There
was no contention in current discussions across [GitHub issues | Arrow
Mailing List | Community Syncs].

We can save Java 11 deprecation for a future discussion. For users on Java
11, I do anticipate this discussion to come shortly after Java 8
deprecation is released.

On Fri, May 17, 2024 at 10:02 AM Fokko Driesprong  wrote:

> I was traveling the last few weeks, so just a follow-up from my end.
>
> Fokko, can you elaborate on the discussions held in other OSS projects to
>> drop Java <17? How did they weigh the benefits/drawbacks for dropping both
>> Java 8 and 11 LTS versions? I'd also be curious if other projects plan to
>> support older branches with security patches.
>
>
> So, the ones that I'm involved with (including a TLDR):
>
>- Avro:
>   - (April 2024: Consensus on moving to 11+, +1 for moving to 17+)
>   https://lists.apache.org/thread/6vbd3w5qk7mpb5lyrfyf2s0z1cymjt5w
>   - (Jan 2024: Consensus on dropping 8)
>   https://lists.apache.org/thread/bd39zhk655pgzfctq763vp3z4xrjpx58
>   - Iceberg:
>   - (Jan 2023: Concerns about Hive):
>   https://lists.apache.org/thread/hr7rdxvddw3fklfyg3dfbqbsy81hzhyk
>   - (Feb 2024: Concensus to drop Hadoop 2.x, and move to JDK11+,
>   also +1's for moving to 17+):
>   https://lists.apache.org/thread/ntrk2thvsg9tdccwd4flsdz9gg743368
>
> I think the most noteworthy (slow-moving in general):
>
>- Spark 4 supports JDK 17+
>- Hive 4 is still on Java 8
><https://github.com/apache/hive?tab=readme-ov-file#java>
>
> It looks like most of the projects are looking at each other. Keep in
> mind, that projects that still support older versions of Java, can still
> use older versions of Arrow.
>
> [image: spiderman-pointing-at-spiderman.jpeg]
> (in case the image doesn't come through, that's Spiderman pointing at
> Spiderman)
>
> Concerning the Java 11 support, some data:
>
>- Oracle 11: support until January 2032 (extended fee has been waived)
>- Cornetto 11: September 2027
>- Adoptium 11: At least Oct 2027
>- Zulu 11: Jan 2032
>- OpenJDK11: October 2024
>
> I think it is fair to support 11 for the time being, but at some point, we
> also have to move on and start exploiting the new features and make sure
> that we keep up to date. For example, Java 8 also has extended support
> until 2030. Dependabot on the Iceberg project
> <https://github.com/apache/iceberg/pulls?q=is%3Aopen+is%3Apr+label%3Adependencies>
> nicely shows which projects are already at JDK11+ :)
>
> Thanks Dane for driving this!
>
> Kind regards,
> Fokko
>
>
>
>
>
> Op vr 17 mei 2024 om 07:44 schreef Jean-Baptiste Onofré :
>
>> Hi Dane
>>
>> Do we have a formal consensus about Java version in regards of arrow
>> version ?
>> I agree with the plan but just wondering if it’s ok from everyone with the
>> community.
>>
>> Regards
>> JB
>>
>> Le jeu. 16 mai 2024 à 18:05, Dane Pitkin  a écrit :
>>
>> > To wrap up this thread on Java 8 deprecation, here is my current plan of
>> > action:
>> >
>> > 1) Arrow v17 will be the last version supporting Java 8 and the release
>> > notes will warn of its impending deprecation.
>> > 2) Arrow v18 will be the first release supporting min version Java 11.
>> >
>> > I have updated the GH issue[1] to reflect this.
>> >
>> > [1]https://github.com/apache/arrow/issues/38051
>> >
>> > On Wed, May 8, 2024 at 5:46 PM Dane Pitkin > >
>> > wrote:
>> >
>> > > Thank you all for your valuable input. The consensus from my
>> > understanding
>> > > is that dropping Java 8 is not contentious, so we will move forward
>> here.
>> > >
>> > > We won't drop Java 11 yet, but there's a chance it will happen sooner
>> > than
>> > > later. I brought up Java 8 & 11 deprecation in the community sync
>> again
>> > > today. The summary is that the ASF could be enforcing stricter
>> security
>> > > practices in the near future. Arrow Java may be forced to drop Java
>> 11 if
>> > > any of its dependencies no longer support Java 11. This is something
>> > we'll
>> > > have to investigate and monitor. When the time is right, we should
>> start
>> > a
>> > > new thread on the mailing list to discuss.
>

Re: [DISCUSS] Drop Java 8 support

2024-05-16 Thread Dane Pitkin
To wrap up this thread on Java 8 deprecation, here is my current plan of
action:

1) Arrow v17 will be the last version supporting Java 8 and the release
notes will warn of its impending deprecation.
2) Arrow v18 will be the first release supporting min version Java 11.

I have updated the GH issue[1] to reflect this.

[1]https://github.com/apache/arrow/issues/38051

On Wed, May 8, 2024 at 5:46 PM Dane Pitkin 
wrote:

> Thank you all for your valuable input. The consensus from my understanding
> is that dropping Java 8 is not contentious, so we will move forward here.
>
> We won't drop Java 11 yet, but there's a chance it will happen sooner than
> later. I brought up Java 8 & 11 deprecation in the community sync again
> today. The summary is that the ASF could be enforcing stricter security
> practices in the near future. Arrow Java may be forced to drop Java 11 if
> any of its dependencies no longer support Java 11. This is something we'll
> have to investigate and monitor. When the time is right, we should start a
> new thread on the mailing list to discuss.
>
> Thanks,
> Dane
>
> On Sat, May 4, 2024 at 2:51 AM  wrote:
>
> > Hi,
> >
> > We were originally expecting to keep Java 11 to the 2026 EOL date for
> > extended support, but now that date is moved to 2032 which feels like
> more
> > time than we need. The issue for us is that getting technology approved
> for
> > use in an enterprise can have ridiculously long lead times, so having a
> > minimum supported version that is only 2 years old, while probably ok in
> > most case, would be a bit aggressive. We use optional dependencies where
> we
> > can, so e.g. the Java 17 dependency for Spark 4 would only affect clients
> > using Spark 4, and they could wait to upgrade. But we chose to use Arrow
> in
> > the core of our product, it is the internal format everything else goes
> > through. On the compliance side we have to keep current with security
> > updates, so there is no option to stick on an old version.
> >
> > If we were to drop Java 11 after the next LTS comes out, i.e. 2025 /
> 2026,
> > then the three latest LTS versions would be supported and the minimum
> > version would have been available for 4 - 5 years. I think it would be
> very
> > hard to argue 17 can’t be made available at that point. If Arrow forces
> our
> > hand then obviously we’ll have to go sooner, but it wouldn’t be ideal for
> > us.
> >
> > Lastly just on language capabilities, the only things we’re really
> > interested in are performance related, probably virtual threads and
> foreign
> > memory would be the main ones. Both of the those could be optional
> > dependencies, in the case of FFM we’d rely on either yourselves or Netty
> > anyway to provide an allocator. So in fact there is very little benefit
> for
> > us to drop Java 11 early, all it costs us is one extra CI job.
> >
> > Hope some of this is helpful - apologies for the high latency, busy as
> > always!!
> >
> > Martin.
> >
> >
> > > On 1 May 2024, at 22:38, Dane Pitkin 
> > wrote:
> > >
> > > Thanks, Martin. It's great to hear of real-world use cases. Do you
> > > anticipate any timeline for dropping Java 11 for your product? If Arrow
> > did
> > > drop Java 11, then it sounds like pinning Arrow Java to an older
> version
> > > wouldn't be an ideal option if security patches are not backported.
> > >
> > > Fokko, can you elaborate on the discussions held in other OSS projects
> to
> > > drop Java <17? How did they weigh the benefits/drawbacks for dropping
> > both
> > > Java 8 and 11 LTS versions? I'd also be curious if other projects plan
> to
> > > support older branches with security patches.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Apr 30, 2024 at 4:14 PM 
> > wrote:
> > >
> > >> Speaking for my own product we would like to see Java 11 support, we
> > rely
> > >> heavily on Arrow and have Java 11 as our minimum supported version.
> We’d
> > >> like to keep doing that if possible. Our clients are big enterprises
> > with
> > >> notoriously sluggish update cycles, so we want to offer maximum
> > >> compatibility. Once security patches are no longer available on the
> > regular
> > >> public channels then there is a compliance issue, so we generally
> follow
> > >> the EOL schedule of our dependencies.
> > >>
> > >> Corretto, Adoptium and Zulu

Re: [DISCUSS] Drop Java 8 support

2024-05-08 Thread Dane Pitkin
Thank you all for your valuable input. The consensus from my understanding
is that dropping Java 8 is not contentious, so we will move forward here.

We won't drop Java 11 yet, but there's a chance it will happen sooner than
later. I brought up Java 8 & 11 deprecation in the community sync again
today. The summary is that the ASF could be enforcing stricter security
practices in the near future. Arrow Java may be forced to drop Java 11 if
any of its dependencies no longer support Java 11. This is something we'll
have to investigate and monitor. When the time is right, we should start a
new thread on the mailing list to discuss.

Thanks,
Dane

On Sat, May 4, 2024 at 2:51 AM  wrote:

> Hi,
>
> We were originally expecting to keep Java 11 to the 2026 EOL date for
> extended support, but now that date is moved to 2032 which feels like more
> time than we need. The issue for us is that getting technology approved for
> use in an enterprise can have ridiculously long lead times, so having a
> minimum supported version that is only 2 years old, while probably ok in
> most case, would be a bit aggressive. We use optional dependencies where we
> can, so e.g. the Java 17 dependency for Spark 4 would only affect clients
> using Spark 4, and they could wait to upgrade. But we chose to use Arrow in
> the core of our product, it is the internal format everything else goes
> through. On the compliance side we have to keep current with security
> updates, so there is no option to stick on an old version.
>
> If we were to drop Java 11 after the next LTS comes out, i.e. 2025 / 2026,
> then the three latest LTS versions would be supported and the minimum
> version would have been available for 4 - 5 years. I think it would be very
> hard to argue 17 can’t be made available at that point. If Arrow forces our
> hand then obviously we’ll have to go sooner, but it wouldn’t be ideal for
> us.
>
> Lastly just on language capabilities, the only things we’re really
> interested in are performance related, probably virtual threads and foreign
> memory would be the main ones. Both of the those could be optional
> dependencies, in the case of FFM we’d rely on either yourselves or Netty
> anyway to provide an allocator. So in fact there is very little benefit for
> us to drop Java 11 early, all it costs us is one extra CI job.
>
> Hope some of this is helpful - apologies for the high latency, busy as
> always!!
>
> Martin.
>
>
> > On 1 May 2024, at 22:38, Dane Pitkin 
> wrote:
> >
> > Thanks, Martin. It's great to hear of real-world use cases. Do you
> > anticipate any timeline for dropping Java 11 for your product? If Arrow
> did
> > drop Java 11, then it sounds like pinning Arrow Java to an older version
> > wouldn't be an ideal option if security patches are not backported.
> >
> > Fokko, can you elaborate on the discussions held in other OSS projects to
> > drop Java <17? How did they weigh the benefits/drawbacks for dropping
> both
> > Java 8 and 11 LTS versions? I'd also be curious if other projects plan to
> > support older branches with security patches.
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Apr 30, 2024 at 4:14 PM 
> wrote:
> >
> >> Speaking for my own product we would like to see Java 11 support, we
> rely
> >> heavily on Arrow and have Java 11 as our minimum supported version. We’d
> >> like to keep doing that if possible. Our clients are big enterprises
> with
> >> notoriously sluggish update cycles, so we want to offer maximum
> >> compatibility. Once security patches are no longer available on the
> regular
> >> public channels then there is a compliance issue, so we generally follow
> >> the EOL schedule of our dependencies.
> >>
> >> Corretto, Adoptium and Zulu all have recent public builds of both 8 and
> 11
> >> and look set to support them with public builds for many years to come.
> >> Several organisations I have worked with switched away from Oracle when
> >> they made their licensing blunder with Java 8 and although that is
> >> rectified now, the change seems to have stuck in quite a few places (at
> >> least in my anecdotal experience).
> >>
> >> A major practical difference to me in Java 17 is the strong
> encapsulation
> >> of internals. Since that affects the majority of serious Java
> applications
> >> then perhaps most people have figured out by now to add the JVM params
> that
> >> let Java continue working. Still, it could be a consideration, if
> Java17
> >> is the baseline supported version.
> >>
> >> Regards,
&

Re: [ANNOUNCE] New Arrow committer: Dane Pitkin

2024-05-08 Thread Dane Pitkin
Thanks, all! It's exhilarating to be a part of this project. I appreciate
the kind words!

On Tue, May 7, 2024 at 10:10 PM David Li  wrote:

> Congrats Dane!
>
> On Wed, May 8, 2024, at 09:46, Felipe Oliveira Carvalho wrote:
> > Great news. Congratulations Dane!
> >
> > On Tue, May 7, 2024 at 7:57 PM Vibhatha Abeykoon 
> wrote:
> >>
> >> Congratulations Dane!!!
> >>
> >> Vibhatha Abeykoon
> >>
> >>
> >> On Wed, May 8, 2024 at 4:02 AM Jacob Wujciak 
> wrote:
> >>
> >> > Congrats!
> >> >
> >> > Am Di., 7. Mai 2024 um 23:19 Uhr schrieb Bryce Mecum <
> bryceme...@gmail.com
> >> > >:
> >> >
> >> > > Congrats Dane!
> >> > >
> >> > > On Tue, May 7, 2024 at 5:53 AM Joris Van den Bossche
> >> > >  wrote:
> >> > > >
> >> > > > On behalf of the Arrow PMC, I'm happy to announce that Dane
> Pitkin has
> >> > > > accepted an invitation to become a committer on Apache Arrow.
> Welcome,
> >> > > > and thank you for your contributions!
> >> > > >
> >> > > > Joris
> >> > >
> >> >
>


Re: [DISCUSS] Drop Java 8 support

2024-05-01 Thread Dane Pitkin
Thanks, Martin. It's great to hear of real-world use cases. Do you
anticipate any timeline for dropping Java 11 for your product? If Arrow did
drop Java 11, then it sounds like pinning Arrow Java to an older version
wouldn't be an ideal option if security patches are not backported.

Fokko, can you elaborate on the discussions held in other OSS projects to
drop Java <17? How did they weigh the benefits/drawbacks for dropping both
Java 8 and 11 LTS versions? I'd also be curious if other projects plan to
support older branches with security patches.







On Tue, Apr 30, 2024 at 4:14 PM  wrote:

> Speaking for my own product we would like to see Java 11 support, we rely
> heavily on Arrow and have Java 11 as our minimum supported version. We’d
> like to keep doing that if possible. Our clients are big enterprises with
> notoriously sluggish update cycles, so we want to offer maximum
> compatibility. Once security patches are no longer available on the regular
> public channels then there is a compliance issue, so we generally follow
> the EOL schedule of our dependencies.
>
> Corretto, Adoptium and Zulu all have recent public builds of both 8 and 11
> and look set to support them with public builds for many years to come.
> Several organisations I have worked with switched away from Oracle when
> they made their licensing blunder with Java 8 and although that is
> rectified now, the change seems to have stuck in quite a few places (at
> least in my anecdotal experience).
>
> A major practical difference to me in Java 17 is the strong encapsulation
> of internals. Since that affects the majority of serious Java applications
> then perhaps most people have figured out by now to add the JVM params that
> let Java continue working. Still, it could be a consideration, if  Java17
> is the baseline supported version.
>
> Regards,
> Martin.
>
> - In case anyone is curious why we don’t support Java 8 per our own
> policy, it’s because of the “var” keyword - seriously, why did Java take so
> long with that, even C++ got there sooner!
>
> > On 30 Apr 2024, at 16:20, Jacob Wujciak  wrote:
> >
> > Hello everyone!
> > Great to see this move forward!
> > +1 on dropping both 8 and 11 unless there is very good reason to keep 11
> > around.
> > Otherwise people will just move to 11 and then have the pain of migration
> > again when we drop that (which will happen soon regardless imo).
> >
> > Am Di., 30. Apr. 2024 um 16:18 Uhr schrieb Dane Pitkin
> > :
> >
> >> Thanks, JB. Are we aware of any downstream dependencies that would
> benefit
> >> from maintaining Java 11 support? Apache Spark jumped straight to Java
> 17.
> >> It seems other projects are dropping both 8 and 11 at the same time as
> >> mentioned by Fokko. From a maintenance perspective, it would be nice to
> >> drop both.
> >>
> >> On Mon, Apr 29, 2024 at 11:20 AM Jean-Baptiste Onofré 
> >> wrote:
> >>
> >>> Hi
> >>>
> >>> I think it's time to drop JDK8 support. I would say that we should
> >>> keep Java11 (jumping directly to Java17 would be problematic
> >>> potentially for some users I guess).
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On Thu, Apr 25, 2024 at 10:21 PM James Duong
> >>>  wrote:
> >>>>
> >>>> If we dropped JDK 8, we could use the JDK to compile module-info.java
> >>> files. Then we could remove the custom maven plugin we’re using for
> >>> compiling module-info.java files for JPMS support and get better IDE
> >>> integration (as what we’re doing currently somewhat shoe-horns module
> >>> information alongside JDK8 bytecode).
> >>>>
> >>>> From: Dane Pitkin 
> >>>> Date: Thursday, April 25, 2024 at 1:02 PM
> >>>> To: dev@arrow.apache.org 
> >>>> Subject: [DISCUSS] Drop Java 8 support
> >>>> Hi all,
> >>>>
> >>>> I would like to revisit the discussion of dropping Java 8 (and maybe
> >> 11)
> >>>> from Arrow's Java implementation. See GH issue[1] below. This was also
> >>>> discussed in the last Arrow community sync meeting on 2024-04-24.
> >>>>
> >>>> For context, this was discussed[2] last year on this mailing list. We
> >>>> decided to revisit the discussion around the June 2024 release (Arrow
> >>> v17).
> >>>> The timing coincides with the initial release of Apache Spark 4.0.0,
> >>> which
> >>>> drops both Java 8 and 11

Re: [DISCUSS] Drop Java 8 support

2024-04-30 Thread Dane Pitkin
Thanks, JB. Are we aware of any downstream dependencies that would benefit
from maintaining Java 11 support? Apache Spark jumped straight to Java 17.
It seems other projects are dropping both 8 and 11 at the same time as
mentioned by Fokko. From a maintenance perspective, it would be nice to
drop both.

On Mon, Apr 29, 2024 at 11:20 AM Jean-Baptiste Onofré 
wrote:

> Hi
>
> I think it's time to drop JDK8 support. I would say that we should
> keep Java11 (jumping directly to Java17 would be problematic
> potentially for some users I guess).
>
> Regards
> JB
>
> On Thu, Apr 25, 2024 at 10:21 PM James Duong
>  wrote:
> >
> > If we dropped JDK 8, we could use the JDK to compile module-info.java
> files. Then we could remove the custom maven plugin we’re using for
> compiling module-info.java files for JPMS support and get better IDE
> integration (as what we’re doing currently somewhat shoe-horns module
> information alongside JDK8 bytecode).
> >
> > From: Dane Pitkin 
> > Date: Thursday, April 25, 2024 at 1:02 PM
> > To: dev@arrow.apache.org 
> > Subject: [DISCUSS] Drop Java 8 support
> > Hi all,
> >
> > I would like to revisit the discussion of dropping Java 8 (and maybe 11)
> > from Arrow's Java implementation. See GH issue[1] below. This was also
> > discussed in the last Arrow community sync meeting on 2024-04-24.
> >
> > For context, this was discussed[2] last year on this mailing list. We
> > decided to revisit the discussion around the June 2024 release (Arrow
> v17).
> > The timing coincides with the initial release of Apache Spark 4.0.0,
> which
> > drops both Java 8 and 11 support.
> >
> > For background, we chose not to drop Java 8 support last year because
> Arrow
> > is seen as a low level library that should support as many environments
> as
> > possible. Nowadays, we see more enthusiasm for dropping Java 8 (and 11)
> as
> > exemplified by Apache Spark as well as Apache Iceberg[3].
> >
> > Is it time to consider dropping Java 8? Should we drop Java 11 and skip
> > straight to Java 17 as our minimum version? What implications do we need
> to
> > be aware of?
> >
> > Thanks,
> > Dane
> >
> > [1]https://github.com/apache/arrow/issues/38051
> > [2]https://lists.apache.org/thread/s07jx58yw4mkl54t3bkggnyg0sftcrr8
> > [3]https://lists.apache.org/thread/ntrk2thvsg9tdccwd4flsdz9gg743368
>


Re: [DISCUSS] Drop Java 8 support

2024-04-26 Thread Dane Pitkin
Thanks James, Fokko! Sounds like we are generally in agreement to move
forward.

I'll leave the discussion open for now in case others have
questions/comments to add.

On Thu, Apr 25, 2024 at 4:25 PM Fokko Driesprong  wrote:

> Hey Dane,
>
> Thanks for bringing this up again.
>
> In the dev-list thread you referred to I hesitated to drop Java <17, but it
> is time. We see several projects that are moving past Java 8, and in
> the process are also dropping Java 11 since it is not supported anymore
> <https://www.oracle.com/java/technologies/java-se-support-roadmap.html>. I
> echo'd this on the Iceberg dev list, and also suggested this with Avro 1.12
> <https://lists.apache.org/thread/6vbd3w5qk7mpb5lyrfyf2s0z1cymjt5w>.
>
> Kind regards,
> Fokko
>
>
>
> Op do 25 apr 2024 om 22:02 schreef Dane Pitkin
>  >:
>
> > Hi all,
> >
> > I would like to revisit the discussion of dropping Java 8 (and maybe 11)
> > from Arrow's Java implementation. See GH issue[1] below. This was also
> > discussed in the last Arrow community sync meeting on 2024-04-24.
> >
> > For context, this was discussed[2] last year on this mailing list. We
> > decided to revisit the discussion around the June 2024 release (Arrow
> v17).
> > The timing coincides with the initial release of Apache Spark 4.0.0,
> which
> > drops both Java 8 and 11 support.
> >
> > For background, we chose not to drop Java 8 support last year because
> Arrow
> > is seen as a low level library that should support as many environments
> as
> > possible. Nowadays, we see more enthusiasm for dropping Java 8 (and 11)
> as
> > exemplified by Apache Spark as well as Apache Iceberg[3].
> >
> > Is it time to consider dropping Java 8? Should we drop Java 11 and skip
> > straight to Java 17 as our minimum version? What implications do we need
> to
> > be aware of?
> >
> > Thanks,
> > Dane
> >
> > [1]https://github.com/apache/arrow/issues/38051
> > [2]https://lists.apache.org/thread/s07jx58yw4mkl54t3bkggnyg0sftcrr8
> > [3]https://lists.apache.org/thread/ntrk2thvsg9tdccwd4flsdz9gg743368
> >
>


[DISCUSS] Drop Java 8 support

2024-04-25 Thread Dane Pitkin
Hi all,

I would like to revisit the discussion of dropping Java 8 (and maybe 11)
from Arrow's Java implementation. See GH issue[1] below. This was also
discussed in the last Arrow community sync meeting on 2024-04-24.

For context, this was discussed[2] last year on this mailing list. We
decided to revisit the discussion around the June 2024 release (Arrow v17).
The timing coincides with the initial release of Apache Spark 4.0.0, which
drops both Java 8 and 11 support.

For background, we chose not to drop Java 8 support last year because Arrow
is seen as a low level library that should support as many environments as
possible. Nowadays, we see more enthusiasm for dropping Java 8 (and 11) as
exemplified by Apache Spark as well as Apache Iceberg[3].

Is it time to consider dropping Java 8? Should we drop Java 11 and skip
straight to Java 17 as our minimum version? What implications do we need to
be aware of?

Thanks,
Dane

[1]https://github.com/apache/arrow/issues/38051
[2]https://lists.apache.org/thread/s07jx58yw4mkl54t3bkggnyg0sftcrr8
[3]https://lists.apache.org/thread/ntrk2thvsg9tdccwd4flsdz9gg743368


Re: Upgrading Java version in build toolchain

2024-04-05 Thread Dane Pitkin
I think we can revisit the discussion soon for dropping Java 8 altogether,
since Spark will release 4.0 in ~June supporting Java 17+ at runtime.

I'm curious how big of an effort it would be to get your proposal
implemented. Would you be willing to draft a PR so we can see what types of
changes are necessary?

On Wed, Apr 3, 2024 at 8:05 AM Jean-Baptiste Onofré  wrote:

> Yes, correct for language features. My point was more that we can
> decide on a major Arrow version upgrading the target language version.
> That's what I meant by "consensus".
>
> Regards
> JB
>
> On Tue, Apr 2, 2024 at 5:55 PM Laurent Goujon
>  wrote:
> >
> > At code level we need to separate language features from library
> features?
> > It should be possible to leverage memory API for example through
> reflection
> > and/or multi-release jar files, but record is a language feature and it
> > would not possible to use it without targeting java 17 at the source
> level.
> >
> > Laurent
> >
> > On Tue, Apr 2, 2024 at 1:40 AM Jean-Baptiste Onofré 
> wrote:
> >
> > > Hi Laurent
> > >
> > > It makes sense to me. I started this "move" (on the plugin side of the
> > > thing) as part of the reproducible build effort.
> > >
> > > At code level, I think it would be great to leverage some features
> > > from Java 17+ (I'm thinking about record, memory API, etc).
> > > I would be more than happy to help on this as soon as we have a
> consensus.
> > >
> > > Thanks,
> > > Regards
> > > JB
> > >
> > > On Mon, Apr 1, 2024 at 7:48 PM Laurent Goujon
> > >  wrote:
> > > >
> > > > Hello Arrow Java developers,
> > > >
> > > > I would wonder if the community would be okay to change the minimum
> Java
> > > > version used by the build toolchain to at least Java 17 or 21 (or
> even
> > > 22).
> > > > This is different from changing the minimum Java version used at
> runtime
> > > > which would still be 8 (following the vote from last september).
> > > >
> > > > Concretely it would mean:
> > > > * Java 21 would be required to build Arrow Java
> > > > * But Arrow would still be compatible with Java 8
> > > > * Unit tests should keep running with Java 8 and higher.
> > > >
> > > > Reasons for changing the toolchain would be:
> > > > - More and more tools and plugins now require at least Java 11,
> forcing
> > > the
> > > > project to keep using older/unsupported versions
> > > > - We hacked our way to support Java modules with
> > > >
> > >
> https://github.com/apache/arrow/tree/main/java/maven/module-info-compiler-maven-plugin
> > > > - There are several new features which we could conditionally
> include in
> > > > the Arrow project like VarHandle (Java 9) and Foreign Function and
> Memory
> > > > API (Java 22) to move away from Unsafe support which require more and
> > > more
> > > > workarounds (Apache Lucene is a project which has managed to
> introduce
> > > > support for multiple Java incubator and final API while maintaining
> > > > compatibility with previous Java versions). But doing it from Java 8
> > > > creates a higher barrier.
> > > >
> > > > Laurent
> > >
>


Re: [VOTE] Release Apache Arrow ADBC 0.11.0 - RC0

2024-03-28 Thread Dane Pitkin
+1 (non-binding)

Verified on MacOS 14.4.1 aarch64 with Conda using:

DOCKER_DEFAULT_PLATFORM=linux/amd64 USE_CONDA=1
./dev/release/verify-release-candidate.sh 0.11.0 0

On Thu, Mar 28, 2024 at 11:07 AM David Li  wrote:

> Hello,
>
> I would like to propose the following release candidate (RC0) of Apache
> Arrow ADBC version 0.11.0. This is a release consisting of 36 resolved
> GitHub issues [1].
>
> This release candidate is based on commit:
> 3cb5825bf551ae93d0e9ed2f64be226b569b27a7 [2]
>
> The source release rc0 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8].
> The changelog is located at [9].
>
> Please download, verify checksums and signatures, run the unit tests, and
> vote on the release. See [10] for how to validate a release candidate.
>
> See also a verification result on GitHub Actions [11].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow ADBC 0.11.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow ADBC 0.11.0 because...
>
> Note: to verify APT/YUM packages on macOS/AArch64, you must `export
> DOCKER_DEFAULT_PLATFORM=linux/amd64`. (Or skip this step by `export
> TEST_APT=0 TEST_YUM=0`.)
>
> [1]:
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A%22ADBC+Libraries+0.11.0%22+is%3Aclosed
> [2]:
> https://github.com/apache/arrow-adbc/commit/3cb5825bf551ae93d0e9ed2f64be226b569b27a7
> [3]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-adbc-0.11.0-rc0/
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [7]:
> https://repository.apache.org/content/repositories/staging/org/apache/arrow/adbc/
> [8]:
> https://github.com/apache/arrow-adbc/releases/tag/apache-arrow-adbc-0.11.0-rc0
> [9]:
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.11.0-rc0/CHANGELOG.md
> [10]:
> https://arrow.apache.org/adbc/main/development/releasing.html#how-to-verify-release-candidates
> [11]: https://github.com/apache/arrow-adbc/actions/runs/8468352632
>


Re: [ANNOUNCE] New Arrow committer: Bryce Mecum

2024-03-18 Thread Dane Pitkin
Congratulations, Bryce!!

On Mon, Mar 18, 2024 at 9:18 AM David Li  wrote:

> Congrats Bryce!
>
> On Mon, Mar 18, 2024, at 08:52, Ian Cook wrote:
> > Congratulations Bryce!
> >
> > Ian
> >
> > On Sun, Mar 17, 2024 at 22:24 Nic Crane  wrote:
> >
> >> On behalf of the Arrow PMC, I'm happy to announce that Bryce Mecum has
> >> accepted an invitation to become a committer on Apache Arrow. Welcome,
> and
> >> thank you for your contributions!
> >>
> >> Nic
> >>
>


Re: [VOTE] Release Apache Arrow ADBC 0.10.0 - RC1

2024-02-21 Thread Dane Pitkin
+1 (non-binding)

Verified on Mac M1 using conda.

On Tue, Feb 20, 2024 at 11:27 PM Dewey Dunnington
 wrote:

> +1!
>
> I ran USE_CONDA=1 dev/release/verify-release-candidate.sh 0.10.0 1 on
> MacOS Sonoma (M1).
>
> On Tue, Feb 20, 2024 at 9:43 AM Jean-Baptiste Onofré 
> wrote:
> >
> > +1 (non binding)
> >
> > I quickly tested on MacOS arm64.
> >
> > Regards
> > JB
> >
> > On Sun, Feb 18, 2024 at 9:47 PM David Li  wrote:
> > >
> > > Hello,
> > >
> > > I would like to propose the following release candidate (RC1) of
> Apache Arrow ADBC version 0.10.0. This is a release consisting of 30
> resolved GitHub issues [1].
> > >
> > > This release candidate is based on commit:
> 9a8e44cc62f23a68ffc0d3d4c7362214b221bea0 [2]
> > >
> > > The source release rc1 is hosted at [3].
> > > The binary artifacts are hosted at [4][5][6][7][8].
> > > The changelog is located at [9].
> > >
> > > Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [10] for how to validate a release candidate.
> > >
> > > See also a verification result on GitHub Actions [11].
> > >
> > > The vote will be open for at least 72 hours.
> > >
> > > [ ] +1 Release this as Apache Arrow ADBC 0.10.0
> > > [ ] +0
> > > [ ] -1 Do not release this as Apache Arrow ADBC 0.10.0 because...
> > >
> > > Note: to verify APT/YUM packages on macOS/AArch64, you must `export
> DOCKER_DEFAULT_PLATFORM=linux/amd64`. (Or skip this step by `export
> TEST_APT=0 TEST_YUM=0`.)
> > >
> > > [1]:
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A%22ADBC+Libraries+0.10.0%22+is%3Aclosed
> > > [2]:
> https://github.com/apache/arrow-adbc/commit/9a8e44cc62f23a68ffc0d3d4c7362214b221bea0
> > > [3]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-adbc-0.10.0-rc1/
> > > [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> > > [5]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> > > [6]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> > > [7]:
> https://repository.apache.org/content/repositories/staging/org/apache/arrow/adbc/
> > > [8]:
> https://github.com/apache/arrow-adbc/releases/tag/apache-arrow-adbc-0.10.0-rc1
> > > [9]:
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.10.0-rc1/CHANGELOG.md
> > > [10]:
> https://arrow.apache.org/adbc/main/development/releasing.html#how-to-verify-release-candidates
> > > [11]: https://github.com/apache/arrow-adbc/actions/runs/7951302316
>


Re: Arrow 15 parquet nanosecond change

2024-02-21 Thread Dane Pitkin
It is possible to change the default Parquet version when instantiating
PyArrow's ParquetWriter[1]. Here's the PR[2] that upgraded the default
Parquet format version from 2.4 -> 2.6, which contains nanosecond support.
It was released in Arrow v13.

[1]
https://github.com/apache/arrow/blob/e198f309c577de9a265c04af2bc4644c33f54375/python/pyarrow/parquet/core.py#L953

[2]https://github.com/apache/arrow/pull/36137

On Wed, Feb 21, 2024 at 4:15 PM Li Jin  wrote:

> “Exponentially exposed” -> “potentially exposed”
>
> On Wed, Feb 21, 2024 at 4:13 PM Li Jin  wrote:
>
> > Thanks - since we don’t control all the invocation of pq.write_table, I
> > wonder if there is some configuration for the “default” behavior?
> >
> > Also I wonder if there are other API surface that is exponentially
> exposed
> > to this, e.g., dataset or pd.Dataframe.to_parquet ?
> >
> > Thanks!
> > Li
> >
> > On Wed, Feb 21, 2024 at 3:53 PM Jacek Pliszka 
> > wrote:
> >
> >> Hi!
> >>
> >> pq.write_table(
> >> table, config.output_filename, coerce_timestamps="us",
> >> allow_truncated_timestamps=True,
> >> )
> >>
> >> allows you to write as us instead of ns.
> >>
> >> BR
> >>
> >> J
> >>
> >>
> >> śr., 21 lut 2024 o 21:44 Li Jin  napisał(a):
> >>
> >> > Hi,
> >> >
> >> > My colleague has informed me that during the Arrow 12->15 upgrade, he
> >> found
> >> > that writing a pandas Dataframe with datetime64[ns] to parquet will
> >> result
> >> > in nanosecond metadata and nanosecond values.
> >> >
> >> > I wonder if this is something configurable to the old behavior so we
> can
> >> > enable “nanosecond in parquet” gradually? There are code that reads
> >> parquet
> >> > files that don’t handle parquet nanosecond now.
> >> >
> >> > Thanks!
> >> > Li
> >> >
> >>
> >
>


[DISCUSS] Arrow 15.0.1 patch release

2024-02-13 Thread Dane Pitkin
Hi all,

Arrow Java identified an issue[1] in the 15.0.0 release. There is an
undefined symbol in the dataset module that causes a linking error at run
time. The issue is resolved[2] and I'd like to propose a patch release. We
also have an open issue to implement testing to prevent this from happening
in the future[3]. This is a major regression in the Arrow Java package, so
it would be great to get a patch released for users.

Special thanks to David Susanibar and Kou for triaging and fixing this
issue.

Thanks,
Dane

[1]https://github.com/apache/arrow/issues/39919
[2]https://github.com/apache/arrow/pull/40015
[3]https://github.com/apache/arrow/issues/40018


Re: [VOTE] Release Apache Arrow nanoarrow 0.4.0 - RC0

2024-01-29 Thread Dane Pitkin
+1 (non-binding)

Verified on MacOS 14 using conda.

On Mon, Jan 29, 2024 at 10:11 AM Dewey Dunnington
 wrote:

> Hello,
>
> I would like to propose the following release candidate (rc0) of
> Apache Arrow nanoarrow [0] version 0.4.0. This release consists of 46
> resolved GitHub issues from 5 contributors [1].
>
> This release candidate is based on commit:
> 3f83f4c48959f7a51053074672b7a330888385b1 [2]
>
> The source release rc0 is hosted at [3].
> The changelog is located at [4].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [5] for how to validate a release
> candidate. Note also a successful verification CI run at [6].
>
> This release contains experimental Python bindings to the nanoarrow C
> library. This vote is on the source tarball only; however, wheels have
> also been prepared and tested for convenience and are available from
> [7].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow nanoarrow 0.4.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow nanoarrow 0.4.0 because...
>
> [0] https://github.com/apache/arrow-nanoarrow
> [1] https://github.com/apache/arrow-nanoarrow/milestone/4?closed=1
> [2]
> https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.4.0-rc0
> [3]
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-nanoarrow-0.4.0-rc0/
> [4]
> https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.4.0-rc0/CHANGELOG.md
> [5]
> https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md
> [6] https://github.com/apache/arrow-nanoarrow/actions/runs/7697719271
> [7] https://github.com/apache/arrow-nanoarrow/actions/runs/7697710625
>


Re: [DISC] Improve Arrow Release verification process

2024-01-19 Thread Dane Pitkin
I agree that this is a great time to look at improving the verification
process. One solution I've seen work fairly well is to convert large bash
scripts into a lightweight ETL pipeline that caches the results/status of
each node as it executes. That way, restarting a pipeline at the right
checkpoint is trivial. Existing open source ETL platforms should be able to
do this, although I don't know which would be best. Something like Apache
Airflow would probably be overkill IMO.

On Fri, Jan 19, 2024 at 5:44 AM Raúl Cumplido  wrote:

> Hi,
>
> One of the challenges we have when doing a release is verification and
> voting.
>
> Currently the Arrow verification process is quite long, tedious and error
> prone.
>
> I would like to use this email to get feedback and user requests in
> order to improve the process.
>
> Several things already on my mind:
>
> One thing that is quite annoying is that any flaky failure makes us
> restart the process and possibly requires downloading everything
> again. It would be great to have some kind of retry mechanism that
> allows us to keep going from where it failed and doesn't have to redo
> the previous successful jobs.
>
> We do have a bunch of flags to do specific parts but that requires
> knowledge and time to go over the different flags, etcetera so the UX
> could be improved.
>
> Based on the ASF release policy [1] in order to cast a +1 vote we have
> to validate the source code packages but it is not required to
> validate binaries locally. Several binaries are currently tested using
> docker images and they are already tested and validated on CI. Our
> documentation for release verification points to perform binary
> validation. I plan to update the documentation and move it to the
> official docs instead of the wiki [2].
>
> I would appreciate input on the topic so we can improve the current
> process.
>
> Thanks everyone,
> Raúl
>
> [1] https://www.apache.org/legal/release-policy.html#release-approval
> [2]
> https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
>


Re: [ANNOUNCE] New Arrow committer: Felipe Oliveira Carvalho

2023-12-07 Thread Dane Pitkin
Congrats, Felipe!

On Thu, Dec 7, 2023 at 11:41 AM hsseo0501  wrote:

> Congrats. Felipe :)내 Galaxy에서 보냄
>  원본 이메일 발신: Ian Cook  날짜: 23/12/8
> 오전 1:24  (GMT+09:00) 받은 사람: dev@arrow.apache.org 제목: Re: [ANNOUNCE] New
> Arrow committer: Felipe Oliveira Carvalho Congratulations Felipe!!!On Thu,
> Dec 7, 2023 at 10:43 AM Benjamin Kietzman  wrote:>>
> On behalf of the Arrow PMC, I'm happy to announce that Felipe Oliveira>
> Carvalho> has accepted an invitation to become a committer on Apache>
> Arrow. Welcome, and thank you for your contributions!>> Ben Kietzman


Re: [ANNOUNCE] New Arrow committer: James Duong

2023-11-16 Thread Dane Pitkin
Congrats, James!

On Thu, Nov 16, 2023 at 4:23 AM Alenka Frim 
wrote:

> Congratulations!
>
> On Thu, Nov 16, 2023 at 8:46 AM Joris Van den Bossche <
> jorisvandenboss...@gmail.com> wrote:
>
> > Congrats!
> >
> > On Thu, 16 Nov 2023 at 08:44, Sutou Kouhei  wrote:
> > >
> > > On behalf of the Arrow PMC, I'm happy to announce that James Duong
> > > has accepted an invitation to become a committer on Apache
> > > Arrow. Welcome, and thank you for your contributions!
> > >
> > > --
> > > kou
> > >
> > >
> >
>


Re: [ANNOUNCE] New Arrow PMC member: Raúl Cumplido

2023-11-13 Thread Dane Pitkin
Congrats, Raul!

On Mon, Nov 13, 2023 at 2:45 PM Kevin Gurney 
wrote:

> Congratulations, Raúl!
>
> 
> From: Nic Crane 
> Sent: Monday, November 13, 2023 2:31 PM
> To: dev@arrow.apache.org 
> Subject: Re: [ANNOUNCE] New Arrow PMC member: Raúl Cumplido
>
> Congrats Raul!
>
> On Tue, 14 Nov 2023, 03:28 Andrew Lamb,  wrote:
>
> > The Project Management Committee (PMC) for Apache Arrow has invited
> > Raúl Cumplido  to become a PMC member and we are pleased to announce
> > that  Raúl Cumplido has accepted.
> >
> > Please join me in congratulating them.
> >
> > Andrew
> >
>


Re: [VOTE] Release Apache Arrow ADBC 0.8.0 - RC0

2023-11-07 Thread Dane Pitkin
+1 (non-binding)

Verified on M1 MacOS 13 with:

USE_CONDA=1 TEST_YUM=0 TEST_APT=0 ./dev/release/verify-release-candidate.sh
0.8.0 0

On Tue, Nov 7, 2023 at 9:10 AM Dewey Dunnington
 wrote:

> +1!
>
> I ran: TEST_APT=0 TEST_YUM=0 USE_CONDA=1
> dev/release/verify-release-candidate.sh 0.8.0 0
>
> On Fri, Nov 3, 2023 at 12:18 PM David Li  wrote:
> >
> > Hello,
> >
> > I would like to propose the following release candidate (RC0) of Apache
> Arrow ADBC version 0.8.0. This is a release consisting of 42 resolved
> GitHub issues [1].
> >
> > This release candidate is based on commit:
> 95f13231f49494bcf78df45de1f65aa25620981b [2]
> >
> > The source release rc0 is hosted at [3].
> > The binary artifacts are hosted at [4][5][6][7][8].
> > The changelog is located at [9].
> >
> > Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [10] for how to validate a release candidate.
> >
> > See also a verification result on GitHub Actions [11].
> >
> > The vote will be open for at least 72 hours.
> >
> > [ ] +1 Release this as Apache Arrow ADBC 0.8.0
> > [ ] +0
> > [ ] -1 Do not release this as Apache Arrow ADBC 0.8.0 because...
> >
> > Note: to verify APT/YUM packages on macOS/AArch64, you must `export
> DOCKER_DEFAULT_PLATFORM=linux/amd64`. (Or skip this step by `export
> TEST_APT=0 TEST_YUM=0`.)
> >
> > Note: it is not currently possible to verify with Conda and Python 3.12
> (some test dependencies do not yet have a Python 3.12 build available). The
> verification script defaults to Python 3.11. Binary artifacts are available
> for 3.12.
> >
> > [1]:
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A%22ADBC+Libraries+0.8.0%22+is%3Aclosed
> > [2]:
> https://github.com/apache/arrow-adbc/commit/95f13231f49494bcf78df45de1f65aa25620981b
> > [3]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-adbc-0.8.0-rc0/
> > [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> > [5]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> > [6]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> > [7]:
> https://repository.apache.org/content/repositories/staging/org/apache/arrow/adbc/
> > [8]:
> https://github.com/apache/arrow-adbc/releases/tag/apache-arrow-adbc-0.8.0-rc0
> > [9]:
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.8.0-rc0/CHANGELOG.md
> > [10]:
> https://arrow.apache.org/adbc/main/development/releasing.html#how-to-verify-release-candidates
> > [11]: https://github.com/apache/arrow-adbc/actions/runs/6746653191
>


Re: [VOTE] Release Apache Arrow 14.0.0 - RC2

2023-10-27 Thread Dane Pitkin
Has anyone fully verified on MacOS yet? Last time I tried, I was still
getting stuck on the original Go issue after syncing Kou's latest changes
(I might need to try a fresh install of arrow).

On Wed, Oct 25, 2023 at 9:34 AM David Li  wrote:

> +1 (binding)
>
> Tested on Debian 12 'bookworm' / x86_64
>
> I had some trouble with the wheels but creating a fresh ARROW_TMPDIR
> resolved it. I filed [1] as well since testing binaries with the apparent
> rate-limit is quite frustrating
>
> [1]: https://github.com/apache/arrow/issues/38442
>
> On Wed, Oct 25, 2023, at 04:58, Raúl Cumplido wrote:
> > +1 (non-binding)
> >
> > I've successfully run:
> > TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh
> 14.0.0 2
> > TEST_DEFAULT=0 TEST_APT=1 dev/release/verify-release-candidate.sh 14.0.0
> 2
> > TEST_DEFAULT=0 TEST_BINARY=1 dev/release/verify-release-candidate.sh
> 14.0.0 2
> > TEST_DEFAULT=0 TEST_JARS=1 dev/release/verify-release-candidate.sh
> 14.0.0 2
> > TEST_DEFAULT=0 TEST_WHEELS=1 dev/release/verify-release-candidate.sh
> 14.0.0 2
> > TEST_DEFAULT=0 TEST_YUM=1 dev/release/verify-release-candidate.sh 14.0.0
> 2
> >
> > I have been able to successfully run the amazon linux 2023 yum package
> > by removing all my old docker images locally and pulling them again.
> >
> > With:
> >   * Python 3.10.12
> >   * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
> >   * NVIDIA CUDA Build cuda_11.5.r11.5/compiler.30672275_0
> >   * openjdk 17.0.8.1 2023-08-24
> >   * ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]
> >   * dotnet 7.0.112
> >   * Ubuntu 22.04 LTS
> >
> > El mié, 25 oct 2023 a las 9:24, Sutou Kouhei ()
> escribió:
> >>
> >> Hi,
> >>
> >> https://github.com/apache/arrow/pull/38450 should fix it.
> >> This is not a blocker because this is a verification script
> >> problem not source archive problem.
> >>
> >>
> >> Thanks,
> >> --
> >> kou
> >>
> >> In 
> >>   "Re: [VOTE] Release Apache Arrow 14.0.0 - RC2" on Tue, 24 Oct 2023
> 20:13:57 +0200,
> >>   Raúl Cumplido  wrote:
> >>
> >> > Hi Bryce,
> >> >
> >> > This happened on the verification tasks and is related to this issue
> [1].
> >> >
> >> > It should be solved if you pull the latest main and the related
> >> > testing submodules.
> >> >
> >> > Thanks,
> >> > Raúl
> >> >
> >> > [1] https://github.com/apache/arrow/issues/38345
> >> >
> >> > El mar, 24 oct 2023 a las 20:02, Bryce Mecum ()
> escribió:
> >> >>
> >> >> I've failed to verify this release candidate on macOS M1, running
> >> >> "dev/release/verify-release-candidate.sh 14.0.0 2" [1]. The failure
> >> >> looks related to the Go implementation's "parquet-encryption-test".
> >> >> Can anyone on a similar machine verify?
> >> >>
> >> >> [1] https://gist.github.com/amoeba/f47534bea44d78a7ee79e4b44ed0e4ff
> >> >>
> >> >>
> >> >> On Mon, Oct 23, 2023 at 11:19 PM Raúl Cumplido 
> wrote:
> >> >> >
> >> >> > Hi,
> >> >> >
> >> >> > I would like to propose the following release candidate (RC2) of
> Apache
> >> >> > Arrow version 14.0.0. This is a release consisting of 461
> >> >> > resolved GitHub issues[1].
> >> >> >
> >> >> > This release candidate is based on commit:
> >> >> > 2dcee3f82c6cf54b53a64729fd81840efa583244 [2]
> >> >> >
> >> >> > The source release rc2 is hosted at [3].
> >> >> > The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> >> >> > The changelog is located at [12].
> >> >> >
> >> >> > Please download, verify checksums and signatures, run the unit
> tests,
> >> >> > and vote on the release. See [13] for how to validate a release
> candidate.
> >> >> >
> >> >> > See also a verification result on GitHub pull request [14].
> >> >> >
> >> >> > The vote will be open for at least 72 hours.
> >> >> >
> >> >> > [ ] +1 Release this as Apache Arrow 14.0.0
> >> >> > [ ] +0
> >> >> > [ ] -1 Do not release this as Apache Arrow 14.0.0 because...
> >> >> >
> >> >> > [1]:
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A14.0.0+is%3Aclosed
> >> >> > [2]:
> https://github.com/apache/arrow/tree/2dcee3f82c6cf54b53a64729fd81840efa583244
> >> >> > [3]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-14.0.0-rc2
> >> >> > [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> >> >> > [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> >> >> > [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> >> >> > [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> >> >> > [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/14.0.0-rc2
> >> >> > [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/14.0.0-rc2
> >> >> > [10]:
> https://apache.jfrog.io/artifactory/arrow/python-rc/14.0.0-rc2
> >> >> > [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> >> >> > [12]:
> https://github.com/apache/arrow/blob/2dcee3f82c6cf54b53a64729fd81840efa583244/CHANGELOG.md
> >> >> > [13]:
> https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
> >> >> > [14]: https://github.com/apache/arrow/p

Re: [ANNOUNCE] New Arrow committer: Xuwei Fu

2023-10-26 Thread Dane Pitkin
Congratulations, Xuwei!

On Thu, Oct 26, 2023 at 9:34 AM Joris Van den Bossche <
jorisvandenboss...@gmail.com> wrote:

> Congrats!
>
> On Wed, 25 Oct 2023 at 08:23, Ian Joiner  wrote:
> >
> > Congrats!
> >
> > On Mon, Oct 23, 2023 at 2:33 AM Sutou Kouhei  wrote:
> >
> > > On behalf of the Arrow PMC, I'm happy to announce that Xuwei Fu
> > > has accepted an invitation to become a committer on Apache
> > > Arrow. Welcome, and thank you for your contributions!
> > >
> > > --
> > > kou
> > >
>


Re: Request for Assistance with MATLAB CI Workflow Issue

2023-10-24 Thread Dane Pitkin
Hey Divyansh,

I suggest discussing this on the PR[1] itself. You will get the best
discussion there, where comments can be added to the code review directly.
It looks like there is a MATLAB reviewer that has already left feedback
that may be helpful.

[1]https://github.com/apache/arrow/pull/38274

On Tue, Oct 24, 2023 at 2:28 PM Divyansh Khatri 
wrote:

> I am currently working on GitHub Issue #38211, titled "[MATLAB] Add support
> for creating an empty arrow.tabular.RecordBatch by calling
> arrow.recordBatch with no input arguments." As a beginner in MATLAB
> development, I am facing challenges with the failing MATLAB CI workflow
> associated with this issue. We need to ensure that all the appropriate test
> cases pass as expected.
> If you have any insights or suggestions on how to fix this issue, please
> share them with me.
>
> -Divyansh
>


Re: [ANNOUNCE] New Arrow committer: Curt Hagenlocher

2023-10-16 Thread Dane Pitkin
Congrats Curt!

On Mon, Oct 16, 2023 at 12:00 PM Kevin Gurney 
wrote:

> Congratulations, Curt!
> 
> From: Weston Pace 
> Sent: Sunday, October 15, 2023 5:32 PM
> To: dev@arrow.apache.org 
> Subject: Re: [ANNOUNCE] New Arrow committer: Curt Hagenlocher
>
> Congratulations!
>
> On Sun, Oct 15, 2023, 8:51 AM Gang Wu  wrote:
>
> > Congrats!
> >
> > On Sun, Oct 15, 2023 at 10:49 PM David Li  wrote:
> >
> > > Congrats & welcome Curt!
> > >
> > > On Sun, Oct 15, 2023, at 09:03, wish maple wrote:
> > > > Congratulations!
> > > >
> > > > Raúl Cumplido  于2023年10月15日周日 20:48写道:
> > > >
> > > >> Congratulations and welcome!
> > > >>
> > > >> El dom, 15 oct 2023, 13:57, Ian Cook 
> escribió:
> > > >>
> > > >> > Congratulations Curt!
> > > >> >
> > > >> > On Sun, Oct 15, 2023 at 05:32 Andrew Lamb 
> > > wrote:
> > > >> >
> > > >> > > On behalf of the Arrow PMC, I'm happy to announce that Curt
> > > Hagenlocher
> > > >> > > has accepted an invitation to become a committer on Apache
> > > >> > > Arrow. Welcome, and thank you for your contributions!
> > > >> > >
> > > >> > > Andrew
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>


Re: [ANNOUNCE] New Arrow PMC member: Jonathan Keane

2023-10-16 Thread Dane Pitkin
Congrats Jon!!

On Mon, Oct 16, 2023 at 7:04 AM Krisztián Szűcs 
wrote:

> Congrats Jon!
>
> On Mon, Oct 16, 2023 at 11:20 AM Alenka Frim
>  wrote:
> >
> > Yay, congratulations Jon!!
> >
> > On Mon, Oct 16, 2023 at 10:27 AM vin jake  wrote:
> >
> > > Congrats Jon!
> > >
> > > On Sun, Oct 15, 2023 at 1:25 AM Andrew Lamb 
> wrote:
> > >
> > > > The Project Management Committee (PMC) for Apache Arrow has invited
> > > > Jonathan Keane to become a PMC member and we are pleased to announce
> > > > that Jonathan Keane has accepted.
> > > >
> > > > Congratulations and welcome!
> > > >
> > > > Andrew
> > > >
> > >
>


Re: [Java][Discuss]: consensus for JDK 8 deprecation

2023-10-12 Thread Dane Pitkin
Thanks all for the input. It sounds like it is NOT feasible to maintain a
legacy branch. So when Arrow drops Java 8 (and 11?) support, we should
consider it EOL from Arrow's perspective.

+1 to Fokko's suggestion on waiting. I propose we commit to dropping Java 8
and 11 from Arrow in tandem with the Spark 4 release. Spark 4 has a release
date of 2024-06[1]. I also propose that we re-discuss closer to that date
as a final evaluation. This was also discussed in the Arrow bi-weekly sync
where waiting for Spark 4 was agreed upon.

There is a lot of pre-work we can do to prepare Arrow Java for new features
that will become available.

[1]https://lists.apache.org/thread/xhkgj60j361gdpywoxxz7qspp2w80ry6

-Dane

On Wed, Oct 11, 2023 at 2:50 AM Fokko Driesprong  wrote:

> Hey everyone,
>
> Great to bring this up. I can speak for the Iceberg community, and we
> expect to support Java 8 for a long time there (unfortunately). Let me go
> over some of the arguments here.
>
> Java 8 does have a long Extended Support timeline, but a recent
> > report shows Java 11 increasing in adoption vs Java 8. "More than 56% of
> > applications are now using Java 11 in production (up from 48% in 2022 and
> > 11% in 2020). Java 8 is a close second with nearly 33% of applications
> > using it in production (down from 46% in 2022)."[2]
>
>
> I think this is skewed, mostly because it is easy to upgrade a Spring
> application to the latest version from Java, but if you're tied to the
> Hadoop ecosystems, things are moving slowly.
>
> 2. Unblock Arrow from upgrading dependencies that no longer support Java 8.
>
>
> Following the links, I noticed the dependencies are around tests (Mockito)
> and static error checking (error-prone). Those sound like nice to have to
> me.
>
> An example; Apache Thrift dropped Java 8 a while ago but added the support
> again since it was breaking the ecosystem
> <https://github.com/apache/thrift/pull/2785>. Thrift is used in Parquet,
> and in Parquet we cannot yet drop Java 8 support. I think of Arrow as a
> low-level library, like Thirft and Iceberg, and I think it makes sense to
> serve an as wide audience as possible (within reasonable bounds of course).
>
> I would at least wait until the release of Spark 4. My experience is that
> nobody is really eager to do backporting to older versions, and for me, I
> still think the gains of dropping Java 8 support are not that big.
>
> Kind regards,
> Fokko Driesprong
>
> Op wo 11 okt 2023 om 06:05 schreef Jacob Wujciak-Jens
> :
>
> > > I cannot estimate the effort to backport large features like the new
> > layouts that are currently being added (e.g. RunEndEncoding, ListView,
> > etc.).
> >
> > In my mind we are only talking about patch releases for security fixes or
> > similarly critical issues as otherwise the effort to maintain 'v14' (but
> > actually arrow-latest) would surely overshadow any gains made by
> > deprecating jdk 8?
> >
> > On Wed, Oct 11, 2023 at 3:31 AM Gang Wu  wrote:
> >
> > > I agree that we have to move on. It seems that patch release to
> > > Arrow v14 is a good idea, though I cannot estimate the effort to
> > > backport large features like the new layouts that are currently
> > > being added (e.g. RunEndEncoding, ListView, etc.).
> > >
> > > As an Arrow developer, I am always happy to drop JDK 8. My
> > > employer has leveraged Apache Arrow in the internal engine
> > > and depends on Arrow Java in the Java SDK. For end users
> > > who cannot get away with JDK 8, we might need to prepare
> > > different Java SDKs and use features that are available in the
> > > Arrow v14, or let the server side chooses which subset of
> > > features based on the SDK version.
> > >
> > > Thanks,
> > > Gang
> > >
> > >
> > >
> > > On Wed, Oct 11, 2023 at 12:40 AM Dane Pitkin
> >  > > >
> > > wrote:
> > >
> > > > To summarize the discussion so far:
> > > >
> > > > * Some Arrow Java users are still on JDK 8
> > > > * Arrow v14 is proposed as the final version with JDK 8 support
> > > > * Arrow v14 can support patch releases if necessary for JDK 8 users
> > > > * There is an open question to decide if JDK 11 should be dropped
> > > > simultaneously
> > > >
> > > > Gang Wu, I'm curious what are your thoughts given your initial
> > concerns?
> > > >
> > > > -Dane
> > > >
> > > > On Sat, Oct 7, 2023 at 12:00 AM Jacob Wujciak-Jens
> > > >  wrote:

Re: [Java][Discuss]: consensus for JDK 8 deprecation

2023-10-10 Thread Dane Pitkin
To summarize the discussion so far:

* Some Arrow Java users are still on JDK 8
* Arrow v14 is proposed as the final version with JDK 8 support
* Arrow v14 can support patch releases if necessary for JDK 8 users
* There is an open question to decide if JDK 11 should be dropped
simultaneously

Gang Wu, I'm curious what are your thoughts given your initial concerns?

-Dane

On Sat, Oct 7, 2023 at 12:00 AM Jacob Wujciak-Jens
 wrote:

> From a release engineer perspective (without java knowledge) I agree with
> Micah, I'd rather make a patch release for an older version if needed but
> modernize the codebase and simplify CI!
>
>
> On Sat, Oct 7, 2023 at 5:27 AM Micah Kornfield 
> wrote:
>
> > I think given the stability of Arrow Java, dropping support probably
> makes
> > sense.  If a bug comes up or consumers really need to new features we can
> > always make a patch release of an older version.
> >
> > On Thu, Oct 5, 2023 at 3:13 PM Dane Pitkin  >
> > wrote:
> >
> > > I also learned today that Apache Spark has dropped support for Java 8
> and
> > > 11 for their next release (v4.0)[1]. Should we consider dropping Java
> 11
> > as
> > > well?
> > >
> > > [1]https://github.com/apache/spark/pull/43005
> > >
> > > -Dane
> > >
> > > On Thu, Oct 5, 2023 at 3:30 PM Dane Pitkin 
> wrote:
> > >
> > > > I created a GH issue[1] proposing the removal of Java 8 support. It
> > > > would target the Arrow v15 release (~Jan 2024).
> > > >
> > > > IMO it would be in the best interest of the project for two major
> > > reasons:
> > > > 1. Unblock the Java Platform Module System (JPMS)[2] implementation.
> > > > 2. Unblock Arrow from upgrading dependencies that no longer support
> > Java
> > > > 8. (See [1] for examples)
> > > >
> > > > Since Arrow Java has been quite stable, will Java 8 users be okay
> with
> > > > pinning Arrow to the last supported release (v14) if the Arrow
> project
> > > > ultimately decides to remove Java 8 support?
> > > >
> > > >
> > > > [1]https://github.com/apache/arrow/issues/38051
> > > > [2]https://en.wikipedia.org/wiki/Java_Platform_Module_System
> > > >
> > > > -Dane
> > > >
> > > > On Fri, Sep 15, 2023 at 12:26 PM Dane Pitkin 
> > > wrote:
> > > >
> > > >> - As a low level library, users have to add specific flags to use
> > > >>>  Java 9 and up with Arrow to resolve issues with java.nio. This has
> > > >>>  been annoying for our customers constantly. If this is not
> resolved,
> > > >>>  I would say we may see a lot of complaints in the future.
> > > >>>
> > > >> I filed issue 37739[1] to track this, but it sounds like this can't
> be
> > > >> changed until Java 21 or 24.
> > > >>
> > > >> - It seems that the EOL of Java 8 from Oracle is Dec 2030 [2]. A lot
> > > >>>  users will still stay on it for a long time. At least this is true
> > for
> > > >>> our
> > > >>>  customers. So I am afraid we may not upgrade to newer versions
> > > >>>  of Arrow if it no longer supports Java 8.
> > > >>>
> > > >> Java 8 does have a long Extended Support timeline, but a recent
> > > >> report shows Java 11 increasing in adoption vs Java 8. "More than
> 56%
> > of
> > > >> applications are now using Java 11 in production (up from 48% in
> 2022
> > > and
> > > >> 11% in 2020). Java 8 is a close second with nearly 33% of
> applications
> > > >> using it in production (down from 46% in 2022)."[2]
> > > >> I expect the Java ecosystem will find a way to move on from Java 8
> > much
> > > >> sooner than 2030, meaning many of Arrow's dependencies could drop
> > > support
> > > >> for Java 8 before then. At this point, Arrow may be forced to
> support
> > a
> > > >> higher minimum Java version.
> > > >>
> > > >> That being said, it's hard to argue against real use cases. I'd be
> > > >> curious to hear what Java version other users of Arrow are using
> (and
> > if
> > > >> there is a timeline to upgrade if on Java 8).
> > > >>
> > > >>
> > > >> [1]https://github.com/apache/arrow/issues/37739
> >

Re: [Java][Discuss]: consensus for JDK 8 deprecation

2023-10-05 Thread Dane Pitkin
I also learned today that Apache Spark has dropped support for Java 8 and
11 for their next release (v4.0)[1]. Should we consider dropping Java 11 as
well?

[1]https://github.com/apache/spark/pull/43005

-Dane

On Thu, Oct 5, 2023 at 3:30 PM Dane Pitkin  wrote:

> I created a GH issue[1] proposing the removal of Java 8 support. It
> would target the Arrow v15 release (~Jan 2024).
>
> IMO it would be in the best interest of the project for two major reasons:
> 1. Unblock the Java Platform Module System (JPMS)[2] implementation.
> 2. Unblock Arrow from upgrading dependencies that no longer support Java
> 8. (See [1] for examples)
>
> Since Arrow Java has been quite stable, will Java 8 users be okay with
> pinning Arrow to the last supported release (v14) if the Arrow project
> ultimately decides to remove Java 8 support?
>
>
> [1]https://github.com/apache/arrow/issues/38051
> [2]https://en.wikipedia.org/wiki/Java_Platform_Module_System
>
> -Dane
>
> On Fri, Sep 15, 2023 at 12:26 PM Dane Pitkin  wrote:
>
>> - As a low level library, users have to add specific flags to use
>>>  Java 9 and up with Arrow to resolve issues with java.nio. This has
>>>  been annoying for our customers constantly. If this is not resolved,
>>>  I would say we may see a lot of complaints in the future.
>>>
>> I filed issue 37739[1] to track this, but it sounds like this can't be
>> changed until Java 21 or 24.
>>
>> - It seems that the EOL of Java 8 from Oracle is Dec 2030 [2]. A lot
>>>  users will still stay on it for a long time. At least this is true for
>>> our
>>>  customers. So I am afraid we may not upgrade to newer versions
>>>  of Arrow if it no longer supports Java 8.
>>>
>> Java 8 does have a long Extended Support timeline, but a recent
>> report shows Java 11 increasing in adoption vs Java 8. "More than 56% of
>> applications are now using Java 11 in production (up from 48% in 2022 and
>> 11% in 2020). Java 8 is a close second with nearly 33% of applications
>> using it in production (down from 46% in 2022)."[2]
>> I expect the Java ecosystem will find a way to move on from Java 8 much
>> sooner than 2030, meaning many of Arrow's dependencies could drop support
>> for Java 8 before then. At this point, Arrow may be forced to support a
>> higher minimum Java version.
>>
>> That being said, it's hard to argue against real use cases. I'd be
>> curious to hear what Java version other users of Arrow are using (and if
>> there is a timeline to upgrade if on Java 8).
>>
>>
>> [1]https://github.com/apache/arrow/issues/37739
>> [2]
>> https://newrelic.com/sites/default/files/2023-04/new-relic-2023-state-of-the-java-ecosystem-2023-04-20.pdf
>>
>>
>> -Dane
>>
>>
>> On Thu, Sep 14, 2023 at 11:45 AM Gang Wu  wrote:
>>
>>> Thanks for bringing this up!
>>>
>>> I have two concerns of dropping Java 8 support:
>>> - As a low level library, users have to add specific flags [1] to use
>>>  Java 9 and up with Arrow to resolve issues with java.nio. This has
>>>  been annoying for our customers constantly. If this is not resolved,
>>>  I would say we may see a lot of complaints in the future.
>>> - It seems that the EOL of Java 8 from Oracle is Dec 2030 [2]. A lot
>>>  users will still stay on it for a long time. At least this is true for
>>> our
>>>  customers. So I am afraid we may not upgrade to newer versions
>>>  of Arrow if it no longer supports Java 8.
>>>
>>> [1] https://arrow.apache.org/docs/java/install.html#java-compatibility
>>> [2]
>>> https://www.oracle.com/java/technologies/java-se-support-roadmap.html
>>>
>>> Best,
>>> Gang
>>>
>>>
>>>
>>> On Thu, Sep 14, 2023 at 11:14 PM David Dali Susanibar Arce <
>>> davi.sar...@gmail.com> wrote:
>>>
>>> > Hi Arrow Java developers,
>>> >
>>> > I would like to propose a timeline for dropping support for Java 8:
>>> > - Propose to drop JDK8 in Arrow v15 (2 releases from now)
>>> > - JDK 21 support will be added before removal of JDK8
>>> >
>>> > Why?
>>> > - Java 8 no longer receives Premier Support (1)
>>> > - Some Arrow Java (test) dependencies have already started to drop
>>> > Java 8 support, forcing us to pin to older packager versions
>>> >
>>> > Also note:
>>> > - gRPC Java may drop support for a JDK version when that version is no
>>> > l

Re: [Java][Discuss]: consensus for JDK 8 deprecation

2023-10-05 Thread Dane Pitkin
I created a GH issue[1] proposing the removal of Java 8 support. It
would target the Arrow v15 release (~Jan 2024).

IMO it would be in the best interest of the project for two major reasons:
1. Unblock the Java Platform Module System (JPMS)[2] implementation.
2. Unblock Arrow from upgrading dependencies that no longer support Java 8.
(See [1] for examples)

Since Arrow Java has been quite stable, will Java 8 users be okay with
pinning Arrow to the last supported release (v14) if the Arrow project
ultimately decides to remove Java 8 support?


[1]https://github.com/apache/arrow/issues/38051
[2]https://en.wikipedia.org/wiki/Java_Platform_Module_System

-Dane

On Fri, Sep 15, 2023 at 12:26 PM Dane Pitkin  wrote:

> - As a low level library, users have to add specific flags to use
>>  Java 9 and up with Arrow to resolve issues with java.nio. This has
>>  been annoying for our customers constantly. If this is not resolved,
>>  I would say we may see a lot of complaints in the future.
>>
> I filed issue 37739[1] to track this, but it sounds like this can't be
> changed until Java 21 or 24.
>
> - It seems that the EOL of Java 8 from Oracle is Dec 2030 [2]. A lot
>>  users will still stay on it for a long time. At least this is true for
>> our
>>  customers. So I am afraid we may not upgrade to newer versions
>>  of Arrow if it no longer supports Java 8.
>>
> Java 8 does have a long Extended Support timeline, but a recent
> report shows Java 11 increasing in adoption vs Java 8. "More than 56% of
> applications are now using Java 11 in production (up from 48% in 2022 and
> 11% in 2020). Java 8 is a close second with nearly 33% of applications
> using it in production (down from 46% in 2022)."[2]
> I expect the Java ecosystem will find a way to move on from Java 8 much
> sooner than 2030, meaning many of Arrow's dependencies could drop support
> for Java 8 before then. At this point, Arrow may be forced to support a
> higher minimum Java version.
>
> That being said, it's hard to argue against real use cases. I'd be curious
> to hear what Java version other users of Arrow are using (and if there is a
> timeline to upgrade if on Java 8).
>
>
> [1]https://github.com/apache/arrow/issues/37739
> [2]
> https://newrelic.com/sites/default/files/2023-04/new-relic-2023-state-of-the-java-ecosystem-2023-04-20.pdf
>
>
> -Dane
>
>
> On Thu, Sep 14, 2023 at 11:45 AM Gang Wu  wrote:
>
>> Thanks for bringing this up!
>>
>> I have two concerns of dropping Java 8 support:
>> - As a low level library, users have to add specific flags [1] to use
>>  Java 9 and up with Arrow to resolve issues with java.nio. This has
>>  been annoying for our customers constantly. If this is not resolved,
>>  I would say we may see a lot of complaints in the future.
>> - It seems that the EOL of Java 8 from Oracle is Dec 2030 [2]. A lot
>>  users will still stay on it for a long time. At least this is true for
>> our
>>  customers. So I am afraid we may not upgrade to newer versions
>>  of Arrow if it no longer supports Java 8.
>>
>> [1] https://arrow.apache.org/docs/java/install.html#java-compatibility
>> [2] https://www.oracle.com/java/technologies/java-se-support-roadmap.html
>>
>> Best,
>> Gang
>>
>>
>>
>> On Thu, Sep 14, 2023 at 11:14 PM David Dali Susanibar Arce <
>> davi.sar...@gmail.com> wrote:
>>
>> > Hi Arrow Java developers,
>> >
>> > I would like to propose a timeline for dropping support for Java 8:
>> > - Propose to drop JDK8 in Arrow v15 (2 releases from now)
>> > - JDK 21 support will be added before removal of JDK8
>> >
>> > Why?
>> > - Java 8 no longer receives Premier Support (1)
>> > - Some Arrow Java (test) dependencies have already started to drop
>> > Java 8 support, forcing us to pin to older packager versions
>> >
>> > Also note:
>> > - gRPC Java may drop support for a JDK version when that version is no
>> > longer receiving Premier Support from Oracle (2), more detail at Java
>> > 8 / Java 11 support timeline in gRPC here (3)
>> > - Spark plans to tentatively drop JDK 8 support in Spark 4.0 (4),
>> > which has a release timeline of approximately 2024-06 (5). Is it fine
>> > for us to drop JDK 8 support before spark?
>> >
>> > (1)
>> https://www.oracle.com/java/technologies/java-se-support-roadmap.html
>> > (2)
>> >
>> https://github.com/grpc/proposal/pull/283/files#:~:text=gRPC%20Java%20may,support%5D
>> > .
>> > (3) https://groups.google.com/g/grpc-io/c/-XK6Kd_19YQ/m/-4s07TzdAgAJ
>> > (4) https://issues.apache.org/jira/browse/SPARK-44112
>> > (5) https://www.mail-archive.com/dev@spark.apache.org/msg30460.html
>> >
>> > Consider:
>> > - JDK8 deprecation is currently not mandatory. We simply want to
>> > devote more time to development of Java LTS versions 11, 17 and 21.
>> > - Java 11 is dropping Premier Support this month.
>> >
>> > Best regards,
>> >
>> > --
>> > David Susanibar
>> >
>>
>


Re: [VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0

2023-09-26 Thread Dane Pitkin
+1 (non-binding)

I verified successfully on MacOS 13.5 (aarch64) with:

cd dev/release && ./verify-release-candidate.sh 0.3.0 0



On Tue, Sep 26, 2023 at 5:30 PM Sutou Kouhei  wrote:

> +1
>
> I ran the following command line on Debian GNU/Linux sid:
>
>   CMAKE_PREFIX_PATH=/tmp/local \
> dev/release/verify-release-candidate.sh 0.3.0 0
>
> with:
>
>   * Apache Arrow C++ main
>   * gcc (Debian 13.2.0-4) 13.2.0
>   * R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
>
> Thanks,
> --
> kou
>
> In 
>   "[VOTE] Release Apace Arrow nanoarrow 0.3.0 - RC0" on Tue, 26 Sep 2023
> 12:23:52 -0300,
>   Dewey Dunnington  wrote:
>
> > Hello,
> >
> > I would like to propose the following release candidate (rc0) of
> > Apache Arrow nanoarrow [0] version 0.3.0. This is an initial release
> > consisting of 42 resolved GitHub issues from 4 contributors [1].
> >
> > This release candidate is based on commit:
> > c00cd7707bcddb4dab9a7d19bf63e87c06d36c63 [2]
> >
> > The source release rc0 is hosted at [3].
> > The changelog is located at [4].
> >
> > Please download, verify checksums and signatures, run the unit tests,
> > and vote on the release. See [5] for how to validate a release
> > candidate.
> >
> > See also a successful suite of verification runs at [6].
> >
> > The vote will be open for at least 72 hours.
> >
> > [ ] +1 Release this as Apache Arrow nanoarrow 0.3.0
> > [ ] +0
> > [ ] -1 Do not release this as Apache Arrow nanoarrow 0.3.0 because...
> >
> > [0] https://github.com/apache/arrow-nanoarrow
> > [1] https://github.com/apache/arrow-nanoarrow/milestone/3?closed=1
> > [2]
> https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.3.0-rc0
> > [3]
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-nanoarrow-0.3.0-rc0/
> > [4]
> https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.3.0-rc0/CHANGELOG.md
> > [5]
> https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md
> > [6] https://github.com/apache/arrow-nanoarrow/actions/runs/6314579940
>


Re: [VOTE] Release Apache Arrow ADBC 0.7.0 - RC0

2023-09-20 Thread Dane Pitkin
+1 (non-binding)

I verified it successfully on macOS/aarch64 with:

$ export DOCKER_DEFAULT_PLATFORM=linux/amd64
> && ARROW_TMPDIR=/tmp/adbc-0.7.0 USE_CONDA=1
> ./dev/release/verify-release-candidate.sh 0.7.0 0


On Wed, Sep 20, 2023 at 1:04 PM David Li  wrote:

> Hello,
>
> I would like to propose the following release candidate (RC0) of Apache
> Arrow ADBC version 0.7.0. This is a release consisting of 50 resolved
> GitHub issues [1].
>
> This release candidate is based on commit:
> efb72b4729e0f99c7d1f6723c1a966e011fa478f [2]
> This is the first release using API specification 1.1.0.
>
> The source release rc0 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8].
> The changelog is located at [9].
>
> Please download, verify checksums and signatures, run the unit tests, and
> vote on the release. See [10] for how to validate a release candidate.
>
> See also a verification result on GitHub Actions [11].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow ADBC 0.7.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow ADBC 0.7.0 because...
>
> Note: to verify APT/YUM packages on macOS/AArch64, you must `export
> DOCKER_DEFAULT_PLATFORM=linux/amd64`. (Or skip this step by `export
> TEST_APT=0 TEST_YUM=0`.)
>
> [1]:
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A%22ADBC+Libraries+0.7.0%22+is%3Aclosed
> [2]:
> https://github.com/apache/arrow-adbc/commit/efb72b4729e0f99c7d1f6723c1a966e011fa478f
> [3]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-adbc-0.7.0-rc0/
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [7]:
> https://repository.apache.org/content/repositories/staging/org/apache/arrow/adbc/
> [8]:
> https://github.com/apache/arrow-adbc/releases/tag/apache-arrow-adbc-0.7.0-rc0
> [9]:
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.7.0-rc0/CHANGELOG.md
> [10]:
> https://arrow.apache.org/adbc/main/development/releasing.html#how-to-verify-release-candidates
> [11]: https://github.com/apache/arrow-adbc/actions/runs/6251522630
>


Re: [Java][Discuss]: consensus for JDK 8 deprecation

2023-09-15 Thread Dane Pitkin
>
> - As a low level library, users have to add specific flags to use
>  Java 9 and up with Arrow to resolve issues with java.nio. This has
>  been annoying for our customers constantly. If this is not resolved,
>  I would say we may see a lot of complaints in the future.
>
I filed issue 37739[1] to track this, but it sounds like this can't be
changed until Java 21 or 24.

- It seems that the EOL of Java 8 from Oracle is Dec 2030 [2]. A lot
>  users will still stay on it for a long time. At least this is true for our
>  customers. So I am afraid we may not upgrade to newer versions
>  of Arrow if it no longer supports Java 8.
>
Java 8 does have a long Extended Support timeline, but a recent
report shows Java 11 increasing in adoption vs Java 8. "More than 56% of
applications are now using Java 11 in production (up from 48% in 2022 and
11% in 2020). Java 8 is a close second with nearly 33% of applications
using it in production (down from 46% in 2022)."[2]
I expect the Java ecosystem will find a way to move on from Java 8 much
sooner than 2030, meaning many of Arrow's dependencies could drop support
for Java 8 before then. At this point, Arrow may be forced to support a
higher minimum Java version.

That being said, it's hard to argue against real use cases. I'd be curious
to hear what Java version other users of Arrow are using (and if there is a
timeline to upgrade if on Java 8).


[1]https://github.com/apache/arrow/issues/37739
[2]
https://newrelic.com/sites/default/files/2023-04/new-relic-2023-state-of-the-java-ecosystem-2023-04-20.pdf


-Dane


On Thu, Sep 14, 2023 at 11:45 AM Gang Wu  wrote:

> Thanks for bringing this up!
>
> I have two concerns of dropping Java 8 support:
> - As a low level library, users have to add specific flags [1] to use
>  Java 9 and up with Arrow to resolve issues with java.nio. This has
>  been annoying for our customers constantly. If this is not resolved,
>  I would say we may see a lot of complaints in the future.
> - It seems that the EOL of Java 8 from Oracle is Dec 2030 [2]. A lot
>  users will still stay on it for a long time. At least this is true for our
>  customers. So I am afraid we may not upgrade to newer versions
>  of Arrow if it no longer supports Java 8.
>
> [1] https://arrow.apache.org/docs/java/install.html#java-compatibility
> [2] https://www.oracle.com/java/technologies/java-se-support-roadmap.html
>
> Best,
> Gang
>
>
>
> On Thu, Sep 14, 2023 at 11:14 PM David Dali Susanibar Arce <
> davi.sar...@gmail.com> wrote:
>
> > Hi Arrow Java developers,
> >
> > I would like to propose a timeline for dropping support for Java 8:
> > - Propose to drop JDK8 in Arrow v15 (2 releases from now)
> > - JDK 21 support will be added before removal of JDK8
> >
> > Why?
> > - Java 8 no longer receives Premier Support (1)
> > - Some Arrow Java (test) dependencies have already started to drop
> > Java 8 support, forcing us to pin to older packager versions
> >
> > Also note:
> > - gRPC Java may drop support for a JDK version when that version is no
> > longer receiving Premier Support from Oracle (2), more detail at Java
> > 8 / Java 11 support timeline in gRPC here (3)
> > - Spark plans to tentatively drop JDK 8 support in Spark 4.0 (4),
> > which has a release timeline of approximately 2024-06 (5). Is it fine
> > for us to drop JDK 8 support before spark?
> >
> > (1)
> https://www.oracle.com/java/technologies/java-se-support-roadmap.html
> > (2)
> >
> https://github.com/grpc/proposal/pull/283/files#:~:text=gRPC%20Java%20may,support%5D
> > .
> > (3) https://groups.google.com/g/grpc-io/c/-XK6Kd_19YQ/m/-4s07TzdAgAJ
> > (4) https://issues.apache.org/jira/browse/SPARK-44112
> > (5) https://www.mail-archive.com/dev@spark.apache.org/msg30460.html
> >
> > Consider:
> > - JDK8 deprecation is currently not mandatory. We simply want to
> > devote more time to development of Java LTS versions 11, 17 and 21.
> > - Java 11 is dropping Premier Support this month.
> >
> > Best regards,
> >
> > --
> > David Susanibar
> >
>


Re: [Python][Discuss] PyArrow Dataset as a Python protocol

2023-09-01 Thread Dane Pitkin
The Python Substrait package[1] is on PyPi[2] and currently has python
wrappers for the Substrait protobuf objects. I think this will be a great
opportunity to identify helper features that users of this protocol would
like to see. I'll be keeping an eye out as this develops, but also feel
free to file feature requests in the project!


[1]https://github.com/substrait-io/substrait-python
[2]https://pypi.org/project/substrait/


On Thu, Aug 31, 2023 at 10:05 PM Will Jones  wrote:

> Hello Arrow devs,
>
> We discussed this further in the Arrow community call on 2023-08-30 [1],
> and concluded we should create an entirely new protocol that uses Substrait
> expressions. I have created an issue [2] to track this and will start a PR
> soon.
>
> It does look like we might block this on creating a PyCapsule based
> protocol for arrays, schemas, and streams. That is tracked here [3].
> Hopefully that isn't too ambitious :)
>
> Best,
>
> Will Jones
>
>
> [1]
>
> https://docs.google.com/document/d/1xrji8fc6_24TVmKiHJB4ECX1Zy2sy2eRbBjpVJMnPmk/edit
> [2] https://github.com/apache/arrow/issues/37504
> [3] https://github.com/apache/arrow/issues/35531
>
>
> On Tue, Aug 29, 2023 at 2:59 PM Ian Cook  wrote:
>
> > An update about this:
> >
> > Weston's PR https://github.com/apache/arrow/pull/34834/ merged last
> > week. This makes it possible to convert PyArrow expressions to/from
> > Substrait expressions.
> >
> > As Fokko previously noted, the PR does not change the PyArrow Dataset
> > interface at all. It simply enables a Substrait expression to be
> > converted to a PyArrow expression, which can then be used to
> > filter/project a Dataset.
> >
> > There is a basic example here demonstrating this:
> > https://gist.github.com/ianmcook/f70fc185d29ae97bdf85ffe0378c68e0
> >
> > We might now consider whether to build upon this to create a Dataset
> > protocol that is independent of the PyArrow Expression implementation
> > and that could interoperate across languages.
> >
> > Ian
> >
> > On Mon, Jul 3, 2023 at 5:48 PM Will Jones 
> wrote:
> > >
> > > Hello,
> > >
> > > After thinking about it, I think I understand the approach David Li and
> > Ian
> > > are suggesting with respect to expressions. There will be some
> arguments
> > > that only PyArrow's own datasets support, but that aren't in the
> generic
> > > protocol. Passing
> > > PyArrow expressions to the filters argument should be considered one of
> > > those. DuckDB and others are currently passing them down, so they
> aren't
> > > yet using the protocol properly. But once we add support in the
> protocol
> > > for passing filters via Substrait expressions, we'll move DuckDB and
> > others
> > > over to be fully compliant with the protocol.
> > >
> > > It's a bit of an awkward temporary state for now, but so would having
> > > PyArrow expressions in the protocol just to be deprecated in a few
> > months.
> > > One caveat is that we'll need to provide DuckDB and other consumers
> with
> > a
> > > way to tell whether the dataset supports passing filters as Substrait
> > > expression or PyArrow ones, since I doubt they'll want to lose support
> > for
> > > integrating with older PyArrow versions.
> > >
> > > I've removed filters from the protocol for now, with the intention of
> > > bringing them back as soon as we can get Substrait support. I think we
> > can
> > > do this in the 14.0.0 release.
> > >
> > > Best,
> > >
> > > Will Jones
> > >
> > >
> > > On Mon, Jul 3, 2023 at 7:45 AM Fokko Driesprong 
> > wrote:
> > >
> > > > Hey everyone,
> > > >
> > > > Chiming in here from the PyIceberg side. I would love to see the
> > protocol
> > > > as proposed in the PR. I did a small test
> > > > <
> > https://github.com/apache/arrow/pull/35568#pullrequestreview-1480259722
> >,
> > > > and it seems to be quite straightforward to implement and it brings a
> > lot
> > > > of potential. Unsurprisingly, I leaning toward the first option:
> > > >
> > > > 1. We keep PyArrow expressions in the API initially, but once we have
> > > > > Substrait-based alternatives we deprecate the PyArrow expression
> > support.
> > > > > This is what I intended with the current design, and I think it
> > provides
> > > > > the most obvious migration paths for existing producers and
> > consumers.
> > > >
> > > >
> > > > Let me give my vision on some of the concerns raised.
> > > >
> > > > Will, I see that you've already addressed this issue to some extent
> in
> > > > > your proposal. For example, you mention that we should initially
> > > > > define this protocol to include only a minimal subset of the
> Dataset
> > > > > API. I agree, but I think there are some loose ends we should be
> > > > > careful to tie up. I strongly agree with the comments made by
> David,
> > > > > Weston, and Dewey arguing that we should avoid any use of PyArrow
> > > > > expressions in this API. Expressions are an implementation detail
> of
> > > > > PyArrow, not a part of the Arrow standard. It would be much safer
> fo

Re: [VOTE] Release Apache Arrow 13.0.0 - RC2

2023-08-02 Thread Dane Pitkin
During the Arrow community meeting today, we decided to include the fix for
the wide-dataframe performance regression due to its severity. The ETA is
end-of-the-week for the patch. You can review the meeting notes here[1].

[1]
https://docs.google.com/document/d/1xrji8fc6_24TVmKiHJB4ECX1Zy2sy2eRbBjpVJMnPmk/edit#heading=h.k1ts4kvvl8jq

On Wed, Aug 2, 2023 at 12:43 PM Austin Dickey
 wrote:

> Hi all,
>
> We have generated a new report comparing the Python and R benchmark results
> from the contender 13.0.0 RC2 to the baseline 12.0.1 RC1 on the
> ursa-i9-9960x machine:
>
>
> http://crossbow.voltrondata.com/release_reports/arrow-release-report-13.0.0-rc2.html
>
> To dive deeper into the data, here is the same comparison on Conbench:
>
>
> https://conbench.ursa.dev/compare/runs/b5871a9145e545409dd169ec30f17384...833d1302bd98477caaaf2fa428db0512/
>
> The main notable performance difference is in the "wide-dataframe
> use_legacy_dataset=false" benchmark case, which looks to exhibit a
> performance regression, as Raúl mentioned above:
>
>
> https://conbench.ursa.dev/benchmark-results/064ca542414d71f5800054929f91b4a1/
>
> Thanks,
> Austin
>
>
> On Wed, Aug 2, 2023 at 4:55 AM Raúl Cumplido  wrote:
>
> > Hi,
> >
> > As already shared on both Zulip and the Mailing list after creating RC
> > 1 we found a couple of regressions on performance. One has been fixed
> > and added as part of RC2 but the other is still in progress:
> > https://github.com/apache/arrow/issues/36892
> >
> > Based on conversations on Zulip it doesn't seem to be a blocker but
> > please raise your concerns if you think we should wait for a new RC to
> > add it.
> >
> > I have triggered a new benchmark run on the RC PR but it is still in
> > progress:
> > https://github.com/apache/arrow/pull/36985#issuecomment-1661559507
> >
> > I'll share in this discussion the comparison between the baseline
> > (12.0.1) and the current RC once it is available.
> >
> > Thanks and happy holidays for the ones that are on vacation :)
> > Raúl
> >
> > El mié, 2 ago 2023 a las 11:46, Raúl Cumplido ()
> > escribió:
> > >
> > > Hi,
> > >
> > > I would like to propose the following release candidate (RC2) of Apache
> > > Arrow version 13.0.0. This is a release consisting of 435
> > > resolved JIRA issues[1].
> > >
> > > This release candidate is based on commit:
> > > c604084e7c62747780f91d6f8419c47feb4b20fb [2]
> > >
> > > The source release rc2 is hosted at [3].
> > > The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> > > The changelog is located at [12].
> > >
> > > Please download, verify checksums and signatures, run the unit tests,
> > > and vote on the release. See [13] for how to validate a release
> > candidate.
> > >
> > > See also a verification result on GitHub pull request [14].
> > >
> > > The vote will be open for at least 72 hours.
> > >
> > > [ ] +1 Release this as Apache Arrow 13.0.0
> > > [ ] +0
> > > [ ] -1 Do not release this as Apache Arrow 13.0.0 because...
> > >
> > > [1]:
> >
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A13.0.0+is%3Aclosed
> > > [2]:
> >
> https://github.com/apache/arrow/tree/c604084e7c62747780f91d6f8419c47feb4b20fb
> > > [3]:
> > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-13.0.0-rc2
> > > [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> > > [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> > > [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> > > [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> > > [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/13.0.0-rc2
> > > [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/13.0.0-rc2
> > > [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/13.0.0-rc2
> > > [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> > > [12]:
> >
> https://github.com/apache/arrow/blob/c604084e7c62747780f91d6f8419c47feb4b20fb/CHANGELOG.md
> > > [13]:
> >
> https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
> > > [14]: https://github.com/apache/arrow/pull/36985
> >
>


Re: [DISCUSS] Canonical alternative layout proposal

2023-07-13 Thread Dane Pitkin
I am in favor of this proposal. IMO the Arrow project is the right place to
standardize both the interoperability *and operability* of columnar data
layouts. Data engines are a core component of the Arrow ecosystem and the
project should be able to grow with these data engines as they converge on
new layouts. Since columnar data is ubiquitous in analytical workloads, we
are seeing a natural progression into optimizing those workloads. This
includes new lossless compression schemes for columnar data that allows
engines to operate directly on the compressed data (e.g. RLE). If we can't
reliably support the growing needs of the broader data engine ecosystem in
a timely manner, then I also fear Arrow might lose relevancy over time.

On Thu, Jul 13, 2023 at 11:59 AM Ian Cook  wrote:

> Thank you Weston for proposing this solution and Neal for describing
> its context and implications. I agree with the other replies here—this
> seems like an elegant solution to a growing need that could, if left
> unaddressed, increase the fragmentation of the ecosystem and reduce
> the centrality of the Arrow format.
>
> Greater diversity of layouts is happening. Whether it happens inside
> of Arrow or outside of Arrow is up to us. I think we all would like to
> see it happen inside of Arrow. This proposal allows for that, while
> striking a balance as Raphael describes.
>
> However I think there is still some ambiguity about exactly how an
> Arrow implementation that is consuming/producing data would negotiate
> with an Arrow implementation or other component that is
> producing/consuming data to determine whether an alternative layout is
> supported. This was discussed briefly in [5] but I am interested to
> see how this negotiation would be implemented in practice in the C
> data interface, IPC, Flight, etc.
>
> Ian
>
> [5] https://lists.apache.org/thread/7x2714wookjqgkoykxpq9jtpyrgx2bx2
>
>
> On Thu, Jul 13, 2023 at 11:00 AM Raphael Taylor-Davies
>  wrote:
> >
> > I like this proposal, I think it strikes a pragmatic balance between
> > preserving interoperability whilst still allowing new ideas to be
> > incorporated into the standard. Thank you for writing this up.
> >
> > On 13/07/2023 10:22, Matt Topol wrote:
> > > I don't have much to add but I do want to second Jacob's comments. I
> agree
> > > that this is a good way to avoid the fragmentation while keeping Arrow
> > > relevant, and likely something we need to do so that we can ensure
> Arrow
> > > remains the way to do this data integration and interoperability.
> > >
> > > On Wed, Jul 12, 2023 at 9:52 PM Jacob Wujciak-Jens
> > >  wrote:
> > >
> > >> Hello Everyone,
> > >>
> > >> Thanks for this comprehensive but concise write up Neal! I think this
> > >> proposal is a good way to avoid both fragmentation of the arrow
> ecosystem
> > >> as well as its obsolescence. In my opinion of these two problems the
> > >> obsolescence is the bigger issue as (as mentioned in the proposal)
> arrow is
> > >> already (close to) being relegated to the sidelines in eco-system
> defining
> > >> projects.
> > >>
> > >> Jacob
> > >>
> > >> On Thu, Jul 13, 2023 at 12:03 AM Neal Richardson <
> > >> neal.p.richard...@gmail.com> wrote:
> > >>
> > >>> Hi all,
> > >>> As was previously raised in [1] and surfaced again in [2], there is a
> > >>> proposal for representing alternative layouts. The intent, as I
> > >> understand
> > >>> it, is to be able to support memory layouts that some (but perhaps
> not
> > >> all)
> > >>> applications of Arrow find valuable, so that these nearly Arrow
> systems
> > >> can
> > >>> be fully Arrow-native.
> > >>>
> > >>> I wanted to start a more focused discussion on it because I think
> it's
> > >>> worth being considered on its own merits, but I also think this gets
> to
> > >> the
> > >>> core of what the Arrow project is and should be, and I don't want us
> to
> > >>> lose sight of that.
> > >>>
> > >>> To restate the proposal from [1]:
> > >>>
> > >>>   * There are one or more primary layouts
> > >>> * Existing layouts are automatically considered primary layouts,
> > >>> even if they
> > >>> wouldn't have been primary layouts initially (e.g. large list)
> > >>>   * A new layout, if it is semantically equivalent to another, is
> > >>> considered an
> > >>> alternative layout
> > >>>   * An alternative layout still has the same requirements for
> adoption
> > >>> (two implementations
> > >>> and a vote)
> > >>> * An implementation should not feel pressured to rush and
> implement
> > >> the
> > >>> new
> > >>> layout. It would be good if they contribute in the discussion and
> > >> consider
> > >>> the layout and vote if they feel it would be an acceptable design.
> > >>>   * We can define and vote and approve as many canonical alternative
> > >>> layouts as
> > >>> we want:
> > >>> * A canonical alternative layout should, at a minimum, have some
> > >>> reasonable
> > >>> justification, such as improved performance for algorithm X
> > >>>   * 

Re: [ANNOUNCE] New Arrow PMC member: Dewey Dunnington

2023-06-23 Thread Dane Pitkin
Congrats Dewey!

On Fri, Jun 23, 2023 at 9:15 AM Nic Crane  wrote:

> Well-deserved Dewey, congratulations!
>
> On Fri, 23 Jun 2023 at 11:53, Vibhatha Abeykoon 
> wrote:
>
> > Congratulations Dewey!
> >
> > On Fri, Jun 23, 2023 at 4:16 PM Alenka Frim  > .invalid>
> > wrote:
> >
> > > Congratulations Dewey!! 🎉
> > >
> > > On Fri, Jun 23, 2023 at 12:10 PM Raúl Cumplido  >
> > > wrote:
> > >
> > > > Congratulations Dewey!
> > > >
> > > > El vie, 23 jun 2023, 11:55, Andrew Lamb 
> > escribió:
> > > >
> > > > > The Project Management Committee (PMC) for Apache Arrow has invited
> > > > > Dewey Dunnington (paleolimbot) to become a PMC member and we are
> > > pleased
> > > > to
> > > > > announce
> > > > > that Dewey Dunnington has accepted.
> > > > >
> > > > > Congratulations and welcome!
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1

2023-06-20 Thread Dane Pitkin
+1 (non-binding)

Verified on MacOS (M1) using conda.

A couple of nuances:
* Had to uninstall gnupg in conda and used brew's gnupg instead (same issue
Will found).
* I initially encountered some intermittent CMake build timeouts with
gtest, but haven't been able to reproduce.

On Tue, Jun 20, 2023 at 9:55 AM Antoine Pitrou  wrote:

>
> I don't have much time to investigate and I don't think it's a blocker
> either way. Perhaps there's room for improvement on the Arrow C++ side
> as well...
>
>
> Le 20/06/2023 à 15:40, Dewey Dunnington a écrit :
> > Thanks for verifying!
> >
> > I don't *think* there is anything non-standard about the
> > `find_package(Arrow)` / `target_link_libraries(..., arrow_shared)`
> > sequence used to link the tests (although clearly they aren't working
> > as intended!). You can pass extra arguments to CMake to help it find
> > the right Arrow using export NANOARROW_CMAKE_OPTIONS="-DArrow_DIR=..."
> > but here it sounds like it's finding the .so but failing to link the
> > dependencies. There are also instructions on creating a conda
> > environment with all required dependencies at [1].
> >
> > [1]
> https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md#conda-linux-and-macos
> >
> > On Tue, Jun 20, 2023 at 9:32 AM Antoine Pitrou 
> wrote:
> >>
> >>
> >> Ok, now running from the right repo :-), I get linker errors against
> >> Arrow C++ dependencies:
> >>
> >> [ 44%] Linking CXX executable utils_test
> >>
> /home/antoine/mambaforge/envs/pyarrow/bin/../lib/gcc/x86_64-conda-linux-gnu/12.2.0/../../../../x86_64-conda-linux-gnu/bin/ld:
> >> warning: libcrypto.so.3, needed by
> >> /home/antoine/mambaforge/envs/pyarrow/lib/libarrow.so.1300.0.0, not
> >> found (try using -rpath or -rpath-link)
> >>
> >> (etc.)
> >>
> >> https://gist.github.com/pitrou/3e6e9621e3b6cc2aff932eafdafef82b
> >>
> >> Note that Arrow C++ is compiled by myself inside a conda environment
> >> (which is activated when running the verification script).
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >>
> >> Le 20/06/2023 à 12:38, Raúl Cumplido a écrit :
> >>> +1 (non-binding)
> >>>
> >>> I've run:
> >>> ./verify-release-candidate.sh 0.2.0 1
> >>>
> >>> on Ubuntu 22.04 with conda:
> >>> * arrow-cpp 12.0.0
> >>> * gcc (conda-forge gcc 11.4.0-0) 11.4.0
> >>> * r-base  4.2.3
> >>>
> >>> Thanks,
> >>> Raúl
> >>>
> >>> El mar, 20 jun 2023 a las 1:55, Sutou Kouhei ()
> escribió:
> 
>  +1
> 
>  I ran the following command line on Debian GNU/Linux sid:
> 
>  CMAKE_PREFIX_PATH=/tmp/local \
>    dev/release/verify-release-candidate.sh 0.2.0 1
> 
>  with:
> 
>  * Apache Arrow C++ main
>  * gcc (Debian 12.2.0-14) 12.2.0
>  * R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
> 
> 
>  Thanks,
>  --
>  kou
> 
>  In  oy-8keyn0at47jpmaw...@mail.gmail.com>
>  "[VOTE] Release Apache Arrow nanoarrow 0.2.0 - RC1" on Mon, 19
> Jun 2023 15:58:45 -0300,
>  Dewey Dunnington  wrote:
> 
> > Hello,
> >
> > I would like to propose the following release candidate (RC1) of
> > Apache Arrow nanoarrow version 0.2.0. This release consists of 17
> > resolved GitHub issues [1].
> >
> > This release candidate is based on commit:
> > f71063605e288d9a8dd73cfdd9578773519b6743 [2]
> >
> > The source release rc1 is hosted at [3].
> > The changelog is located at [4].
> > The draft release post is located at [5].
> >
> > Please download, verify checksums and signatures, run the unit tests,
> > and vote on the release. See [6] for how to validate a release
> > candidate.
> >
> > The vote will be open for at least 72 hours.
> >
> > [ ] +1 Release this as Apache Arrow nanoarrow 0.2.0
> > [ ] +0
> > [ ] -1 Do not release this as Apache Arrow nanoarrow 0.2.0 because...
> >
> > [0] https://github.com/apache/arrow-nanoarrow
> > [1] https://github.com/apache/arrow-nanoarrow/milestone/2?closed=1
> > [2]
> https://github.com/apache/arrow-nanoarrow/tree/apache-arrow-nanoarrow-0.2.0-rc1
> > [3]
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-nanoarrow-0.2.0-rc1/
> > [4]
> https://github.com/apache/arrow-nanoarrow/blob/apache-arrow-nanoarrow-0.2.0-rc1/CHANGELOG.md
> > [5] https://github.com/apache/arrow-site/pull/364
> > [6]
> https://github.com/apache/arrow-nanoarrow/blob/main/dev/release/README.md
>