Re: [VOTE][Julia] Release Apache Arrow Julia 2.6.2 RC1

2023-06-09 Thread Ben Baumgold
+1 (macOS m1)

On Fri, Jun 9, 2023 at 4:02 PM Jacob Quinn  wrote:

> +1 (macOS m1)
>
> -Jacob
>
> On Fri, Jun 9, 2023 at 1:41 PM Sutou Kouhei  wrote:
>
> > Hi,
> >
> > I would like to propose the following release candidate (RC1) of
> > Apache Arrow Julia version 2.6.2.
> >
> > This release candidate is based on commit:
> > 9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf [1]
> >
> > The source release rc1 is hosted at [2].
> >
> > Please download, verify checksums and signatures, run the unit tests,
> > and vote on the release. See [3] for how to validate a release candidate.
> >
> > The vote will be open for at least 24 hours.
> >
> > [ ] +1 Release this as Apache Arrow Julia 2.6.2
> > [ ] +0
> > [ ] -1 Do not release this as Apache Arrow Julia 2.6.2 because...
> >
> > [1]:
> >
> https://github.com/apache/arrow-julia/tree/9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf
> > [2]:
> >
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-julia-2.6.2-rc1/
> > [3]:
> >
> https://github.com/apache/arrow-julia/blob/main/dev/release/README.md#verify
> >
>


Converting Pandas DataFrame <-> Struct Array?

2023-06-09 Thread Li Jin
Hello,

I am looking for the best ways for converting Pandas DataFrame <-> Struct
Array.

Currently I have:

pa.RecordBatch.from_pandas(df).to_struct_array()

and

pa.RecordBatch.from_struct_array(s_array).to_pandas()

- I wonder if there is a direct way to go from DataFrame <-> Struct Array
without going through RecordBatch?

Thanks,
Li


Re: [VOTE][Julia] Release Apache Arrow Julia 2.6.2 RC1

2023-06-09 Thread Jacob Quinn
+1 (macOS m1)

-Jacob

On Fri, Jun 9, 2023 at 1:41 PM Sutou Kouhei  wrote:

> Hi,
>
> I would like to propose the following release candidate (RC1) of
> Apache Arrow Julia version 2.6.2.
>
> This release candidate is based on commit:
> 9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf [1]
>
> The source release rc1 is hosted at [2].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [3] for how to validate a release candidate.
>
> The vote will be open for at least 24 hours.
>
> [ ] +1 Release this as Apache Arrow Julia 2.6.2
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow Julia 2.6.2 because...
>
> [1]:
> https://github.com/apache/arrow-julia/tree/9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf
> [2]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-julia-2.6.2-rc1/
> [3]:
> https://github.com/apache/arrow-julia/blob/main/dev/release/README.md#verify
>


Re: [VOTE][Julia] Release Apache Arrow Julia 2.6.2 RC1

2023-06-09 Thread Sutou Kouhei
+1

I ran the following command line on Debian GNU/Linux sid:

  VERIFY_FORCE_USE_JULIA_BINARY=1 dev/release/verify_rc.sh 2.6.2 1


Thanks,
-- 
kou

In <20230610.044039.1468288593045013710@clear-code.com>
  "[VOTE][Julia] Release Apache Arrow Julia 2.6.2 RC1" on Sat, 10 Jun 2023 
04:40:39 +0900 (JST),
  Sutou Kouhei  wrote:

> Hi,
> 
> I would like to propose the following release candidate (RC1) of
> Apache Arrow Julia version 2.6.2.
> 
> This release candidate is based on commit:
> 9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf [1]
> 
> The source release rc1 is hosted at [2].
> 
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [3] for how to validate a release candidate.
> 
> The vote will be open for at least 24 hours.
> 
> [ ] +1 Release this as Apache Arrow Julia 2.6.2
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow Julia 2.6.2 because...
> 
> [1]: 
> https://github.com/apache/arrow-julia/tree/9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf
> [2]: 
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-julia-2.6.2-rc1/
> [3]: 
> https://github.com/apache/arrow-julia/blob/main/dev/release/README.md#verify


[VOTE][Julia] Release Apache Arrow Julia 2.6.2 RC1

2023-06-09 Thread Sutou Kouhei
Hi,

I would like to propose the following release candidate (RC1) of
Apache Arrow Julia version 2.6.2.

This release candidate is based on commit:
9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf [1]

The source release rc1 is hosted at [2].

Please download, verify checksums and signatures, run the unit tests,
and vote on the release. See [3] for how to validate a release candidate.

The vote will be open for at least 24 hours.

[ ] +1 Release this as Apache Arrow Julia 2.6.2
[ ] +0
[ ] -1 Do not release this as Apache Arrow Julia 2.6.2 because...

[1]: 
https://github.com/apache/arrow-julia/tree/9f1d51a2c975bd83cbaf70c5f640762c6a0bccaf
[2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-julia-2.6.2-rc1/
[3]: 
https://github.com/apache/arrow-julia/blob/main/dev/release/README.md#verify


Re: [ANNOUNCE] New Arrow committer: Mehmet Ozan Kabak

2023-06-09 Thread David Li
Welcome Mehmet!

On Thu, Jun 8, 2023, at 22:54, Weston Pace wrote:
> Congratulations!
>
> On Thu, Jun 8, 2023, 5:36 PM Mehmet Ozan Kabak  wrote:
>
>> Thanks everybody. Looking to collaborate further!
>>
>> > On Jun 8, 2023, at 9:52 AM, Matt Topol  wrote:
>> >
>> > Congrats! Welcome Ozan!
>> >
>> > On Thu, Jun 8, 2023 at 8:53 AM Raúl Cumplido 
>> wrote:
>> >
>> >> Congratulations and welcome!
>> >>
>> >> El jue, 8 jun 2023 a las 14:45, Metehan Yıldırım
>> >> () escribió:
>> >>>
>> >>> Congrats Ozan!
>> >>>
>> >>> On Thu, Jun 8, 2023 at 1:09 PM Andrew Lamb 
>> wrote:
>> >>>
>>  On behalf of the Arrow PMC, I'm happy to announce that  Mehmet Ozan
>> >> Kabak
>>  has accepted an invitation to become a committer on Apache
>>  Arrow. Welcome, and thank you for your contributions!
>> 
>>  Andrew
>> 
>> >>
>>
>>


Re: [VOTE] Release Apache Arrow 12.0.1 - RC1

2023-06-09 Thread David Li
+1 (Ubuntu Linux 20.04/x86_64)

Verified with Conda. I had to retry binary verification a few times due to rate 
limiting from Artifactory.

On Fri, Jun 9, 2023, at 08:41, Raúl Cumplido wrote:
> Hi,
>
> There has been an issue identified on the Release Candidate on the
> source verification on MacOS if the last homebrew version of protobuf
> is used [1].
>
> On that case we can specify -DProtobuf_SOURCE=BUNDLED like:
> ARROW_CMAKE_OPTIONS="-DProtobuf_SOURCE=BUNDLED"
> dev/release/verify-release-candidate.sh ...
>
> All the verification results have been successful on the PR [2] and I
> have run the MacOS source verification tasks (also tracked on that PR)
> with the following patch to use Bundled protobuf:
>
> diff --git a/dev/release/verify-release-candidate.sh
> b/dev/release/verify-release-candidate.sh
> index 12e6d9c..9bb39b2 100755
> --- a/dev/release/verify-release-candidate.sh
> +++ b/dev/release/verify-release-candidate.sh
> @@ -602,6 +602,10 @@ test_and_install_cpp() {
>  ARROW_CMAKE_OPTIONS="${ARROW_CMAKE_OPTIONS:-} -G ${CMAKE_GENERATOR}"
>fi
>
> +  if [ "$(uname)" == "Darwin" ]; then
> +brew uninstall --force protobuf abseil grpc
> +  fi
> +
>cmake \
>  -DARROW_BOOST_USE_SHARED=ON \
>  -DARROW_BUILD_EXAMPLES=OFF \
> @@ -638,6 +642,7 @@ test_and_install_cpp() {
>  -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
>  -DCMAKE_UNITY_BUILD=${CMAKE_UNITY_BUILD:-OFF} \
>  -DGTest_SOURCE=BUNDLED \
> +-DProtobuf_SOURCE=BUNDLED \
>  -DPARQUET_BUILD_EXAMPLES=ON \
>  -DPARQUET_BUILD_EXECUTABLES=ON \
>  -DPARQUET_REQUIRE_ENCRYPTION=ON \
>
> As discussed on Zulip and the issue, this should not be a blocker for
> the release.
>
> Regards,
> Raúl
>
>
> [1] https://github.com/apache/arrow/issues/35987
> [2] https://github.com/apache/arrow/pull/35967
>
> El vie, 9 jun 2023 a las 14:32, Raúl Cumplido () escribió:
>>
>> Hi,
>>
>> I would like to propose the following release candidate (RC1) of Apache
>> Arrow version 12.0.1. This is a release consisting of 29
>> resolved GitHub issues[1].
>>
>> This release candidate is based on commit:
>> 6af660f48472b8b45a5e01b7136b9b040b185eb1 [2]
>>
>> The source release rc1 is hosted at [3].
>> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
>> The changelog is located at [12].
>>
>> Please download, verify checksums and signatures, run the unit tests,
>> and vote on the release. See [13] for how to validate a release candidate.
>>
>> See also a verification result on GitHub pull request [14].
>>
>> The vote will be open for at least 72 hours.
>>
>> [ ] +1 Release this as Apache Arrow 12.0.1
>> [ ] +0
>> [ ] -1 Do not release this as Apache Arrow 12.0.1 because...
>>
>> [1]: 
>> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A12.0.1+is%3Aclosed
>> [2]: 
>> https://github.com/apache/arrow/tree/6af660f48472b8b45a5e01b7136b9b040b185eb1
>> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-12.0.1-rc1
>> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
>> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
>> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
>> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
>> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/12.0.1-rc1
>> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/12.0.1-rc1
>> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/12.0.1-rc1
>> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
>> [12]: 
>> https://github.com/apache/arrow/blob/6af660f48472b8b45a5e01b7136b9b040b185eb1/CHANGELOG.md
>> [13]: 
>> https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
>> [14]: https://github.com/apache/arrow/pull/35967


Scalars in Apache Arrow Rust

2023-06-09 Thread Raphael Taylor-Davies

Hi All,

Currently the Rust implementation of arrow lacks a consistent story for 
supporting scalars. Whilst there are some binary kernels that support 
scalar values [1] [2], the way this is encoded is not consistent [3], 
requires type-dispatch logic in downstreams like DataFusion [4], and has 
no mechanism to preserve type metadata such as timestamp timezones, or 
decimal precision [5].


I would therefore like to draw attention to a proposal [6] to address 
this. As this will necessarily have downstream ramifications, and is 
likely a problem that other implementations of arrow have already 
grappled with, I would very much appreciate any feedback the community 
can give. To avoid bifurcating the discussion, please comment on the 
GitHub PR.


I look forward to hearing your thoughts.

Kind Regards,

Raphael Taylor-Davies

[1]: 
https://docs.rs/arrow-arith/latest/arrow_arith/arithmetic/fn.add_scalar.html
[2]: 
https://docs.rs/arrow-ord/latest/arrow_ord/comparison/fn.eq_dyn_scalar.html

[3]: https://github.com/apache/arrow-rs/issues/2837
[4]: 
https://github.com/apache/arrow-datafusion/blob/d9e91d187c8af7f3f8b1a11d53383826a471/datafusion/physical-expr/src/expressions/binary.rs

[5]: https://github.com/apache/arrow-rs/issues/3999
[6]: https://github.com/apache/arrow-rs/pull/4393



Re: [VOTE] Release Apache Arrow 12.0.1 - RC1

2023-06-09 Thread Raúl Cumplido
Hi,

There has been an issue identified on the Release Candidate on the
source verification on MacOS if the last homebrew version of protobuf
is used [1].

On that case we can specify -DProtobuf_SOURCE=BUNDLED like:
ARROW_CMAKE_OPTIONS="-DProtobuf_SOURCE=BUNDLED"
dev/release/verify-release-candidate.sh ...

All the verification results have been successful on the PR [2] and I
have run the MacOS source verification tasks (also tracked on that PR)
with the following patch to use Bundled protobuf:

diff --git a/dev/release/verify-release-candidate.sh
b/dev/release/verify-release-candidate.sh
index 12e6d9c..9bb39b2 100755
--- a/dev/release/verify-release-candidate.sh
+++ b/dev/release/verify-release-candidate.sh
@@ -602,6 +602,10 @@ test_and_install_cpp() {
 ARROW_CMAKE_OPTIONS="${ARROW_CMAKE_OPTIONS:-} -G ${CMAKE_GENERATOR}"
   fi

+  if [ "$(uname)" == "Darwin" ]; then
+brew uninstall --force protobuf abseil grpc
+  fi
+
   cmake \
 -DARROW_BOOST_USE_SHARED=ON \
 -DARROW_BUILD_EXAMPLES=OFF \
@@ -638,6 +642,7 @@ test_and_install_cpp() {
 -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
 -DCMAKE_UNITY_BUILD=${CMAKE_UNITY_BUILD:-OFF} \
 -DGTest_SOURCE=BUNDLED \
+-DProtobuf_SOURCE=BUNDLED \
 -DPARQUET_BUILD_EXAMPLES=ON \
 -DPARQUET_BUILD_EXECUTABLES=ON \
 -DPARQUET_REQUIRE_ENCRYPTION=ON \

As discussed on Zulip and the issue, this should not be a blocker for
the release.

Regards,
Raúl


[1] https://github.com/apache/arrow/issues/35987
[2] https://github.com/apache/arrow/pull/35967

El vie, 9 jun 2023 a las 14:32, Raúl Cumplido () escribió:
>
> Hi,
>
> I would like to propose the following release candidate (RC1) of Apache
> Arrow version 12.0.1. This is a release consisting of 29
> resolved GitHub issues[1].
>
> This release candidate is based on commit:
> 6af660f48472b8b45a5e01b7136b9b040b185eb1 [2]
>
> The source release rc1 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> The changelog is located at [12].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [13] for how to validate a release candidate.
>
> See also a verification result on GitHub pull request [14].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow 12.0.1
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow 12.0.1 because...
>
> [1]: 
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A12.0.1+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow/tree/6af660f48472b8b45a5e01b7136b9b040b185eb1
> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-12.0.1-rc1
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/12.0.1-rc1
> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/12.0.1-rc1
> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/12.0.1-rc1
> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [12]: 
> https://github.com/apache/arrow/blob/6af660f48472b8b45a5e01b7136b9b040b185eb1/CHANGELOG.md
> [13]: 
> https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
> [14]: https://github.com/apache/arrow/pull/35967


[VOTE] Release Apache Arrow 12.0.1 - RC1

2023-06-09 Thread Raúl Cumplido
Hi,

I would like to propose the following release candidate (RC1) of Apache
Arrow version 12.0.1. This is a release consisting of 29
resolved GitHub issues[1].

This release candidate is based on commit:
6af660f48472b8b45a5e01b7136b9b040b185eb1 [2]

The source release rc1 is hosted at [3].
The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
The changelog is located at [12].

Please download, verify checksums and signatures, run the unit tests,
and vote on the release. See [13] for how to validate a release candidate.

See also a verification result on GitHub pull request [14].

The vote will be open for at least 72 hours.

[ ] +1 Release this as Apache Arrow 12.0.1
[ ] +0
[ ] -1 Do not release this as Apache Arrow 12.0.1 because...

[1]: 
https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A12.0.1+is%3Aclosed
[2]: 
https://github.com/apache/arrow/tree/6af660f48472b8b45a5e01b7136b9b040b185eb1
[3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-12.0.1-rc1
[4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
[5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
[6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
[7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
[8]: https://apache.jfrog.io/artifactory/arrow/java-rc/12.0.1-rc1
[9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/12.0.1-rc1
[10]: https://apache.jfrog.io/artifactory/arrow/python-rc/12.0.1-rc1
[11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
[12]: 
https://github.com/apache/arrow/blob/6af660f48472b8b45a5e01b7136b9b040b185eb1/CHANGELOG.md
[13]: 
https://cwiki.apache.org/confluence/display/ARROW/How+to+Verify+Release+Candidates
[14]: https://github.com/apache/arrow/pull/35967