Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Ying Zhou
Hmm it seems that the PyArrow wheel doesn’t actually install on my Mac. Sorry I didn’t report any source testing issues since my environment is pretty messed up.. (arrowvenv) (base) karlkatzen@chloes venv % python3 Python 3.8.3 (default, Jul 2 2020, 11:26:31) [Clang 10.0.0 ] :: Anaconda, Inc.

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Micah Kornfield
Oh, nice, I thought they just missed the cutoff. On Tue, Apr 27, 2021 at 8:19 PM Ying Zhou wrote: > They actually did. > > Ying > > > On Apr 27, 2021, at 11:11 PM, Micah Kornfield > wrote: > > > > Did the ORC additions actually make it into 4.0? > > > > On Tue, Apr 27, 2021 at 7:55 PM Ying

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Ying Zhou
They actually did. Ying > On Apr 27, 2021, at 11:11 PM, Micah Kornfield wrote: > > Did the ORC additions actually make it into 4.0? > > On Tue, Apr 27, 2021 at 7:55 PM Ying Zhou wrote: > >> Sure. I just added some info about the ORC writer. I think we need to >> update the documentation in

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Micah Kornfield
Did the ORC additions actually make it into 4.0? On Tue, Apr 27, 2021 at 7:55 PM Ying Zhou wrote: > Sure. I just added some info about the ORC writer. I think we need to > update the documentation in both C++ and Python as well to include ORC. I > will do it. > > Ying > > > On Apr 27, 2021, at

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Ying Zhou
Sure. I just added some info about the ORC writer. I think we need to update the documentation in both C++ and Python as well to include ORC. I will do it. Ying > On Apr 27, 2021, at 5:28 PM, Neal Richardson > wrote: > > 4.0 blog post is still pretty bare and could use some help filling in:

Arrow sync call April 28 at 12:00 US/Eastern, 16:00 UTC

2021-04-27 Thread Neal Richardson
Hi all, Our biweekly call is coming up tomorrow at https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will be shared with the mailing list afterward. Neal

Re: [Format][RFC] Introduce COMPLEX type for IntervalUnit

2021-04-27 Thread Wes McKinney
Thanks Micah — I commented in the PR. Once we've settled on the details we can come up with an implementation / vote plan On Tue, Apr 27, 2021 at 1:12 PM Micah Kornfield wrote: > To nudge this along I opend up https://github.com/apache/arrow/pull/10177 > > Comments welcome. > > On Sun, Apr 11,

Re: [DISCUSS] experimental repos

2021-04-27 Thread Wes McKinney
This process seems pretty reasonable to me. Thanks for writing up the document. On Tue, Apr 27, 2021 at 10:56 AM Micah Kornfield wrote: > Hi Jorge, > I think especially for the second case, it might be better to keep things > on branches in the repro even if they aren't quite mergeable. Even

Re: [DISCUSS] Moving the format directory to arrow-format repository

2021-04-27 Thread Wes McKinney
I wouldn't be too excited about this. Here are my thoughts: 1. Having the format/ directory in apache/arrow be a submodule would be cumbersome and error-prone for developers. The only submodules we have right now are optional testing dependencies — not having these initialized and updated does

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Neal Richardson
4.0 blog post is still pretty bare and could use some help filling in: https://github.com/apache/arrow-site/pull/104 Thanks, Neal On Tue, Apr 27, 2021 at 1:55 PM Sutou Kouhei wrote: > The remaining tasks: > > 3. [in-pr|Kou] upload binaries > https://github.com/apache/arrow/pull/10172 >

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Sutou Kouhei
The remaining tasks: 3. [in-pr|Kou] upload binaries https://github.com/apache/arrow/pull/10172 10. [Uwe] update conda recipes 12. [in-pr|Ian] update homebrew packages https://github.com/Homebrew/homebrew-core/pull/76060 I updated versions on JIRA: *

Re: [C++][Python] Parquet INT96 overflow for arrow timestamps

2021-04-27 Thread Antoine Pitrou
Hi Karik, I answered in the JIRA itself. Feel free to ask any more questions! Regards Antoine. Le 27/04/2021 à 16:28, Karik Isichei a écrit : Hi there, I previously raised an issue regarding arrow timestamp values overflowing when reading parquet type INT96 (

Re: [Format][RFC] Introduce COMPLEX type for IntervalUnit

2021-04-27 Thread Micah Kornfield
To nudge this along I opend up https://github.com/apache/arrow/pull/10177 Comments welcome. On Sun, Apr 11, 2021 at 9:38 PM Micah Kornfield wrote: > If there are no more comments on this maybe we should update the original > RFC PR and ensure we are OK with it in principle (Dmitry do you want

Re: [DISCUSS] experimental repos

2021-04-27 Thread Micah Kornfield
Hi Jorge, I think especially for the second case, it might be better to keep things on branches in the repro even if they aren't quite mergeable. Even for the first case, I would potentially aim for the "closest possible" repo with a new branch. I think standalone repos tend to indicate a higher

[DISCUSS] Moving the format directory to arrow-format repository

2021-04-27 Thread Neville Dipale
Hi Arrow devs, Andy noticed that we carry a copy of the format directory in arrow-rs, which is bound to get outdated in the future. We would like to propose creating an arrow-format repository, similar to parquet-format, so that arrow-rs and other future separate repositories could add this as a

Re: Issue with pyarrow v4.0.0 - Write parquet files with non str datatypes

2021-04-27 Thread Joris Van den Bossche
Hi Jorge, How did you install pyarrow 4.0.0? The error you show typically points to an installation issue (eg built with a wrong numpy) Best, Joris On Tue, 27 Apr 2021 at 16:47, Jorge Alarcon wrote: > Hi everybody, > > > > Please, there is an issue with pyarrow (version 4.0.0) when you try to

Re: [DISCUSS] [Rust] Python-datafusion

2021-04-27 Thread Micah Kornfield
Hi Jorge, This all sounds good to me. It might be nice to test against both the pinned released version of pyarrow and at head if possible. I like the idea of not causing release churn as long as all the underlying libraries are compatible. Thanks for the write up. -Micah On Mon, Apr 26, 2021

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Paul Taylor
JS packages have been uploaded. Paul On 4/27/21 9:47 AM, Neal Richardson wrote: R package has been accepted by CRAN. Neal On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs wrote: I've just opened a PR with the updated documentation. The remaining tasks: 3. [in-pr|Kou] upload binaries 6.

Issue with pyarrow v4.0.0 - Write parquet files with non str datatypes

2021-04-27 Thread Jorge Alarcon
Hi everybody, Please, there is an issue with pyarrow (version 4.0.0) when you try to write a parquet with your engine. It is not possible to write a parquet from a pandas df when it includes non str columns (datetime64, float64, int64...) Example: df = pd.DataFrame({'A':[1, 2, 3], 'B':['a',

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Neal Richardson
R package has been accepted by CRAN. Neal On Tue, Apr 27, 2021 at 7:25 AM Krisztián Szűcs wrote: > I've just opened a PR with the updated documentation. > > The remaining tasks: > > 3. [in-pr|Kou] upload binaries > 6. [Paul] upload js packages > 10. [Uwe] update conda recipes > 12. [todo]

[C++][Python] Parquet INT96 overflow for arrow timestamps

2021-04-27 Thread Karik Isichei
Hi there, I previously raised an issue regarding arrow timestamp values overflowing when reading parquet type INT96 ( https://issues.apache.org/jira/browse/ARROW-12096). I would like to try and add a contribution to this try and fix this, but wanted to check the following: - If I should start

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Krisztián Szűcs
I've just opened a PR with the updated documentation. The remaining tasks: 3. [in-pr|Kou] upload binaries 6. [Paul] upload js packages 10. [Uwe] update conda recipes 12. [todo] update homebrew packages 14. [Kou] update msys2 15. [Neal] update R packages 16. [in-pr|Krisztian] update docs On

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Krisztián Szűcs
On Tue, Apr 27, 2021 at 2:21 PM Paul Taylor wrote: > > These look like the errors resolved in > https://github.com/apache/arrow/pull/10156. Can we cherry-pick that > commit to the release branch? Great, I'll cherry-pick that commit. Could you please release the JS packages to npm? I think the

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Paul Taylor
These look like the errors resolved in https://github.com/apache/arrow/pull/10156. Can we cherry-pick that commit to the release branch? On 4/27/21 7:04 AM, Krisztián Szűcs wrote: I'd need some help to both release the JS packages using the new lerna configuration and to fix the JS

Re: [VOTE] Release Apache Arrow 4.0.0 - RC3

2021-04-27 Thread Krisztián Szűcs
I'd need some help to both release the JS packages using the new lerna configuration and to fix the JS documentation generation [1]. We should backport these changes to the release-4.0.0 branch. [1]:

[NIGHTLY] Arrow Build Report for Job nightly-2021-04-27-0

2021-04-27 Thread Crossbow
Arrow Build Report for Job nightly-2021-04-27-0 All tasks: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-27-0 Failed Tasks: - conda-linux-gcc-py36-arm64: URL: