Re: [DISCUSS] [Rust] Move Rust components to new repos and process

2021-04-09 Thread Jorge Cardoso Leitão
Hi, The major problem that we are addressing with an independent release cycle is that the large majority of our users do not use released versions, neither from ASF archives nor from Cargo crates. They use a git hash commit. This is a problem because our git hashes are *de facto* releases. This

Re: [DISCUSS] [Rust] Move Rust components to new repos and process

2021-04-09 Thread QP Hou
On Fri, Apr 9, 2021 at 4:57 PM Weston Pace wrote: > Note, these problems technically exist now with the concept that any > language can release a patch at any time. Also, since Rust isn't > directly compiling against other Arrow libs and we are only talking > about interoperability it's probably

Re: [DISCUSS] [Rust] Move Rust components to new repos and process

2021-04-09 Thread Weston Pace
> I'm assuming the idea is that the existing integration tests will remain in > apache/arrow. Will you also run the integration test suites on your rust > repository CI checks? Furthermore, against what version will these tests run? * If Arrow runs against the latest release of Rust then it

Re: [DISCUSS] [Rust] Move Rust components to new repos and process

2021-04-09 Thread Micah Kornfield
> > With this explanation do you still have a concern? There is no suggestion > of making releases that depend on GitHub hashes. No, I don't think so. IIUC you are saying the crates dependency does not imply the crate artifacts are published elsewhere. This sounds inline with policies to me.

Re: [DISCUSS] [Rust] Move Rust components to new repos and process

2021-04-09 Thread Andy Grove
Hi Micah, During development, the Rust crates have local dependencies on each other based on relative file system paths. At release time, we change these to versioned dependencies before publishing, because it isn't possible to publish a crate that depends on non-published crates. With the code

Re: [DISCUSS] [Rust] Move Rust components to new repos and process

2021-04-09 Thread Micah Kornfield
> > " Crates can depend on GitHub commit hashes between releases" This sounds like it might not align with ASF release policies [1]. [1] https://www.apache.org/legal/release-policy.html#release-definition On Fri, Apr 9, 2021 at 1:34 PM Neal Richardson wrote: > Thanks, Andy. Two areas of

Re: [DISCUSS] [Rust] Move Rust components to new repos and process

2021-04-09 Thread Andrew Lamb
I personally believe we should continue to release the rust arrow crate in such a way that the major versions match the other implementations, for precisely the reasons you mention. On Fri, Apr 9, 2021 at 4:34 PM Neal Richardson wrote: > Thanks, Andy. Two areas of concern I think we should

Re: [DISCUSS] [Rust] Move Rust components to new repos and process

2021-04-09 Thread Neal Richardson
Thanks, Andy. Two areas of concern I think we should have some answer for before going forward with this (and I make no opinions as to what the "right" answers are, just raising them for discussion): 1. Integration testing: what is our workflow for ensuring that our implementations are

Re: A bug of pyarrow in python

2021-04-09 Thread Micah Kornfield
Thank you for the report. So you have a minimal repro to reproduce the issue you are seeing? On Friday, April 9, 2021, 谢旗旺 <1415850...@qq.com> wrote: > When I use "pd.read_parquet(path,engine='pyarrow’)” to load my > dataset,the loaded dataset appears where one piece of data is copied into >

[Rust] [DataFusion] Proposal for datafusion test reorganization

2021-04-09 Thread Andrew Lamb
As Jorge points out here [1], the tests in datafusion/src/context.rs are not really unit tests. They are more like SQL integration tests. There is also a small and languishing set of sql tests in `rust/datafusion/tests/ sql.rs`. These tests are critical for DataFusion's quality and I would like

Re: 4.0 release preparation

2021-04-09 Thread Neal Richardson
Looks like we aren't quite ready to release. Nightly build failures seem to have increased rather than decreased, and we're still at 66 open issues on https://cwiki.apache.org/confluence/display/ARROW/Arrow+4.0.0+Release despite our work this week (and despite my attempts to bump issues out of

A bug of pyarrow in python

2021-04-09 Thread ??????
When I use "pd.read_parquet(path,engine='pyarrow??)?? to load my dataset,the loaded dataset appears where one piece of data is copied into two??but fastparquet will not.

Re: [ANNOUNCE] [Rust] Ballista donation has been merged

2021-04-09 Thread Andrew Lamb
I agree -- thank you Andy for your perseverance through the process. Things are looking bright from my perspective as well On Thu, Apr 8, 2021 at 9:14 PM Wes McKinney wrote: > Congrats Andy! I know this was a lot of work, but I think it speaks to > a bright future for the Arrow ecosystem. Once

[NIGHTLY] Arrow Build Report for Job nightly-2021-04-09-0

2021-04-09 Thread Crossbow
Arrow Build Report for Job nightly-2021-04-09-0 All tasks: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-04-09-0 Failed Tasks: - conda-linux-gcc-py36-aarch64: URL:

Re: 4.0 release preparation

2021-04-09 Thread Sutou Kouhei
Hi, One more concern: * ASF's Bintray -> Artifactory migration isn't finished yet. https://lists.apache.org/thread.html/r9200fed3fa812f8c7de07a2500425f258db3231baa8e05f288175e4a%40%3Cbuilds.apache.org%3E I think that this is not a blocker. We'll not release deb/rpm packages when we