[jira] [Created] (ARROW-7566) [CI] Use more recent Miniconda on AppVeyor
Antoine Pitrou created ARROW-7566: - Summary: [CI] Use more recent Miniconda on AppVeyor Key: ARROW-7566 URL: https://issues.apache.org/jira/browse/ARROW-7566 Project: Apache Arrow Issue Type: Wish Components: Continuous Integration Reporter: Antoine Pitrou A newer conda might improve setup speed because of the new package format. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7567) Bump Checkstyle from 6.19 to 8.18
Fokko Driesprong created ARROW-7567: --- Summary: Bump Checkstyle from 6.19 to 8.18 Key: ARROW-7567 URL: https://issues.apache.org/jira/browse/ARROW-7567 Project: Apache Arrow Issue Type: Improvement Components: Java Affects Versions: 0.15.1 Reporter: Fokko Driesprong Fix For: 0.16.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7568) Bump Apache Avro from 1.9.0 to 1.9.1
Fokko Driesprong created ARROW-7568: --- Summary: Bump Apache Avro from 1.9.0 to 1.9.1 Key: ARROW-7568 URL: https://issues.apache.org/jira/browse/ARROW-7568 Project: Apache Arrow Issue Type: Improvement Components: Java Affects Versions: 0.15.1 Reporter: Fokko Driesprong Fix For: 0.16.0 Apache Avro 1.9.1 contains some bugfixes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7569) [Python] Add API to map Arrow types to pandas ExtensionDtypes for to_pandas conversions
Joris Van den Bossche created ARROW-7569: Summary: [Python] Add API to map Arrow types to pandas ExtensionDtypes for to_pandas conversions Key: ARROW-7569 URL: https://issues.apache.org/jira/browse/ARROW-7569 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Joris Van den Bossche Fix For: 0.16.0 ARROW-2428 was about adding such a mapping, and described three use cases (see this [comment|https://issues.apache.org/jira/browse/ARROW-2428?focusedCommentId=16914231&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16914231] for details): * Basic roundtrip based on the pandas_metadata (in {{to_pandas}}, we check if the pandas_metadata specify pandas extension dtypes, and if so, use this as the target dtype for that column) * Conversion for pyarrow extension types that can define their equivalent pandas extension dtype * A way to override default conversion (eg for the built-in types, or in absence of pandas_metadata in the schema). This would require the user to be able to specify some mapping of pyarrow type or column name to the pandas extension dtype to use. The PR that closed ARROW-2428 (https://github.com/apache/arrow/pull/5512) only covered the first two cases, and not the third case. I think it is still interesting to also cover the third case in some way. An example use case are the new nullable dtypes that are introduced in pandas (eg the nullable integer dtype). Assume I want to read a parquet file into a pandas DataFrame using this nullable integer dtype. The pyarrow Table has no pandas_metadata indicating to use this dtype (unless it was created from a pandas DataFrame that was already using this dtype, but that will often not be the case), and the pyarrow.int64() type is also not an extension type that can define its equivalent pandas extension dtype. Currently, the only solution is first read it into pandas DataFrame (which will use floats for the integers if there are nulls), and then afterwards to convert those floats back to a nullable integer dtype. A possible API for this could look like: {code} table.to_pandas(types_mapping={pa.int64(): pd.Int64Dtype()}) {code} to indicate that you want to convert all columns of the pyarrow table with int64 type to a pandas column using the nullable Int64 dtype. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7570) Fix high severity issues
Fokko Driesprong created ARROW-7570: --- Summary: Fix high severity issues Key: ARROW-7570 URL: https://issues.apache.org/jira/browse/ARROW-7570 Project: Apache Arrow Issue Type: Improvement Components: Java Affects Versions: 0.15.1 Reporter: Fokko Driesprong Fix For: 0.16.0 Fixes high severity issues reported by LGTM: [https://lgtm.com/projects/g/apache/arrow/?mode=list&lang=java&severity=error] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7571) Correct minimal java version on README
Fokko Driesprong created ARROW-7571: --- Summary: Correct minimal java version on README Key: ARROW-7571 URL: https://issues.apache.org/jira/browse/ARROW-7571 Project: Apache Arrow Issue Type: Improvement Components: Java Affects Versions: 0.15.1 Reporter: Fokko Driesprong Fix For: 0.16.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7572) Enfore Maven 3.3+ as mentioned in README
Fokko Driesprong created ARROW-7572: --- Summary: Enfore Maven 3.3+ as mentioned in README Key: ARROW-7572 URL: https://issues.apache.org/jira/browse/ARROW-7572 Project: Apache Arrow Issue Type: Improvement Components: Java Affects Versions: 0.15.1 Reporter: Fokko Driesprong Fix For: 0.16.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[NIGHTLY] Arrow Build Report for Job nightly-2020-01-14-0
Arrow Build Report for Job nightly-2020-01-14-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0 Failed Tasks: - centos-6: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-centos-6 - gandiva-jar-osx: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-travis-gandiva-jar-osx - test-conda-r-3.6: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-r-3.6 Succeeded Tasks: - centos-7: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-centos-7 - centos-8: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-centos-8 - conda-linux-gcc-py27: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-linux-gcc-py27 - conda-linux-gcc-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-linux-gcc-py36 - conda-linux-gcc-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-linux-gcc-py37 - conda-linux-gcc-py38: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-linux-gcc-py38 - conda-osx-clang-py27: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-osx-clang-py27 - conda-osx-clang-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-osx-clang-py36 - conda-osx-clang-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-osx-clang-py37 - conda-osx-clang-py38: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-osx-clang-py38 - conda-win-vs2015-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-win-vs2015-py36 - conda-win-vs2015-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-win-vs2015-py37 - conda-win-vs2015-py38: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-conda-win-vs2015-py38 - debian-buster: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-debian-buster - debian-stretch: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-azure-debian-stretch - gandiva-jar-trusty: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-travis-gandiva-jar-trusty - homebrew-cpp: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-travis-homebrew-cpp - macos-r-autobrew: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-travis-macos-r-autobrew - test-conda-cpp: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-cpp - test-conda-python-2.7-pandas-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-2.7-pandas-latest - test-conda-python-2.7: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-2.7 - test-conda-python-3.6: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.6 - test-conda-python-3.7-dask-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.7-dask-latest - test-conda-python-3.7-hdfs-2.9.2: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.7-hdfs-2.9.2 - test-conda-python-3.7-pandas-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.7-pandas-latest - test-conda-python-3.7-pandas-master: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.7-pandas-master - test-conda-python-3.7-spark-master: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.7-spark-master - test-conda-python-3.7-turbodbc-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.7-turbodbc-latest - test-conda-python-3.7-turbodbc-master: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.7-turbodbc-master - test-conda-python-3.7: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.7 - test-conda-python-3.8-dask-master: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-01-14-0-circle-test-conda-python-3.8-dask-master - test-conda-python-3.8-pandas-
[jira] [Created] (ARROW-7573) [Rust] Reduce boxing and cleanup
Gurwinder Singh created ARROW-7573: -- Summary: [Rust] Reduce boxing and cleanup Key: ARROW-7573 URL: https://issues.apache.org/jira/browse/ARROW-7573 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Gurwinder Singh Assignee: Gurwinder Singh -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7574) [Rust] FileSource read implementation is seeking for each single byte
Jörn Horstmann created ARROW-7574: - Summary: [Rust] FileSource read implementation is seeking for each single byte Key: ARROW-7574 URL: https://issues.apache.org/jira/browse/ARROW-7574 Project: Apache Arrow Issue Type: Bug Affects Versions: 0.16.0 Reporter: Jörn Horstmann on current master branch {code:java} $ RUST_BACKTRACE=1 strace target/debug/parquet-read tripdata.parquet{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7575) [R] Linux binary packaging followup
Neal Richardson created ARROW-7575: -- Summary: [R] Linux binary packaging followup Key: ARROW-7575 URL: https://issues.apache.org/jira/browse/ARROW-7575 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.16.0 After ARROW-6793 merged, I set up some nightly binary building CI and need to iterate on the install script and documentation to reflect what is available there. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7576) [C++][Dev] Improve fuzzing setup
Antoine Pitrou created ARROW-7576: - Summary: [C++][Dev] Improve fuzzing setup Key: ARROW-7576 URL: https://issues.apache.org/jira/browse/ARROW-7576 Project: Apache Arrow Issue Type: Sub-task Components: C++, Developer Tools Reporter: Antoine Pitrou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7577) [C++][CI] Check fuzzer setup in CI
Antoine Pitrou created ARROW-7577: - Summary: [C++][CI] Check fuzzer setup in CI Key: ARROW-7577 URL: https://issues.apache.org/jira/browse/ARROW-7577 Project: Apache Arrow Issue Type: Sub-task Components: C++, Continuous Integration Reporter: Antoine Pitrou Perhaps as a cron job. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7578) [R] Add support for datasets with IPC files and with multiple sources
Neal Richardson created ARROW-7578: -- Summary: [R] Add support for datasets with IPC files and with multiple sources Key: ARROW-7578 URL: https://issues.apache.org/jira/browse/ARROW-7578 Project: Apache Arrow Issue Type: Improvement Components: C++ - Dataset, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.16.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7579) [FlightRPC] Make Handshake optional
David Li created ARROW-7579: --- Summary: [FlightRPC] Make Handshake optional Key: ARROW-7579 URL: https://issues.apache.org/jira/browse/ARROW-7579 Project: Apache Arrow Issue Type: Bug Components: FlightRPC Reporter: David Li Fix For: 1.0.0 We should make it possible to _not_ invoke Handshake for services that don't want it. Especially when using it with flight-grpc, where the standard gRPC authentication mechanisms don't know about Flight and try to authenticate the Handshake endpoint - it's easy to forget to configure this endpoint to bypass authentication. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7580) [Website] 0.16 release post
Neal Richardson created ARROW-7580: -- Summary: [Website] 0.16 release post Key: ARROW-7580 URL: https://issues.apache.org/jira/browse/ARROW-7580 Project: Apache Arrow Issue Type: Improvement Components: Website Reporter: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Timeline for next major release [was Re: Looking to 1.0]
Hi all, to help us get ready, I've started a draft blog post for the 0.16 release: https://github.com/apache/arrow-site/pull/41 We'll need to fill in the sections. Feel free to push edits to my branch, or you can also email me (personally is fine) and I can paste them in. Neal On Thu, Jan 9, 2020 at 5:37 PM Jacques Nadeau wrote: > Understood and appreciated. Yeah, it can become a bit of a mess. > > On Thu, Jan 9, 2020 at 12:22 PM Wes McKinney wrote: > > > Will do -- there were many C++ and Python-related issues that I think > > were put in 1.0.0 / 0.16.0 overly optimistically and so I removed the > > Fix Version entirely (some of these had been pushed off 3-4 major > > releases ago). I may have removed some Fix Versions from other > > components that should have been rolled over -- sorry about that. It's > > hard to judge on some issues that have been open for 6-12 months or > > more. > > > > In general I think we should try to be more conservative about what > > issues we pre-emptively assign fix versions -- there may be a more > > constructive way that we can prioritize issues and distinguish between > > "optimistic" / nice-to-have issues and "must do to release" issues. > > > > On Thu, Jan 9, 2020 at 12:42 PM Jacques Nadeau > wrote: > > > > > > It would be helpful that when something is assigned to a release and > you > > > want to push it out, you push it to the next release as opposed to > > removing > > > a fix version entirely. Thanks! > > > > > > On Tue, Jan 7, 2020 at 10:26 AM Wes McKinney > > wrote: > > > > > > > I just renamed the 1.0.0 release version in JIRA to 0.16.0 and will > > > > work on removing issues that are not necessary to be able to release > > > > (others, please help). If we make miraculous progress with the 1.0.0 > > > > columnar format blockers (per discussion below), we can change this > > > > back, but I think either way we should put ourselves on a critical > > > > path to have an RC cut by Friday January 24. Does that seem doable? > > > > > > > > On Tue, Jan 7, 2020 at 10:25 AM Wes McKinney > > wrote: > > > > > > > > > > We absolutely should have a list of exactly what needs to be done > to > > > > > put out the 1.0.0 release, but based on what we know needs to be > done > > > > > I am not optimistic that it can all be accomplished before the end > of > > > > > January. That doesn't mean that we should assume these things won't > > > > > get done before March/April time frame. If they get done sooner, > > let's > > > > > release 1.0.0 sooner. > > > > > > > > > > On Mon, Jan 6, 2020 at 6:03 PM Neal Richardson > > > > > wrote: > > > > > > > > > > > > I'm all for maintaining a regular cadence of releases, but before > > we > > > > cast > > > > > > aside the idea of 1.0, I'd still encourage us to do the work of > > > > enumerating > > > > > > what truly must happen before we call a release 1.0 so that we > can > > get > > > > it > > > > > > done. Otherwise, in April we're going to be talking about doing a > > 0.17 > > > > > > release. > > > > > > > > > > > > I believe I've found the issues that Wes referenced and added > them > > as > > > > > > "blockers" to 1.0.0. That brings the total blocker count listed > on > > > > > > > > https://cwiki.apache.org/confluence/display/ARROW/Arrow+1.0.0+Release > > > > to 10 > > > > > > issues, though some may be overlapping/redundant. Do we think > this > > is > > > > an > > > > > > exhaustive list of blockers? Should some of these be downgraded > to > > > > > > not-blocking? If we were to resolve all 10 of these issues, would > > we > > > > have > > > > > > consensus that we're ready for 1.0? > > > > > > > > > > > > Would it help to update this wiki, which seems pretty stale at > this > > > > point? > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/ARROW/Columnar+Format+1.0+Milestone > > > > > > > > > > > > Thanks, > > > > > > Neal > > > > > > > > > > > > > > > > > > On Mon, Jan 6, 2020 at 11:40 AM Bryan Cutler > > > > wrote: > > > > > > > > > > > > > I agree on a 0.16.0 release. In the meantime I'll try to help > out > > > > with > > > > > > > getting the Java side ready for 1.0. > > > > > > > > > > > > > > On Sat, Jan 4, 2020 at 7:21 PM Fan Liya > > > > wrote: > > > > > > > > > > > > > > > Hi Jacques, > > > > > > > > > > > > > > > > ARROW-4526 is interesting. I would like to try to resolve it. > > > > > > > > Thanks a lot for the information. > > > > > > > > > > > > > > > > Best, > > > > > > > > Liya Fan > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Jan 5, 2020 at 6:14 AM Jacques Nadeau < > > jacq...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > > > > > The third ticket I was commenting on was ARROW-4526. > > > > > > > > > > > > > > > > > > Fan, do you want to take a shot at that one? > > > > > > > > > > > > > > > > > > On Fri, Jan 3, 2020 at 8:16 PM Fan Liya < > > liya.fa...@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Jacques, > > > > > > > > > > > > > > > >
[jira] [Created] (ARROW-7581) [R] Documentation/polishing for 0.16 release
Neal Richardson created ARROW-7581: -- Summary: [R] Documentation/polishing for 0.16 release Key: ARROW-7581 URL: https://issues.apache.org/jira/browse/ARROW-7581 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.16.0 Includes updating NEWS.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7582) [Rust][Flight] Unable to compile arrow.flight.protocol.rs
Krisztian Szucs created ARROW-7582: -- Summary: [Rust][Flight] Unable to compile arrow.flight.protocol.rs Key: ARROW-7582 URL: https://issues.apache.org/jira/browse/ARROW-7582 Project: Apache Arrow Issue Type: Bug Components: Rust Reporter: Krisztian Szucs Not sure exactly why, perhaps it has something to do with the recently updated dependencies: https://github.com/apache/arrow/runs/389937707 cc [~andygrove] -- This message was sent by Atlassian Jira (v8.3.4#803005)