[jira] [Created] (ARROW-5754) [C++]Missing override for ~GrpcStreamWriter?
Kenta Murata created ARROW-5754: --- Summary: [C++]Missing override for ~GrpcStreamWriter? Key: ARROW-5754 URL: https://issues.apache.org/jira/browse/ARROW-5754 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Kenta Murata I encountered the following compile error: {{../src/arrow/flight/client.cc:244:3: error: '~GrpcStreamWriter' overrides a destructor but is not marked 'override' [-Werror,-Winconsistent-missing-destructor-override] ~GrpcStreamWriter() = default; ^ ../src/arrow/flight/client.h:86:27: note: overridden virtual function is here class ARROW_FLIGHT_EXPORT FlightStreamWriter : public ipc::RecordBatchWriter { ^}} Putting override modifier can resolve this problem. I'll make a pull-request for the change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5753) [Rust] Fix code coverage in CI
Chao Sun created ARROW-5753: --- Summary: [Rust] Fix code coverage in CI Key: ARROW-5753 URL: https://issues.apache.org/jira/browse/ARROW-5753 Project: Apache Arrow Issue Type: Bug Components: Rust Reporter: Chao Sun Assignee: Chao Sun Rust code coverage in CI has been broken for a while now. We should fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5752) [Java] Improve the performance of ArrowBuf#setZero
Liya Fan created ARROW-5752: --- Summary: [Java] Improve the performance of ArrowBuf#setZero Key: ARROW-5752 URL: https://issues.apache.org/jira/browse/ARROW-5752 Project: Apache Arrow Issue Type: Improvement Components: Java Reporter: Liya Fan Assignee: Liya Fan The current implementation involves repeated calls of setLong, setInt & setByte. It is more efficient to directly call the native function. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
回复:[Discuss][Java] Shading Flatbuffer dependency
Thanks for opening the discuss and the proposal, Micah. Looks reasonable to me. Thanks Ji Liu -- 发件人:Micah Kornfield 发送时间:2019年6月26日(星期三) 20:35 收件人:dev@arrow.apache.org 主 题:Re: [Discuss][Java] Shading Flatbuffer dependency There is a proposed PR [1] now that shades flatbuffers by consolidating all related code in the vector module and provides wrapping classes upstream for flight. It deletes the format module. Feedback welcome. Thanks, Micah [1] https://github.com/apache/arrow/pull/4701 On Thursday, June 20, 2019, Micah Kornfield wrote: > ARROW-5579 [1] brought to our attention that the core flat buffer library > makes no guarantees of compatibility with generated class files that were > generated with a different version. > > Properly shading the dependency seems to require using the shaded version > across all of our sub-project. Ji Liu followed this approach and created > a pull request [2] that shades the dependency in the "format" module and > then uses the shaded version in downstream modules. > > This is an invasive change and has the potential to break IDE development > workflows [3] so I figured it was worth discussing on the mailing list. > > Questions: > 1. Do we want to shade the package? > 2. Is there a better way to accomplish this that would continue to allow > full IDE support? > 3. Are we ok making the development process harder for Java developers (I > wonder if eclipse or other IDEs have similar issues?) > > Thanks, > Micah > > > [1] https://issues.apache.org/jira/browse/ARROW-5579 > [2] https://github.com/apache/arrow/pull/4629 > [3] https://youtrack.jetbrains.com/issue/IDEA-93855 >
[jira] [Created] (ARROW-5751) [Packaging][Python] Python 2.7 wheels broken on macOS: libcares.2.dylib not found
Philipp Moritz created ARROW-5751: - Summary: [Packaging][Python] Python 2.7 wheels broken on macOS: libcares.2.dylib not found Key: ARROW-5751 URL: https://issues.apache.org/jira/browse/ARROW-5751 Project: Apache Arrow Issue Type: Improvement Reporter: Philipp Moritz I'm afraid while [https://github.com/apache/arrow/pull/4685] fixed the macOS wheels for python 3, but the python 2.7 wheel is still broken (with a different error): {code:java} ImportError: dlopen(/Users/pcmoritz/anaconda3/lib/python3.6/site-packages/pyarrow/lib.cpython-36m-darwin.so, 2): Library not loaded: /usr/local/opt/c-ares/lib/libcares.2.dylib Referenced from: /Users/pcmoritz/anaconda3/lib/python3.6/site-packages/pyarrow/libarrow_python.14.dylib Reason: image not found{code} I tried the same hack as in [https://github.com/apache/arrow/pull/4685] for libcares but it doesn't work (removing the .dylib fails one of the earlier build steps). I think the only way to go forward on this is to compile grpc ourselves. My attempt to do this in [https://github.com/apache/arrow/compare/master...pcmoritz:mac-wheels-py2] fails because OpenSSL is not found even though I'm specifying the OPENSSL_ROOT_DIR (see [https://travis-ci.org/pcmoritz/crossbow/builds/550603543]). Let me know if you have any ideas how to fix this! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5750) [Java] Java compilation failures on master
Wes McKinney created ARROW-5750: --- Summary: [Java] Java compilation failures on master Key: ARROW-5750 URL: https://issues.apache.org/jira/browse/ARROW-5750 Project: Apache Arrow Issue Type: Bug Components: Java Reporter: Wes McKinney Fix For: 0.14.0 Two Flight-related Java patches were merged today and we have compilation failures on master now: https://travis-ci.org/apache/arrow/jobs/551015006#L956 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5749) [Python] Add Python binding for Table::CombineChunks()
Zhuo Peng created ARROW-5749: Summary: [Python] Add Python binding for Table::CombineChunks() Key: ARROW-5749 URL: https://issues.apache.org/jira/browse/ARROW-5749 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Zhuo Peng Assignee: Zhuo Peng Fix For: 0.14.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5748) [Packaging][deb] Add support for Debian GNU/Linux buster
Sutou Kouhei created ARROW-5748: --- Summary: [Packaging][deb] Add support for Debian GNU/Linux buster Key: ARROW-5748 URL: https://issues.apache.org/jira/browse/ARROW-5748 Project: Apache Arrow Issue Type: Improvement Components: Packaging Reporter: Sutou Kouhei Assignee: Sutou Kouhei Fix For: 0.14.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5747) [C++] Better column name and header support in CSV reader
Neal Richardson created ARROW-5747: -- Summary: [C++] Better column name and header support in CSV reader Key: ARROW-5747 URL: https://issues.apache.org/jira/browse/ARROW-5747 Project: Apache Arrow Issue Type: Improvement Reporter: Neal Richardson While working on ARROW-5500, I found a number of issues around the CSV parse options {{header_rows}}: * If header_rows is 0, [the reader errors|https://github.com/apache/arrow/blob/8b0318a11bba2aa2cf39bff245ff916a3283d372/cpp/src/arrow/csv/reader.cc#L150] * It's not possible to supply your own column names, as [this TODO|https://github.com/apache/arrow/blob/8b0318a11bba2aa2cf39bff245ff916a3283d372/cpp/src/arrow/csv/reader.cc#L149] notes. ARROW-4912 allows renaming columns after reading in, which _maybe_ is enough as long as header_rows == 0 doesn't error, but then you can't naturally specify column types in the convert options because that takes a map of column name to type. * If header_rows is > 1, every cell gets turned into a column name, so if header_rows == 2, you get twice the number of column names as columns. This doesn't error, but it leads to unexpected results. IMO a better interface would be to have a {{skip_rows}} argument to let you ignore a large header, and a {{column_names}} argument that, if provided, gives the column names. If not provided, the first row after {{skip_rows}} is taken to be the column names. I don't think there's value in trying to be clever about multirow headers and converting those to column names; if there's meaningful information in a tall header, let the user parse it themselves. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5746) [Website] Move website source out of apache/arrow
Neal Richardson created ARROW-5746: -- Summary: [Website] Move website source out of apache/arrow Key: ARROW-5746 URL: https://issues.apache.org/jira/browse/ARROW-5746 Project: Apache Arrow Issue Type: Improvement Components: Website Reporter: Neal Richardson Possibly to apache/arrow-site, which already exists for hosting the static built site. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] Ongoing Travis CI service degradation
Also note that the situation with AppVeyor isn't much better. Any "free as in beer" CI service is probably too capacity-limited for our needs now, unless it allows private workers (which apparently Gitlab CI does). Regards Antoine. Le 26/06/2019 à 18:32, Wes McKinney a écrit : > It seems that there is intermittent Apache-wide degradation of Travis > CI services -- I was looking at https://travis-ci.org/apache today and > there appeared to be a stretch of 3-4 hours where no queued builds on > github.com/apache were running at all. I initially thought that the > issue was contention with other Apache projects but even with > round-robin allocation and a concurrency limit (e.g. no Apache project > having more than 5-6 concurrent builds) that wouldn't explain why NO > builds are running. > > This is obviously disturbing given how reliant we are on Travis CI to > validate patches to be merged. > > I've opened a support ticket with Travis CI to see if they can provide > some insight into what's going on. There is also an INFRA ticket where > other projects have reported some similar experiences > > https://issues.apache.org/jira/browse/INFRA-18533 > > As a meta-comment, at some point Apache Arrow is going to need to move > off of public CI services for patch validation so that we can have > unilateral control over scaling our build / test resources as the > community grows larger. As the most active merger of patches (I have > merged over 50% of pull requests over the project's history) this > affects me greatly as I am often monitoring builds on many open PRs so > that I can merge them as soon as possible. We are often resorting to > builds on contributor's forks (assuming they have enabled Travis CI / > Appveyor) > > As some context around Travis CI in particular, in January Travis CI > was acquired by Idera, a private equity (I think?) developer tools > conglomerate. It's likely that we're seeing some "maximize profit, > minimize costs" behavior in play, so the recent experience could > become the new normal. > > - Wes >
[DISCUSS] Ongoing Travis CI service degradation
It seems that there is intermittent Apache-wide degradation of Travis CI services -- I was looking at https://travis-ci.org/apache today and there appeared to be a stretch of 3-4 hours where no queued builds on github.com/apache were running at all. I initially thought that the issue was contention with other Apache projects but even with round-robin allocation and a concurrency limit (e.g. no Apache project having more than 5-6 concurrent builds) that wouldn't explain why NO builds are running. This is obviously disturbing given how reliant we are on Travis CI to validate patches to be merged. I've opened a support ticket with Travis CI to see if they can provide some insight into what's going on. There is also an INFRA ticket where other projects have reported some similar experiences https://issues.apache.org/jira/browse/INFRA-18533 As a meta-comment, at some point Apache Arrow is going to need to move off of public CI services for patch validation so that we can have unilateral control over scaling our build / test resources as the community grows larger. As the most active merger of patches (I have merged over 50% of pull requests over the project's history) this affects me greatly as I am often monitoring builds on many open PRs so that I can merge them as soon as possible. We are often resorting to builds on contributor's forks (assuming they have enabled Travis CI / Appveyor) As some context around Travis CI in particular, in January Travis CI was acquired by Idera, a private equity (I think?) developer tools conglomerate. It's likely that we're seeing some "maximize profit, minimize costs" behavior in play, so the recent experience could become the new normal. - Wes
[jira] [Created] (ARROW-5745) [C++] properties of Map(Array|Type) are confusingly named
Benjamin Kietzman created ARROW-5745: Summary: [C++] properties of Map(Array|Type) are confusingly named Key: ARROW-5745 URL: https://issues.apache.org/jira/browse/ARROW-5745 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Benjamin Kietzman Assignee: Benjamin Kietzman In the context of ListArrays, "values" indicates the elements in a slot of the ListArray. Since MapArray isa ListArray, "values" indicates the same thing and the elements are key-item pairs. This naming scheme is not idiomatic; these *should* be called key-value pairs but that would require propagating the renaming down to ListArray. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Arrow sync call tomorrow (June 26) at 12:00 US/Eastern, 16:00 UTC
Attendees: Ben Kietzman Wes McKinney John Muehlhausen Neal Richardson François Saint-Jacques Shyam Singh Discussion: 0.14 release: * Trouble in past releases with Java Gandiva ( https://issues.apache.org/jira/browse/ARROW-4301). Shyam says that they'll follow up on email * Travis-CI backups causing delays, will discuss on mailing list Ben: naming for MapArray given that it is subclass of ListArray, "values" is overloaded. Resolved to open a jira ticket. On Tue, Jun 25, 2019 at 5:18 PM Neal Richardson wrote: > Hi everyone, > Reminder that the biweekly Arrow call is tomorrow at > https://meet.google.com/vtm-teks-phx. All are welcome to join. I suspect > a significant topic of discussion will be the upcoming release and any > remaining blockers (of which > https://cwiki.apache.org/confluence/display/ARROW/Arrow+0.14.0+Release > still shows several), but we don't need to limit ourselves to that topic if > there are other items for the agenda. > > Notes will be sent out to the mailing list afterwards. > > Neal >
[jira] [Created] (ARROW-5744) [C++] arrow::Concatenate does not check for BinaryArray offset overflows
Wes McKinney created ARROW-5744: --- Summary: [C++] arrow::Concatenate does not check for BinaryArray offset overflows Key: ARROW-5744 URL: https://issues.apache.org/jira/browse/ARROW-5744 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Wes McKinney Fix For: 0.14.0 Discovered during ARROW-5635 code review -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Idiosyncratic failing builds on CI infrastructure
Thanks Antoine for figuring out the root causes, it was kind of the perfect storm of issues (corrupt conda packages, Boost CMake configuration issues) On Wed, Jun 26, 2019 at 6:57 AM Antoine Pitrou wrote: > > > Hi, > > I pushed a fix for these issues. Hopefully CI should be green again, > until the next firefighting. > > Regards > > Antoine. > > > Le 26/06/2019 à 03:27, Wes McKinney a écrit : > > We are seeing persistent failures due to a couple of issues seemingly > > unrelated to any code changes in the project > > > > * https://issues.apache.org/jira/browse/ARROW-5732 > > * https://issues.apache.org/jira/browse/ARROW-5735 > > > > The Appveyor failure is especially weird since here is a passing build > > 6 hours ago on my fork > > > > https://ci.appveyor.com/project/wesm/arrow/builds/25537882 > > > > where here is master failing after merging this patch to master > > > > https://ci.appveyor.com/project/wesm/arrow/builds/25543386 > > > > I'm out of steam to debug these issues further today, but if anyone > > has any idea it would be nice to get our CI stabilized so we can > > continue working to close out the outstanding 0.14.0 issues. > > > > Thanks, > > Wes > >
[jira] [Created] (ARROW-5743) [C++] Add CMake option to enable "large memory" unit tests
Wes McKinney created ARROW-5743: --- Summary: [C++] Add CMake option to enable "large memory" unit tests Key: ARROW-5743 URL: https://issues.apache.org/jira/browse/ARROW-5743 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Wes McKinney Fix For: 1.0.0 We have a number of unit tests that need to exercise code paths where memory in excess of 2-4GB is allocated. Some of these are marked as {{DISABLED_*}} in googletest which seems to be a recipe for bitrot. I propose instead to have a CMake option that sets a compiler definition to enable these tests at build time, so that they can be run regularly on machines that have adequate RAM (i.e. not public CI services) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5742) [CI] Add daily / weekly Valgrind build
Antoine Pitrou created ARROW-5742: - Summary: [CI] Add daily / weekly Valgrind build Key: ARROW-5742 URL: https://issues.apache.org/jira/browse/ARROW-5742 Project: Apache Arrow Issue Type: Wish Components: C++, Continuous Integration Reporter: Antoine Pitrou Fix For: 1.0.0 A daily or weekly Valgrind build on the ursa-labs machines would further check sanity of the C++ code base, though with ASAN and UBSAN builds on Travis-CI we're already well covered. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5741) [JS] Make numeric vector from functions consistent with TypedArray.from
Brian Hulette created ARROW-5741: Summary: [JS] Make numeric vector from functions consistent with TypedArray.from Key: ARROW-5741 URL: https://issues.apache.org/jira/browse/ARROW-5741 Project: Apache Arrow Issue Type: Improvement Components: JavaScript Reporter: Brian Hulette Described in https://lists.apache.org/thread.html/b648a781cba7f10d5a6072ff2e7dab6c03e2d1f12e359d9261891486@%3Cdev.arrow.apache.org%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5740) [JS] Add ability to run tests in headless browsers
Brian Hulette created ARROW-5740: Summary: [JS] Add ability to run tests in headless browsers Key: ARROW-5740 URL: https://issues.apache.org/jira/browse/ARROW-5740 Project: Apache Arrow Issue Type: Task Components: JavaScript Reporter: Brian Hulette Now that we have a compatibility check that modifies behavior based on the features in a supported browser, we should really be running our tests in various browsers to exercise the various cases. For example right now we don't actually run tests on the non-BigNum code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5739) [CI] Fix docker python build
Francois Saint-Jacques created ARROW-5739: - Summary: [CI] Fix docker python build Key: ARROW-5739 URL: https://issues.apache.org/jira/browse/ARROW-5739 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration Reporter: Francois Saint-Jacques python docker image will fail to clean the build directory, installing a previous invocation of `docker-compose run python`. This is not affecting CI that drops the `/build` mount, but only local users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-5738) [Crossbow][Conda] OSX package builds are failing with missing intrinsics
Krisztian Szucs created ARROW-5738: -- Summary: [Crossbow][Conda] OSX package builds are failing with missing intrinsics Key: ARROW-5738 URL: https://issues.apache.org/jira/browse/ARROW-5738 Project: Apache Arrow Issue Type: Bug Components: Packaging Reporter: Krisztian Szucs Assignee: Krisztian Szucs Fix For: 0.14.0 Failing builds: https://github.com/ursa-labs/crossbow/branches/all?utf8=%E2%9C%93&query=build-653 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [Discuss][Java] Shading Flatbuffer dependency
There is a proposed PR [1] now that shades flatbuffers by consolidating all related code in the vector module and provides wrapping classes upstream for flight. It deletes the format module. Feedback welcome. Thanks, Micah [1] https://github.com/apache/arrow/pull/4701 On Thursday, June 20, 2019, Micah Kornfield wrote: > ARROW-5579 [1] brought to our attention that the core flat buffer library > makes no guarantees of compatibility with generated class files that were > generated with a different version. > > Properly shading the dependency seems to require using the shaded version > across all of our sub-project. Ji Liu followed this approach and created > a pull request [2] that shades the dependency in the "format" module and > then uses the shaded version in downstream modules. > > This is an invasive change and has the potential to break IDE development > workflows [3] so I figured it was worth discussing on the mailing list. > > Questions: > 1. Do we want to shade the package? > 2. Is there a better way to accomplish this that would continue to allow > full IDE support? > 3. Are we ok making the development process harder for Java developers (I > wonder if eclipse or other IDEs have similar issues?) > > Thanks, > Micah > > > [1] https://issues.apache.org/jira/browse/ARROW-5579 > [2] https://github.com/apache/arrow/pull/4629 > [3] https://youtrack.jetbrains.com/issue/IDEA-93855 >
Re: Idiosyncratic failing builds on CI infrastructure
Hi, I pushed a fix for these issues. Hopefully CI should be green again, until the next firefighting. Regards Antoine. Le 26/06/2019 à 03:27, Wes McKinney a écrit : > We are seeing persistent failures due to a couple of issues seemingly > unrelated to any code changes in the project > > * https://issues.apache.org/jira/browse/ARROW-5732 > * https://issues.apache.org/jira/browse/ARROW-5735 > > The Appveyor failure is especially weird since here is a passing build > 6 hours ago on my fork > > https://ci.appveyor.com/project/wesm/arrow/builds/25537882 > > where here is master failing after merging this patch to master > > https://ci.appveyor.com/project/wesm/arrow/builds/25543386 > > I'm out of steam to debug these issues further today, but if anyone > has any idea it would be nice to get our CI stabilized so we can > continue working to close out the outstanding 0.14.0 issues. > > Thanks, > Wes >
[jira] [Created] (ARROW-5737) Gandiva not building in manylinux
Praveen Kumar Desabandu created ARROW-5737: -- Summary: Gandiva not building in manylinux Key: ARROW-5737 URL: https://issues.apache.org/jira/browse/ARROW-5737 Project: Apache Arrow Issue Type: Bug Components: C++ - Gandiva Reporter: Praveen Kumar Desabandu The gandiva many linux builds have started failing post [https://github.com/apache/arrow/commit/0fc5bc429fbe527b1e42db4307cde8d0ce2818c6] ninja is unable to interpret the [make_precompiled_bitcode.py|https://github.com/apache/arrow/commit/0fc5bc429fbe527b1e42db4307cde8d0ce2818c6#diff-456ea80d0a4228a2dbf98b5d47615e07] correctly. [~pitrou] - I tried to fix it but did not make much progress :) Could you please help out. Error is "File "/arrow/cpp/src/gandiva/precompiled/../make_precompiled_bitcode.py", line 22 marker = B"" SyntaxError: invalid syntax -- This message was sent by Atlassian JIRA (v7.6.3#76005)