[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921168#comment-16921168 ] Wes McKinney commented on ARROW-6417: - OK, I think to make things faster we need to b

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921164#comment-16921164 ] Wes McKinney commented on ARROW-6417: - The dreaded {{\_\_memmove_avx_unaligned_erms}}

[jira] [Updated] (ARROW-6420) [Java] Improve the performance of UnionVector when getting underlying vectors

2019-09-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6420: -- Labels: pull-request-available (was: ) > [Java] Improve the performance of UnionVector when ge

[jira] [Created] (ARROW-6420) [Java] Improve the performance of UnionVector when getting underlying vectors

2019-09-02 Thread Liya Fan (Jira)
Liya Fan created ARROW-6420: --- Summary: [Java] Improve the performance of UnionVector when getting underlying vectors Key: ARROW-6420 URL: https://issues.apache.org/jira/browse/ARROW-6420 Project: Apache Arr

[jira] [Closed] (ARROW-6221) [Java] Improve the performance of RangeEqualVisitor for comparing variable-width vectors

2019-09-02 Thread Liya Fan (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liya Fan closed ARROW-6221. --- Resolution: Won't Fix According to the discussion in https://lists.apache.org/thread.html/8de66a11b0256d2c81

[jira] [Updated] (ARROW-6251) [Developer] Add PR merge tool to apache/arrow-site

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6251: Fix Version/s: (was: 0.15.0) 1.0.0 > [Developer] Add PR merge tool to apache

[jira] [Updated] (ARROW-6405) [Python] Add std::move wrapper for use in Cython

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6405: Fix Version/s: (was: 0.15.0) 1.0.0 > [Python] Add std::move wrapper for use

[jira] [Updated] (ARROW-6414) [Python] pyarrow cannot (de)serialise an empty MultiIndex-ed column DataFrame

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6414: Summary: [Python] pyarrow cannot (de)serialise an empty MultiIndex-ed column DataFrame (was: pyarr

[jira] [Updated] (ARROW-6418) [C++] Plasma cmake targets are not exported

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6418: Summary: [C++] Plasma cmake targets are not exported (was: Plasma cmake targets are not exported)

[jira] [Updated] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6417: Attachment: (was: 20190903_parquet_read_perf.png) > [C++][Parquet] Non-dictionary BinaryArray r

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921131#comment-16921131 ] Wes McKinney commented on ARROW-6417: - I updated the results plot to use gcc 8.3 in b

[jira] [Updated] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6417: Description: In doing some benchmarking, I have found that binary reads seem to be slower from Arro

[jira] [Updated] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6417: Attachment: 20190903_parquet_read_perf.png > [C++][Parquet] Non-dictionary BinaryArray reads from P

[jira] [Updated] (ARROW-6419) [Website] Blog post about Parquet dictionary performance work coming in 0.15.x release

2019-09-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6419: -- Labels: pull-request-available (was: ) > [Website] Blog post about Parquet dictionary performa

[jira] [Created] (ARROW-6419) [Website] Blog post about Parquet dictionary performance work coming in 0.15.x release

2019-09-02 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6419: --- Summary: [Website] Blog post about Parquet dictionary performance work coming in 0.15.x release Key: ARROW-6419 URL: https://issues.apache.org/jira/browse/ARROW-6419 Pr

[jira] [Resolved] (ARROW-6411) [C++][Parquet] DictEncoderImpl::PutIndicesTyped has bad performance on some systems

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-6411. - Resolution: Fixed Issue resolved by pull request 5248 [https://github.com/apache/arrow/pull/5248]

[jira] [Assigned] (ARROW-5494) [Python] Create FileSystem bindings

2019-09-02 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-5494: -- Assignee: Krisztian Szucs > [Python] Create FileSystem bindings >

[jira] [Updated] (ARROW-6418) Plasma cmake targets are not exported

2019-09-02 Thread Tobias Mayer (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tobias Mayer updated ARROW-6418: Description: The generated arrowTargets.cmake files in the build and install directories don't cont

[jira] [Created] (ARROW-6418) Plasma cmake targets are not exported

2019-09-02 Thread Tobias Mayer (Jira)
Tobias Mayer created ARROW-6418: --- Summary: Plasma cmake targets are not exported Key: ARROW-6418 URL: https://issues.apache.org/jira/browse/ARROW-6418 Project: Apache Arrow Issue Type: Bug

[jira] [Commented] (ARROW-6404) [C++] CMake build of arrow libraries fails on Windows

2019-09-02 Thread Sutou Kouhei (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921047#comment-16921047 ] Sutou Kouhei commented on ARROW-6404: - [~ARF1] Thanks for the information. It seems t

[jira] [Updated] (ARROW-5471) [C++][Gandiva]Array offset is ignored in Gandiva projector

2019-09-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5471: -- Labels: pull-request-available (was: ) > [C++][Gandiva]Array offset is ignored in Gandiva proj

[jira] [Updated] (ARROW-6416) [Python] Confusing API & documentation regarding chunksizes

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arik Funke updated ARROW-6416: -- Description: The python API and documentation regarding chunksizes is confusing in my opinion. Exampl

[jira] [Updated] (ARROW-6416) [Python] Confusing API & documentation regarding chunksizes

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arik Funke updated ARROW-6416: -- Description: The python API and documentation regarding chunksizes is confusing in my opinion. Exampl

[jira] [Updated] (ARROW-6416) [Python] Confusing API & documentation regarding chunksizes

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arik Funke updated ARROW-6416: -- Description: The python API and documentation regarding chunksizes is confusing in my opinion. Exampl

[jira] [Updated] (ARROW-6416) [Python] Confusing API & documentation regarding chunksizes

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arik Funke updated ARROW-6416: -- Description: The python API and documentation regarding chunksizes is confusing in my opinion. Exampl

[jira] [Updated] (ARROW-6416) [Python] Confusing API & documentation regarding chunksizes

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arik Funke updated ARROW-6416: -- External issue URL: https://github.com/apache/arrow/pull/5254 > [Python] Confusing API & documentation

[jira] [Updated] (ARROW-6416) [Python] Confusing API & documentation regarding chunksizes

2019-09-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6416: -- Labels: pull-request-available (was: ) > [Python] Confusing API & documentation regarding chun

[jira] [Created] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6417: --- Summary: [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x Key: ARROW-6417 URL: https://issues.apache.org/jira/browse/ARROW-6417

[jira] [Created] (ARROW-6416) [Python] Confusing API & documentation regarding chunksizes

2019-09-02 Thread Arik Funke (Jira)
Arik Funke created ARROW-6416: - Summary: [Python] Confusing API & documentation regarding chunksizes Key: ARROW-6416 URL: https://issues.apache.org/jira/browse/ARROW-6416 Project: Apache Arrow I

[jira] [Commented] (ARROW-5494) [Python] Create FileSystem bindings

2019-09-02 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920956#comment-16920956 ] Krisztian Szucs commented on ARROW-5494: Yep, I might start to work on it because

[jira] [Commented] (ARROW-5494) [Python] Create FileSystem bindings

2019-09-02 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920953#comment-16920953 ] Antoine Pitrou commented on ARROW-5494: --- [~kszucs] You may be interested too. > [P

[jira] [Assigned] (ARROW-6141) [C++] Enable memory-mapping a file region that is offset from the beginning of the file

2019-09-02 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-6141: - Assignee: Yuan Zhou > [C++] Enable memory-mapping a file region that is offset from the

[jira] [Resolved] (ARROW-6141) [C++] Enable memory-mapping a file region that is offset from the beginning of the file

2019-09-02 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-6141. --- Fix Version/s: 0.15.0 Resolution: Fixed Issue resolved by pull request 5101 [https://g

[jira] [Assigned] (ARROW-6063) [FlightRPC] Implement "half-closed" semantics for DoPut

2019-09-02 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-6063: - Assignee: lidavidm > [FlightRPC] Implement "half-closed" semantics for DoPut > -

[jira] [Resolved] (ARROW-6063) [FlightRPC] Implement "half-closed" semantics for DoPut

2019-09-02 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-6063. --- Fix Version/s: 0.15.0 Resolution: Fixed Issue resolved by pull request 5196 [https://g

[jira] [Updated] (ARROW-6412) [C++] arrow-flight-test can crash because of port allocation

2019-09-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6412: -- Labels: pull-request-available (was: ) > [C++] arrow-flight-test can crash because of port all

[jira] [Created] (ARROW-6415) [R] Remove usage of R CMD config CXXCPP

2019-09-02 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6415: -- Summary: [R] Remove usage of R CMD config CXXCPP Key: ARROW-6415 URL: https://issues.apache.org/jira/browse/ARROW-6415 Project: Apache Arrow Issue Type:

[jira] [Updated] (ARROW-6413) [R] Support autogenerating column names

2019-09-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6413: -- Labels: pull-request-available (was: ) > [R] Support autogenerating column names > ---

[jira] [Updated] (ARROW-6414) pyarrow cannot (de)serialise an empty MultiIndex-ed column DataFrame

2019-09-02 Thread Stpehen Gowdy (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stpehen Gowdy updated ARROW-6414: - Description: If you have an empty multiindex columns in a pandas dataframe pyarrow cannot serial

[jira] [Created] (ARROW-6414) pyarrow cannot (de)serialise an empty MultiIndex-ed column DataFrame

2019-09-02 Thread Stpehen Gowdy (Jira)
Stpehen Gowdy created ARROW-6414: Summary: pyarrow cannot (de)serialise an empty MultiIndex-ed column DataFrame Key: ARROW-6414 URL: https://issues.apache.org/jira/browse/ARROW-6414 Project: Apache Ar

[jira] [Created] (ARROW-6413) [R] Support autogenerating column names

2019-09-02 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6413: -- Summary: [R] Support autogenerating column names Key: ARROW-6413 URL: https://issues.apache.org/jira/browse/ARROW-6413 Project: Apache Arrow Issue Type:

[jira] [Updated] (ARROW-6231) [C++][Python] Consider assigning default column names when reading CSV file and header_rows=0

2019-09-02 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6231: --- Component/s: C++ > [C++][Python] Consider assigning default column names when reading CSV fil

[jira] [Updated] (ARROW-6231) [C++][Python] Consider assigning default column names when reading CSV file and header_rows=0

2019-09-02 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6231: --- Summary: [C++][Python] Consider assigning default column names when reading CSV file and head

[jira] [Assigned] (ARROW-6412) [C++] arrow-flight-test can crash because of port allocation

2019-09-02 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-6412: - Assignee: Antoine Pitrou > [C++] arrow-flight-test can crash because of port allocation

[jira] [Created] (ARROW-6412) [C++] arrow-flight-test can crash because of port allocation

2019-09-02 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6412: - Summary: [C++] arrow-flight-test can crash because of port allocation Key: ARROW-6412 URL: https://issues.apache.org/jira/browse/ARROW-6412 Project: Apache Arrow

[jira] [Updated] (ARROW-6316) [Go] Make change to ensure flatbuffer reads are aligned

2019-09-02 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6316: -- Labels: pull-request-available (was: ) > [Go] Make change to ensure flatbuffer reads are alig

[jira] [Resolved] (ARROW-6383) [Java] report outstanding child allocators on parent allocator close

2019-09-02 Thread Pindikura Ravindra (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pindikura Ravindra resolved ARROW-6383. --- Fix Version/s: 0.15.0 Resolution: Fixed Issue resolved by pull request 5227 [h

[jira] [Comment Edited] (ARROW-6404) [C++] CMake build of arrow libraries fails on Windows

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920715#comment-16920715 ] Arik Funke edited comment on ARROW-6404 at 9/2/19 9:09 AM: --- [~k

[jira] [Comment Edited] (ARROW-6404) [C++] CMake build of arrow libraries fails on Windows

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920715#comment-16920715 ] Arik Funke edited comment on ARROW-6404 at 9/2/19 9:08 AM: --- [~k

[jira] [Comment Edited] (ARROW-6404) [C++] CMake build of arrow libraries fails on Windows

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920715#comment-16920715 ] Arik Funke edited comment on ARROW-6404 at 9/2/19 9:08 AM: --- [~k

[jira] [Commented] (ARROW-6404) [C++] CMake build of arrow libraries fails on Windows

2019-09-02 Thread Arik Funke (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920715#comment-16920715 ] Arik Funke commented on ARROW-6404: --- [~kou] Thanks for letting me know. If you are upd