[jira] [Updated] (ARROW-6592) [Java] Add support for skipping decoding of columns/field in Avro converter
[ https://issues.apache.org/jira/browse/ARROW-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6592: -- Labels: avro pull-request-available (was: avro) > [Java] Add support for skipping decoding of columns/field in Avro converter > --- > > Key: ARROW-6592 > URL: https://issues.apache.org/jira/browse/ARROW-6592 > Project: Apache Arrow > Issue Type: Sub-task > Components: Java >Reporter: Micah Kornfield >Assignee: Ji Liu >Priority: Major > Labels: avro, pull-request-available > > Users should be able to pass in a set of fields they wish to decode from Avro > and the converter should avoid creating Vectors in the returned > ArrowSchemaRoot. This would ideally support nested columns so if there was: > > Struct A { > int B; > int C; > } > > The use could choose to only read A.B or A.C or both. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6601) [Java] Improve JDBC adapter performance & add benchmark
[ https://issues.apache.org/jira/browse/ARROW-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield resolved ARROW-6601. Fix Version/s: 0.15.0 Resolution: Fixed Issue resolved by pull request 5472 [https://github.com/apache/arrow/pull/5472] > [Java] Improve JDBC adapter performance & add benchmark > --- > > Key: ARROW-6601 > URL: https://issues.apache.org/jira/browse/ARROW-6601 > Project: Apache Arrow > Issue Type: Task > Components: Java >Reporter: Ji Liu >Assignee: Ji Liu >Priority: Critical > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Add a performance test as well to get a baseline number, to avoid performance > regression when we change related code. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (ARROW-5580) [C++][Gandiva] Correct definitions of timestamp functions in Gandiva
[ https://issues.apache.org/jira/browse/ARROW-5580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prudhvi Porandla closed ARROW-5580. --- Resolution: Fixed > [C++][Gandiva] Correct definitions of timestamp functions in Gandiva > > > Key: ARROW-5580 > URL: https://issues.apache.org/jira/browse/ARROW-5580 > Project: Apache Arrow > Issue Type: Task > Components: C++ - Gandiva >Reporter: Prudhvi Porandla >Assignee: Prudhvi Porandla >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > Timestamp functions are unsupported in Gandiva due to definition mismatch. > For example, Gandiva supports timestampAddMonth(timestamp, int32) but the > expected signature is timestampAddMonth(int32, timestamp). > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4930) [Python] Remove LIBDIR assumptions in Python build
[ https://issues.apache.org/jira/browse/ARROW-4930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936383#comment-16936383 ] Suvayu Ali commented on ARROW-4930: --- Hi [~apitrou], I have had limited success so far. [I was working off of master, {{git describe}} says: {{apache-arrow-0.14.0-584-g176adf5a0}}] This is what I found: 1. {{setup.py}} makes the library directory is {{$ARROW_HOME/lib}} when setting {{PKG_CONFIG_PATH}} in the environment (line 253). I believe this is bit of a hack, which is also mentioned by the author in the issue that tracked that change ARROW-1090. The resolution should be somewhere in the cmake scripts. 2. I successfully detected {{libarrow}} with the attached patch [^FindArrow.cmake.patch]. 3. However I then failed to detect {{libparquet}}. On further investigation I found (AFAIU) that even though {{FindParquet.cmake}} sets {{ARROW_HOME}}, it is not used. However, it does use {{PARQUET_HOME}}. Since my CMake foo is a bit weak, I worked up a similar patch [^FindParquet.cmake.patch] as before and set {{export PARQUET_HOME=$ARROW_HOME}} in the terminal. This allowed the compilation to succeed. The compilation commands I used for C++ and Python are: {code:java} $ cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DARROW_FLIGHT=ON -DARROW_GANDIVA=ON -DARROW_ORC=ON \ -DARROW_PARQUET=ON -DPYTHON_EXECUTABLE=/usr/bin/python3.7m \ -DARROW_PYTHON=ON -DARROW_PLASMA=ON \ -DARROW_BUILD_TESTS=ON -DLLVM_DIR=/usr/lib64/llvm7.0 .. $ python3 setup.py build_ext --cmake-generator Ninja --inplace {code} I then tried to run the python tests with {{pytest-3 pyarrow}}. The summary was: {quote}5 failed, 1411 passed, 59 skipped, 4 xfailed, 29 warnings in 28.30 seconds {quote} The failures are all some kind of setup related issues, not being able to import, not being able to start plasma, etc. I'll investigate this further, but my take is the cmake scripts don't actually have _one way_ of detecting the libraries, making it very difficult to configure it properly from setup.py. > [Python] Remove LIBDIR assumptions in Python build > -- > > Key: ARROW-4930 > URL: https://issues.apache.org/jira/browse/ARROW-4930 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.12.1 >Reporter: Suvayu Ali >Priority: Minor > Labels: setup.py > Fix For: 2.0.0 > > Attachments: FindArrow.cmake.patch, FindParquet.cmake.patch > > > This is in reference to (4) in > [this|http://mail-archives.apache.org/mod_mbox/arrow-dev/201903.mbox/%3C0AF328A1-ED2A-457F-B72D-3B49C8614850%40xhochy.com%3E] > mailing list discussion. > Certain sections of setup.py assume a specific location of the C++ libraries. > Removing this hard assumption will simplify PyArrow builds significantly. As > far as I could tell these assumptions are made in the > {{build_ext._run_cmake()}} method (wherever bundling of C++ libraries are > handled). > # The first occurrence is before invoking cmake (see line 237). > # The second occurrence is when the C++ libraries are moved from their build > directory to the Python tree (see line 347). The actual implementation is in > the function {{_move_shared_libs_unix(..)}} (see line 468). > Hope this helps. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-4930) [Python] Remove LIBDIR assumptions in Python build
[ https://issues.apache.org/jira/browse/ARROW-4930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suvayu Ali updated ARROW-4930: -- Attachment: FindParquet.cmake.patch FindArrow.cmake.patch > [Python] Remove LIBDIR assumptions in Python build > -- > > Key: ARROW-4930 > URL: https://issues.apache.org/jira/browse/ARROW-4930 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.12.1 >Reporter: Suvayu Ali >Priority: Minor > Labels: setup.py > Fix For: 2.0.0 > > Attachments: FindArrow.cmake.patch, FindParquet.cmake.patch > > > This is in reference to (4) in > [this|http://mail-archives.apache.org/mod_mbox/arrow-dev/201903.mbox/%3C0AF328A1-ED2A-457F-B72D-3B49C8614850%40xhochy.com%3E] > mailing list discussion. > Certain sections of setup.py assume a specific location of the C++ libraries. > Removing this hard assumption will simplify PyArrow builds significantly. As > far as I could tell these assumptions are made in the > {{build_ext._run_cmake()}} method (wherever bundling of C++ libraries are > handled). > # The first occurrence is before invoking cmake (see line 237). > # The second occurrence is when the C++ libraries are moved from their build > directory to the Python tree (see line 347). The actual implementation is in > the function {{_move_shared_libs_unix(..)}} (see line 468). > Hope this helps. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6668) [Rust] [DataFusion] Implement CAST expression
[ https://issues.apache.org/jira/browse/ARROW-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paddy Horan resolved ARROW-6668. Fix Version/s: (was: 1.0.0) 0.15.0 Resolution: Fixed Issue resolved by pull request 5477 [https://github.com/apache/arrow/pull/5477] > [Rust] [DataFusion] Implement CAST expression > - > > Key: ARROW-6668 > URL: https://issues.apache.org/jira/browse/ARROW-6668 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Implement CAST expression -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6664) [C++] Add option to build without SSE4.2
[ https://issues.apache.org/jira/browse/ARROW-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-6664. - Resolution: Fixed Issue resolved by pull request 5468 [https://github.com/apache/arrow/pull/5468] > [C++] Add option to build without SSE4.2 > > > Key: ARROW-6664 > URL: https://issues.apache.org/jira/browse/ARROW-6664 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Child task of ARROW-5381 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6664) [C++] Add option to build without SSE4.2
[ https://issues.apache.org/jira/browse/ARROW-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-6664: --- Assignee: Wes McKinney > [C++] Add option to build without SSE4.2 > > > Key: ARROW-6664 > URL: https://issues.apache.org/jira/browse/ARROW-6664 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Child task of ARROW-5381 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6532) [R] Write parquet files with compression
[ https://issues.apache.org/jira/browse/ARROW-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-6532: -- Assignee: Romain François (was: Neal Richardson) > [R] Write parquet files with compression > > > Key: ARROW-6532 > URL: https://issues.apache.org/jira/browse/ARROW-6532 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Romain François >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Followup to ARROW-6360. See ARROW-6216 for the C++ side. `write_parquet()` > should be able to write compressed files, including with a specified > compression level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6532) [R] Write parquet files with compression
[ https://issues.apache.org/jira/browse/ARROW-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-6532: -- Assignee: Neal Richardson > [R] Write parquet files with compression > > > Key: ARROW-6532 > URL: https://issues.apache.org/jira/browse/ARROW-6532 > Project: Apache Arrow > Issue Type: Improvement > Components: R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Followup to ARROW-6360. See ARROW-6216 for the C++ side. `write_parquet()` > should be able to write compressed files, including with a specified > compression level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-3817) [R] $ method for RecordBatch
[ https://issues.apache.org/jira/browse/ARROW-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-3817. Fix Version/s: (was: 1.0.0) 0.15.0 Resolution: Fixed Issue resolved by pull request 5459 [https://github.com/apache/arrow/pull/5459] > [R] $ method for RecordBatch > > > Key: ARROW-3817 > URL: https://issues.apache.org/jira/browse/ARROW-3817 > Project: Apache Arrow > Issue Type: New Feature > Components: R >Reporter: Romain François >Assignee: Neal Richardson >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6670) [CI][R] Fix fix for R nightly jobs
[ https://issues.apache.org/jira/browse/ARROW-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-6670. Resolution: Fixed Issue resolved by pull request 5479 [https://github.com/apache/arrow/pull/5479] > [CI][R] Fix fix for R nightly jobs > -- > > Key: ARROW-6670 > URL: https://issues.apache.org/jira/browse/ARROW-6670 > Project: Apache Arrow > Issue Type: Bug > Components: Continuous Integration, R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Minor > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6665) [Rust] [DataFusion] Implement numeric literal expressions
[ https://issues.apache.org/jira/browse/ARROW-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-6665. --- Fix Version/s: (was: 1.0.0) 0.15.0 Resolution: Fixed Issue resolved by pull request 5474 [https://github.com/apache/arrow/pull/5474] > [Rust] [DataFusion] Implement numeric literal expressions > - > > Key: ARROW-6665 > URL: https://issues.apache.org/jira/browse/ARROW-6665 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Implement numeric literal expressions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6670) [CI][R] Fix fix for R nightly jobs
[ https://issues.apache.org/jira/browse/ARROW-6670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6670: -- Labels: pull-request-available (was: ) > [CI][R] Fix fix for R nightly jobs > -- > > Key: ARROW-6670 > URL: https://issues.apache.org/jira/browse/ARROW-6670 > Project: Apache Arrow > Issue Type: Bug > Components: Continuous Integration, R >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Minor > Labels: pull-request-available > Fix For: 0.15.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6669) [Rust] [DataFusion] Implement physical expression for binary expressions
[ https://issues.apache.org/jira/browse/ARROW-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6669: -- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Implement physical expression for binary expressions > > > Key: ARROW-6669 > URL: https://issues.apache.org/jira/browse/ARROW-6669 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > > Implement comparison operators (<, <=, >, >=, =, !=) as well as binary > operators AND and OR. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6670) [CI][R] Fix fix for R nightly jobs
Neal Richardson created ARROW-6670: -- Summary: [CI][R] Fix fix for R nightly jobs Key: ARROW-6670 URL: https://issues.apache.org/jira/browse/ARROW-6670 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.15.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6668) [Rust] [DataFusion] Implement CAST expression
[ https://issues.apache.org/jira/browse/ARROW-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove reassigned ARROW-6668: - Assignee: Andy Grove > [Rust] [DataFusion] Implement CAST expression > - > > Key: ARROW-6668 > URL: https://issues.apache.org/jira/browse/ARROW-6668 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Implement CAST expression -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6669) [Rust] [DataFusion] Implement physical expression for binary expressions
[ https://issues.apache.org/jira/browse/ARROW-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove reassigned ARROW-6669: - Assignee: Andy Grove > [Rust] [DataFusion] Implement physical expression for binary expressions > > > Key: ARROW-6669 > URL: https://issues.apache.org/jira/browse/ARROW-6669 > Project: Apache Arrow > Issue Type: Sub-task >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > > Implement comparison operators (<, <=, >, >=, =, !=) as well as binary > operators AND and OR. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6669) [Rust] [DataFusion] Implement physical expression for binary expressions
Andy Grove created ARROW-6669: - Summary: [Rust] [DataFusion] Implement physical expression for binary expressions Key: ARROW-6669 URL: https://issues.apache.org/jira/browse/ARROW-6669 Project: Apache Arrow Issue Type: Sub-task Reporter: Andy Grove Implement comparison operators (<, <=, >, >=, =, !=) as well as binary operators AND and OR. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6667) [Python] Avoid Reference Cycles in pyarrow.parquet
[ https://issues.apache.org/jira/browse/ARROW-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-6667: -- Component/s: Python > [Python] Avoid Reference Cycles in pyarrow.parquet > -- > > Key: ARROW-6667 > URL: https://issues.apache.org/jira/browse/ARROW-6667 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Aaron Opfer >Priority: Minor > Labels: pull-request-available > Attachments: cycle1_build_nested_path.PNG, cycle2_open_dataset.PNG > > Time Spent: 40m > Remaining Estimate: 0h > > Reference cycles appear in two places inside pyarrow.parquet which causes > these objects to have much longer lifetimes than necessary: > > {{_build_nested_path}} has a reference cycle because the closured function > refers to the parent cell which also refers to the closured function again > (objgraph shown in attachment) > {{open_dataset_file}} is partialed with self inside the {{ParquetFile}} class > (objgraph shown in attachment). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6655) [Python] Filesystem bindings for S3
[ https://issues.apache.org/jira/browse/ARROW-6655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6655: -- Labels: pull-request-available (was: ) > [Python] Filesystem bindings for S3 > --- > > Key: ARROW-6655 > URL: https://issues.apache.org/jira/browse/ARROW-6655 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > > Follow-up work of ARROW-5494: [Python] Create FileSystem bindings -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6668) [Rust] [DataFusion] Implement CAST expression
[ https://issues.apache.org/jira/browse/ARROW-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6668: -- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Implement CAST expression > - > > Key: ARROW-6668 > URL: https://issues.apache.org/jira/browse/ARROW-6668 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Implement CAST expression -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6668) [Rust] [DataFusion] Implement CAST expression
Andy Grove created ARROW-6668: - Summary: [Rust] [DataFusion] Implement CAST expression Key: ARROW-6668 URL: https://issues.apache.org/jira/browse/ARROW-6668 Project: Apache Arrow Issue Type: Sub-task Components: Rust, Rust - DataFusion Reporter: Andy Grove Fix For: 1.0.0 Implement CAST expression -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6664) [C++] Add option to build without SSE4.2
[ https://issues.apache.org/jira/browse/ARROW-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6664: -- Labels: pull-request-available (was: ) > [C++] Add option to build without SSE4.2 > > > Key: ARROW-6664 > URL: https://issues.apache.org/jira/browse/ARROW-6664 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > > Child task of ARROW-5381 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6213) [C++] tests fail for AVX512
[ https://issues.apache.org/jira/browse/ARROW-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936012#comment-16936012 ] Antoine Pitrou commented on ARROW-6213: --- Well, let's create that account :-) Does it have a stable IP address? > [C++] tests fail for AVX512 > --- > > Key: ARROW-6213 > URL: https://issues.apache.org/jira/browse/ARROW-6213 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.14.1 > Environment: CentOS 7.6.1810, Intel Xeon Processor (Skylake, IBRS) > avx512 >Reporter: Charles Coulombe >Priority: Minor > Fix For: 2.0.0 > > Attachments: arrow-0.14.1-c++-failed-tests-cmake-conf.txt, > arrow-0.14.1-c++-failed-tests.txt > > > When building libraries for avx512 with GCC 7.3.0, two C++ tests fails. > {noformat} > The following tests FAILED: > 28 - arrow-compute-compare-test (Failed) > 30 - arrow-compute-filter-test (Failed) > Errors while running CTest{noformat} > while for avx2 they passes. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6667) [Python] Avoid Reference Cycles in pyarrow.parquet
[ https://issues.apache.org/jira/browse/ARROW-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6667: -- Labels: pull-request-available (was: ) > [Python] Avoid Reference Cycles in pyarrow.parquet > -- > > Key: ARROW-6667 > URL: https://issues.apache.org/jira/browse/ARROW-6667 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Aaron Opfer >Priority: Minor > Labels: pull-request-available > Attachments: cycle1_build_nested_path.PNG, cycle2_open_dataset.PNG > > > Reference cycles appear in two places inside pyarrow.parquet which causes > these objects to have much longer lifetimes than necessary: > > {{_build_nested_path}} has a reference cycle because the closured function > refers to the parent cell which also refers to the closured function again > (objgraph shown in attachment) > {{open_dataset_file}} is partialed with self inside the {{ParquetFile}} class > (objgraph shown in attachment). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6667) [Python] Avoid Reference Cycles in pyarrow.parquet
Aaron Opfer created ARROW-6667: -- Summary: [Python] Avoid Reference Cycles in pyarrow.parquet Key: ARROW-6667 URL: https://issues.apache.org/jira/browse/ARROW-6667 Project: Apache Arrow Issue Type: Improvement Reporter: Aaron Opfer Attachments: cycle1_build_nested_path.PNG, cycle2_open_dataset.PNG Reference cycles appear in two places inside pyarrow.parquet which causes these objects to have much longer lifetimes than necessary: {{_build_nested_path}} has a reference cycle because the closured function refers to the parent cell which also refers to the closured function again (objgraph shown in attachment) {{open_dataset_file}} is partialed with self inside the {{ParquetFile}} class (objgraph shown in attachment). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6089) [Rust] [DataFusion] Implement parallel execution for selection
[ https://issues.apache.org/jira/browse/ARROW-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated ARROW-6089: -- Fix Version/s: (was: 0.15.0) 1.0.0 > [Rust] [DataFusion] Implement parallel execution for selection > -- > > Key: ARROW-6089 > URL: https://issues.apache.org/jira/browse/ARROW-6089 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Implement physical plan for selection operator. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6665) [Rust] [DataFusion] Implement numeric literal expressions
[ https://issues.apache.org/jira/browse/ARROW-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6665: -- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Implement numeric literal expressions > - > > Key: ARROW-6665 > URL: https://issues.apache.org/jira/browse/ARROW-6665 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Implement numeric literal expressions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6666) [Rust] [DataFusion] Implement string literal expression
Andy Grove created ARROW-: - Summary: [Rust] [DataFusion] Implement string literal expression Key: ARROW- URL: https://issues.apache.org/jira/browse/ARROW- Project: Apache Arrow Issue Type: Sub-task Components: Rust, Rust - DataFusion Reporter: Andy Grove Fix For: 1.0.0 Implement string literal expression -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6665) [Rust] [DataFusion] Implement numeric literal expressions
[ https://issues.apache.org/jira/browse/ARROW-6665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove reassigned ARROW-6665: - Assignee: Andy Grove > [Rust] [DataFusion] Implement numeric literal expressions > - > > Key: ARROW-6665 > URL: https://issues.apache.org/jira/browse/ARROW-6665 > Project: Apache Arrow > Issue Type: Sub-task > Components: Rust, Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Fix For: 1.0.0 > > > Implement numeric literal expressions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6665) [Rust] [DataFusion] Implement numeric literal expressions
Andy Grove created ARROW-6665: - Summary: [Rust] [DataFusion] Implement numeric literal expressions Key: ARROW-6665 URL: https://issues.apache.org/jira/browse/ARROW-6665 Project: Apache Arrow Issue Type: Sub-task Components: Rust, Rust - DataFusion Reporter: Andy Grove Fix For: 1.0.0 Implement numeric literal expressions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6605) [C++] Add recursion depth control to fs::Selector
[ https://issues.apache.org/jira/browse/ARROW-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-6605. --- Fix Version/s: (was: 1.0.0) 0.15.0 Resolution: Fixed Issue resolved by pull request 5429 [https://github.com/apache/arrow/pull/5429] > [C++] Add recursion depth control to fs::Selector > - > > Key: ARROW-6605 > URL: https://issues.apache.org/jira/browse/ARROW-6605 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Minor > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > This is similar to the recursive options, but also control the depth. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6605) [C++] Add recursion depth control to fs::Selector
[ https://issues.apache.org/jira/browse/ARROW-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-6605: - Assignee: Francois Saint-Jacques > [C++] Add recursion depth control to fs::Selector > - > > Key: ARROW-6605 > URL: https://issues.apache.org/jira/browse/ARROW-6605 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > This is similar to the recursive options, but also control the depth. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6352) [Java] Add implementation of DenseUnionVector.
[ https://issues.apache.org/jira/browse/ARROW-6352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6352: -- Labels: pull-request-available (was: ) > [Java] Add implementation of DenseUnionVector. > -- > > Key: ARROW-6352 > URL: https://issues.apache.org/jira/browse/ARROW-6352 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Micah Kornfield >Assignee: Liya Fan >Priority: Major > Labels: pull-request-available > > Today only Sparse unions are supported. We should have a dense union > implementation vector that conforms to the IPC protocol (the current sparse > union vector doesn't do this and there are other JIRAs covering making it > compatible). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6601) [Java] Improve JDBC adapter performance & add benchmark
[ https://issues.apache.org/jira/browse/ARROW-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6601: -- Labels: pull-request-available (was: ) > [Java] Improve JDBC adapter performance & add benchmark > --- > > Key: ARROW-6601 > URL: https://issues.apache.org/jira/browse/ARROW-6601 > Project: Apache Arrow > Issue Type: Task > Components: Java >Reporter: Ji Liu >Assignee: Ji Liu >Priority: Critical > Labels: pull-request-available > > Add a performance test as well to get a baseline number, to avoid performance > regression when we change related code. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6601) [Java] Improve JDBC adapter performance & add benchmark
[ https://issues.apache.org/jira/browse/ARROW-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16935613#comment-16935613 ] Ji Liu commented on ARROW-6601: --- When working with Jdbc adapter benchmark, I found the jmh result is very worse (about 168 ns/op), and I finally found that when we initialize a VectorSchemaRoot, when call JdbcToArrowUtils#allocateVectors which is time consuming, and this is not necessary since we use setSafe API in consumers. After remove this, the jmh result is about 2000ns/op (3 coulumns with valueCount = 3000). I think this one should merged into 0.15 release. > [Java] Improve JDBC adapter performance & add benchmark > --- > > Key: ARROW-6601 > URL: https://issues.apache.org/jira/browse/ARROW-6601 > Project: Apache Arrow > Issue Type: Task > Components: Java >Reporter: Ji Liu >Assignee: Ji Liu >Priority: Critical > > Add a performance test as well to get a baseline number, to avoid performance > regression when we change related code. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6601) [Java] Improve JDBC adapter performance & add benchmark
[ https://issues.apache.org/jira/browse/ARROW-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ji Liu updated ARROW-6601: -- Summary: [Java] Improve JDBC adapter performance & add benchmark (was: [Java] Add benchmark for JDBC adapter to avoid potential regression) > [Java] Improve JDBC adapter performance & add benchmark > --- > > Key: ARROW-6601 > URL: https://issues.apache.org/jira/browse/ARROW-6601 > Project: Apache Arrow > Issue Type: Task > Components: Java >Reporter: Ji Liu >Assignee: Ji Liu >Priority: Critical > > Add a performance test as well to get a baseline number, to avoid performance > regression when we change related code. > -- This message was sent by Atlassian Jira (v8.3.4#803005)