[jira] [Updated] (ARROW-11455) [R] Improve handling of -2^31 in 32-bit integer fields

2021-02-01 Thread Ian Cook (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-11455: - Description: R’s {{integer}} range is 1 smaller than the normal 32-bit integer range of C++, Java,

[jira] [Updated] (ARROW-11455) [R] Improve handling of -2^31 in 32-bit integer fields

2021-02-01 Thread Ian Cook (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-11455: - Description: R’s {{integer}} range is 1 smaller than the normal 32-bit integer range of C++, Java,

[jira] [Updated] (ARROW-11463) Allow configuration of IpcWriterOptions 64Bit from PyArrow

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11463: --- Labels: pull-request-available (was: ) > Allow configuration of IpcWriterOptions 64Bit

[jira] [Created] (ARROW-11468) [R] Allow user to pass schema to read_json_arrow()

2021-02-01 Thread Ian Cook (Jira)
Ian Cook created ARROW-11468: Summary: [R] Allow user to pass schema to read_json_arrow() Key: ARROW-11468 URL: https://issues.apache.org/jira/browse/ARROW-11468 Project: Apache Arrow Issue

[jira] [Assigned] (ARROW-11463) Allow configuration of IpcWriterOptions 64Bit from PyArrow

2021-02-01 Thread Tao He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao He reassigned ARROW-11463: -- Assignee: Tao He > Allow configuration of IpcWriterOptions 64Bit from PyArrow >

[jira] [Commented] (ARROW-11463) Allow configuration of IpcWriterOptions 64Bit from PyArrow

2021-02-01 Thread Tao He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276819#comment-17276819 ] Tao He commented on ARROW-11463: The `pa.ipc.new_stream` accepts an option:

[jira] [Updated] (ARROW-11467) [R] Fix reference to json_table_reader() in R docs

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11467: --- Labels: pull-request-available (was: ) > [R] Fix reference to json_table_reader() in R

[jira] [Created] (ARROW-11467) [R] Fix reference to json_table_reader() in R docs

2021-02-01 Thread Ian Cook (Jira)
Ian Cook created ARROW-11467: Summary: [R] Fix reference to json_table_reader() in R docs Key: ARROW-11467 URL: https://issues.apache.org/jira/browse/ARROW-11467 Project: Apache Arrow Issue

[jira] [Commented] (ARROW-10438) [C++][Dataset] Partitioning::Format on nulls

2021-02-01 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276774#comment-17276774 ] Weston Pace commented on ARROW-10438: - [~jorisvandenbossche] , I spoke with [~bkietz] a bit on this

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}} 

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}} 

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}} 

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}} 

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}} 

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}} 

[jira] [Updated] (ARROW-11466) [Flight][Go] Add BasicAuth and BearerToken handlers for Go

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11466: --- Labels: pull-request-available (was: ) > [Flight][Go] Add BasicAuth and BearerToken

[jira] [Resolved] (ARROW-11437) [Rust] Simplify benches

2021-02-01 Thread Neville Dipale (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neville Dipale resolved ARROW-11437. Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 9371

[jira] [Updated] (ARROW-11465) Parquet file writer snapshot API and proper ColumnChunk.file_path utilization

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11465: --- Labels: pull-request-available (was: ) > Parquet file writer snapshot API and proper

[jira] [Created] (ARROW-11466) [Flight][Go] Add BasicAuth and BearerToken handlers for Go

2021-02-01 Thread Matt Topol (Jira)
Matt Topol created ARROW-11466: -- Summary: [Flight][Go] Add BasicAuth and BearerToken handlers for Go Key: ARROW-11466 URL: https://issues.apache.org/jira/browse/ARROW-11466 Project: Apache Arrow

[jira] [Created] (ARROW-11465) Parquet file writer snapshot API and proper ColumnChunk.file_path utilization

2021-02-01 Thread Radu Teodorescu (Jira)
Radu Teodorescu created ARROW-11465: --- Summary: Parquet file writer snapshot API and proper ColumnChunk.file_path utilization Key: ARROW-11465 URL: https://issues.apache.org/jira/browse/ARROW-11465

[jira] [Created] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
Pac A. He created ARROW-11464: - Summary: [Python] pyarrow.parquet.read_pandas doesn't conform to its docs Key: ARROW-11464 URL: https://issues.apache.org/jira/browse/ARROW-11464 Project: Apache Arrow

[jira] [Updated] (ARROW-11464) [Python] pyarrow.parquet.read_pandas doesn't conform to its docs

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11464: -- Description: The {{*pyarrow.parquet.read_pandas*}} 

[jira] [Updated] (ARROW-11463) Allow configuration of IpcWriterOptions 64Bit from PyArrow

2021-02-01 Thread Leonard Lausen (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leonard Lausen updated ARROW-11463: --- Description: For tables with many chunks (2M+ rows, 20k+ chunks), `pyarrow.Table.take` will

[jira] [Created] (ARROW-11463) Allow configuration of IpcWriterOptions 64Bit from PyArrow

2021-02-01 Thread Leonard Lausen (Jira)
Leonard Lausen created ARROW-11463: -- Summary: Allow configuration of IpcWriterOptions 64Bit from PyArrow Key: ARROW-11463 URL: https://issues.apache.org/jira/browse/ARROW-11463 Project: Apache Arrow

[jira] [Created] (ARROW-11462) [Developer] Remove needless quote from the default DOCKER_VOLUME_PREFIX

2021-02-01 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-11462: Summary: [Developer] Remove needless quote from the default DOCKER_VOLUME_PREFIX Key: ARROW-11462 URL: https://issues.apache.org/jira/browse/ARROW-11462 Project:

[jira] [Updated] (ARROW-11462) [Developer] Remove needless quote from the default DOCKER_VOLUME_PREFIX

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11462: --- Labels: pull-request-available (was: ) > [Developer] Remove needless quote from the

[jira] [Updated] (ARROW-11452) [Rust] Parquet reader cannot read file where a struct column has the same name as struct member columns

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11452: --- Labels: pull-request-available (was: ) > [Rust] Parquet reader cannot read file where a

[jira] [Commented] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276639#comment-17276639 ] Antoine Pitrou commented on ARROW-11427: Yes, {{ARROW_USER_SIMD_LEVEL=avx2}} is a good fallback.

[jira] [Commented] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Ali Cetin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276628#comment-17276628 ] Ali Cetin commented on ARROW-11427: --- Yeah, that would explain why Broadwell CPUs works fine, which

[jira] [Resolved] (ARROW-10520) [C++][R] Implement add/remove/replace for RecordBatch

2021-02-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-10520. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 9365

[jira] [Commented] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276613#comment-17276613 ] Antoine Pitrou commented on ARROW-11427: Thank you very much for the investigation [~ali.cetin]

[jira] [Comment Edited] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Ali Cetin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276608#comment-17276608 ] Ali Cetin edited comment on ARROW-11427 at 2/1/21, 7:48 PM: As you can see

[jira] [Commented] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Ali Cetin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276608#comment-17276608 ] Ali Cetin commented on ARROW-11427: --- As you can see above, I have also tried it on Windows Server

[jira] [Updated] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Ali Cetin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ali Cetin updated ARROW-11427: -- Description: *Update*: Azure (D2_v2) VM no longer spins-up with Xeon Platinum 8171m, so I'm unable

[jira] [Commented] (ARROW-11455) [R] Improve handling of -2^31 in 32-bit integer fields

2021-02-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276603#comment-17276603 ] Neal Richardson commented on ARROW-11455: - We have a function we use to determine whether an

[jira] [Commented] (ARROW-11460) [R] Use system compression libraries if present on Linux

2021-02-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276598#comment-17276598 ] Neal Richardson commented on ARROW-11460: - No, I mean build time. I believe

[jira] [Updated] (ARROW-11460) [R] Use system compression libraries if present on Linux

2021-02-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-11460: Description: We vendor/bundle all compression libraries and have them disabled in the

[jira] [Updated] (ARROW-11461) [Flight][Go] GetSchema does not work with Java Flight Server

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11461: --- Labels: pull-request-available (was: ) > [Flight][Go] GetSchema does not work with Java

[jira] [Created] (ARROW-11461) [Flight][Go] GetSchema does not work with Java Flight Server

2021-02-01 Thread Matt Topol (Jira)
Matt Topol created ARROW-11461: -- Summary: [Flight][Go] GetSchema does not work with Java Flight Server Key: ARROW-11461 URL: https://issues.apache.org/jira/browse/ARROW-11461 Project: Apache Arrow

[jira] [Commented] (ARROW-11460) [R] Use system compression libraries if present on Linux

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276586#comment-17276586 ] Antoine Pitrou commented on ARROW-11460: By "if present", do you mean detect them at runtime

[jira] [Created] (ARROW-11460) [R] Use system compression libraries if present on Linux

2021-02-01 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-11460: --- Summary: [R] Use system compression libraries if present on Linux Key: ARROW-11460 URL: https://issues.apache.org/jira/browse/ARROW-11460 Project: Apache Arrow

[jira] [Commented] (ARROW-11450) [Python] pyarrow<3 incompatible with numpy>=1.20.0

2021-02-01 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276582#comment-17276582 ] Joris Van den Bossche commented on ARROW-11450: --- bq. There are libraries whose latest

[jira] [Updated] (ARROW-11459) [Rust] Allow ListArray of primitives to be built from iterator

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11459: --- Labels: pull-request-available (was: ) > [Rust] Allow ListArray of primitives to be built

[jira] [Created] (ARROW-11459) [Rust] Allow ListArray of primitives to be built from iterator

2021-02-01 Thread Jira
Jorge Leitão created ARROW-11459: Summary: [Rust] Allow ListArray of primitives to be built from iterator Key: ARROW-11459 URL: https://issues.apache.org/jira/browse/ARROW-11459 Project: Apache Arrow

[jira] [Commented] (ARROW-11450) [Python] pyarrow<3 incompatible with numpy>=1.20.0

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276519#comment-17276519 ] Antoine Pitrou commented on ARROW-11450: Unfortunately the release process is quite involved

[jira] [Closed] (ARROW-11458) PyArrow 1.x and 2.x do not work with numpy 1.20

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou closed ARROW-11458. -- Resolution: Duplicate > PyArrow 1.x and 2.x do not work with numpy 1.20 >

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276517#comment-17276517 ] Antoine Pitrou commented on ARROW-11456: Was the Parquet file generated with Arrow? > [Python]

[jira] [Commented] (ARROW-11450) [Python] pyarrow<3 incompatible with numpy>=1.20.0

2021-02-01 Thread Zhuo Peng (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276510#comment-17276510 ] Zhuo Peng commented on ARROW-11450: --- There are libraries whose latest release (and even the dev

[jira] [Commented] (ARROW-11458) PyArrow 1.x and 2.x do not work with numpy 1.20

2021-02-01 Thread Zhuo Peng (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276505#comment-17276505 ] Zhuo Peng commented on ARROW-11458: --- Sorry just saw 

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276501#comment-17276501 ] Pac A. He edited comment on ARROW-11456 at 2/1/21, 5:21 PM:

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276501#comment-17276501 ] Pac A. He edited comment on ARROW-11456 at 2/1/21, 5:21 PM:

[jira] [Comment Edited] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276501#comment-17276501 ] Pac A. He edited comment on ARROW-11456 at 2/1/21, 5:20 PM:

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276501#comment-17276501 ] Pac A. He commented on ARROW-11456: --- [~jorisvandenbossche] This is very difficult in this case because

[jira] [Created] (ARROW-11458) PyArrow 1.x and 2.x do not work with numpy 1.20

2021-02-01 Thread Zhuo Peng (Jira)
Zhuo Peng created ARROW-11458: - Summary: PyArrow 1.x and 2.x do not work with numpy 1.20 Key: ARROW-11458 URL: https://issues.apache.org/jira/browse/ARROW-11458 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276480#comment-17276480 ] Joris Van den Bossche commented on ARROW-11456: --- [~apacman] would you be able to provide a

[jira] [Resolved] (ARROW-11457) [Rust] Make string comparisson kernels generic over Utf8 and LargeUtf8

2021-02-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb resolved ARROW-11457. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 9362

[jira] [Created] (ARROW-11457) [Rust] Make string comparisson kernels generic over Utf8 and LargeUtf8

2021-02-01 Thread Andrew Lamb (Jira)
Andrew Lamb created ARROW-11457: --- Summary: [Rust] Make string comparisson kernels generic over Utf8 and LargeUtf8 Key: ARROW-11457 URL: https://issues.apache.org/jira/browse/ARROW-11457 Project:

[jira] [Updated] (ARROW-11457) [Rust] Make string comparisson kernels generic over Utf8 and LargeUtf8

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11457: --- Labels: pull-request-available (was: ) > [Rust] Make string comparisson kernels generic

[jira] [Updated] (ARROW-11457) [Rust] Make string comparisson kernels generic over Utf8 and LargeUtf8

2021-02-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb updated ARROW-11457: Component/s: Rust > [Rust] Make string comparisson kernels generic over Utf8 and LargeUtf8 >

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final =

[jira] [Commented] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276462#comment-17276462 ] Antoine Pitrou commented on ARROW-11456: cc [~jorisvandenbossche] > [Python] Parquet reader

[jira] [Updated] (ARROW-11456) [Python] Parquet reader cannot read large strings

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-11456: --- Summary: [Python] Parquet reader cannot read large strings (was: OSError: Capacity error:

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-11456: --- Priority: Major (was: Blocker) > OSError: Capacity error: BinaryBuilder cannot reserve

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final =

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final =

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Environment: pyarrow 3.0.0 / 2.0.0 pandas 1.2.1 python 3.8.6 was: pyarrow 3.0.0 / 2.0.0 pandas

[jira] [Created] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
Pac A. He created ARROW-11456: - Summary: OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements Key: ARROW-11456 URL: https://issues.apache.org/jira/browse/ARROW-11456

[jira] [Updated] (ARROW-11456) OSError: Capacity error: BinaryBuilder cannot reserve space for more than 2147483646 child elements

2021-02-01 Thread Pac A. He (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pac A. He updated ARROW-11456: -- Description: When reading a large parquet file, I have this error:   {noformat} df: Final =

[jira] [Commented] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276444#comment-17276444 ] Antoine Pitrou commented on ARROW-11427: I've tried your reproducer script on a AVX512-enabled

[jira] [Updated] (ARROW-11455) [R] Improve handling of -2^31 in 32-bit integer fields

2021-02-01 Thread Ian Cook (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-11455: - Description: R’s {{integer}} range is 1 smaller than the normal 32-bit integer range of C++, Java,

[jira] [Resolved] (ARROW-11449) [CI][R][Windows] Use ccache

2021-02-01 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-11449. - Fix Version/s: 4.0.0 Resolution: Fixed Issue resolved by pull request 9383

[jira] [Updated] (ARROW-11455) [R] Improve handling of -2^31 in 32-bit integer fields

2021-02-01 Thread Ian Cook (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Cook updated ARROW-11455: - Description: R’s {{integer}} range is 1 smaller than the normal 32-bit integer range of C++, Java,

[jira] [Created] (ARROW-11455) [R] Improve handling of -2^31 in 32-bit integer fields

2021-02-01 Thread Ian Cook (Jira)
Ian Cook created ARROW-11455: Summary: [R] Improve handling of -2^31 in 32-bit integer fields Key: ARROW-11455 URL: https://issues.apache.org/jira/browse/ARROW-11455 Project: Apache Arrow Issue

[jira] [Commented] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Ali Cetin (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276370#comment-17276370 ] Ali Cetin commented on ARROW-11427: --- I have done some more testing. Seems like Skylake processors

[jira] [Resolved] (ARROW-11440) [Rust] [DataFusion] Add method to CsvExec to get CSV schema

2021-02-01 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Lamb resolved ARROW-11440. - Resolution: Fixed Issue resolved by pull request 9377

[jira] [Created] (ARROW-11454) [Website] [Rust] 3.0.0 Blog Post

2021-02-01 Thread Andy Grove (Jira)
Andy Grove created ARROW-11454: -- Summary: [Website] [Rust] 3.0.0 Blog Post Key: ARROW-11454 URL: https://issues.apache.org/jira/browse/ARROW-11454 Project: Apache Arrow Issue Type: Improvement

[jira] [Commented] (ARROW-11066) [Java] Is there a bug in flight AddWritableBuffer

2021-02-01 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276338#comment-17276338 ] David Li commented on ARROW-11066: -- Fixing the optimization exposes a SIGSEGV. It looks like in some

[jira] [Comment Edited] (ARROW-10344) [Python] Get all columns names (or schema) from Feather file, before loading whole Feather file

2021-02-01 Thread Gert Hulselmans (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276321#comment-17276321 ] Gert Hulselmans edited comment on ARROW-10344 at 2/1/21, 1:35 PM: --

[jira] [Comment Edited] (ARROW-10344) [Python] Get all columns names (or schema) from Feather file, before loading whole Feather file

2021-02-01 Thread Gert Hulselmans (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276321#comment-17276321 ] Gert Hulselmans edited comment on ARROW-10344 at 2/1/21, 1:34 PM: --

[jira] [Commented] (ARROW-10344) [Python] Get all columns names (or schema) from Feather file, before loading whole Feather file

2021-02-01 Thread Gert Hulselmans (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276321#comment-17276321 ] Gert Hulselmans commented on ARROW-10344: - [~weldingwelding] The first 4/6 bytes (and last 4/6

[jira] [Commented] (ARROW-11424) [C++] Add more StructType and StructArray methods

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276311#comment-17276311 ] Antoine Pitrou commented on ARROW-11424: They're a bit different: * a StructArray has a

[jira] [Updated] (ARROW-11373) [Python][Docs] Add example of specifying type for a column when reading csv file

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11373: --- Labels: pull-request-available (was: ) > [Python][Docs] Add example of specifying type for

[jira] [Commented] (ARROW-11427) [Python] Windows Server 2012 w/ Xeon Platinum 8171M crashes after upgrading to pyarrow 3.0

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276238#comment-17276238 ] Antoine Pitrou commented on ARROW-11427: Could you try setting the environment variable

[jira] [Closed] (ARROW-5756) [Python] Remove manylinux1 support

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou closed ARROW-5756. - Resolution: Duplicate > [Python] Remove manylinux1 support > --

[jira] [Commented] (ARROW-5756) [Python] Remove manylinux1 support

2021-02-01 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276232#comment-17276232 ] Antoine Pitrou commented on ARROW-5756: --- Indeed, I'm gonna close this as duplicate. Thank you :) >

[jira] [Closed] (ARROW-11450) [Python] pyarrow<3 incompatible with numpy>=1.20.0

2021-02-01 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche closed ARROW-11450. - Fix Version/s: 3.0.0 Resolution: Duplicate > [Python] pyarrow<3

[jira] [Commented] (ARROW-11450) [Python] pyarrow<3 incompatible with numpy>=1.20.0

2021-02-01 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17276135#comment-17276135 ] Joris Van den Bossche commented on ARROW-11450: --- This is known issue with wheels of older