[jira] [Resolved] (ARROW-16148) [C++] TPC-H generator cleanup

2022-04-21 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weston Pace resolved ARROW-16148. - Fix Version/s: 8.0.0 Resolution: Fixed Issue resolved by pull request 12843

[jira] [Assigned] (ARROW-16148) [C++] TPC-H generator cleanup

2022-04-21 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weston Pace reassigned ARROW-16148: --- Assignee: Sasha Krassovsky > [C++] TPC-H generator cleanup > -

[jira] [Created] (ARROW-16277) [Python] No builds for macOS arm64.

2022-04-21 Thread A. Coady (Jira)
A. Coady created ARROW-16277: Summary: [Python] No builds for macOS arm64. Key: ARROW-16277 URL: https://issues.apache.org/jira/browse/ARROW-16277 Project: Apache Arrow Issue Type: Task

[jira] [Resolved] (ARROW-16187) [Go][Parquet] Properly utilize BufferedStream and buffer size for reading

2022-04-21 Thread Matthew Topol (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-16187. --- Resolution: Fixed Issue resolved by pull request 12876

[jira] [Created] (ARROW-16276) [R] Release News

2022-04-21 Thread Jonathan Keane (Jira)
Jonathan Keane created ARROW-16276: -- Summary: [R] Release News Key: ARROW-16276 URL: https://issues.apache.org/jira/browse/ARROW-16276 Project: Apache Arrow Issue Type: Improvement

[jira] [Resolved] (ARROW-14638) [C++][R] Unknown C compiler / ccache on Arch Linux

2022-04-21 Thread Jonathan Keane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Keane resolved ARROW-14638. Fix Version/s: 8.0.0 Resolution: Fixed Issue resolved by pull request 11666

[jira] [Updated] (ARROW-16275) [C++] Add support for pushdown filtering of nested references

2022-04-21 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weston Pace updated ARROW-16275: Summary: [C++] Add support for pushdown filtering of nested references (was: [C++] Add support

[jira] [Created] (ARROW-16275) [C++] Add support for pushdown projection of nested references

2022-04-21 Thread Weston Pace (Jira)
Weston Pace created ARROW-16275: --- Summary: [C++] Add support for pushdown projection of nested references Key: ARROW-16275 URL: https://issues.apache.org/jira/browse/ARROW-16275 Project: Apache Arrow

[jira] [Resolved] (ARROW-15950) [Go] Lift BitSetRunReader to internal/bitutils package

2022-04-21 Thread Matthew Topol (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol resolved ARROW-15950. --- Fix Version/s: 8.0.0 Resolution: Fixed Issue resolved by pull request 12926

[jira] [Created] (ARROW-16274) [C++] Substrait consumer should be feature-aware

2022-04-21 Thread Weston Pace (Jira)
Weston Pace created ARROW-16274: --- Summary: [C++] Substrait consumer should be feature-aware Key: ARROW-16274 URL: https://issues.apache.org/jira/browse/ARROW-16274 Project: Apache Arrow Issue

[jira] [Commented] (ARROW-16129) [Java] Illegal reflective access operation on JDK 11

2022-04-21 Thread David Dali Susanibar Arce (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526079#comment-17526079 ] David Dali Susanibar Arce commented on ARROW-16129: --- Hi [~nrhelmi]  please if you

[jira] [Commented] (ARROW-14512) [Java][Doc] JavaDoc errors while building the docs

2022-04-21 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526077#comment-17526077 ] David Li commented on ARROW-14512: -- This is the regular JavaDoc. I don't recall what happened anymore.

[jira] [Commented] (ARROW-14512) [Java][Doc] JavaDoc errors while building the docs

2022-04-21 Thread David Dali Susanibar Arce (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526075#comment-17526075 ] David Dali Susanibar Arce commented on ARROW-14512: --- Hi [~kszucs] please could you

[jira] [Commented] (ARROW-16272) [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread Sahil Gupta (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526072#comment-17526072 ] Sahil Gupta commented on ARROW-16272: - Thanks [~apitrou] ! > [C++][Python] Poor read performance of

[jira] [Commented] (ARROW-16272) [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread Sahil Gupta (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526074#comment-17526074 ] Sahil Gupta commented on ARROW-16272: - > It seems that, despite {{{}nrows=100{}}}, the S3 filesystem

[jira] [Created] (ARROW-16273) [C++] Valgrind error in arrow-compute-scalar-test

2022-04-21 Thread Weston Pace (Jira)
Weston Pace created ARROW-16273: --- Summary: [C++] Valgrind error in arrow-compute-scalar-test Key: ARROW-16273 URL: https://issues.apache.org/jira/browse/ARROW-16273 Project: Apache Arrow Issue

[jira] [Commented] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526058#comment-17526058 ] Weston Pace commented on ARROW-15678: - That seems like an good solution to me. I had no idea it was

[jira] [Commented] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526055#comment-17526055 ] Antoine Pitrou commented on ARROW-15678: In any case, this is probably too involved a change for

[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526053#comment-17526053 ] Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:51 PM: - So,

[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526053#comment-17526053 ] Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:51 PM: - So,

[jira] [Commented] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526053#comment-17526053 ] Antoine Pitrou commented on ARROW-15678: So, currently we are doing something such as: {code}

[jira] [Commented] (ARROW-15664) [C++] parquet reader Segfaults with illegal SIMD instruction

2022-04-21 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526051#comment-17526051 ] Weston Pace commented on ARROW-15664: - I'm going to investigate this more today. > [C++] parquet

[jira] [Comment Edited] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526047#comment-17526047 ] Weston Pace edited comment on ARROW-16240 at 4/21/22 7:42 PM: -- Your

[jira] [Commented] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526047#comment-17526047 ] Weston Pace commented on ARROW-16240: - Your understanding is correct. I think

[jira] [Assigned] (ARROW-16272) [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-16272: -- Assignee: Antoine Pitrou > [C++][Python] Poor read performance of

[jira] [Commented] (ARROW-16272) [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526022#comment-17526022 ] Antoine Pitrou commented on ARROW-16272: Hmm, thanks for the report. For now, this can be worked

[jira] [Updated] (ARROW-16272) [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-16272: --- Fix Version/s: 9.0.0 > [C++][Python] Poor read performance of S3FileSystem.open_input_file

[jira] [Closed] (ARROW-15664) [C++] parquet reader Segfaults with illegal SIMD instruction

2022-04-21 Thread Jonathan Keane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Keane closed ARROW-15664. -- Resolution: Duplicate > [C++] parquet reader Segfaults with illegal SIMD instruction >

[jira] [Commented] (ARROW-15664) [C++] parquet reader Segfaults with illegal SIMD instruction

2022-04-21 Thread Jonathan Keane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526014#comment-17526014 ] Jonathan Keane commented on ARROW-15664: Also, I'm going to close this in favor of ARROW-15678 —

[jira] [Updated] (ARROW-15312) [R][C++] filtering a Parquet dataset with is.na() misses some rows

2022-04-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-15312: --- Labels: dataset pull-request-available (was: dataset) > [R][C++] filtering a Parquet

[jira] [Updated] (ARROW-16121) [Python] Deprecate the (common_)metadata(_path) attributes of ParquetDataset

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-16121: Fix Version/s: 9.0.0 (was: 8.0.0) > [Python] Deprecate the

[jira] [Updated] (ARROW-11415) [R] experimental map_batches cannot find columns

2022-04-21 Thread Will Jones (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Jones updated ARROW-11415: --- Fix Version/s: 8.0.0 (was: 9.0.0) > [R] experimental map_batches cannot find

[jira] [Updated] (ARROW-11415) [R] experimental map_batches cannot find columns

2022-04-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-11415: --- Labels: pull-request-available (was: ) > [R] experimental map_batches cannot find columns

[jira] [Commented] (ARROW-16272) [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526000#comment-17526000 ] David Li commented on ARROW-16272: -- Off the top of my head this is possibly because s3fs adds some

[jira] [Updated] (ARROW-16272) [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li updated ARROW-16272: - Component/s: C++ Python > [C++][Python] Poor read performance of

[jira] [Updated] (ARROW-16272) [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li updated ARROW-16272: - Summary: [C++][Python] Poor read performance of S3FileSystem.open_input_file when used with

[jira] [Commented] (ARROW-15664) [C++] parquet reader Segfaults with illegal SIMD instruction

2022-04-21 Thread Jonathan Keane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525997#comment-17525997 ] Jonathan Keane commented on ARROW-15664: This isn't an M1 issue, it's an issue with x86 + using

[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525995#comment-17525995 ] Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:08 PM: - Wow,

[jira] [Commented] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525996#comment-17525996 ] Antoine Pitrou commented on ARROW-15678: [~bkietz] You may have some idea about how to fix this

[jira] [Commented] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525995#comment-17525995 ] Antoine Pitrou commented on ARROW-15678: Wow, thanks for the diagnosis [~westonpace]. So, it

[jira] [Commented] (ARROW-16266) [R] Add StructArray$create()

2022-04-21 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525992#comment-17525992 ] Neal Richardson commented on ARROW-16266: - I think the other use case would be to take Arrays in

[jira] [Commented] (ARROW-16073) [R] clean-up date time unit testing once tzdb is available on Windows

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525990#comment-17525990 ] Krisztian Szucs commented on ARROW-16073: - Postponing to 9.0 > [R] clean-up date time unit

[jira] [Updated] (ARROW-16073) [R] clean-up date time unit testing once tzdb is available on Windows

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-16073: Fix Version/s: 9.0.0 (was: 8.0.0) > [R] clean-up date time unit

[jira] [Commented] (ARROW-15664) [C++] parquet reader Segfaults with illegal SIMD instruction

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525988#comment-17525988 ] Antoine Pitrou commented on ARROW-15664: I'm not really taking a look, it needs someone to

[jira] [Updated] (ARROW-16204) [C++][Dataset] Default error existing_data_behaviour for writing dataset ignores a single file

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-16204: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Dataset] Default error

[jira] [Commented] (ARROW-16204) [C++][Dataset] Default error existing_data_behaviour for writing dataset ignores a single file

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525989#comment-17525989 ] Krisztian Szucs commented on ARROW-16204: - Postponing to 9.0 > [C++][Dataset] Default error

[jira] [Resolved] (ARROW-12659) [C++][Compute] Support SimplifyWithGuarantee(is_null(foo), is_valid(foo))

2022-04-21 Thread Jonathan Keane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Keane resolved ARROW-12659. Fix Version/s: 8.0.0 Resolution: Fixed Issue resolved by pull request 12891

[jira] [Updated] (ARROW-13530) [C++] Implement cumulative sum compute function

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-13530: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++] Implement cumulative sum

[jira] [Commented] (ARROW-13530) [C++] Implement cumulative sum compute function

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525986#comment-17525986 ] Krisztian Szucs commented on ARROW-13530: - Postponing to 9.0. > [C++] Implement cumulative sum

[jira] [Updated] (ARROW-15329) [Python] Add character limit to ChunkedArray repr

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-15329: Fix Version/s: (was: 8.0.0) > [Python] Add character limit to ChunkedArray repr >

[jira] [Updated] (ARROW-15329) [Python] Add character limit to ChunkedArray repr

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-15329: Fix Version/s: 9.0.0 > [Python] Add character limit to ChunkedArray repr >

[jira] [Commented] (ARROW-15329) [Python] Add character limit to ChunkedArray repr

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525985#comment-17525985 ] Krisztian Szucs commented on ARROW-15329: - Postponing to 9.0 > [Python] Add character limit to

[jira] [Commented] (ARROW-15664) [C++] parquet reader Segfaults with illegal SIMD instruction

2022-04-21 Thread Jonathan Keane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525984#comment-17525984 ] Jonathan Keane commented on ARROW-15664: We have a solution, though there is concern it will

[jira] [Updated] (ARROW-14596) [Python] parquet.read_table nested fields in columns does not work for use_legacy_dataset=False

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14596: Fix Version/s: 9.0.0 (was: 8.0.0) > [Python] parquet.read_table

[jira] [Commented] (ARROW-14596) [Python] parquet.read_table nested fields in columns does not work for use_legacy_dataset=False

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525981#comment-17525981 ] Krisztian Szucs commented on ARROW-14596: - Postponing to 9.0 > [Python] parquet.read_table

[jira] [Updated] (ARROW-13593) [C++][Dataset][Parquet] Support parquet modular encryption in the new Dataset API

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-13593: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Dataset][Parquet]

[jira] [Commented] (ARROW-13593) [C++][Dataset][Parquet] Support parquet modular encryption in the new Dataset API

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525976#comment-17525976 ] Krisztian Szucs commented on ARROW-13593: - Postponing to 9.0 > [C++][Dataset][Parquet] Support

[jira] [Commented] (ARROW-14182) [C++][Compute] Hash Join performance improvement

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525978#comment-17525978 ] Krisztian Szucs commented on ARROW-14182: - Postponing to 9.0 > [C++][Compute] Hash Join

[jira] [Updated] (ARROW-14182) [C++][Compute] Hash Join performance improvement

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14182: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Compute] Hash Join

[jira] [Updated] (ARROW-14725) [C++][Compute] Extract Expression simplification passes to an extensible registry

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14725: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Compute] Extract

[jira] [Updated] (ARROW-16272) Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-16272: --- Description: `pyarrow.fs.S3FileSystem.open_input_file` and

[jira] [Updated] (ARROW-15671) [GLib] Generate Vala VAPI file

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-15671: Fix Version/s: 9.0.0 (was: 8.0.0) > [GLib] Generate Vala VAPI file

[jira] [Commented] (ARROW-15671) [GLib] Generate Vala VAPI file

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525973#comment-17525973 ] Krisztian Szucs commented on ARROW-15671: - Since it's in progress I'm postponing to 9.0, but

[jira] [Updated] (ARROW-14134) [C++][Compute] Standardize generator dispatchers

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14134: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Compute] Standardize

[jira] [Updated] (ARROW-15117) [Docs] Splitting the sphinx-based Arrow docs into separate sphinx projects

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-15117: Fix Version/s: 9.0.0 (was: 8.0.0) > [Docs] Splitting the

[jira] [Updated] (ARROW-9843) [C++][Python] Implement Between ternary kernel

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-9843: --- Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Python] Implement Between

[jira] [Updated] (ARROW-15612) [C++] Migrate Flight APIs to Result<>

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-15612: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++] Migrate Flight APIs to

[jira] [Updated] (ARROW-14330) [C++] Create DataHolder that can be used for caching during exec plans

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14330: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++] Create DataHolder that

[jira] [Updated] (ARROW-14034) [Java] Unexpected Allocator states created after allocating buffer whose AllocationManager has different size from the requested size

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14034: Fix Version/s: 9.0.0 (was: 8.0.0) > [Java] Unexpected Allocator

[jira] [Updated] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-16240: -- Parent: ARROW-16119 Issue Type: Sub-task (was: Improvement) >

[jira] [Resolved] (ARROW-12515) [Dev][Wiki][Release] Fix and update Windows RC verify script

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-12515. - Resolution: Resolved > [Dev][Wiki][Release] Fix and update Windows RC verify script >

[jira] [Updated] (ARROW-9285) [C++] Detect unauthorized memory allocations in function kernels

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-9285: --- Fix Version/s: 9.0.0 (was: 8.0.0) > [C++] Detect unauthorized memory

[jira] [Commented] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525966#comment-17525966 ] Joris Van den Bossche commented on ARROW-16240: --- cc [~westonpace] do you remember if this

[jira] [Updated] (ARROW-12084) [C++][Compute] Add remainder and quotient compute::Function

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-12084: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Compute] Add remainder

[jira] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240 ] Joris Van den Bossche deleted comment on ARROW-16240: --- was (Author: jorisvandenbossche): cc @westonpace do you remember if this has been discussed before how the

[jira] [Commented] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525965#comment-17525965 ] Joris Van den Bossche commented on ARROW-16240: --- cc @westonpace do you remember if this

[jira] [Updated] (ARROW-14443) [C++] Implement Plan Fragments support for ExecPlan.

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14443: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++] Implement Plan Fragments

[jira] [Updated] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-16240: -- Fix Version/s: 8.0.0 > [Python] Support row_group_size/chunk_size keyword in

[jira] [Updated] (ARROW-14445) [C++] Implement memory management for DataHolder

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14445: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++] Implement memory

[jira] [Updated] (ARROW-11502) [C++] Optimize Arrow ByteStreamSplitDecode with Neon

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-11502: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++] Optimize Arrow

[jira] [Updated] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-16240: -- Description: The {{pq.write_to_dataset}} (legacy implementation) supports the

[jira] [Updated] (ARROW-16272) Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread Sahil Gupta (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Gupta updated ARROW-16272: Description: `pyarrow.fs.S3FileSystem.open_input_file` and

[jira] [Updated] (ARROW-11776) [Java][Dataset] Support writing to files within dataset scanner via JNI

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-11776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-11776: Fix Version/s: 9.0.0 (was: 8.0.0) > [Java][Dataset] Support

[jira] [Created] (ARROW-16272) Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv`

2022-04-21 Thread Sahil Gupta (Jira)
Sahil Gupta created ARROW-16272: --- Summary: Poor read performance of S3FileSystem.open_input_file when used with `pd.read_csv` Key: ARROW-16272 URL: https://issues.apache.org/jira/browse/ARROW-16272

[jira] [Updated] (ARROW-16202) [C++][Parquet] WipeOutDecryptionKeys doesn't securely wipe out keys

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-16202: --- Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Parquet]

[jira] [Updated] (ARROW-12755) [C++][Compute] Add quotient and modulo kernels

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-12755: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Compute] Add quotient

[jira] [Updated] (ARROW-12723) [C++][Compute] GroupBy: add unittests for individual components of hash group by

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-12723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-12723: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++][Compute] GroupBy: add

[jira] [Updated] (ARROW-16240) [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False

2022-04-21 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-16240: -- Summary: [Python] Support row_group_size/chunk_size keyword in

[jira] [Updated] (ARROW-15250) [Python][R] Temporal floor/ceil/round for should accept frequency string

2022-04-21 Thread Rok Mihevc (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rok Mihevc updated ARROW-15250: --- Labels: kernel timestamp (was: good-first-issue kernel timestamp) > [Python][R] Temporal

[jira] [Commented] (ARROW-16202) [C++][Parquet] WipeOutDecryptionKeys doesn't securely wipe out keys

2022-04-21 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525954#comment-17525954 ] Antoine Pitrou commented on ARROW-16202: Yes, probably. > [C++][Parquet] WipeOutDecryptionKeys

[jira] [Updated] (ARROW-8221) [Python][Dataset] Expose schema inference / validation options in the factory

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-8221: --- Fix Version/s: 9.0.0 (was: 8.0.0) > [Python][Dataset] Expose schema

[jira] [Updated] (ARROW-15277) [Python] Use Make to create ChunkedArray and remove checks

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-15277: Fix Version/s: 9.0.0 (was: 8.0.0) > [Python] Use Make to create

[jira] [Commented] (ARROW-14656) [Python] Add sort_by helper method to StructArray

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525952#comment-17525952 ] Krisztian Szucs commented on ARROW-14656: - Postponed to 9.0 > [Python] Add sort_by helper

[jira] [Updated] (ARROW-14656) [Python] Add sort_by helper method to StructArray

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-14656: Fix Version/s: 9.0.0 (was: 8.0.0) > [Python] Add sort_by helper

[jira] [Commented] (ARROW-16202) [C++][Parquet] WipeOutDecryptionKeys doesn't securely wipe out keys

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525951#comment-17525951 ] Krisztian Szucs commented on ARROW-16202: - [~apitrou] shall we postpone it to 9.0? >

[jira] [Updated] (ARROW-15961) [C++] Check nullability when validating fields on batches or struct arrays

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-15961: Fix Version/s: 9.0.0 (was: 8.0.0) > [C++] Check nullability when

[jira] [Updated] (ARROW-16116) [C++] Properly handle non-nullable fields in Parquet reading

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-16116: Fix Version/s: 9.0.0 > [C++] Properly handle non-nullable fields in Parquet reading >

[jira] [Commented] (ARROW-15961) [C++] Check nullability when validating fields on batches or struct arrays

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525949#comment-17525949 ] Krisztian Szucs commented on ARROW-15961: - Postponing to 9.0 based on the PR comments. > [C++]

[jira] [Commented] (ARROW-15897) [C++] Linker error when building Flight tests

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525946#comment-17525946 ] Krisztian Szucs commented on ARROW-15897: - [~apitrou] is this still an issue? > [C++] Linker

[jira] [Commented] (ARROW-15664) [C++] parquet reader Segfaults with illegal SIMD instruction

2022-04-21 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525945#comment-17525945 ] Krisztian Szucs commented on ARROW-15664: - [~jonkeane] what's the status of this issue? >

  1   2   3   4   >