[jira] [Created] (ARROW-17640) [C++] Add File Handling Test cases for GlobFile handling in Substrait Read

2022-09-06 Thread Vibhatha Lakmal Abeykoon (Jira)
Vibhatha Lakmal Abeykoon created ARROW-17640: Summary: [C++] Add File Handling Test cases for GlobFile handling in Substrait Read Key: ARROW-17640 URL:

[jira] [Commented] (ARROW-17601) [C++] Error when creating Expression on Decimal128 types: precision out of range

2022-09-06 Thread Yibo Cai (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601120#comment-17601120 ] Yibo Cai commented on ARROW-17601: -- For arrays, if we reduce the maximal possible precision, we don't

[jira] [Updated] (ARROW-17639) arrow::write_parquet fails when column first element is null

2022-09-06 Thread David (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David updated ARROW-17639: -- Description: * Works reticulate::py_run_string(" import pandas as pd df = pd.DataFrame( \{'col1': [[1,2],

[jira] [Updated] (ARROW-17639) arrow::write_parquet fails when column first element is null

2022-09-06 Thread David (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David updated ARROW-17639: -- Description: * Works reticulate::py_run_string(" import pandas as pd df = pd.DataFrame({'col1': [[1,2], None,

[jira] [Created] (ARROW-17639) arrow::write_parquet fails when column first element is null

2022-09-06 Thread David (Jira)
David created ARROW-17639: - Summary: arrow::write_parquet fails when column first element is null Key: ARROW-17639 URL: https://issues.apache.org/jira/browse/ARROW-17639 Project: Apache Arrow Issue

[jira] [Commented] (ARROW-17595) [C++] Installation Error stdlib.h no such file or directory

2022-09-06 Thread Kouhei Sutou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601056#comment-17601056 ] Kouhei Sutou commented on ARROW-17595: -- Could you try "ol8_appstream" instead of "acx-appstream"? I

[jira] [Commented] (ARROW-17614) [CI][Python] test test_write_dataset_max_rows_per_file is producing several nightly build failures

2022-09-06 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601033#comment-17601033 ] Weston Pace commented on ARROW-17614: - Hmm, my guess is this is probably caused by

[jira] [Commented] (ARROW-17627) [GO][Parquet] Unable to pass metadata without StoreSchema

2022-09-06 Thread Matthew Topol (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601031#comment-17601031 ] Matthew Topol commented on ARROW-17627: --- Hey [~zhouyan1014] thanks for the proposal. This looks

[jira] [Updated] (ARROW-17638) [Go] Update C Data API support

2022-09-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17638: --- Labels: pull-request-available (was: ) > [Go] Update C Data API support >

[jira] [Assigned] (ARROW-17638) [Go] Update C Data API support

2022-09-06 Thread Matthew Topol (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Topol reassigned ARROW-17638: - Assignee: Matthew Topol > [Go] Update C Data API support >

[jira] [Created] (ARROW-17638) [Go] Update C Data API support

2022-09-06 Thread Matthew Topol (Jira)
Matthew Topol created ARROW-17638: - Summary: [Go] Update C Data API support Key: ARROW-17638 URL: https://issues.apache.org/jira/browse/ARROW-17638 Project: Apache Arrow Issue Type:

[jira] [Updated] (ARROW-17600) [Go] Implement Casting for Complex Types (List/Struct/etc.)

2022-09-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17600: --- Labels: pull-request-available (was: ) > [Go] Implement Casting for Complex Types

[jira] [Commented] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Weston Pace (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601015#comment-17601015 ] Weston Pace commented on ARROW-17597: - I get pretty similar results. >60 seconds for s3_bucket. ~4

[jira] [Updated] (ARROW-17617) [Doc] Remove experimental marker for Flight RPC in feature matrix

2022-09-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17617: --- Labels: pull-request-available (was: ) > [Doc] Remove experimental marker for Flight RPC

[jira] [Commented] (ARROW-17617) [Doc] Remove experimental marker for Flight RPC in feature matrix

2022-09-06 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600982#comment-17600982 ] David Li commented on ARROW-17617: -- Actually yeah: Flight predates

[jira] [Commented] (ARROW-17617) [Doc] Remove experimental marker for Flight RPC in feature matrix

2022-09-06 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600981#comment-17600981 ] David Li commented on ARROW-17617: -- Looking around, I don't actually see a vote (maybe it was from

[jira] [Resolved] (ARROW-16000) [C++][Dataset] Support Latin-1 encoding

2022-09-06 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li resolved ARROW-16000. -- Fix Version/s: 10.0.0 Resolution: Fixed Issue resolved by pull request 13820

[jira] [Commented] (ARROW-17637) [R] as.Date fails going from timestamp[us] to timestamp[s]

2022-09-06 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600960#comment-17600960 ] Neal Richardson commented on ARROW-17637: - The naive cast to date32() works: {code} >

[jira] [Updated] (ARROW-17637) [R] as.Date fails going from timestamp[us] to timestamp[s]

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane updated ARROW-17637: - Description: Using as.Date to convert from timestamp to date fails even though this is fine in

[jira] [Updated] (ARROW-17637) [R] as.Date fails going from timestamp[us] to timestamp[s]

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane updated ARROW-17637: - Component/s: R > [R] as.Date fails going from timestamp[us] to timestamp[s] >

[jira] [Updated] (ARROW-17637) [R] as.Date fails going from timestamp[us] to timestamp[s]

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane updated ARROW-17637: - Description: Using as.Date to convert from timestamp to date fails in Arrow even though this

[jira] [Updated] (ARROW-17637) [R] as.Date fails going from timestamp[us] to timestamp[s]

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane updated ARROW-17637: - Summary: [R] as.Date fails going from timestamp[us] to timestamp[s] (was: [R] as.Date fails

[jira] [Created] (ARROW-17637) [R] as.Date fails going from timestamp[s

2022-09-06 Thread Nicola Crane (Jira)
Nicola Crane created ARROW-17637: Summary: [R] as.Date fails going from timestamp[s Key: ARROW-17637 URL: https://issues.apache.org/jira/browse/ARROW-17637 Project: Apache Arrow Issue Type:

[jira] [Comment Edited] (ARROW-17636) Converting Table to pandas raises NotImplementedError (when table previously saved as partitioned parquet dataset)

2022-09-06 Thread Roberto Lobo (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600947#comment-17600947 ] Roberto Lobo edited comment on ARROW-17636 at 9/6/22 7:11 PM: -- Using an

[jira] [Comment Edited] (ARROW-17636) Converting Table to pandas raises NotImplementedError (when table previously saved as partitioned parquet dataset)

2022-09-06 Thread Roberto Lobo (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600947#comment-17600947 ] Roberto Lobo edited comment on ARROW-17636 at 9/6/22 7:09 PM: -- Using an

[jira] [Commented] (ARROW-17636) Converting Table to pandas raises NotImplementedError (when table previously saved as partitioned parquet dataset)

2022-09-06 Thread Roberto Lobo (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600947#comment-17600947 ] Roberto Lobo commented on ARROW-17636: -- Using an workaround: {code:java}

[jira] [Assigned] (ARROW-15479) [C++] Cast fixed size list to compatible fixed size list type (other values type, other field name)

2022-09-06 Thread Kshiteej K (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kshiteej K reassigned ARROW-15479: -- Assignee: Kshiteej K > [C++] Cast fixed size list to compatible fixed size list type (other

[jira] [Created] (ARROW-17636) Converting Table to pandas raises NotImplementedError (when table previously saved as partitioned parquet dataset)

2022-09-06 Thread Roberto Lobo (Jira)
Roberto Lobo created ARROW-17636: Summary: Converting Table to pandas raises NotImplementedError (when table previously saved as partitioned parquet dataset) Key: ARROW-17636 URL:

[jira] [Commented] (ARROW-17635) [Python][CI] Sync conda recipe with the arrow-cpp feedstock

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600890#comment-17600890 ] Antoine Pitrou commented on ARROW-17635: Seems like the last sync was done ~6 months ago in

[jira] [Commented] (ARROW-16008) [C++] It is more expensive than expected to create a default Result

2022-09-06 Thread Todd Farmer (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600888#comment-17600888 ] Todd Farmer commented on ARROW-16008: - This issue was last updated over 90 days ago, which may be an

[jira] [Assigned] (ARROW-16008) [C++] It is more expensive than expected to create a default Result

2022-09-06 Thread Todd Farmer (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Farmer reassigned ARROW-16008: --- Assignee: (was: Weston Pace) > [C++] It is more expensive than expected to create a

[jira] [Commented] (ARROW-17635) [Python][CI] Sync conda recipe with the arrow-cpp feedstock

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600887#comment-17600887 ] Antoine Pitrou commented on ARROW-17635: [~raulcd] I don't know if you would be interested in

[jira] [Created] (ARROW-17635) [Python][CI] Sync conda recipe with the arrow-cpp feedstock

2022-09-06 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17635: -- Summary: [Python][CI] Sync conda recipe with the arrow-cpp feedstock Key: ARROW-17635 URL: https://issues.apache.org/jira/browse/ARROW-17635 Project: Apache

[jira] [Commented] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600882#comment-17600882 ] Antoine Pitrou commented on ARROW-17597: bq. is accessing data via the S3 URI going to be

[jira] [Commented] (ARROW-17374) [R] R Arrow install fails with SNAPPY_LIB-NOTFOUND

2022-09-06 Thread Vincent Nijs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600881#comment-17600881 ] Vincent Nijs commented on ARROW-17374: -- I haven't tried that tbh. The docker file below is for

[jira] [Comment Edited] (ARROW-2034) [C++] Filesystem implementation for Azure Blob Storage

2022-09-06 Thread Dean MacGregor (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600865#comment-17600865 ] Dean MacGregor edited comment on ARROW-2034 at 9/6/22 4:12 PM: --- If someone

[jira] [Commented] (ARROW-2034) [C++] Filesystem implementation for Azure Blob Storage

2022-09-06 Thread Dean MacGregor (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600865#comment-17600865 ] Dean MacGregor commented on ARROW-2034: --- If someone wants to work on this but doesn't have an Azure

[jira] [Closed] (ARROW-17633) [Python][CI] test_write_dataset_max_rows_per_file is flaky

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou closed ARROW-17633. -- Resolution: Duplicate > [Python][CI] test_write_dataset_max_rows_per_file is flaky >

[jira] [Updated] (ARROW-17634) pyarrow.fs import reserves large amount of memory

2022-09-06 Thread James Coder (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Coder updated ARROW-17634: Description: It seems that in version 9.0.0 `import pyarrow.fs` reserves 1+ (close to 2) gigs of

[jira] [Commented] (ARROW-17614) [CI][Python] test test_write_dataset_max_rows_per_file is producing several nightly build failures

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600862#comment-17600862 ] Antoine Pitrou commented on ARROW-17614: [~vibhatha] [~westonpace] > [CI][Python] test

[jira] [Commented] (ARROW-17633) [Python][CI] test_write_dataset_max_rows_per_file is flaky

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600861#comment-17600861 ] Antoine Pitrou commented on ARROW-17633: Definitely! > [Python][CI]

[jira] [Created] (ARROW-17634) pyarrow.fs import reserves large amount of memory

2022-09-06 Thread James Coder (Jira)
James Coder created ARROW-17634: --- Summary: pyarrow.fs import reserves large amount of memory Key: ARROW-17634 URL: https://issues.apache.org/jira/browse/ARROW-17634 Project: Apache Arrow Issue

[jira] [Commented] (ARROW-17633) [Python][CI] test_write_dataset_max_rows_per_file is flaky

2022-09-06 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600852#comment-17600852 ] Raúl Cumplido commented on ARROW-17633: --- duplicate from this one?

[jira] [Updated] (ARROW-16384) [Doc][Flight] Mention Flight SQL in implementation status

2022-09-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-16384: --- Labels: pull-request-available (was: ) > [Doc][Flight] Mention Flight SQL in

[jira] [Commented] (ARROW-17633) [Python][CI] test_write_dataset_max_rows_per_file is flaky

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600846#comment-17600846 ] Antoine Pitrou commented on ARROW-17633: cc [~vibhatha] [~westonpace] > [Python][CI]

[jira] [Created] (ARROW-17633) [Python][CI] test_write_dataset_max_rows_per_file is flaky

2022-09-06 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-17633: -- Summary: [Python][CI] test_write_dataset_max_rows_per_file is flaky Key: ARROW-17633 URL: https://issues.apache.org/jira/browse/ARROW-17633 Project: Apache Arrow

[jira] [Commented] (ARROW-17595) [C++] Installation Error stdlib.h no such file or directory

2022-09-06 Thread Robert Tidwell (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600841#comment-17600841 ] Robert Tidwell commented on ARROW-17595: It is a local repo of the OEL-appstream for 8.5 >

[jira] [Commented] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Carl Boettiger (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600838#comment-17600838 ] Carl Boettiger commented on ARROW-17597:  Just a note, but I think the additional latency in S3

[jira] [Assigned] (ARROW-4709) [C++] Optimize for ordered JSON fields

2022-09-06 Thread Ben Harkins (Jira)
[ https://issues.apache.org/jira/browse/ARROW-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Harkins reassigned ARROW-4709: -- Assignee: Ben Harkins > [C++] Optimize for ordered JSON fields >

[jira] [Commented] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600812#comment-17600812 ] Neal Richardson commented on ARROW-17597: - https URLs are handled by R connections, not the

[jira] [Comment Edited] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600811#comment-17600811 ] Nicola Crane edited comment on ARROW-17597 at 9/6/22 2:51 PM: -- Hmm, that's

[jira] [Commented] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600811#comment-17600811 ] Nicola Crane commented on ARROW-17597: -- Hmm, that's a point; is accessing data via the S3 URI going

[jira] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597 ] Nicola Crane deleted comment on ARROW-17597: -- was (Author: thisisnic): Hmm, that's a point; is accessing data via the S3 URI going to be inherently slower than accessing it via https? I

[jira] [Updated] (ARROW-17632) [Python][C++] Add details of where is lib arrow being found during build

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-17632: -- Fix Version/s: 10.0.0 > [Python][C++] Add details of where is lib arrow being

[jira] [Commented] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600807#comment-17600807 ] Nicola Crane commented on ARROW-17597: -- Hmm, that's a point; is accessing data via the S3 URI going

[jira] [Updated] (ARROW-17632) [Python][C++] Add details of where is lib arrow being found during build

2022-09-06 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-17632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raúl Cumplido updated ARROW-17632: -- Description: As discussed here:

[jira] [Created] (ARROW-17632) [Python][C++] Add details of where is lib arrow being found during build

2022-09-06 Thread Jira
Raúl Cumplido created ARROW-17632: - Summary: [Python][C++] Add details of where is lib arrow being found during build Key: ARROW-17632 URL: https://issues.apache.org/jira/browse/ARROW-17632 Project:

[jira] [Comment Edited] (ARROW-15006) [Python][Doc] Iteratively enable more numpydoc checks

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600800#comment-17600800 ] Joris Van den Bossche edited comment on ARROW-15006 at 9/6/22 2:36 PM:

[jira] [Commented] (ARROW-15006) [Python][Doc] Iteratively enable more numpydoc checks

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600800#comment-17600800 ] Joris Van den Bossche commented on ARROW-15006: --- Nice overview! I think your proposal for

[jira] [Commented] (ARROW-17052) [C++][Python][FlightRPC] Ensure ::Serialize and ::Deserialize are consistently implemented

2022-09-06 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600793#comment-17600793 ] David Li commented on ARROW-17052: -- I guess PutResult is never exposed directly so don't worry about it

[jira] [Commented] (ARROW-8210) [C++][Dataset] Handling of duplicate columns in Dataset factory and scanning

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600788#comment-17600788 ] Joris Van den Bossche commented on ARROW-8210: -- There is some discussion in

[jira] [Created] (ARROW-17631) Propagate table/columns comment into Arrow Schema

2022-09-06 Thread Igor Suhorukov (Jira)
Igor Suhorukov created ARROW-17631: -- Summary: Propagate table/columns comment into Arrow Schema Key: ARROW-17631 URL: https://issues.apache.org/jira/browse/ARROW-17631 Project: Apache Arrow

[jira] [Created] (ARROW-17630) Introduce column index in JdbcToArrowTypeConverter

2022-09-06 Thread Igor Suhorukov (Jira)
Igor Suhorukov created ARROW-17630: -- Summary: Introduce column index in JdbcToArrowTypeConverter Key: ARROW-17630 URL: https://issues.apache.org/jira/browse/ARROW-17630 Project: Apache Arrow

[jira] [Created] (ARROW-17629) Bind DB column to Arrow Map type in JdbcToArrowUtils

2022-09-06 Thread Igor Suhorukov (Jira)
Igor Suhorukov created ARROW-17629: -- Summary: Bind DB column to Arrow Map type in JdbcToArrowUtils Key: ARROW-17629 URL: https://issues.apache.org/jira/browse/ARROW-17629 Project: Apache Arrow

[jira] [Updated] (ARROW-16728) [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset

2022-09-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-16728: --- Labels: pull-request-available (was: ) > [Python] Switch default and deprecate

[jira] [Commented] (ARROW-17601) [C++] Error when creating Expression on Decimal128 types: precision out of range

2022-09-06 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600769#comment-17600769 ] Neal Richardson commented on ARROW-17601: - The trouble with the status quo is that we fail even

[jira] [Commented] (ARROW-17606) [C++] Cast float to decimal truncates

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600765#comment-17600765 ] Antoine Pitrou commented on ARROW-17606: Ah, I suppose we could do that, but it would be quite

[jira] [Updated] (ARROW-17320) [Python] Refine pyarrow.parquet API exposure

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-17320: -- Fix Version/s: 10.0.0 > [Python] Refine pyarrow.parquet API exposure >

[jira] [Commented] (ARROW-17606) [C++] Cast float to decimal truncates

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600759#comment-17600759 ] Joris Van den Bossche commented on ARROW-17606: --- This might be too naive, but when casting

[jira] [Commented] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600752#comment-17600752 ] Antoine Pitrou commented on ARROW-17597: Well, I suppose time is spent in S3 accesses? How does

[jira] [Commented] (ARROW-17628) [CI][Packaging][Java] Publish latest nightly with SNAPSHOT version

2022-09-06 Thread Kouhei Sutou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600750#comment-17600750 ] Kouhei Sutou commented on ARROW-17628: -- +1 > [CI][Packaging][Java] Publish latest nightly with

[jira] [Commented] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600742#comment-17600742 ] Nicola Crane commented on ARROW-17597: -- I had a look into this in terms of profiling it to see

[jira] [Updated] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane updated ARROW-17597: - Component/s: C++ > [R][C++] Why is read_csv_arrow so much slower when using S3 path notation? >

[jira] [Updated] (ARROW-17597) [R][C++] Why is read_csv_arrow so much slower when using S3 path notation?

2022-09-06 Thread Nicola Crane (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicola Crane updated ARROW-17597: - Summary: [R][C++] Why is read_csv_arrow so much slower when using S3 path notation? (was: [R]

[jira] [Updated] (ARROW-14596) [Python] parquet.read_table nested fields in columns does not work for use_legacy_dataset=False

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-14596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-14596: -- Fix Version/s: 10.0.0 > [Python] parquet.read_table nested fields in columns

[jira] [Commented] (ARROW-16728) [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600732#comment-17600732 ] Joris Van den Bossche commented on ARROW-16728: --- Yes, indeed (but so the two first items

[jira] [Assigned] (ARROW-16728) [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-16728: - Assignee: Joris Van den Bossche > [Python] Switch default and

[jira] [Updated] (ARROW-16728) [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-16728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-16728: -- Fix Version/s: 10.0.0 > [Python] Switch default and deprecate

[jira] [Resolved] (ARROW-15693) [Dev] Update crossbow templates to use master or main

2022-09-06 Thread Alessandro Molina (Jira)
[ https://issues.apache.org/jira/browse/ARROW-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessandro Molina resolved ARROW-15693. --- Fix Version/s: 10.0.0 Resolution: Fixed Issue resolved by pull request 13975

[jira] [Commented] (ARROW-17628) [CI][Packaging][Java] Publish latest nightly with SNAPSHOT version

2022-09-06 Thread Jacob Wujciak-Jens (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600727#comment-17600727 ] Jacob Wujciak-Jens commented on ARROW-17628: +1 > [CI][Packaging][Java] Publish latest

[jira] [Commented] (ARROW-17628) [CI][Packaging][Java] Publish latest nightly with SNAPSHOT version

2022-09-06 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-17628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600723#comment-17600723 ] Raúl Cumplido commented on ARROW-17628: --- [~dsusanibara] [~kou] what do you think about this one? I

[jira] [Created] (ARROW-17628) [CI][Packaging][Java] Publish latest nightly with SNAPSHOT version

2022-09-06 Thread Jira
Raúl Cumplido created ARROW-17628: - Summary: [CI][Packaging][Java] Publish latest nightly with SNAPSHOT version Key: ARROW-17628 URL: https://issues.apache.org/jira/browse/ARROW-17628 Project: Apache

[jira] [Commented] (ARROW-17616) [CI][Java] Java nightly upload job fails after introduction of pruning

2022-09-06 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-17616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600720#comment-17600720 ] Raúl Cumplido commented on ARROW-17616: --- It seems we also have to update the regex as now dev

[jira] [Created] (ARROW-17627) [GO][Parquet] Unable to pass metadata without StoreSchema

2022-09-06 Thread Yan Zhou (Jira)
Yan Zhou created ARROW-17627: Summary: [GO][Parquet] Unable to pass metadata without StoreSchema Key: ARROW-17627 URL: https://issues.apache.org/jira/browse/ARROW-17627 Project: Apache Arrow

[jira] [Updated] (ARROW-13798) [Python] Selective projection of struct fields errors with use_legacy_dataset = False

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-13798: -- Fix Version/s: 10.0.0 > [Python] Selective projection of struct fields errors

[jira] [Updated] (ARROW-13798) [Python] Selective projection of struct fields errors with use_legacy_dataset = False

2022-09-06 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-13798: -- Priority: Critical (was: Major) > [Python] Selective projection of struct

[jira] [Closed] (ARROW-17615) [CI][Packaging] arrow-cpp on conda nightlies fail finding Arrow package

2022-09-06 Thread Kouhei Sutou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou closed ARROW-17615. Resolution: Not A Problem This is an usage issue. (We can't use {{find_package(COMPONETNS)}} for

[jira] [Updated] (ARROW-17626) discrepancy of of schema and schema.metadata if second same-name key is ignored if the second key is byte-encoded

2022-09-06 Thread Chris (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris updated ARROW-17626: -- Description: {code:python} schema = pa.schema( [('col1', pa.int8())], metadata={ 'key': 'key

[jira] [Updated] (ARROW-17626) discrepancy of of schema and schema.metadata if second same-name key is byte-encoded

2022-09-06 Thread Chris (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris updated ARROW-17626: -- Description: {code:python} schema = pa.schema( [('col1', pa.int8())], metadata={ 'key': 'key

[jira] [Created] (ARROW-17626) discrepancy of of schema and schema.metadata if second same-name key is ignored if the second key is byte-encoded

2022-09-06 Thread Chris (Jira)
Chris created ARROW-17626: - Summary: discrepancy of of schema and schema.metadata if second same-name key is ignored if the second key is byte-encoded Key: ARROW-17626 URL:

[jira] [Assigned] (ARROW-17575) [C++][Docs] Update build document to follow new CMake package

2022-09-06 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-17575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raúl Cumplido reassigned ARROW-17575: - Assignee: Raúl Cumplido (was: Kouhei Sutou) > [C++][Docs] Update build document to

[jira] [Commented] (ARROW-17575) [C++][Docs] Update build document to follow new CMake package

2022-09-06 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-17575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600673#comment-17600673 ] Raúl Cumplido commented on ARROW-17575: --- Copy a comment from GitHub to remember things to add

[jira] [Commented] (ARROW-17615) [CI][Packaging] arrow-cpp on conda nightlies fail finding Arrow package

2022-09-06 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-17615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600670#comment-17600670 ] Raúl Cumplido commented on ARROW-17615: --- I was using components as I was replicating what we have

[jira] [Updated] (ARROW-17450) Arrow-Parquet cannot read columns with Run Length Encoding (RLE)

2022-09-06 Thread Nishanth (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishanth updated ARROW-17450: - Attachment: athena_boolean.gz.parquet > Arrow-Parquet cannot read columns with Run Length Encoding

[jira] [Commented] (ARROW-17450) Arrow-Parquet cannot read columns with Run Length Encoding (RLE)

2022-09-06 Thread Nishanth (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600649#comment-17600649 ] Nishanth commented on ARROW-17450: -- Attached a sample file > Arrow-Parquet cannot read columns with

[jira] [Commented] (ARROW-17374) [R] R Arrow install fails with SNAPPY_LIB-NOTFOUND

2022-09-06 Thread Kouhei Sutou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17600643#comment-17600643 ] Kouhei Sutou commented on ARROW-17374: -- Thanks. I could build the {{Dockerfile}} successfully...

[jira] [Updated] (ARROW-17624) [C++][Acero] Window Functions add helper classes for frame calculation

2022-09-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17624: --- Labels: pull-request-available query-engine (was: query-engine) > [C++][Acero] Window

[jira] [Updated] (ARROW-17623) [C++][Acero] Window Functions add helper classes for ranking

2022-09-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-17623: --- Labels: pull-request-available query-engine (was: query-engine) > [C++][Acero] Window

[jira] [Updated] (ARROW-17625) Cast error on roundtrip of categorical column to parquet and back

2022-09-06 Thread Yishai Beeri (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yishai Beeri updated ARROW-17625: - Description: Writing a table to parquet, then reading it back fails if: # One of the columns

[jira] [Updated] (ARROW-17625) Cast error on roundtrip of categorical column to parquet and back

2022-09-06 Thread Yishai Beeri (Jira)
[ https://issues.apache.org/jira/browse/ARROW-17625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yishai Beeri updated ARROW-17625: - Description: Writing a table to parquet, then reading it back fails if: # One of the columns

  1   2   >