[jira] [Commented] (ARROW-1231) [C++] Add filesystem / IO implementation for Google Cloud Storage

2020-03-10 Thread Frank Natividad (Jira)
[ https://issues.apache.org/jira/browse/ARROW-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055656#comment-17055656 ] Frank Natividad commented on ARROW-1231: The XML API does exist and compatibility

[jira] [Created] (ARROW-8055) [GLib][Ruby] Add some metadata bindings to GArrowSchema

2020-03-10 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-8055: --- Summary: [GLib][Ruby] Add some metadata bindings to GArrowSchema Key: ARROW-8055 URL: https://issues.apache.org/jira/browse/ARROW-8055 Project: Apache Arrow Is

[jira] [Updated] (ARROW-8055) [GLib][Ruby] Add some metadata bindings to GArrowSchema

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8055: -- Labels: pull-request-available (was: ) > [GLib][Ruby] Add some metadata bindings to GArrowSche

[jira] [Commented] (ARROW-7830) [C++] Parquet library version doesn't change with releases

2020-03-10 Thread H. Vetinari (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055702#comment-17055702 ] H. Vetinari commented on ARROW-7830: OK, I didn't want to come off as demanding (anyt

[jira] [Commented] (ARROW-7996) Error serializing empty pandas DataFrame with pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055756#comment-17055756 ] Joris Van den Bossche commented on ARROW-7996: -- [~jdavidagudelo] Thanks for

[jira] [Updated] (ARROW-7996) [Python] Error serializing empty pandas DataFrame with pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-7996: - Summary: [Python] Error serializing empty pandas DataFrame with pyarrow (was: Er

[jira] [Commented] (ARROW-7996) [Python] Error serializing empty pandas DataFrame with pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055760#comment-17055760 ] Joris Van den Bossche commented on ARROW-7996: -- The error comes from deseria

[jira] [Updated] (ARROW-7996) [Python] Error serializing empty pandas DataFrame with pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-7996: - Labels: serialization (was: ) > [Python] Error serializing empty pandas DataFram

[jira] [Updated] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-7997: - Summary: [Python] Schema equals method with inconsistent docs in pyarrow (was: S

[jira] [Updated] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-7997: - Component/s: Python > [Python] Schema equals method with inconsistent docs in pya

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055766#comment-17055766 ] Joris Van den Bossche commented on ARROW-7997: -- [~otaviocv] Thanks for the r

[jira] [Created] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Dyfan Jones (Jira)
Dyfan Jones created ARROW-8056: -- Summary: [R] Support read and write orc file format Key: ARROW-8056 URL: https://issues.apache.org/jira/browse/ARROW-8056 Project: Apache Arrow Issue Type: New F

[jira] [Commented] (ARROW-8004) [Python] Define API for user-defined conversions of array cell values in pyarrow.array

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055776#comment-17055776 ] Joris Van den Bossche commented on ARROW-8004: -- For a more limited use case

[jira] [Commented] (ARROW-7956) [Python] Memory leak in pyarrow functions .ipc.serialize_pandas/deserialize_pandas

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055783#comment-17055783 ] Joris Van den Bossche commented on ARROW-7956: -- [~wesm] I think this was clo

[jira] [Updated] (ARROW-8010) [Python] Fixed size list not convertible to Numpy Array / pandas Series

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8010: - Summary: [Python] Fixed size list not convertible to Numpy Array / pandas Series

[jira] [Created] (ARROW-8057) Schema equality not roundtrip safe

2020-03-10 Thread Florian Jetter (Jira)
Florian Jetter created ARROW-8057: - Summary: Schema equality not roundtrip safe Key: ARROW-8057 URL: https://issues.apache.org/jira/browse/ARROW-8057 Project: Apache Arrow Issue Type: Bug

[jira] [Commented] (ARROW-8057) Schema equality not roundtrip safe

2020-03-10 Thread Florian Jetter (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055804#comment-17055804 ] Florian Jetter commented on ARROW-8057: --- Investigating the fields explicitly shows

[jira] [Updated] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-8057: -- Component/s: C++ > [C++] Schema equality not roundtrip safe through Parquet > -

[jira] [Updated] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-8057: -- Summary: [C++] Schema equality not roundtrip safe through Parquet (was: Schema equality not ro

[jira] [Comment Edited] (ARROW-7677) [C++] Handle Windows file paths with backslashes in GetTargetStats

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055814#comment-17055814 ] Joris Van den Bossche edited comment on ARROW-7677 at 3/10/20, 10:56 AM: --

[jira] [Commented] (ARROW-7677) [C++] Handle Windows file paths with backslashes in GetTargetStats

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055814#comment-17055814 ] Joris Van den Bossche commented on ARROW-7677: -- It came up in a partitioned

[jira] [Commented] (ARROW-1231) [C++] Add filesystem / IO implementation for Google Cloud Storage

2020-03-10 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055813#comment-17055813 ] Antoine Pitrou commented on ARROW-1231: --- Thank you for the explanation. I agree we

[jira] [Commented] (ARROW-7680) [C++][Dataset] Partition discovery is not working with windows path

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055815#comment-17055815 ] Joris Van den Bossche commented on ARROW-7680: -- Since ARROW-7677 is not yet

[jira] [Commented] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055816#comment-17055816 ] Antoine Pitrou commented on ARROW-8057: --- cc [~wesm] > [C++] Schema equality not ro

[jira] [Updated] (ARROW-7680) [C++][Dataset] Partition discovery is not working with windows path

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-7680: -- Labels: pull-request-available (was: ) > [C++][Dataset] Partition discovery is not working wit

[jira] [Updated] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Dyfan Jones (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dyfan Jones updated ARROW-8056: --- Component/s: R > [R] Support read and write orc file format > ---

[jira] [Commented] (ARROW-7680) [C++][Dataset] Partition discovery is not working with windows path

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055828#comment-17055828 ] Joris Van den Bossche commented on ARROW-7680: -- Indeed, we are still getting

[jira] [Commented] (ARROW-8052) [Python] requirements-test.txt cannot be used with conda install --file

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055830#comment-17055830 ] Joris Van den Bossche commented on ARROW-8052: -- I don't think this should be

[jira] [Commented] (ARROW-8010) [Python] Fixed size list not convertible to Numpy Array / pandas Series

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055833#comment-17055833 ] Joris Van den Bossche commented on ARROW-8010: -- [~balancap] Thanks for the r

[jira] [Closed] (ARROW-8010) [Python] Fixed size list not convertible to Numpy Array / pandas Series

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche closed ARROW-8010. Resolution: Duplicate > [Python] Fixed size list not convertible to Numpy Array / p

[GitHub] [arrow-dist] Rajpratik71 opened a new pull request #31: optimization debian package manager tweaks

2020-03-10 Thread GitBox
Rajpratik71 opened a new pull request #31: optimization debian package manager tweaks URL: https://github.com/apache/arrow-dist/pull/31 By default, Ubuntu or Debian based "apt" or "apt-get" system installs recommended but not suggested packages . By passing "--no-install-recommends"

[jira] [Assigned] (ARROW-5265) [Python/CI] Add integration test with kartothek

2020-03-10 Thread Uwe Korn (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Korn reassigned ARROW-5265: --- Assignee: Uwe Korn > [Python/CI] Add integration test with kartothek > -

[jira] [Updated] (ARROW-5265) [Python/CI] Add integration test with kartothek

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5265: -- Labels: parquet pull-request-available (was: parquet) > [Python/CI] Add integration test with

[jira] [Updated] (ARROW-3154) [Python][C++] Document how to write _metadata, _common_metadata files with Parquet datasets

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-3154: - Component/s: C++ - Dataset > [Python][C++] Document how to write _metadata, _comm

[jira] [Updated] (ARROW-2728) [Python][C++][Dataset] Support partitioned Parquet datasets using glob-style file paths

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-2728: - Component/s: C++ - Dataset > [Python][C++][Dataset] Support partitioned Parquet d

[jira] [Created] (ARROW-8058) [C++][Python][Dataset] Provide an option to skip validation in FileSystemDatasetFactoryOptions

2020-03-10 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-8058: --- Summary: [C++][Python][Dataset] Provide an option to skip validation in FileSystemDatasetFactoryOptions Key: ARROW-8058 URL: https://issues.apache.org/jira/browse/ARROW-8058

[jira] [Assigned] (ARROW-8058) [C++][Python][Dataset] Provide an option to skip validation in FileSystemDatasetFactoryOptions

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman reassigned ARROW-8058: --- Assignee: (was: Ben Kietzman) > [C++][Python][Dataset] Provide an option to skip validat

[jira] [Updated] (ARROW-8058) [C++][Python][Dataset] Provide an option to toggle validation and schema inference in FileSystemDatasetFactoryOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8058: -- Summary: [C++][Python][Dataset] Provide an option to toggle validation and sche

[jira] [Commented] (ARROW-7830) [C++] Parquet library version doesn't change with releases

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055974#comment-17055974 ] Wes McKinney commented on ARROW-7830: - > So wouldn't it be a reasonable way of lookin

[jira] [Resolved] (ARROW-7956) [Python] Memory leak in pyarrow functions .ipc.serialize_pandas/deserialize_pandas

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-7956. - Resolution: Fixed Yes, resolved as part of the patch for ARROW-4120 > [Python] Memory leak in py

[jira] [Commented] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055979#comment-17055979 ] Wes McKinney commented on ARROW-8057: - Yes, this was added in https://github.com/apa

[jira] [Closed] (ARROW-8052) [Python] requirements-test.txt cannot be used with conda install --file

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney closed ARROW-8052. --- Resolution: Not A Problem Alright, closing then > [Python] requirements-test.txt cannot be used with

[jira] [Assigned] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-8057: --- Assignee: Wes McKinney > [C++] Schema equality not roundtrip safe through Parquet >

[jira] [Updated] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8057: -- Labels: pull-request-available (was: ) > [C++] Schema equality not roundtrip safe through Parq

[jira] [Commented] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056005#comment-17056005 ] Wes McKinney commented on ARROW-8057: - I went with changing the default to {{False}},

[jira] [Updated] (ARROW-7963) [C++][Python][Dataset] Expose listing fragments

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-7963: -- Labels: pull-request-available (was: ) > [C++][Python][Dataset] Expose listing fragments > ---

[jira] [Commented] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056030#comment-17056030 ] Neal Richardson commented on ARROW-8056: Codewise, it probably wouldn't be too ba

[jira] [Commented] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Dyfan Jones (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056107#comment-17056107 ] Dyfan Jones commented on ARROW-8056: To my knowledge R doesn't have any maintained pa

[jira] [Created] (ARROW-8059) [Python] Make FileSystem objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8059: Summary: [Python] Make FileSystem objects serializable Key: ARROW-8059 URL: https://issues.apache.org/jira/browse/ARROW-8059 Project: Apache Arrow

[jira] [Updated] (ARROW-8059) [Python] Make FileSystem objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8059: - Fix Version/s: 0.17.0 > [Python] Make FileSystem objects serializable > -

[jira] [Created] (ARROW-8060) [Python] Make dataset Expression objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8060: Summary: [Python] Make dataset Expression objects serializable Key: ARROW-8060 URL: https://issues.apache.org/jira/browse/ARROW-8060 Project: Apache Ar

[jira] [Updated] (ARROW-8060) [Python] Make dataset Expression objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8060: - Fix Version/s: 0.17.0 > [Python] Make dataset Expression objects serializable > -

[jira] [Commented] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056194#comment-17056194 ] Neal Richardson commented on ARROW-8056: Sounds like a reasonable objective. Sin

[jira] [Created] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8061: Summary: [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups) Key: ARROW-8061 URL: https://issues.apache.org/jira/browse/ARROW

[jira] [Commented] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056201#comment-17056201 ] Joris Van den Bossche commented on ARROW-8061: -- Example usecase for this: fo

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056207#comment-17056207 ] Otávio Vasques commented on ARROW-7997: --- Interested! I will work on that. > [Pytho

[jira] [Commented] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056211#comment-17056211 ] Francois Saint-Jacques commented on ARROW-8061: --- Yes, this is possible, a P

[jira] [Created] (ARROW-8062) [C++][Dataset] Parquet Dataset factory from a _metadata/_common_metadata file

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8062: Summary: [C++][Dataset] Parquet Dataset factory from a _metadata/_common_metadata file Key: ARROW-8062 URL: https://issues.apache.org/jira/browse/ARROW-8062

[jira] [Commented] (ARROW-8060) [Python] Make dataset Expression objects serializable

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056221#comment-17056221 ] Ben Kietzman commented on ARROW-8060: - this should probably wait for ARROW-7878 (and

[jira] [Commented] (ARROW-8059) [Python] Make FileSystem objects serializable

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056228#comment-17056228 ] Ben Kietzman commented on ARROW-8059: - what will be the result of trying to serialize

[jira] [Created] (ARROW-8063) [Python] Add user guide documentation for Datasets API

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8063: Summary: [Python] Add user guide documentation for Datasets API Key: ARROW-8063 URL: https://issues.apache.org/jira/browse/ARROW-8063 Project: Apache A

[jira] [Commented] (ARROW-8047) [Python][Documentation] Document migration from ParquetDataset to pyarrow.datasets

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056235#comment-17056235 ] Joris Van den Bossche commented on ARROW-8047: -- I also created ARROW-8063 fo

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056241#comment-17056241 ] Joris Van den Bossche commented on ARROW-7997: -- Actually, there is just toda

[jira] [Commented] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056248#comment-17056248 ] Joris Van den Bossche commented on ARROW-8061: -- > Note that parallelism of R

[jira] [Commented] (ARROW-8059) [Python] Make FileSystem objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056251#comment-17056251 ] Joris Van den Bossche commented on ARROW-8059: -- Specifically for dask's usec

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056260#comment-17056260 ] Wes McKinney commented on ARROW-7997: - Yes, we appear to be changing the default to F

[jira] [Created] (ARROW-8064) [Dev] Implement Comment bot via Github actions

2020-03-10 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8064: -- Summary: [Dev] Implement Comment bot via Github actions Key: ARROW-8064 URL: https://issues.apache.org/jira/browse/ARROW-8064 Project: Apache Arrow Issue

[jira] [Commented] (ARROW-8039) [C++][Python][Dataset] Assemble a minimal ParquetDataset shim

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056284#comment-17056284 ] Joris Van den Bossche commented on ARROW-8039: -- > We might focus this by say

[jira] [Updated] (ARROW-8064) [Dev] Implement Comment bot via Github actions

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8064: -- Labels: pull-request-available (was: ) > [Dev] Implement Comment bot via Github actions >

[jira] [Created] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8065: - Summary: [C++][Dataset] Untangle Dataset, Fragment and ScanOptions Key: ARROW-8065 URL: https://issues.apache.org/jira/browse/ARROW-8065 Project: Apa

[jira] [Updated] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8065: -- Component/s: C++ - Dataset > [C++][Dataset] Untangle Dataset, Fragment and Scan

[jira] [Commented] (ARROW-8039) [C++][Python][Dataset] Assemble a minimal ParquetDataset shim

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056289#comment-17056289 ] Neal Richardson commented on ARROW-8039: Right, my thought was that we'd solve th

[jira] [Commented] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056295#comment-17056295 ] Ben Kietzman commented on ARROW-8065: - https://github.com/apache/arrow/pull/6570#issu

[jira] [Created] (ARROW-8066) PyArrow discards timezones

2020-03-10 Thread Markovtsev Vadim (Jira)
Markovtsev Vadim created ARROW-8066: --- Summary: PyArrow discards timezones Key: ARROW-8066 URL: https://issues.apache.org/jira/browse/ARROW-8066 Project: Apache Arrow Issue Type: Bug

[jira] [Updated] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman updated ARROW-8065: Description: Currently: a fragment is a product of a scan; it is a lazy collection of scan tasks c

[jira] [Commented] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056322#comment-17056322 ] Ben Kietzman commented on ARROW-8065: - If a {{Fragment}}'s schema is pulled from its

[jira] [Commented] (ARROW-8066) PyArrow discards timezones

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056323#comment-17056323 ] Wes McKinney commented on ARROW-8066: - Can you copy the details to this issue? > PyA

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056329#comment-17056329 ] Otávio Vasques commented on ARROW-7997: --- Ok! > [Python] Schema equals method with

[jira] [Updated] (ARROW-8049) [C++] Upgrade bundled Thrift version to 0.13.0

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8049: -- Labels: pull-request-available (was: ) > [C++] Upgrade bundled Thrift version to 0.13.0 >

[jira] [Assigned] (ARROW-8049) [C++] Upgrade bundled Thrift version to 0.13.0

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8049: -- Assignee: Neal Richardson > [C++] Upgrade bundled Thrift version to 0.13.0 > -

[jira] [Updated] (ARROW-8066) PyArrow discards timezones

2020-03-10 Thread Markovtsev Vadim (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markovtsev Vadim updated ARROW-8066: Description: The original description is at  [https://github.com/pandas-dev/pandas/issues/3

[jira] [Updated] (ARROW-8066) PyArrow discards timezones

2020-03-10 Thread Markovtsev Vadim (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markovtsev Vadim updated ARROW-8066: Description: The original description is at  [https://github.com/pandas-dev/pandas/issues/3

[jira] [Updated] (ARROW-8066) PyArrow discards timezones

2020-03-10 Thread Markovtsev Vadim (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markovtsev Vadim updated ARROW-8066: Description: The description is at [https://github.com/pandas-dev/pandas/issues/32587] ###

[jira] [Updated] (ARROW-8066) PyArrow discards timezones

2020-03-10 Thread Markovtsev Vadim (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markovtsev Vadim updated ARROW-8066: Description: The original description is at  [https://github.com/pandas-dev/pandas/issues/3

[jira] [Commented] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Dyfan Jones (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056335#comment-17056335 ] Dyfan Jones commented on ARROW-8056: Thanks for checking that out for me. AWS actual

[jira] [Resolved] (ARROW-7530) [Developer] Do not include list of commits from PR in squashed summary message

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-7530. Resolution: Fixed Issue resolved by pull request 6565 [https://github.com/apache/arrow/pull

[jira] [Updated] (ARROW-8066) PyArrow discards timezones

2020-03-10 Thread Markovtsev Vadim (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markovtsev Vadim updated ARROW-8066: Description: The original description is at  [https://github.com/pandas-dev/pandas/issues/3

[jira] [Commented] (ARROW-8066) PyArrow discards timezones

2020-03-10 Thread Markovtsev Vadim (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056337#comment-17056337 ] Markovtsev Vadim commented on ARROW-8066: - [~wesm] Done. +100 to my pain (y) > P

[jira] [Updated] (ARROW-5279) [C++] Support reading delta dictionaries in IPC streams

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-5279: --- Fix Version/s: (was: 0.17.0) 1.0.0 > [C++] Support reading delta dicti

[jira] [Updated] (ARROW-6883) [C++] Support sending delta DictionaryBatch or replacement DictionaryBatch in IPC stream writer class

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6883: --- Fix Version/s: (was: 0.17.0) 1.0.0 > [C++] Support sending delta Dicti

[jira] [Updated] (ARROW-7779) [Format] Enable integration tests for dictionaries-within-dictionaries

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-7779: --- Component/s: Integration Fix Version/s: (was: 0.17.0) 1.0.0 > [F

[jira] [Updated] (ARROW-7778) [C++] Support nested dictionaries in JSON integration format

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-7778: --- Fix Version/s: (was: 0.17.0) 1.0.0 > [C++] Support nested dictionaries

[jira] [Resolved] (ARROW-7902) [Integration] Unskip nested dictionary integration tests

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-7902. Resolution: Duplicate > [Integration] Unskip nested dictionary integration tests >

[jira] [Assigned] (ARROW-7902) [Integration] Unskip nested dictionary integration tests

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-7902: -- Assignee: (was: Ben Kietzman) > [Integration] Unskip nested dictionary integration

[jira] [Commented] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056347#comment-17056347 ] Neal Richardson commented on ARROW-8056: Sounds good. For the future, we can leav

[jira] [Updated] (ARROW-8066) [Python] Specify behavior for converting tz-aware datetime.datetime objects

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-8066: Summary: [Python] Specify behavior for converting tz-aware datetime.datetime objects (was: PyArrow

[jira] [Resolved] (ARROW-7963) [C++][Python][Dataset] Expose listing fragments

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-7963. Fix Version/s: 0.17.0 Resolution: Fixed Issue resolved by pull request 6570 [https:/

[jira] [Commented] (ARROW-8066) [Python] Specify behavior for converting tz-aware datetime.datetime objects

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056405#comment-17056405 ] Wes McKinney commented on ARROW-8066: - Thanks. I believe we don't have any handling

[jira] [Commented] (ARROW-8015) [Python] Build 0.16.0 wheel install for Windows + Python 3.5 and publish to PyPI

2020-03-10 Thread Lucas Pickup (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056418#comment-17056418 ] Lucas Pickup commented on ARROW-8015: - [~wesm] Did you make any progress on this? Not

[jira] [Commented] (ARROW-8015) [Python] Build 0.16.0 wheel install for Windows + Python 3.5 and publish to PyPI

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056422#comment-17056422 ] Wes McKinney commented on ARROW-8015: - Sorry, I'm sick right now (just a cold, not co

  1   2   >