[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060782#comment-17060782 ] Joris Van den Bossche commented on ARROW-7997: -- The PR is merged in the mean

[jira] [Commented] (ARROW-8131) [Python] Add dynamic attributes to PyArrow ExtensionArray

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060787#comment-17060787 ] Joris Van den Bossche commented on ARROW-8131: -- We could also try to make it

[jira] [Assigned] (ARROW-7996) [Python] Error serializing empty pandas DataFrame with pyarrow

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-7996: Assignee: Joris Van den Bossche > [Python] Error serializing empty pandas

[jira] [Assigned] (ARROW-7996) [Python] Error serializing empty pandas DataFrame with pyarrow

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-7996: Assignee: (was: Joris Van den Bossche) > [Python] Error serializing em

[jira] [Commented] (ARROW-8131) [Python] Add dynamic attributes to PyArrow ExtensionArray

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060805#comment-17060805 ] Joris Van den Bossche commented on ARROW-8131: -- Personally, I think ARROW-61

[jira] [Updated] (ARROW-8105) [Python] pyarrow.array segfaults when passed masked array with shrunken mask

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8105: - Component/s: Python > [Python] pyarrow.array segfaults when passed masked array w

[jira] [Resolved] (ARROW-8105) [Python] pyarrow.array segfaults when passed masked array with shrunken mask

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche resolved ARROW-8105. -- Resolution: Fixed Issue resolved by pull request 6612 [https://github.com/apach

[jira] [Created] (ARROW-8136) [C++][Python] Creating dataset from relative path no longer working

2020-03-17 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8136: Summary: [C++][Python] Creating dataset from relative path no longer working Key: ARROW-8136 URL: https://issues.apache.org/jira/browse/ARROW-8136 Pro

[jira] [Comment Edited] (ARROW-8136) [C++][Python] Creating dataset from relative path no longer working

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060930#comment-17060930 ] Joris Van den Bossche edited comment on ARROW-8136 at 3/17/20, 1:59 PM: ---

[jira] [Commented] (ARROW-8136) [C++][Python] Creating dataset from relative path no longer working

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060930#comment-17060930 ] Joris Van den Bossche commented on ARROW-8136: -- Still, users do this _all th

[jira] [Commented] (ARROW-8136) [C++][Python] Creating dataset from relative path no longer working

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060937#comment-17060937 ] Joris Van den Bossche commented on ARROW-8136: -- We also actually support rel

[jira] [Commented] (ARROW-8135) Problem importing PyArrow on a cluster

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060962#comment-17060962 ] Joris Van den Bossche commented on ARROW-8135: -- Can you give more details on

[jira] [Updated] (ARROW-8135) [Python] Problem importing PyArrow on a cluster

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8135: - Summary: [Python] Problem importing PyArrow on a cluster (was: Problem importing

[jira] [Commented] (ARROW-5666) [Python] Underscores in partition (string) values are dropped when reading dataset

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060975#comment-17060975 ] Joris Van den Bossche commented on ARROW-5666: -- This now works with the new

[jira] [Commented] (ARROW-5666) [Python] Underscores in partition (string) values are dropped when reading dataset

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060980#comment-17060980 ] Joris Van den Bossche commented on ARROW-5666: -- With the new API, you can al

[jira] [Commented] (ARROW-5310) [Python] better error message on creating ParquetDataset from empty directory

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060984#comment-17060984 ] Joris Van den Bossche commented on ARROW-5310: -- This works now with the new

[jira] [Commented] (ARROW-3861) [Python] ParquetDataset().read columns argument always returns partition column

2020-03-17 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061025#comment-17061025 ] Joris Van den Bossche commented on ARROW-3861: -- This works now correctly wit

[jira] [Updated] (ARROW-8142) [Python/C++] Casting empty table from after parquet roundtrip causes critical failure

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8142: - Description: When casting a schema of an empty table from dict encoded to non-dic

[jira] [Updated] (ARROW-8142) [Python/C++] Casting empty table from after parquet roundtrip causes critical failure

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8142: - Fix Version/s: 0.17.0 > [Python/C++] Casting empty table from after parquet round

[jira] [Updated] (ARROW-8142) [Python/C++] Casting empty table from after parquet roundtrip causes critical failure

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8142: - Component/s: C++ > [Python/C++] Casting empty table from after parquet roundtrip

[jira] [Assigned] (ARROW-8122) [Python] Empty numpy arrays with shape cannot be deserialized

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8122: Assignee: Wenjun Si (was: Joris Van den Bossche) > [Python] Empty numpy a

[jira] [Reopened] (ARROW-7907) [Python] Conversion to pandas of empty table with timestamp type aborts

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reopened ARROW-7907: -- Assignee: (was: Wes McKinney) Actually, it seems that the linked commit o

[jira] [Commented] (ARROW-8142) [Python/C++] Casting empty table from after parquet roundtrip causes critical failure

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061622#comment-17061622 ] Joris Van den Bossche commented on ARROW-8142: -- [~fjetter] thanks for the re

[jira] [Commented] (ARROW-7907) [Python] Conversion to pandas of empty table with timestamp type aborts

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061623#comment-17061623 ] Joris Van den Bossche commented on ARROW-7907: -- So a small reproducer that c

[jira] [Commented] (ARROW-7854) [C++][Dataset] Option to memory map when reading IPC format

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061882#comment-17061882 ] Joris Van den Bossche commented on ARROW-7854: -- [~fsaintjacques] this actual

[jira] [Commented] (ARROW-5572) [Python] raise error message when passing invalid filter in parquet reading

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061915#comment-17061915 ] Joris Van den Bossche commented on ARROW-5572: -- This works now correctly wit

[jira] [Assigned] (ARROW-8088) [C++][Dataset] Partition columns with specified dictionary type result in all nulls

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8088: Assignee: Ben Kietzman (was: Joris Van den Bossche) > [C++][Dataset] Part

[jira] [Assigned] (ARROW-8088) [C++][Dataset] Partition columns with specified dictionary type result in all nulls

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8088: Assignee: Joris Van den Bossche (was: Ben Kietzman) > [C++][Dataset] Part

[jira] [Updated] (ARROW-8088) [C++][Dataset] Partition columns with specified dictionary type result in all nulls

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8088: - Fix Version/s: 0.17.0 > [C++][Dataset] Partition columns with specified dictionar

[jira] [Assigned] (ARROW-8088) [C++][Dataset] Partition columns with specified dictionary type result in all nulls

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8088: Assignee: Ben Kietzman (was: Joris Van den Bossche) > [C++][Dataset] Part

[jira] [Assigned] (ARROW-8088) [C++][Dataset] Partition columns with specified dictionary type result in all nulls

2020-03-18 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8088: Assignee: Joris Van den Bossche > [C++][Dataset] Partition columns with sp

[jira] [Updated] (ARROW-8158) [Java] Getting length of data buffer and base variable width vector

2020-03-19 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8158: - Summary: [Java] Getting length of data buffer and base variable width vector (wa

[jira] [Commented] (ARROW-8142) [Python/C++] Casting empty table from after parquet roundtrip causes critical failure

2020-03-19 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062709#comment-17062709 ] Joris Van den Bossche commented on ARROW-8142: -- It's also not specific to di

[jira] [Updated] (ARROW-8142) [C++] Casting a chunked array with 0 chunks critical failure

2020-03-19 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8142: - Summary: [C++] Casting a chunked array with 0 chunks critical failure (was: [Pyt

[jira] [Resolved] (ARROW-8159) [Python] pyarrow.Schema.from_pandas doesn't support ExtensionDtype

2020-03-19 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche resolved ARROW-8159. -- Resolution: Fixed Issue resolved by pull request 6665 [https://github.com/apach

[jira] [Assigned] (ARROW-8142) [C++] Casting a chunked array with 0 chunks critical failure

2020-03-20 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8142: Assignee: Joris Van den Bossche (was: Ben Kietzman) > [C++] Casting a chu

[jira] [Assigned] (ARROW-8142) [C++] Casting a chunked array with 0 chunks critical failure

2020-03-20 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8142: Assignee: Ben Kietzman (was: Joris Van den Bossche) > [C++] Casting a chu

[jira] [Created] (ARROW-8186) [Python] Dataset expression != returns bool instead of expression for invalid value

2020-03-23 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8186: Summary: [Python] Dataset expression != returns bool instead of expression for invalid value Key: ARROW-8186 URL: https://issues.apache.org/jira/browse/ARROW-8186

[jira] [Updated] (ARROW-8186) [Python] Dataset expression != returns bool instead of expression for invalid value

2020-03-23 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8186: - Description: It's a bit a strange case, but eg when doing {{!= \{3\}}} you get a

[jira] [Commented] (ARROW-8039) [C++][Python][Dataset] Assemble a minimal ParquetDataset shim

2020-03-23 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064894#comment-17064894 ] Joris Van den Bossche commented on ARROW-8039: -- I expanded my existing PR fo

[jira] [Assigned] (ARROW-8039) [C++][Python][Dataset] Assemble a minimal ParquetDataset shim

2020-03-23 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8039: Assignee: Joris Van den Bossche (was: Ben Kietzman) > [C++][Python][Datas

[jira] [Commented] (ARROW-8173) [C++] Validate ChunkedArray()'s arguments

2020-03-23 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065033#comment-17065033 ] Joris Van den Bossche commented on ARROW-8173: -- There are {{ChunkedArray::Va

[jira] [Assigned] (ARROW-6872) [C++][Python] Empty table with dictionary-columns raises ArrowNotImplementedError

2020-03-23 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-6872: Assignee: Joris Van den Bossche > [C++][Python] Empty table with dictionar

[jira] [Created] (ARROW-8196) [Python] Empty table creation from schema with nested dictionary type

2020-03-24 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8196: Summary: [Python] Empty table creation from schema with nested dictionary type Key: ARROW-8196 URL: https://issues.apache.org/jira/browse/ARROW-8196 P

[jira] [Assigned] (ARROW-8186) [Python] Dataset expression != returns bool instead of expression for invalid value

2020-03-24 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8186: Assignee: Joris Van den Bossche > [Python] Dataset expression != returns b

[jira] [Commented] (ARROW-8189) [Python] Python bindings for C++ Builder classes

2020-03-24 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065761#comment-17065761 ] Joris Van den Bossche commented on ARROW-8189: -- This would be similar to the

[jira] [Updated] (ARROW-8186) [Python] Dataset expression != returns bool instead of expression for invalid value

2020-03-24 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8186: - Fix Version/s: 0.17.0 > [Python] Dataset expression != returns bool instead of ex

[jira] [Commented] (ARROW-3391) [Python] Support \0 characters in binary Parquet predicate values

2020-03-24 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065864#comment-17065864 ] Joris Van den Bossche commented on ARROW-3391: -- Thanks for the clarification

[jira] [Commented] (ARROW-2647) [C++/Python] Provide assertion helpers in the style of pandas.testing.assert_frame_equal

2020-03-24 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065957#comment-17065957 ] Joris Van den Bossche commented on ARROW-2647: -- I also regularly run into th

[jira] [Commented] (ARROW-2647) [C++/Python] Provide assertion helpers in the style of pandas.testing.assert_frame_equal

2020-03-24 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065959#comment-17065959 ] Joris Van den Bossche commented on ARROW-2647: -- For a failing test I am havi

[jira] [Assigned] (ARROW-5790) [Python] Passing zero-dim numpy array to pa.array causes segfault

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-5790: Assignee: Joris Van den Bossche > [Python] Passing zero-dim numpy array to

[jira] [Updated] (ARROW-5889) [Python][C++] Parquet backwards compat for timestamps without timezone broken

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5889: - Fix Version/s: 0.14.1 > [Python][C++] Parquet backwards compat for timestamps wit

[jira] [Commented] (ARROW-5889) [Python][C++] Parquet backwards compat for timestamps without timezone broken

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881492#comment-16881492 ] Joris Van den Bossche commented on ARROW-5889: -- [~fjetter] thanks for the te

[jira] [Commented] (ARROW-5889) [Python][C++] Parquet backwards compat for timestamps without timezone broken

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881495#comment-16881495 ] Joris Van den Bossche commented on ARROW-5889: -- This is very much related to

[jira] [Updated] (ARROW-5450) [Python] TimestampArray.to_pylist() fails with OverflowError: Python int too large to convert to C long

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5450: - Fix Version/s: 1.0.0 > [Python] TimestampArray.to_pylist() fails with OverflowErr

[jira] [Commented] (ARROW-5888) [Python][C++] Parquet write metadata not roundtrip safe for timezone timestamps

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881503#comment-16881503 ] Joris Van den Bossche commented on ARROW-5888: -- The Parquet file format has

[jira] [Assigned] (ARROW-5873) [Python][C++] Segmentation fault when comparing schema with None

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-5873: Assignee: Joris Van den Bossche > [Python][C++] Segmentation fault when co

[jira] [Commented] (ARROW-5895) [Python] New version stores timestamps as epoch ms instead of ISO timestamp string

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881549#comment-16881549 ] Joris Van den Bossche commented on ARROW-5895: -- [~johwilso1] Thanks for the

[jira] [Commented] (ARROW-5610) [Python] Define extension type API in Python to "receive" or "send" a foreign extension type

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881581#comment-16881581 ] Joris Van den Bossche commented on ARROW-5610: -- I am trying to wrap my head

[jira] [Commented] (ARROW-5895) [Python] New version stores timestamps as epoch ms instead of ISO timestamp string

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881585#comment-16881585 ] Joris Van den Bossche commented on ARROW-5895: -- So what changed in 0.14.0 co

[jira] [Comment Edited] (ARROW-5895) [Python] New version stores timestamps as epoch ms instead of ISO timestamp string

2019-07-09 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881585#comment-16881585 ] Joris Van den Bossche edited comment on ARROW-5895 at 7/9/19 10:07 PM:

[jira] [Commented] (ARROW-5610) [Python] Define extension type API in Python to "receive" or "send" a foreign extension type

2019-07-10 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882035#comment-16882035 ] Joris Van den Bossche commented on ARROW-5610: -- > You can already implement

[jira] [Comment Edited] (ARROW-5610) [Python] Define extension type API in Python to "receive" or "send" a foreign extension type

2019-07-10 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882035#comment-16882035 ] Joris Van den Bossche edited comment on ARROW-5610 at 7/10/19 1:08 PM:

[jira] [Commented] (ARROW-5610) [Python] Define extension type API in Python to "receive" or "send" a foreign extension type

2019-07-10 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882059#comment-16882059 ] Joris Van den Bossche commented on ARROW-5610: -- > I see. My assumption was t

[jira] [Commented] (ARROW-5889) [Python][C++] Parquet backwards compat for timestamps without timezone broken

2019-07-10 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882350#comment-16882350 ] Joris Van den Bossche commented on ARROW-5889: -- {quote}On writing a schema,

[jira] [Assigned] (ARROW-5864) [Python] simplify cython wrapping of Result

2019-07-10 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-5864: Assignee: Joris Van den Bossche > [Python] simplify cython wrapping of Res

[jira] [Commented] (ARROW-5889) [Python][C++] Parquet backwards compat for timestamps without timezone broken

2019-07-10 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882428#comment-16882428 ] Joris Van den Bossche commented on ARROW-5889: -- For my education: what is an

[jira] [Created] (ARROW-5905) [Python] support conversion to decimal type from floats?

2019-07-10 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-5905: Summary: [Python] support conversion to decimal type from floats? Key: ARROW-5905 URL: https://issues.apache.org/jira/browse/ARROW-5905 Project: Apache

[jira] [Updated] (ARROW-5910) [Python] read_tensor() fails on non-seekable streams

2019-07-11 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5910: - Summary: [Python] read_tensor() fails on non-seekable streams (was: read_tensor(

[jira] [Updated] (ARROW-5907) base64 support of bytes-like

2019-07-11 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5907: - Description: Currently pyarrow could not be encoded by base64: {code} t = numpy.

[jira] [Commented] (ARROW-5907) base64 support of bytes-like

2019-07-11 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882942#comment-16882942 ] Joris Van den Bossche commented on ARROW-5907: -- (updated the issue with the

[jira] [Updated] (ARROW-5907) [Python] base64 support of bytes-like

2019-07-11 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5907: - Summary: [Python] base64 support of bytes-like (was: base64 support of bytes-lik

[jira] [Updated] (ARROW-5907) [Python] base64 support of bytes-like

2019-07-11 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5907: - Fix Version/s: (was: 0.14.0) > [Python] base64 support of bytes-like > --

[jira] [Created] (ARROW-5912) [Python] conversion from datetime objects with mixed timezones should normalize to UTC

2019-07-11 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-5912: Summary: [Python] conversion from datetime objects with mixed timezones should normalize to UTC Key: ARROW-5912 URL: https://issues.apache.org/jira/browse/ARROW-59

[jira] [Updated] (ARROW-5912) [Python] conversion from datetime objects with mixed timezones should normalize to UTC

2019-07-11 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5912: - Fix Version/s: 1.0.0 > [Python] conversion from datetime objects with mixed timez

[jira] [Created] (ARROW-5915) [C++] [Python] Set up testing for backwards compatibility of the parquet reader

2019-07-11 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-5915: Summary: [C++] [Python] Set up testing for backwards compatibility of the parquet reader Key: ARROW-5915 URL: https://issues.apache.org/jira/browse/ARROW-5915

[jira] [Updated] (ARROW-5915) [C++] [Python] Set up testing for backwards compatibility of the parquet reader

2019-07-11 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5915: - Description: Given the recent parquet compat problems, we should have better test

[jira] [Closed] (ARROW-4032) [Python] New pyarrow.Table functions: from_pydict(), from_pylist() and to_pylist()

2019-07-31 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche closed ARROW-4032. Resolution: Duplicate > [Python] New pyarrow.Table functions: from_pydict(), from_p

[jira] [Commented] (ARROW-4032) [Python] New pyarrow.Table functions: from_pydict(), from_pylist() and to_pylist()

2019-07-31 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896867#comment-16896867 ] Joris Van den Bossche commented on ARROW-4032: -- Closing this issue in favor

[jira] [Commented] (ARROW-6001) Add from_pydict(), from_pylist() and to_pylist() to pyarrow.Table + improve pandas.to_dict()

2019-07-31 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896870#comment-16896870 ] Joris Van den Bossche commented on ARROW-6001: -- See also ARROW-4032 for simi

[jira] [Updated] (ARROW-6001) [Python] Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

2019-07-31 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-6001: - Component/s: Python Summary: [Python] Add from_pylist() and to_pylist() to

[jira] [Commented] (ARROW-6001) [Python] Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records

2019-07-31 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896888#comment-16896888 ] Joris Van den Bossche commented on ARROW-6001: -- I think the functionality to

[jira] [Commented] (ARROW-5952) [Python] Segfault when reading empty table with category as pandas dataframe

2019-07-31 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896980#comment-16896980 ] Joris Van den Bossche commented on ARROW-5952: -- [~nugend] Thanks for the rep

[jira] [Commented] (ARROW-6081) FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmptb2ao6te_job_6e0a8ca1.parquet'

2019-07-31 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897125#comment-16897125 ] Joris Van den Bossche commented on ARROW-6081: -- The final error comes from b

[jira] [Created] (ARROW-6082) [Python] create pa.dictionary() type with non-integer indices type crashes

2019-07-31 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-6082: Summary: [Python] create pa.dictionary() type with non-integer indices type crashes Key: ARROW-6082 URL: https://issues.apache.org/jira/browse/ARROW-6082

[jira] [Commented] (ARROW-6004) [C++] CSV reader ignore_empty_lines option doesn't handle empty lines

2019-07-31 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897257#comment-16897257 ] Joris Van den Bossche commented on ARROW-6004: -- [~fsaintjacques] skipping em

[jira] [Assigned] (ARROW-6082) [Python] create pa.dictionary() type with non-integer indices type crashes

2019-08-01 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-6082: Assignee: Joris Van den Bossche > [Python] create pa.dictionary() type wit

[jira] [Created] (ARROW-6115) [Python] support LargeList, LargeString, LargeBinary in conversion to pandas

2019-08-02 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-6115: Summary: [Python] support LargeList, LargeString, LargeBinary in conversion to pandas Key: ARROW-6115 URL: https://issues.apache.org/jira/browse/ARROW-6115

[jira] [Updated] (ARROW-6114) Datatypes are not preserved when a pandas dataframe partitioned and saved as parquet file using pyarrow

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-6114: - Labels: parquet (was: ) > Datatypes are not preserved when a pandas dataframe pa

[jira] [Commented] (ARROW-5480) [Python] Pandas categorical type doesn't survive a round-trip through parquet

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898725#comment-16898725 ] Joris Van den Bossche commented on ARROW-5480: -- {quote}One slightly higher l

[jira] [Commented] (ARROW-6114) Datatypes are not preserved when a pandas dataframe partitioned and saved as parquet file using pyarrow

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898737#comment-16898737 ] Joris Van den Bossche commented on ARROW-6114: -- [~bnriiitb] thanks for openi

[jira] [Updated] (ARROW-6114) Datatypes are not preserved when a pandas dataframe partitioned and saved as parquet file using pyarrow

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-6114: - Labels: dataset parquet (was: parquet) > Datatypes are not preserved when a pand

[jira] [Updated] (ARROW-6114) [Python] Datatypes are not preserved when a pandas dataframe partitioned and saved as parquet file using pyarrow

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-6114: - Summary: [Python] Datatypes are not preserved when a pandas dataframe partitioned

[jira] [Updated] (ARROW-5682) [Python] from_pandas conversion casts values to string inconsistently

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-5682: - Issue Type: Bug (was: Improvement) > [Python] from_pandas conversion casts value

[jira] [Commented] (ARROW-5682) [Python] from_pandas conversion casts values to string inconsistently

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898787#comment-16898787 ] Joris Van den Bossche commented on ARROW-5682: -- This seems to be specific to

[jira] [Comment Edited] (ARROW-5610) [Python] Define extension type API in Python to "receive" or "send" a foreign extension type

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898927#comment-16898927 ] Joris Van den Bossche edited comment on ARROW-5610 at 8/2/19 2:48 PM: -

[jira] [Commented] (ARROW-5610) [Python] Define extension type API in Python to "receive" or "send" a foreign extension type

2019-08-02 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898927#comment-16898927 ] Joris Van den Bossche commented on ARROW-5610: -- {quote}I'll try to take a pa

[jira] [Created] (ARROW-6132) [Python] ListArray.from_arrays does not check validity of input arrays

2019-08-05 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-6132: Summary: [Python] ListArray.from_arrays does not check validity of input arrays Key: ARROW-6132 URL: https://issues.apache.org/jira/browse/ARROW-6132

[jira] [Assigned] (ARROW-6132) [Python] ListArray.from_arrays does not check validity of input arrays

2019-08-07 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-6132: Assignee: Joris Van den Bossche > [Python] ListArray.from_arrays does not

[jira] [Commented] (ARROW-6132) [Python] ListArray.from_arrays does not check validity of input arrays

2019-08-07 Thread Joris Van den Bossche (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901945#comment-16901945 ] Joris Van den Bossche commented on ARROW-6132: -- {{DictionaryArray.from_array

<    8   9   10   11   12   13   14   15   16   >