[jira] [Resolved] (ARROW-5901) [Rust] Implement PartialEq to compare array and json values

2019-07-29 Thread Neville Dipale (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neville Dipale resolved ARROW-5901. --- Resolution: Fixed Fix Version/s: 1.0.0 Issue resolved by pull request 4940

[jira] [Commented] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

2019-07-29 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895648#comment-16895648 ] Neal Richardson commented on ARROW-6043: Right; I was under the impression that {{NaN}} was

[jira] [Commented] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895642#comment-16895642 ] Wes McKinney commented on ARROW-6043: - [~npr] I think you're talking about something different. In

[jira] [Commented] (ARROW-6061) [C++] Cannot build libarrow without rapidjson

2019-07-29 Thread Sutou Kouhei (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895607#comment-16895607 ] Sutou Kouhei commented on ARROW-6061: - It's strange. Because we use bundled RapidJSON when we can't

[jira] [Updated] (ARROW-6066) [Website] Fix blog post author header

2019-07-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6066: -- Labels: pull-request-available (was: ) > [Website] Fix blog post author header >

[jira] [Updated] (ARROW-6066) [Website] Fix blog post author header

2019-07-29 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6066: --- Component/s: Website > [Website] Fix blog post author header >

[jira] [Created] (ARROW-6066) [Website] Fix blog post author header

2019-07-29 Thread Neal Richardson (JIRA)
Neal Richardson created ARROW-6066: -- Summary: [Website] Fix blog post author header Key: ARROW-6066 URL: https://issues.apache.org/jira/browse/ARROW-6066 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

2019-07-29 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895565#comment-16895565 ] Neal Richardson commented on ARROW-6043: For reference, here's how R handles this: {code:java} $

[jira] [Comment Edited] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

2019-07-29 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895517#comment-16895517 ] Antoine Pitrou edited comment on ARROW-6043 at 7/29/19 6:37 PM: Actually,

[jira] [Commented] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

2019-07-29 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895517#comment-16895517 ] Antoine Pitrou commented on ARROW-6043: --- Actually, if look on the C++ side (see

[jira] [Commented] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

2019-07-29 Thread Keith Kraus (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895514#comment-16895514 ] Keith Kraus commented on ARROW-6043: For context, the source of us finding this bug was we were using

[jira] [Updated] (ARROW-6065) [C++] Reorganize parquet/arrow/reader.cc, remove code duplication, improve readability

2019-07-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6065: -- Labels: pull-request-available (was: ) > [C++] Reorganize parquet/arrow/reader.cc, remove

[jira] [Assigned] (ARROW-6065) [C++] Reorganize parquet/arrow/reader.cc, remove code duplication, improve readability

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-6065: --- Assignee: Wes McKinney > [C++] Reorganize parquet/arrow/reader.cc, remove code duplication,

[jira] [Created] (ARROW-6065) [C++] Reorganize parquet/arrow/reader.cc, remove code duplication, improve readability

2019-07-29 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6065: --- Summary: [C++] Reorganize parquet/arrow/reader.cc, remove code duplication, improve readability Key: ARROW-6065 URL: https://issues.apache.org/jira/browse/ARROW-6065

[jira] [Created] (ARROW-6064) [FlightRPC] [C++] Clean up IWYU

2019-07-29 Thread lidavidm (JIRA)
lidavidm created ARROW-6064: --- Summary: [FlightRPC] [C++] Clean up IWYU Key: ARROW-6064 URL: https://issues.apache.org/jira/browse/ARROW-6064 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-6063) [FlightRPC] Implement "half-closed" semantics for DoPut

2019-07-29 Thread lidavidm (JIRA)
lidavidm created ARROW-6063: --- Summary: [FlightRPC] Implement "half-closed" semantics for DoPut Key: ARROW-6063 URL: https://issues.apache.org/jira/browse/ARROW-6063 Project: Apache Arrow Issue

[jira] [Created] (ARROW-6062) [FlightRPC] Allow timeouts on all stream reads

2019-07-29 Thread lidavidm (JIRA)
lidavidm created ARROW-6062: --- Summary: [FlightRPC] Allow timeouts on all stream reads Key: ARROW-6062 URL: https://issues.apache.org/jira/browse/ARROW-6062 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895416#comment-16895416 ] Wes McKinney commented on ARROW-6043: - This is true -- I think we might need to think about semantic

[jira] [Commented] (ARROW-6059) [Python] Regression memory issue when calling pandas.read_parquet

2019-07-29 Thread Francisco Sanchez (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895397#comment-16895397 ] Francisco Sanchez commented on ARROW-6059: -- For some reason I cannot upload a file here, it

[jira] [Commented] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

2019-07-29 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895396#comment-16895396 ] Antoine Pitrou commented on ARROW-6043: --- How is this surprising? It's well-known (well, perhaps not

[jira] [Comment Edited] (ARROW-6060) [Python] too large memory cost using pyarrow.parquet.read_table with use_threads=True

2019-07-29 Thread Kun Liu (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895387#comment-16895387 ] Kun Liu edited comment on ARROW-6060 at 7/29/19 3:53 PM: - [~wesmckinn] I used the

[jira] [Commented] (ARROW-6060) [Python] too large memory cost using pyarrow.parquet.read_table with use_threads=True

2019-07-29 Thread Kun Liu (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895387#comment-16895387 ] Kun Liu commented on ARROW-6060: [~wesmckinn] I used the following code to generate a sample parquet. 

[jira] [Commented] (ARROW-6058) [Python][Parquet] Failure when reading Parquet file from S3

2019-07-29 Thread Siddharth (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895353#comment-16895353 ] Siddharth commented on ARROW-6058: -- hey [~wesmckinn] thanks for your prompt reply. I checked each part

[jira] [Comment Edited] (ARROW-6058) [Python][Parquet] Failure when reading Parquet file from S3

2019-07-29 Thread Siddharth (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895353#comment-16895353 ] Siddharth edited comment on ARROW-6058 at 7/29/19 3:23 PM: --- hey [~wesmckinn]

[jira] [Commented] (ARROW-6061) [C++] Cannot build libarrow without rapidjson

2019-07-29 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895343#comment-16895343 ] Antoine Pitrou commented on ARROW-6061: --- I'd rather have this done as part of this issue. There is

[jira] [Comment Edited] (ARROW-6061) [C++] Cannot build libarrow without rapidjson

2019-07-29 Thread Hatem Helal (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895325#comment-16895325 ] Hatem Helal edited comment on ARROW-6061 at 7/29/19 3:14 PM: - I believe we

[jira] [Commented] (ARROW-6061) [C++] Cannot build libarrow without rapidjson

2019-07-29 Thread Hatem Helal (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895336#comment-16895336 ] Hatem Helal commented on ARROW-6061: I've posted a PR which restores the previous behavior of JSON

[jira] [Commented] (ARROW-6060) [Python] too large memory cost using pyarrow.parquet.read_table with use_threads=True

2019-07-29 Thread Kun Liu (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895331#comment-16895331 ] Kun Liu commented on ARROW-6060: Thanks for the response, [~wesmckinn]. I am trying to generate a sample

[jira] [Updated] (ARROW-6061) [C++] Cannot build libarrow without rapidjson

2019-07-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6061: -- Labels: pull-request-available (was: ) > [C++] Cannot build libarrow without rapidjson >

[jira] [Commented] (ARROW-6061) [C++] Cannot build libarrow without rapidjson

2019-07-29 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895329#comment-16895329 ] Antoine Pitrou commented on ARROW-6061: --- If we wanted to make JSON optional, then we would need a

[jira] [Resolved] (ARROW-6054) pyarrow.serialize should respect the value of structured dtype of numpy

2019-07-29 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-6054. --- Resolution: Fixed Fix Version/s: 1.0.0 Issue resolved by pull request 4953

[jira] [Created] (ARROW-6061) [C++] Cannot build libarrow without rapidjson

2019-07-29 Thread Hatem Helal (JIRA)
Hatem Helal created ARROW-6061: -- Summary: [C++] Cannot build libarrow without rapidjson Key: ARROW-6061 URL: https://issues.apache.org/jira/browse/ARROW-6061 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-6061) [C++] Cannot build libarrow without rapidjson

2019-07-29 Thread Hatem Helal (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895325#comment-16895325 ] Hatem Helal commented on ARROW-6061: I believe we used to get away with not having rapidjson

[jira] [Commented] (ARROW-6057) [Python] Parquet files v2.0 created by spark can't be read by pyarrow

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895322#comment-16895322 ] Wes McKinney commented on ARROW-6057: - Yes, this is PARQUET-458. I linked the JIRA -- some unit tests

[jira] [Updated] (ARROW-6057) [Python] Parquet files v2.0 created by spark can't be read by pyarrow

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6057: Summary: [Python] Parquet files v2.0 created by spark can't be read by pyarrow (was: Parquet

[jira] [Resolved] (ARROW-6006) [C++] Empty IPC streams containing a dictionary are corrupt

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-6006. - Resolution: Fixed Issue resolved by pull request 4947

[jira] [Resolved] (ARROW-6042) [C++] Implement alternative DictionaryBuilder that always yields int32 indices

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-6042. - Resolution: Fixed Issue resolved by pull request 4956

[jira] [Updated] (ARROW-6060) [Python] too large memory cost using pyarrow.parquet.read_table with use_threads=True

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6060: Summary: [Python] too large memory cost using pyarrow.parquet.read_table with use_threads=True

[jira] [Updated] (ARROW-6059) [Python] Regression memory issue when calling pandas.read_parquet

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6059: Summary: [Python] Regression memory issue when calling pandas.read_parquet (was: Regression

[jira] [Commented] (ARROW-6059) [Python] Regression memory issue when calling pandas.read_parquet

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895309#comment-16895309 ] Wes McKinney commented on ARROW-6059: - Can you provide a sample file that reproduces the issue? >

[jira] [Commented] (ARROW-6060) too large memory cost using pyarrow.parquet.read_table with use_threads=True

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895308#comment-16895308 ] Wes McKinney commented on ARROW-6060: - Can you provide an example file that we can use to try to find

[jira] [Comment Edited] (ARROW-6057) Parquet files v2.0 created by spark can't be read by pyarrow

2019-07-29 Thread Vladyslav Shamaida (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895275#comment-16895275 ] Vladyslav Shamaida edited comment on ARROW-6057 at 7/29/19 2:20 PM:

[jira] [Comment Edited] (ARROW-6057) Parquet files v2.0 created by spark can't be read by pyarrow

2019-07-29 Thread Vladyslav Shamaida (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895275#comment-16895275 ] Vladyslav Shamaida edited comment on ARROW-6057 at 7/29/19 2:00 PM:

[jira] [Commented] (ARROW-6057) Parquet files v2.0 created by spark can't be read by pyarrow

2019-07-29 Thread Vladyslav Shamaida (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895275#comment-16895275 ] Vladyslav Shamaida commented on ARROW-6057: --- Is this bug related to 

[jira] [Resolved] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-5924. - Resolution: Fixed Issue resolved by pull request 4877

[jira] [Commented] (ARROW-6058) [Python][Parquet] Failure when reading Parquet file from S3

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895262#comment-16895262 ] Wes McKinney commented on ARROW-6058: - Seems like one of the files is corrupted. Have you tried

[jira] [Updated] (ARROW-6058) [Python][Parquet] Failure when reading Parquet file from S3

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6058: Labels: parquet (was: ) > [Python][Parquet] Failure when reading Parquet file from S3 >

[jira] [Updated] (ARROW-6058) [Python][Parquet] Failure when reading Parquet file from S3

2019-07-29 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6058: Summary: [Python][Parquet] Failure when reading Parquet file from S3 (was:

[jira] [Resolved] (ARROW-6053) [Python] RecordBatchStreamReader::Open2 cdef type signature doesn't match C++

2019-07-29 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-6053. --- Resolution: Fixed Fix Version/s: 1.0.0 Issue resolved by pull request 4957

[jira] [Commented] (ARROW-6025) [Gandiva][Test] Error handling for missing timezone in castTIMESTAMP_utf8 tests

2019-07-29 Thread Pindikura Ravindra (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16895173#comment-16895173 ] Pindikura Ravindra commented on ARROW-6025: --- [~kszucs] - is the solution that [~wesmckinn]

[jira] [Created] (ARROW-6060) too large memory cost using pyarrow.parquet.read_table with use_threads=True

2019-07-29 Thread Kun Liu (JIRA)
Kun Liu created ARROW-6060: -- Summary: too large memory cost using pyarrow.parquet.read_table with use_threads=True Key: ARROW-6060 URL: https://issues.apache.org/jira/browse/ARROW-6060 Project: Apache Arrow

[jira] [Updated] (ARROW-6034) [C++][Gandiva] Add string functions in Gandiva

2019-07-29 Thread Prudhvi Porandla (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prudhvi Porandla updated ARROW-6034: Description: Add following functions in Gandiva - substr(str, offset, len) : returns

[jira] [Assigned] (ARROW-5924) [C++][Plasma] It is not convenient to release a GPU object

2019-07-29 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-5924: - Assignee: shengjun.li > [C++][Plasma] It is not convenient to release a GPU object >

[jira] [Updated] (ARROW-6059) Regression memory issue when calling pandas.read_parquet

2019-07-29 Thread Francisco Sanchez (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francisco Sanchez updated ARROW-6059: - Description: I have a ~3MB parquet file with the next schema: {code:java} bag_stamp:

[jira] [Updated] (ARROW-6059) Regression memory issue when calling pandas.read_parquet

2019-07-29 Thread Francisco Sanchez (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francisco Sanchez updated ARROW-6059: - Description: I have a ~2MB parquet file with the next schema: {code:java} bag_stamp:

[jira] [Created] (ARROW-6059) Regression memory issue when calling pandas.read_parquet

2019-07-29 Thread Francisco Sanchez (JIRA)
Francisco Sanchez created ARROW-6059: Summary: Regression memory issue when calling pandas.read_parquet Key: ARROW-6059 URL: https://issues.apache.org/jira/browse/ARROW-6059 Project: Apache Arrow

[jira] [Updated] (ARROW-6057) Parquet files v2.0 created by spark can't be read by pyarrow

2019-07-29 Thread Vladyslav Shamaida (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladyslav Shamaida updated ARROW-6057: -- Affects Version/s: 0.14.1 > Parquet files v2.0 created by spark can't be read by

[jira] [Updated] (ARROW-6058) pyarrow.lib.ArrowIOError: Unexpected end of stream: Page was smaller than expected

2019-07-29 Thread Siddharth (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth updated ARROW-6058: - Description: I am reading parquet data from S3 and get  ArrowIOError error. Size of the data: 32 part

[jira] [Updated] (ARROW-6058) pyarrow.lib.ArrowIOError: Unexpected end of stream: Page was smaller than expected

2019-07-29 Thread Siddharth (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth updated ARROW-6058: - Description: I am reading parquet data from S3 and get  ArrowIOError error. Size of the data: 32 part

[jira] [Updated] (ARROW-6058) pyarrow.lib.ArrowIOError: Unexpected end of stream: Page was smaller than expected

2019-07-29 Thread Siddharth (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth updated ARROW-6058: - Description: I am reading parquet data from S3 and get  ArrowIOError error. Size of the data: 32 part

[jira] [Updated] (ARROW-6058) pyarrow.lib.ArrowIOError: Unexpected end of stream: Page was smaller than expected

2019-07-29 Thread Siddharth (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth updated ARROW-6058: - Description: I am reading parquet data from S3 and get  ArrowIOError error. Size of the data: 32 part

[jira] [Created] (ARROW-6058) pyarrow.lib.ArrowIOError: Unexpected end of stream: Page was smaller than expected

2019-07-29 Thread Siddharth (JIRA)
Siddharth created ARROW-6058: Summary: pyarrow.lib.ArrowIOError: Unexpected end of stream: Page was smaller than expected Key: ARROW-6058 URL: https://issues.apache.org/jira/browse/ARROW-6058 Project:

[jira] [Created] (ARROW-6057) Parquet files v2.0 created by spark can't be read by pyarrow

2019-07-29 Thread Vladyslav Shamaida (JIRA)
Vladyslav Shamaida created ARROW-6057: - Summary: Parquet files v2.0 created by spark can't be read by pyarrow Key: ARROW-6057 URL: https://issues.apache.org/jira/browse/ARROW-6057 Project: Apache

[jira] [Updated] (ARROW-6040) [Java] Dictionary entries are required in IPC streams even when empty

2019-07-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6040: -- Labels: pull-request-available (was: ) > [Java] Dictionary entries are required in IPC