Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Wes McKinney
+1. The Thrift structures are private to parquet-cpp, so it should not be an issue On Wed, Sep 26, 2018 at 12:23 PM Zoltan Ivanfi wrote: > > Hi, > > I just had a conversation with Nandor and he pointed out to me that even if > we broke _reading_ in parquet-cpp 1.5.0, we could simply release a 1.5

Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Zoltan Ivanfi
Hi, I just had a conversation with Nandor and he pointed out to me that even if we broke _reading_ in parquet-cpp 1.5.0, we could simply release a 1.5.1 version that fixes it. The important thing is that _writing_ is good and parquet-format-compliant in parquet-cpp 1.5.0, therefore we do not have

[jira] [Commented] (PARQUET-1425) [Format] Fix Thrift compiler warning

2018-09-26 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628976#comment-16628976 ] Wes McKinney commented on PARQUET-1425: --- I see, it's not a big deal. Perhaps ther

[jira] [Commented] (PARQUET-1425) [Format] Fix Thrift compiler warning

2018-09-26 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628950#comment-16628950 ] Nandor Kollar commented on PARQUET-1425: [~wesmckinn] would you like this to be

Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Zoltan Ivanfi
Hi, Please let me know your opinions as well. So far all concerns were only raised by me, which may indicate that other community members do not consider this issue serious and in this case my suggestions may be excessive and unjustified. Just to clarify: The data structures for the encrypion fea

[jira] [Updated] (PARQUET-1427) [C++] Move example executables and CLI tools to Apache Arrow repo

2018-09-26 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated PARQUET-1427: Labels: pull-request-available (was: ) > [C++] Move example executables and CLI tools to

[jira] [Created] (PARQUET-1427) [C++] Move example executables and CLI tools to Apache Arrow repo

2018-09-26 Thread Wes McKinney (JIRA)
Wes McKinney created PARQUET-1427: - Summary: [C++] Move example executables and CLI tools to Apache Arrow repo Key: PARQUET-1427 URL: https://issues.apache.org/jira/browse/PARQUET-1427 Project: Parque

[jira] [Created] (PARQUET-1426) [C++] parquet-dump-schema has poor usability

2018-09-26 Thread Wes McKinney (JIRA)
Wes McKinney created PARQUET-1426: - Summary: [C++] parquet-dump-schema has poor usability Key: PARQUET-1426 URL: https://issues.apache.org/jira/browse/PARQUET-1426 Project: Parquet Issue Type

[jira] [Created] (PARQUET-1425) [Format] Fix Thrift compiler warning

2018-09-26 Thread Wes McKinney (JIRA)
Wes McKinney created PARQUET-1425: - Summary: [Format] Fix Thrift compiler warning Key: PARQUET-1425 URL: https://issues.apache.org/jira/browse/PARQUET-1425 Project: Parquet Issue Type: Bug

[jira] [Updated] (PARQUET-1387) Nanosecond precision time and timestamp - parquet-format

2018-09-26 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PARQUET-1387: --- Fix Version/s: format-2.6.0 > Nanosecond precision time and timestamp - parquet-format > --

[jira] [Updated] (PARQUET-1266) LogicalTypes union in parquet-format doesn't include UUID

2018-09-26 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PARQUET-1266: --- Fix Version/s: format-2.6.0 > LogicalTypes union in parquet-format doesn't include UUID > -

[jira] [Updated] (PARQUET-1290) Clarify maximum run lengths for RLE encoding

2018-09-26 Thread Nandor Kollar (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PARQUET-1290: --- Fix Version/s: format-2.6.0 > Clarify maximum run lengths for RLE encoding > --

Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Gidon Gershinsky
Oh, I see now. Moving the crypto structure from position 8 to 9. Let me process that, and we can switch to a direct channel to continue this discussion. On Wed, Sep 26, 2018 at 3:30 PM Zoltan Ivanfi wrote: > Hi, > > I think it's safer if we skip id 8 altogether and use id 9 for the new > crypto

Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Zoltan Ivanfi
Hi, I think it's safer if we skip id 8 altogether and use id 9 for the new crypto structure. This way we don't have to worry about remaining backwards compatible with the accidentally released structure. Br, Zoltan On Wed, Sep 26, 2018 at 2:25 PM Gidon Gershinsky wrote: > Hi, > > I think we s

Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Gidon Gershinsky
Hi, I think we should use this id for its current purpose. This field had been defined and merged months ago, and should be there is any scenario. Except for the last week's change, the encryption format had been stable for a while now. The timing of this change was unfortunate; but the change was

Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Zoltan Ivanfi
Hi, It seems that I spoke too early. I just noticed that a new field was added to the ColumnChunk struct in https://github.com/apache/parquet-cpp/pull/463/files#diff-0589b447b73e51c88d9b2bdcb0957084R708 Although the field is optional, we can't pretend it was never released, because parquet-cpp alr

Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Gidon Gershinsky
Sounds good. I agree a feature branch is the right thing to do in these cases. Cheers, Gidon. On Wed, Sep 26, 2018 at 2:42 PM Zoltan Ivanfi wrote: > Hi, > > If the encryption code release in parquet-cpp is unused at this moment then > I think we are fine. It means that we are still free to deci

Re: Parquet-cpp has already released unofficial thrift changes

2018-09-26 Thread Zoltan Ivanfi
Hi, If the encryption code release in parquet-cpp is unused at this moment then I think we are fine. It means that we are still free to decide any way about the data structures without the risk of incompatility issues. In this case I would suggest to proceed as we planned at the Parquet sync. Tha

Re: parquet sync notes

2018-09-26 Thread 俊杰陈
Hi Zoltan PR #62 contains some rebase info which is not relate to change itself so I created PR#99. Actually it only contains one file change now, I will add another document file later. Zoltan Ivanfi 于2018年9月26日周三 下午3:19写道: > Hi, > > It seems to me that PR #99 does not supersede PR #62, as the

Re: parquet sync notes

2018-09-26 Thread Zoltan Ivanfi
Hi, It seems to me that PR #99 does not supersede PR #62, as the latter affects 16 files but the former only modifies a single one. Or has the rest of the changes been already merged to the codebase from another PR? I checked the history and I don't see anything related. Thanks, Zoltan On Wed,