[jira] [Updated] (PARQUET-1239) [C++] Arrow table reads error when overflowing capacity of BinaryArray

2018-09-04 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated PARQUET-1239: -- Fix Version/s: (was: cpp-1.5.0) cpp-1.6.0 > [C++] Arrow table reads

[jira] [Commented] (PARQUET-1239) [C++] Arrow table reads error when overflowing capacity of BinaryArray

2018-09-04 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603593#comment-16603593 ] Wes McKinney commented on PARQUET-1239: --- It's hard to say but things are going to be much easier

[jira] [Created] (PARQUET-1413) [C++] Remove virtual calls from parquet::Comparator hot paths on writing file

2018-09-04 Thread Wes McKinney (JIRA)
Wes McKinney created PARQUET-1413: - Summary: [C++] Remove virtual calls from parquet::Comparator hot paths on writing file Key: PARQUET-1413 URL: https://issues.apache.org/jira/browse/PARQUET-1413

Re: [RESULT] [VOTE] Moving Apache Parquet C++ development process to a monorepo structure with Apache Arrow C++

2018-09-04 Thread Wes McKinney
Great. It is definitely going to require some follow up patches to fix up the various packaging tasks, but at least the Linux Python wheels will still be working to start On Tue, Sep 4, 2018 at 2:04 PM Uwe L. Korn wrote: > > Hello Wes, > > I have not much time this week but I hope to squeeze in

Re: [RESULT] [VOTE] Moving Apache Parquet C++ development process to a monorepo structure with Apache Arrow C++

2018-09-04 Thread Uwe L. Korn
Hello Wes, I have not much time this week but I hope to squeeze in some minutes tomorrow afternoon to review the code. As this is a very big merge, I want to be extra careful to not break anything really badly. Hopefully more eyes will help. Thank you for all the work in pushing this forward

[jira] [Created] (PARQUET-1412) [C++] Incorporate CLI utilities and benchmarks into Arrow codebase

2018-09-04 Thread Wes McKinney (JIRA)
Wes McKinney created PARQUET-1412: - Summary: [C++] Incorporate CLI utilities and benchmarks into Arrow codebase Key: PARQUET-1412 URL: https://issues.apache.org/jira/browse/PARQUET-1412 Project:

Re: [RESULT] [VOTE] Moving Apache Parquet C++ development process to a monorepo structure with Apache Arrow C++

2018-09-04 Thread Wes McKinney
Dear all, The repo merge is nearly ready to go modulo some fixes to CI. There will be a number of follow up issues to re-establish the various (untested) build procedures in parquet-cpp https://github.com/apache/arrow/pull/2453 I would like to merge this by EOD Wednesday 9/5, or Thursday at

Re: [VOTE] Finalizing the design and moving forward to read/write implementation

2018-09-04 Thread 俊杰陈
I agree with Jim that we might discover more when implementing reader/writer and there should be no major change for parquet-format because: what type of bloom filter to use? We use block-based Bloom filter now and no major changes if we plan to support others. Just add it to defined algorithm