JVM gets killed when loading parquet file

2018-08-26 Thread Nicolas Troncoso
Hi, I'm loading a parque file 900: maeve:$ parquet-tools rowcount -d part-0-6e54bc7e-bb79-423d-ae6a-d46ea55b591b-c000.snappy.parquet part-0-6e54bc7e-bb79-423d-ae6a-d46ea55b591b-c000.snappy.parquet row count: 77318 Total RowCount: 77318 maeve:$ parquet-tools size -d

Re: Page indexing in parquet-cpp

2018-08-26 Thread Wes McKinney
While the codebases are merging, parquet-cpp tasks not concerning Arrow integration will still use the PARQUET JIRA project On Sun, Aug 26, 2018 at 6:08 PM, Renato Marroquín Mogrovejo wrote: > Sounds good Wes. > I was unsure where to create the jIRA but I guess it goes in Arrow ;) > >

Re: Page indexing in parquet-cpp

2018-08-26 Thread Renato Marroquín Mogrovejo
Sounds good Wes. I was unsure where to create the jIRA but I guess it goes in Arrow ;) https://issues.apache.org/jira/browse/ARROW-3124 2018-08-26 22:49 GMT+02:00 Wes McKinney : > hi Renato, > > Yes, we would like to implement it. If there is not already a JIRA for > this, can you create one?

Re: Page indexing in parquet-cpp

2018-08-26 Thread Wes McKinney
hi Renato, Yes, we would like to implement it. If there is not already a JIRA for this, can you create one? The work would not happen until after the Arrow-Parquet codebase merge (which is imminent anyway) - Wes On Sun, Aug 26, 2018 at 4:40 PM, Renato Marroquín Mogrovejo wrote: > Hi devs, > >

Page indexing in parquet-cpp

2018-08-26 Thread Renato Marroquín Mogrovejo
Hi devs, Is there any plans on implementing https://issues.apache.org/jira/browse/PARQUET-922 into parquet-cpp? Is there a jira to follow development on this? Thanks in advance! Renato M.

[jira] [Updated] (PARQUET-1402) incorrect calculation column start offset for files created by parquet-mr 1.8.1

2018-08-26 Thread rip.nsk (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rip.nsk updated PARQUET-1402: - Attachment: test.parquet > incorrect calculation column start offset for files created by parquet-mr

Re: Doing a 1.5.0 C++ release

2018-08-26 Thread Wes McKinney
I think we should be able to cut a release now? We can also proceed with the Arrow merge at the same time once we agree how particularly to do that. On Wed, Aug 22, 2018 at 7:30 AM, Uwe L. Korn wrote: > For me it would also be quite useful to also have >

[jira] [Commented] (PARQUET-1370) [C++] Read consecutive column chunks in a single scan

2018-08-26 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592961#comment-16592961 ] Wes McKinney commented on PARQUET-1370: --- That would be a question for [~pitrou] or [~alendit] >

[jira] [Updated] (PARQUET-1370) [C++] Read consecutive column chunks in a single scan

2018-08-26 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/PARQUET-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated PARQUET-1370: -- Fix Version/s: cpp-1.6.0 > [C++] Read consecutive column chunks in a single scan >