[jira] [Commented] (PARQUET-442) Convert flat SchemaElement vector to implied nested schema data structure
[ https://issues.apache.org/jira/browse/PARQUET-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130423#comment-15130423 ] Wes McKinney commented on PARQUET-442: -- Yep, in progress here: https://github.com/wesm/parquet-cpp/tree/PARQUET-442. Will cc you when I've written enough tests to solidify the main APIs > Convert flat SchemaElement vector to implied nested schema data structure > - > > Key: PARQUET-442 > URL: https://issues.apache.org/jira/browse/PARQUET-442 > Project: Parquet > Issue Type: New Feature > Components: parquet-cpp >Reporter: Wes McKinney >Assignee: Wes McKinney > > To assist with conversion to in-memory nested data structures. Related: > PARQUET-441 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PARQUET-442) Convert flat SchemaElement vector to implied nested schema data structure
[ https://issues.apache.org/jira/browse/PARQUET-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131781#comment-15131781 ] Wes McKinney commented on PARQUET-442: -- Patch in progress: https://github.com/apache/parquet-cpp/pull/38. Going to add some more tests but looking for reviews from experienced Parquet people (like to avoid any pitfalls with schema analysis early if possible). > Convert flat SchemaElement vector to implied nested schema data structure > - > > Key: PARQUET-442 > URL: https://issues.apache.org/jira/browse/PARQUET-442 > Project: Parquet > Issue Type: New Feature > Components: parquet-cpp >Reporter: Wes McKinney >Assignee: Wes McKinney > > To assist with conversion to in-memory nested data structures. Related: > PARQUET-441 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PARQUET-442) Convert flat SchemaElement vector to implied nested schema data structure
[ https://issues.apache.org/jira/browse/PARQUET-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130409#comment-15130409 ] Deepak Majeti commented on PARQUET-442: --- Another priority is a `column_id` indicating the physical column number(`parquet::SchemaElement` with 0 children). Is your plan to implement the Impala's way, which is to build a tree from the flat `parquet::SchemaElement` array? > Convert flat SchemaElement vector to implied nested schema data structure > - > > Key: PARQUET-442 > URL: https://issues.apache.org/jira/browse/PARQUET-442 > Project: Parquet > Issue Type: New Feature > Components: parquet-cpp >Reporter: Wes McKinney >Assignee: Wes McKinney > > To assist with conversion to in-memory nested data structures. Related: > PARQUET-441 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PARQUET-442) Convert flat SchemaElement vector to implied nested schema data structure
[ https://issues.apache.org/jira/browse/PARQUET-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129973#comment-15129973 ] Wes McKinney commented on PARQUET-442: -- There's a lot more to do here than fits in one JIRA. I'll post a WIP patch within the next couple days and create a bunch of follow up JIRAs about schema-related stuff. The biggest priority is initializing the column readers with a proper {{ColumnDescriptor}} indicating the repetition / definition level of the leaf node (since we are unable to read most nested data at the moment). > Convert flat SchemaElement vector to implied nested schema data structure > - > > Key: PARQUET-442 > URL: https://issues.apache.org/jira/browse/PARQUET-442 > Project: Parquet > Issue Type: New Feature > Components: parquet-cpp >Reporter: Wes McKinney >Assignee: Wes McKinney > > To assist with conversion to in-memory nested data structures. Related: > PARQUET-441 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PARQUET-442) Convert flat SchemaElement vector to implied nested schema data structure
[ https://issues.apache.org/jira/browse/PARQUET-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122317#comment-15122317 ] Wes McKinney commented on PARQUET-442: -- Cool -- I'm planning to use the Impala test suite as a guide for writing tests for nested schema resolution so best to leave those bits alone (and the record data structures) for now. The schema will give us the information necessary to start working on the reassembly algorithms. > Convert flat SchemaElement vector to implied nested schema data structure > - > > Key: PARQUET-442 > URL: https://issues.apache.org/jira/browse/PARQUET-442 > Project: Parquet > Issue Type: New Feature > Components: parquet-cpp >Reporter: Wes McKinney >Assignee: Wes McKinney > > To assist with conversion to in-memory nested data structures. Related: > PARQUET-441 -- This message was sent by Atlassian JIRA (v6.3.4#6332)