GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/1206

    DRILL_6314: Add complex types to result set loader

    This PR is a bit of a large one as it adds Union, (non repeated) List and 
Repeated List type support to the column accessors, row set abstraction, result 
set loader abstraction, and associated mechanisms. The good new is that, after 
this PR, all the row set and result set loader work will be complete; we'll 
then move onto the scan operator and readers.
    
    Both Union and (non-repeated) List have very odd semantics that required 
some creative gyrations in the existing code. A (non-repeated) List can old a 
single type (List of VarChar) say, in which the list entries can be null to 
model a JSON list:
    ```
    {a: ["foo", "bar"]} {a: null}
    ```
    
    List entries can also be unions (which can include null values.) A List 
starts as a simple list (one type), then gets "promoted" to a Union. Much 
complexity was needed to hide this process behind the simple row set 
abstractions.
    
    There is similarity between List and Union, between List, Repeated List and 
"normal" Repeated (array) types. Refactoring reflects these commonalities.
    
    Due to the complexity of the added types, this PR revises the mechanisms 
that build a row set from an existing schema. or a schema from a container.
    
    This PR also includes a somewhat orthogonal projection mechanism that 
implements projection at the row set mechanism for simple columns, array values 
and elements within maps. This code is closely intertwined with schema 
creation, and it was not worth the effort to tease the two apart into separate 
PRs. 
    
    Extensive unit tests show the results in action. These are probably the 
best place to start to understand the client view of the new mechanisms.
    
    The work is divided up in a number of commits to help sort out work to each 
layer.
    
    The row set mechanism is fully described 
[here](https://github.com/paul-rogers/drill/wiki/Batch-Handling-Upgrades).
    
    Rather than write a long description here, please take a look at the code 
and the Wiki post. Then, post questions (specific or general) and I'll address 
those particular topics which need additional clarification.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-6314

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1206.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1206
    
----
commit 6bd9ac00fdeb1851799fb38db618f2d951bcd2c3
Author: Paul Rogers <progers@...>
Date:   2018-04-10T21:16:55Z

    DRILL-6134: Vector revisions

commit 8bb54971ad331a95b30ebd28fc803b89568f9d8f
Author: Paul Rogers <progers@...>
Date:   2018-04-10T21:18:41Z

    DRILL-6314: Vector accessor layer

commit 18b51ba387403abced03070724c72a2c8735901d
Author: Paul Rogers <progers@...>
Date:   2018-04-10T21:21:42Z

    DRILL-6314: Row set layer

commit 333ad2c8d77a5c31d5c499da95bbb56a38989099
Author: Paul Rogers <progers@...>
Date:   2018-04-10T21:24:36Z

    DRILL-6314: Result set loader layer

commit 54a26828019bbe1103f291b220bc2d20716f9680
Author: Paul Rogers <progers@...>
Date:   2018-04-10T21:25:09Z

    DRILL-6314: Metadata layer

commit 655779f558d0a0cc36fd3a0a23a9c305b5adf521
Author: Paul Rogers <progers@...>
Date:   2018-04-10T21:25:29Z

    DRILL-6314: Misc revisions

----


---

Reply via email to