GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/1218

    DRILL-6335: Refactor row set abstractions to prepare for unions

    Refactors the column accessors to prepare for adding unions, lists and 
repeated lists.
    
    This is a subset of a PR done a week ago in the hope that this one will be 
easier to review. The original one will be broken down into four or more 
smaller PRs: this one, a refactoring of the result set loader, also to prepare 
for unions, and the union work itself.
    
    The row set mechanism is fully described 
[here](https://github.com/paul-rogers/drill/wiki/Batch-Handling-Upgrades).
    
    Rather than write a long description here, please take a look at the code 
and the Wiki post. To ease review, however, the following summarizes the 
changes:
    
    * Moved metadata from the tuple reader/writer to the column reader/writer 
so that it is available for all columns. Added a `tupleMetadata()` to tuples to 
continue to provide the tuple schema.
    * Added a `ProjectionType` bit of metadata in preparation for the 
projection system to be used by the scan operator. (Projection has three 
states, captured by the new enum.)
    * Updated some tests to use a slightly simpler version of the code that 
compares two result sets.
    * Added unit tests for "indirect" readers: a reader for an SV2.
    * Refactored the offset vector writer to allow a dummy offset vector writer 
as part of the projection mechanism.
    * Changed the column accessor code gen template to use constants instead of 
hard-coded numbers for field positions.
    * Additional documentation.
    * Restructured the column accessor base classes to better organize the 
functions in preparation for lists and unions (which are far more complex than 
maps and scalars.)
    * Pulled a couple of formerly-nested classes into top-level classes.
    * Reorganized the code that builds accessors; moved it into the accessor 
itself.
    * Added more complete system to allow writing of generic Java objects in 
tests.
    * Temporary patches to the row set loader classes to handle the above 
changes. (The patches will be replaced in the result set loader refactoring PR 
to come later.)
    
    The code already has extensive unit tests for this functionality. Those 
tests were rerun to demonstrate that the refactoring preserves existing 
functionality. A later PR will exercise the new structure in tests for unions, 
lists, etc.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-6335

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1218.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1218
    
----
commit c59a3cbe0aa910c8d31589fe1619846e6d417915
Author: Paul Rogers <progers@...>
Date:   2018-04-17T04:44:10Z

    DRILL-6335: Column accessor refactoring

commit 0716a3894f0e7e48747b04f0f934cf94a04ebd3f
Author: Paul Rogers <progers@...>
Date:   2018-04-17T17:41:07Z

    Merge fixes

----


---

Reply via email to