paul-rogers opened a new pull request #2056: DRILL-7701: EVF V2 Scan Framework
URL: https://github.com/apache/drill/pull/2056
 
 
   # [DRILL-7701](https://issues.apache.org/jira/browse/DRILL-7701): EVF V2 
Scan Framework
   
   ## Description
   
   Revises the scan framework to use the revised schema resolution introduced 
in DRILL-7696.
   
   In EVF, the scan has three layers:
   
   * The operator layer which handles the Volcano-style iterator `RecordBatch` 
API.
   * The scan framework (this PR) which handles the details of schema, building 
batches, populating implicit columns, filling in missing columns and so on.
   * The schema resolution mechanism (DRILL-7696) which figures out projection, 
column names, etc.
   
   By splitting the scan into these layers, it turns out to be easier to 
upgrade one layer without making wholesale changes to others.
   
   In this PR, the ad-hoc classes of EVF V1 are renamed and restructured into a 
few simple concepts:
   
   * A scan builder, as in EVF V1.
   * The scan lifecycle that coordinates work across readers (like the former 
`ScanSchemaOrchestrator`)
   * A reader lifecycle which, along with the `SchemaNegotiator`, does all the 
"back office" setup and book-keeping for the reader.
   * A number of helper classes to handle things like implicit columns, missing 
columns, and so on.
   
   This version learns from V1 to reduce the reader from three to two 
operations. The `open()` call of the V1 reader is merged into the constructor 
for V2. This allows many more reader fields to be declared `final` and should 
further simplify readers.
   
   As in V1, the lifecycle comes in two "versions" a base "scan" version which 
is agnostic about the data source, and a "file" version which adds support for 
the `DrillFileSystem` and implicit columns.
   
   At present, no code uses V2; future PRs will switch existing readers over to 
this version one by one.
   
   Code is in the `v3` package. The goal is to eventually replace existing code 
and move the code here up a level to eliminate the "v3" naming.
   
   ## Documentation
   
   No user-visible changes (except that described in DRILL-7696.)
   
   Developers will need to be aware of the slight differences to build a reader 
using V2 EVF vs. V1. Those details will appear in the first PR that performs a 
conversion. Extensive Javadoc appears in the code.
    
   ## Testing
   
   Added new unit tests to parallel (and eventually replace) those for the V1 
framework. Reran all unit tests to ensure no regressions.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to