Jason Altekruse created DRILL-1858:
--------------------------------------

             Summary: Parquet reader should only explicitly fill in data for a 
column requested but not in the file if there are no valid columns found
                 Key: DRILL-1858
                 URL: https://issues.apache.org/jira/browse/DRILL-1858
             Project: Apache Drill
          Issue Type: Improvement
            Reporter: Jason Altekruse


If columns are requested from a parquet file, that do not appear in the 
particular file (users may have a directory full of files that share some 
columns but not others) then we do not need to create a vector to represent 
these columns in most cases. These columns can be materialized (as a vector 
filled with nulls) later when they are referenced in other parts of the query, 
such as a filter or join condition. The current behavior of the reader is to 
always fill vectors for these types of columns, but this just creates extra 
payload to ship around until the vectors are actually referenced.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to