Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/906#discussion_r134627805
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ---
    @@ -768,4 +765,73 @@ else if (exprHasPrefix && refHasPrefix) {
           }
         }
       }
    +
    +  /**
    +   * handle FAST NONE specially when Project for query output. This 
happens when input returns a
    +   * FAST NONE directly ( input does not return any batch with 
schema/data).
    +   *
    +   * Project operator has to return a batch with schema derived using the 
following 3 rules:
    +   *  Case 1:  *  ==>  expand into an empty list of columns.
    +   *  Case 2:  regular column reference ==> treat as nullable-int column
    +   *  Case 3:  expressions => Call ExpressionTreeMaterialization over an 
empty vector contain.
    --- End diff --
    
    Is this description confusing two different scenarios?
    
    1. Empty result set, but a schema is provided. (The Scan Batch changes go 
out of their way to provide a schema when possible.)
    2. Null result set: no rows and no schema.
    
    The rules in the Javadoc seem to relate to the second case: there are no 
columns to project.
    
    But, what do we do in the first case (when we have a schema, but no rows?) 
We should do exactly what we'd do if we had data: matching up columns, 
inserting nullable ints for missing columns, etc.
    
    Now, visualize the null result set as the same as an empty result set with 
no schema. *Exactly the same* rules apply. We match up columns (for wildcard or 
a project list), but will find none. So, we'll replace all reference with a 
nullable int.
    
    The point is, there should be only one code path; not two, and the one code 
path should gracefully handle the case in which the schema is empty.
    
    That said, it is likely true that debugging the existing code path may be 
tedious, and it may be faster to create a new code path. I wonder what that 
does for ongoing maintenance costs, however, as future developers have to not 
only understand the original path, but now must maintain the parallel "fast 
none" path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to