Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1125#discussion_r169857960
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java 
---
    @@ -418,11 +438,13 @@ private void measureColumn(ValueVector v, String 
prefix) {
         netRowWidthCap50 += ! colSize.isVariableWidth ? colSize.estSize :
             8 /* offset vector */ + 
roundUpToPowerOf2(Math.min(colSize.estSize,50));
             // above change 8 to 4 after DRILL-5446 is fixed
    +
    +    return colSize;
       }
     
    -  private void expandMap(AbstractMapVector mapVector, String prefix) {
    +  private void expandMap(ColumnSize colSize, AbstractMapVector mapVector, 
String prefix) {
         for (ValueVector vector : mapVector) {
    -      measureColumn(vector, prefix);
    +      colSize.childColumnSizes.put(prefix + vector.getField().getName(), 
measureColumn(vector, prefix));
    --- End diff --
    
    This is subject to aliasing. Suppose I have two maps:
    
    ```
    aa(b)
    a(ab)
    ```
    When I add the child vectors, both will produce a combined name of `aab`.
    
    We can't use dots n names for the same reason:
    
    ```
    a.b(c)
    a(b.c)
    ```
    
    Both will produce `a.b.c`.
    
    In the new "result set loader" code, all places that handle trees of 
columns use actual trees of maps.
    
    A crude-but-effecive solution is to use a non-legal name character. The 
only valid one is the back-tick since we use that in SQL to quote names. If we 
do that, we now have
    
    ```
    aa`b
    a`ab
    a.b`c
    a`b.c
    ```
    
    And the names are now un-aliased.


---

Reply via email to