[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

ASF GitHub Bot (JIRA) Tue, 14 Nov 2017 09:56:29 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251826#comment-16251826
 ]


ASF GitHub Bot commented on DRILL-5657:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/914#discussion_r150755238
  
    --- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/record/MaterializedField.java 
---
    @@ -168,6 +174,58 @@ public boolean equals(Object obj) {
                 Objects.equals(this.type, other.type);
       }
     
    +  public boolean isEquivalent(MaterializedField other) {
    +    if (! name.equalsIgnoreCase(other.name)) {
    +      return false;
    +    }
    +
    +    // Requires full type equality, including fields such as precision and 
scale.
    +    // But, unset fields are equivalent to 0. Can't use the 
protobuf-provided
    +    // isEquals(), that treats set and unset fields as different.
    +
    +    if (type.getMinorType() != other.type.getMinorType()) {
    +      return false;
    +    }
    +    if (type.getMode() != other.type.getMode()) {
    +      return false;
    +    }
    +    if (type.getScale() != other.type.getScale()) {
    +      return false;
    +    }
    +    if (type.getPrecision() != other.type.getPrecision()) {
    +      return false;
    +    }
    +
    +    // Compare children -- but only for maps, not the internal children
    +    // for Varchar, repeated or nullable types.
    +
    +    if (type.getMinorType() != MinorType.MAP) {
    +      return true;
    +    }
    +
    +    if (children == null  ||  other.children == null) {
    +      return children == other.children;
    +    }
    +    if (children.size() != other.children.size()) {
    +      return false;
    +    }
    +
    +    // Maps are name-based, not position. But, for our
    +    // purposes, we insist on identical ordering.
    +
    +    Iterator<MaterializedField> thisIter = children.iterator();
    +    Iterator<MaterializedField> otherIter = other.children.iterator();
    +    while (thisIter.hasNext()) {
    --- End diff --
    
    The row set & writer abstractions require identical ordering so that column 
indexes are well-defined. Here we are facing the age-old philosophical question 
of "sameness." Sameness is instrumental: sameness-for-a-purpose. Here, we want 
to know if two schemas are equivalent for the purposes of referencing columns 
by index. We recently did a fix elsewhere we do use the looser definition: that 
A and B contain the same columns, but in possibly different orderings. Added a 
comment to explain this.


> Implement size-aware result set loader
> --------------------------------------
>
>                 Key: DRILL-5657
>                 URL: https://issues.apache.org/jira/browse/DRILL-5657
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: Future
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: Future
>
>
> A recent extension to Drill's set of test tools created a "row set" 
> abstraction to allow us to create, and verify, record batches with very few 
> lines of code. Part of this work involved creating a set of "column 
> accessors" in the vector subsystem. Column readers provide a uniform API to 
> obtain data from columns (vectors), while column writers provide a uniform 
> writing interface.
> DRILL-5211 discusses a set of changes to limit value vectors to 16 MB in size 
> (to avoid memory fragmentation due to Drill's two memory allocators.) The 
> column accessors have proven to be so useful that they will be the basis for 
> the new, size-aware writers used by Drill's record readers.
> A step in that direction is to retrofit the column writers to use the 
> size-aware {{setScalar()}} and {{setArray()}} methods introduced in 
> DRILL-5517.
> Since the test framework row set classes are (at present) the only consumer 
> of the accessors, those classes must also be updated with the changes.
> This then allows us to add a new "row mutator" class that handles size-aware 
> vector writing, including the case in which a vector fills in the middle of a 
> row.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

Reply via email to