[GitHub] spark pull request #17185: [SPARK-19602][SQL] Support column resolution of f...

skambha Mon, 06 Aug 2018 16:39:50 -0700

Github user skambha commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17185#discussion_r208059884
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala
 ---
    @@ -169,25 +181,50 @@ package object expressions  {
             })
           }
     
    -      // Find matches for the given name assuming that the 1st part is a 
qualifier (i.e. table name,
    -      // alias, or subquery alias) and the 2nd part is the actual name. 
This returns a tuple of
    +      // Find matches for the given name assuming that the 1st two parts 
are qualifier
    +      // (i.e. database name and table name) and the 3rd part is the 
actual column name.
    +      //
    +      // For example, consider an example where "db1" is the database 
name, "a" is the table name
    +      // and "b" is the column name and "c" is the struct field name.
    +      // If the name parts is db1.a.b.c, then Attribute will match
    --- End diff --
    
    @cloud-fan , Thank you for your suggestion and question. 
    
    Existing spark behavior follows precedence rules in the column resolution 
logic and in this patch we are following the same pattern/rule.   
    
    I am looking into the SQL standard to see if there are any column 
resolution rules but I have not found any yet.  However when I researched 
existing databases, I observed different behaviors among them and it is listed 
in Section 2/Table A in the design doc 
[here](https://drive.google.com/file/d/1zKm3aNZ3DpsqIuoMvRsf0kkDkXsAasxH/view).
    
    I agree, we can improve upon the checks in existing precedence to go all 
the way to ensure there is a nested field.  Although, the user can always 
qualify the field to resolve the ambiguity.  Shall we open another issue to 
discuss and improve upon the existing resolution logic.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17185: [SPARK-19602][SQL] Support column resolution of f...

Reply via email to