[GitHub] spark pull request #20276: [SPARK-14948][SQL] disambiguate attributes in joi...

viirya Wed, 17 Jan 2018 03:06:01 -0800

Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20276#discussion_r162018508
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -1234,11 +1234,24 @@ class Dataset[T] private[sql](
           if (sqlContext.conf.supportQuotedRegexColumnName) {
             colRegex(colName)
           } else {
    -        val expr = resolve(colName)
    -        Column(expr)
    +        createCol(colName)
           }
       }
     
    +  private def createCol(name: String): Column = {
    +    val expr = resolve(name) transform {
    +      case a: AttributeReference =>
    +        // Associate the returned `AttributeReference` with the 
`AnalysisBarrier` of this Dataset,
    +        // by putting the barrier id into `AttributeReference.metadata`. 
This information is only
    +        // used to disambiguate the attributes in join condition when 
resolving self-join and
    +        // de-duplicating the right side plan.
    --- End diff --
    
    Shall we clarify that this metadata will be removed after analysis?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20276: [SPARK-14948][SQL] disambiguate attributes in joi...

Reply via email to