[GitHub] spark pull request #18959: [SPARK-18394][SQL] Make an Attribute.toSeq output...

maropu Wed, 16 Aug 2017 01:07:39 -0700

Github user maropu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18959#discussion_r133383172
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
 ---
    @@ -121,7 +121,12 @@ class AttributeSet private (val baseSet: 
Set[AttributeEquals])
     
       // We must force toSeq to not be strict otherwise we end up with a 
[[Stream]] that captures all
       // sorts of things in its closure.
    -  override def toSeq: Seq[Attribute] = baseSet.map(_.a).toArray.toSeq
    +  override def toSeq: Seq[Attribute] = {
    +    // We need to keep a deterministic output order for `baseSet` because 
this affects a variable
    +    // order in generated code (e.g., `GenerateColumnAccessor`).
    +    // See SPARK-18394 for details.
    +    baseSet.map(_.a).toArray.sortBy { a => (a.name, a.exprId.id) }
    --- End diff --
    
    Yea, as you suggested, I initially did so. But, I just kept [the original 
code](https://github.com/apache/spark/commit/c4787a3690a9ed3b8b2c6c294fc4a6915436b6f7#diff-75576f0ec7f9d8b5032000245217d233R105)
 cuz I was afraid this change wrongly affected the others. cc: @marmbrus



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18959: [SPARK-18394][SQL] Make an Attribute.toSeq output...

Reply via email to