Takeshi Yamamuro created SPARK-20390:
----------------------------------------

             Summary: Non-deterministic expressions could exist in grouping keys
                 Key: SPARK-20390
                 URL: https://issues.apache.org/jira/browse/SPARK-20390
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.1.0
            Reporter: Takeshi Yamamuro


Deterministic expressions only exist in grouping keys though, non-deterministic 
one could exist there in some cases.
This is because `AttributeReference` does not respect `deterministic` 
properties in query plans.
A example is as follows;

{code}
scala> val df = sql("""select rand(0), count(1) group by 1""")
df: org.apache.spark.sql.DataFrame = [rand(0): double, count(1): bigint]

scala> df.explain(true)
== Parsed Logical Plan ==
'Aggregate [1], [unresolvedalias('rand(0), None), unresolvedalias('count(1), 
None)]
+- OneRowRelation$

== Analyzed Logical Plan ==
rand(0): double, count(1): bigint
Aggregate [_nondeterministic#92], [_nondeterministic#92 AS rand(0)#90, count(1) 
AS count(1)#91L]
+- Project [rand(0) AS _nondeterministic#92]
   +- OneRowRelation$

== Optimized Logical Plan ==
Aggregate [_nondeterministic#92], [_nondeterministic#92 AS rand(0)#90, count(1) 
AS count(1)#91L]
+- Project [rand(0) AS _nondeterministic#92]
   +- OneRowRelation$

== Physical Plan ==
*HashAggregate(keys=[_nondeterministic#92], functions=[count(1)], 
output=[rand(0)#90, count(1)#91L])
+- Exchange hashpartitioning(_nondeterministic#92, 200)
   +- *HashAggregate(keys=[_nondeterministic#92], functions=[partial_count(1)], 
output=[_nondeterministic#92, count#94L])
      +- *Project [rand(0) AS _nondeterministic#92]
         +- Scan OneRowRelation[]

scala> df.show
+------------------+--------+
|           rand(0)|count(1)|
+------------------+--------+
|0.8446490682263027|       1|
+------------------+--------+

{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to