Mikhail Nikoliukin created SPARK-54758:
------------------------------------------

             Summary: Fix generator resolution order
                 Key: SPARK-54758
                 URL: https://issues.apache.org/jira/browse/SPARK-54758
             Project: Spark
          Issue Type: Task
          Components: SQL
    Affects Versions: 4.2.0
            Reporter: Mikhail Nikoliukin


During generator testing as part of 
[https://issues.apache.org/jira/projects/SPARK/issues/SPARK-54687], I noticed 
that if the projection has many generators, then the order of their resolution 
depends on rule order. The test that highlights it:
{code:java}
-- !query
SELECT explode(array(0, 1, 2)), explode(array(10, 20))
-- !query analysis
Project [col#x, col#x]
+- Generate explode(array(10, 20)), false, [col#x]
   +- Generate explode(array(0, 1, 2)), false, [col#x]
      +- OneRowRelation


-- !query
SELECT explode(array(sin(0), 1, 2)), explode(array(10, 20))
-- !query analysis
Project [col#x, col#x]
+- Generate explode(array(SIN(cast(0 as double)), cast(1 as double), cast(2 as 
double))), false, [col#x]
   +- Generate explode(array(10, 20)), false, [col#x]
      +- OneRowRelation{code}
It could be seen that adding a function to a generator delays its resolution 
and leads to a generator swap in the result plan.

This does not affect the end result, but it makes it hard to make the 
single-pass analyzer plan compatible with the old one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to