Mikhail Nikoliukin created SPARK-54758:
------------------------------------------
Summary: Fix generator resolution order
Key: SPARK-54758
URL: https://issues.apache.org/jira/browse/SPARK-54758
Project: Spark
Issue Type: Task
Components: SQL
Affects Versions: 4.2.0
Reporter: Mikhail Nikoliukin
During generator testing as part of
[https://issues.apache.org/jira/projects/SPARK/issues/SPARK-54687], I noticed
that if the projection has many generators, then the order of their resolution
depends on rule order. The test that highlights it:
{code:java}
-- !query
SELECT explode(array(0, 1, 2)), explode(array(10, 20))
-- !query analysis
Project [col#x, col#x]
+- Generate explode(array(10, 20)), false, [col#x]
+- Generate explode(array(0, 1, 2)), false, [col#x]
+- OneRowRelation
-- !query
SELECT explode(array(sin(0), 1, 2)), explode(array(10, 20))
-- !query analysis
Project [col#x, col#x]
+- Generate explode(array(SIN(cast(0 as double)), cast(1 as double), cast(2 as
double))), false, [col#x]
+- Generate explode(array(10, 20)), false, [col#x]
+- OneRowRelation{code}
It could be seen that adding a function to a generator delays its resolution
and leads to a generator swap in the result plan.
This does not affect the end result, but it makes it hard to make the
single-pass analyzer plan compatible with the old one.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]