[ 
https://issues.apache.org/jira/browse/SPARK-52033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-52033:
-----------------------------------
    Labels: pull-request-available  (was: )

> Bug in Generate node when child node has multiple copies of the same attribute
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-52033
>                 URL: https://issues.apache.org/jira/browse/SPARK-52033
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.5.0, 4.0.0
>            Reporter: Harsh Motwani
>            Priority: Major
>              Labels: pull-request-available
>
> In a generate node, when the child output has multiple copies of the same 
> attribute but the node output has a different number of copies of this 
> attribute, the generate node breaks in codegen, and returns the wrong result 
> in non-codegen mode. This can be emulated using a SQL UDF and a `lateral view 
> explode`
> {code:java}
> sql("""create or replace temporary function spark_func (params array<struct<x 
> int, y int>>)
>             | returns STRUCT<a: int, b: int> LANGUAGE SQL
>             | return (select ns from (
>             | SELECT try_divide(SUM(item.x * item.y), SUM(item.x * item.x)) 
> AS beta1,
>             | NAMED_STRUCT('a', beta1,'b', beta1) ns
>             | FROM (SELECT params) LATERAL VIEW EXPLODE(params) AS item LIMIT 
> 1));""".stripMargin)
> sql("""select spark_func(collect_list(NAMED_STRUCT('x', 1, 'y', 1))) as 
> result;""").collect()
> {code}
> This code goes through an assertion failure in Codegen 
> [here|https://github.com/harshmotw-db/spark/blob/921eba838bf1e88b5e455ee72e8edad94b71f00c/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L156].
>  When codegen is disabled, it returns {null, null} even though the correct 
> output is {1, 1}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to