[ 
https://issues.apache.org/jira/browse/SPARK-49836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-49836.
----------------------------------
    Fix Version/s: 4.0.0
                   3.4.4
                   3.5.4
         Assignee: Jungtaek Lim
       Resolution: Fixed

Issue resolved via https://github.com/apache/spark/pull/48309

> The outer query is broken when the subquery uses window function which 
> receives time window as parameter
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-49836
>                 URL: https://issues.apache.org/jira/browse/SPARK-49836
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL, Structured Streaming
>    Affects Versions: 3.4.2, 3.4.0, 3.4.1, 3.5.0, 4.0.0, 3.5.1, 3.5.2, 3.4.3, 
> 3.5.3
>            Reporter: Andrzej Zera
>            Assignee: Jungtaek Lim
>            Priority: Blocker
>              Labels: correctness
>             Fix For: 4.0.0, 3.4.4, 3.5.4
>
>
> NOTE: The ticket is describing the actual issue after RCA of tests the 
> reporter reported, but I want to still assign the reporter to [~azera], to 
> give the proper credit. Tests are directly pointing to the edge cases and I 
> could easily spot on the root cause based on the simple reproducers. Thanks 
> [~azera] !
> This issue is from Spark 3.4.0+. Culprit issue link: 
> https://issues.apache.org/jira/browse/SPARK-40821
> This is a silly bug and I'm not sure how return works in Scala after figuring 
> out what happened, but here's my observation.
> [https://github.com/apache/spark/blob/c54c017e93090a5fb2edf1b5ef029561b6387a3f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveTimeWindows.scala#L90-L94]
> This lambda is a partial function - I can't find the exact behavior 
> definition of calling "return" in the partial function. Looks like calling 
> return does not only complete the execution of current context (here it was 
> Aggregate), but somehow complete the execution in broader context and the 
> above operator is  "lost".
> Thanks [~azera] for the report of the broken test!
> EDIT: It does not look like happening for every case - we have relevant tests 
> in MultiStatefulOperatorsSuite and they have been passing for a while.
> Looks like the issue is coupled with subquery. I don't understand how the 
> breakage is happening selectively, but that's what I observed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to