[ https://issues.apache.org/jira/browse/SPARK-49836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim resolved SPARK-49836. ---------------------------------- Fix Version/s: 4.0.0 3.4.4 3.5.4 Assignee: Jungtaek Lim Resolution: Fixed Issue resolved via https://github.com/apache/spark/pull/48309 > The outer query is broken when the subquery uses window function which > receives time window as parameter > -------------------------------------------------------------------------------------------------------- > > Key: SPARK-49836 > URL: https://issues.apache.org/jira/browse/SPARK-49836 > Project: Spark > Issue Type: Bug > Components: SQL, Structured Streaming > Affects Versions: 3.4.2, 3.4.0, 3.4.1, 3.5.0, 4.0.0, 3.5.1, 3.5.2, 3.4.3, > 3.5.3 > Reporter: Andrzej Zera > Assignee: Jungtaek Lim > Priority: Blocker > Labels: correctness > Fix For: 4.0.0, 3.4.4, 3.5.4 > > > NOTE: The ticket is describing the actual issue after RCA of tests the > reporter reported, but I want to still assign the reporter to [~azera], to > give the proper credit. Tests are directly pointing to the edge cases and I > could easily spot on the root cause based on the simple reproducers. Thanks > [~azera] ! > This issue is from Spark 3.4.0+. Culprit issue link: > https://issues.apache.org/jira/browse/SPARK-40821 > This is a silly bug and I'm not sure how return works in Scala after figuring > out what happened, but here's my observation. > [https://github.com/apache/spark/blob/c54c017e93090a5fb2edf1b5ef029561b6387a3f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveTimeWindows.scala#L90-L94] > This lambda is a partial function - I can't find the exact behavior > definition of calling "return" in the partial function. Looks like calling > return does not only complete the execution of current context (here it was > Aggregate), but somehow complete the execution in broader context and the > above operator is "lost". > Thanks [~azera] for the report of the broken test! > EDIT: It does not look like happening for every case - we have relevant tests > in MultiStatefulOperatorsSuite and they have been passing for a while. > Looks like the issue is coupled with subquery. I don't understand how the > breakage is happening selectively, but that's what I observed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org