[ https://issues.apache.org/jira/browse/SPARK-37690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464354#comment-17464354 ]
Robin commented on SPARK-37690: ------------------------------- Someone [here|https://community.databricks.com/s/question/0D53f00001Qugr7CAB/upgrading-from-spark-24-to-32-recursive-view-errors-when-using] has suggested this is an intentional breaking change introduced in Spark 3.1: >From [Migration Guide: SQL, Datasets and DataFrame - Spark 3.1.1 Documentation >(apache.org)|https://spark.apache.org/docs/3.1.1/sql-migration-guide.html]] > In Spark 3.1, the temporary view will have same behaviors with the permanent > view, i.e. capture and store runtime SQL configs, SQL text, catalog and > namespace. The capatured view properties will be applied during the parsing > and analysis phases of the view resolution. To restore the behavior before > Spark 3.1, {*}you can set spark.sql.legacy.storeAnalyzedPlanForView to > true{*}. Grateful if someone could clarify. Worth noting that the example code works in Spark 3.1.2, just not 3.2.0. It's not obvious to me the above quote implies `createOrReplaceTempView` would fail in the example code posted in the issue. > Recursive view `df` detected (cycle: `df` -> `df`) > -------------------------------------------------- > > Key: SPARK-37690 > URL: https://issues.apache.org/jira/browse/SPARK-37690 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.2.0 > Reporter: Robin > Priority: Major > > In Spark 3.2.0, you can no longer reuse the same name for a temporary view. > This change is backwards incompatible, and means a common way of running > pipelines of SQL queries no longer works. The following is a simple > reproducible example that works in Spark 2.x and 3.1.2, but not in 3.2.0: > {code:python}from pyspark.context import SparkContext > from pyspark.sql import SparkSession > sc = SparkContext.getOrCreate() > spark = SparkSession(sc) > sql = """ SELECT id as col_1, rand() AS col_2 FROM RANGE(10); """ > df = spark.sql(sql) > df.createOrReplaceTempView("df") > sql = """ SELECT * FROM df """ > df = spark.sql(sql) > df.createOrReplaceTempView("df") > sql = """ SELECT * FROM df """ > df = spark.sql(sql) {code} > The following error is now produced: > {code:python}AnalysisException: Recursive view `df` detected (cycle: `df` -> > `df`) > {code} > I'm reasonably sure this change is unintentional in 3.2.0 since it breaks a > lot of legacy code, and the `createOrReplaceTempView` method is named > explicitly such that replacing an existing view should be allowed. An > internet search suggests other users have run into a similar problems, e.g. > [here|https://community.databricks.com/s/question/0D53f00001Qugr7CAB/upgrading-from-spark-24-to-32-recursive-view-errors-when-using] > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org