Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView

Daniel de Oliveira Mantovani Mon, 13 Dec 2021 09:43:11 -0800

You are correct, I understand. My only concern is the back compatibility
problem, which worked for the previous version of Apache Spark. It's
painful when an OOTB feature breaks without documentation or a workaround
like "spark.sql.legacy.keepSqlRecursive" true/false. It's not about "my
code", it is about all production code running out there.


Thank you so much

On Mon, Dec 13, 2021 at 2:32 PM Sean Owen <sro...@gmail.com> wrote:

> I think we're around in circles - you should not do this. You essentially
> have "__TABLE__ = SELECT * FROM __TABLE__" and I hope it's clear why that
> can't work in general.
> At first execution, sure, maybe "old" __TABLE__ refers to "SELECT 1", but
> what about the second time? if you stick to that interpretation, it's
> actually not executing correctly, though 'works'. If you execute it as is,
> it fails for circularity. Both are bad, so it's just disallowed.
> Just fix your code?
>
> On Mon, Dec 13, 2021 at 11:27 AM Daniel de Oliveira Mantovani <
> daniel.oliveira.mantov...@gmail.com> wrote:
>
>> I've reduced the code to reproduce the issue,
>>
>> val df = spark.sql("SELECT 1")
>> df.createOrReplaceTempView("__TABLE__")
>> spark.sql("SELECT * FROM __TABLE__").show
>> val df2 = spark.sql("SELECT *,2 FROM __TABLE__")
>> df2.createOrReplaceTempView("__TABLE__") // Exception in Spark 3.2 but
>> works for Spark 2.4.x and Spark 3.1.x
>> spark.sql("SELECT * FROM __TABLE__").show
>>
>> org.apache.spark.sql.AnalysisException: Recursive view `__TABLE__`
>> detected (cycle: `__TABLE__` -> `__TABLE__`)
>>   at
>> org.apache.spark.sql.errors.QueryCompilationErrors$.recursiveViewDetectedError(QueryCompilationErrors.scala:2045)
>>   at
>> org.apache.spark.sql.execution.command.ViewHelper$.checkCyclicViewReference(views.scala:515)
>>   at
>> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2(views.scala:522)
>>   at
>> org.apache.spark.sql.execution.command.ViewHelper$.$anonfun$checkCyclicViewReference$2$adapted(views.scala:522)
>>
>> On Mon, Dec 13, 2021 at 2:10 PM Sean Owen <sro...@gmail.com> wrote:
>>
>>> _shrug_ I think this is a bug fix, unless I am missing something here.
>>> You shouldn't just use __TABLE__ for everything, and I'm not seeing a good
>>> reason to do that other than it's what you do now.
>>> I'm not clear if it's coming across that this _can't_ work in the
>>> general case.
>>>
>>> On Mon, Dec 13, 2021 at 11:03 AM Daniel de Oliveira Mantovani <
>>> daniel.oliveira.mantov...@gmail.com> wrote:
>>>
>>>>
>>>> In this context, I don't want to worry about the name of the temporary
>>>> table. That's why it is "__TABLE__".
>>>> The point is that this behavior for Spark 3.2.x it's breaking back
>>>> compatibility for all previous versions of Apache Spark. In my opinion we
>>>> should at least have some flag like "spark.sql.legacy.keepSqlRecursive"
>>>> true/false.
>>>>
>>>
>>
>> --
>>
>> --
>> Daniel Mantovani
>>
>>

-- 

--
Daniel Mantovani

Re: spark 3.2.0 the different dataframe createOrReplaceTempView the same name TempView

Reply via email to