[jira] [Assigned] (SPARK-34560) Cannot join datasets of SHOW TABLES

Apache Spark (Jira) Sat, 27 Feb 2021 05:05:04 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-34560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Apache Spark reassigned SPARK-34560:
------------------------------------

    Assignee: Apache Spark

> Cannot join datasets of SHOW TABLES
> -----------------------------------
>
>                 Key: SPARK-34560
>                 URL: https://issues.apache.org/jira/browse/SPARK-34560
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: Maxim Gekk
>            Assignee: Apache Spark
>            Priority: Major
>
> The example portraits the issue:
> {code:scala}
> scala> sql("CREATE NAMESPACE ns1")
> res8: org.apache.spark.sql.DataFrame = []
> scala> sql("CREATE NAMESPACE ns2")
> res9: org.apache.spark.sql.DataFrame = []
> scala> sql("CREATE TABLE ns1.tbl1 (c INT)")
> res10: org.apache.spark.sql.DataFrame = []
> scala> sql("CREATE TABLE ns2.tbl2 (c INT)")
> res11: org.apache.spark.sql.DataFrame = []
> scala> val show1 = sql("SHOW TABLES IN ns1")
> show1: org.apache.spark.sql.DataFrame = [namespace: string, tableName: string 
> ... 1 more field]
> scala> val show2 = sql("SHOW TABLES IN ns2")
> show2: org.apache.spark.sql.DataFrame = [namespace: string, tableName: string 
> ... 1 more field]
> scala> show1.show
> +---------+---------+-----------+
> |namespace|tableName|isTemporary|
> +---------+---------+-----------+
> |      ns1|     tbl1|      false|
> +---------+---------+-----------+
> scala> show2.show
> +---------+---------+-----------+
> |namespace|tableName|isTemporary|
> +---------+---------+-----------+
> |      ns2|     tbl2|      false|
> +---------+---------+-----------+
> scala> show1.join(show2).where(show1("tableName") =!= show2("tableName")).show
> org.apache.spark.sql.AnalysisException: Column tableName#17 are ambiguous. 
> It's probably because you joined several Datasets together, and some of these 
> Datasets are the same. This column points to one of the Datasets but Spark is 
> unable to figure out which one. Please alias the Datasets with different 
> names via `Dataset.as` before joining them, and specify the column using 
> qualified name, e.g. `df.as("a").join(df.as("b"), $"a.id" > $"b.id")`. You 
> can also set spark.sql.analyzer.failAmbiguousSelfJoin to false to disable 
> this check.
>   at 
> org.apache.spark.sql.execution.analysis.DetectAmbiguousSelfJoin$.apply(DetectAmbiguousSelfJoin.scala:157)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-34560) Cannot join datasets of SHOW TABLES

Reply via email to