[jira] [Comment Edited] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

2019-03-03 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783074#comment-16783074
 ] 

Gabor Somogyi edited comment on SPARK-26727 at 3/4/19 7:55 AM:
---

[~Udbhav Agrawal] I was dealing with "failOnDataLoss=false should not return 
duplicated records: v1" and was doing some debugging on the latest master 
branch.
The test creates "DontFailOnDataLoss" table which should be dropped at the end 
but after a while the test failed constantly with an exception which stated 
that the table already exists.
I've tried to overcome this issue by setting an overwrite flag but futher 
different exceptions arrived. Here I've felt that doesn't lead to my goal so 
just removed the directory.
Hope this info helps.


was (Author: gsomogyi):
I was dealing with "failOnDataLoss=false should not return duplicated records: 
v1" and was doing some debugging on the latest master branch.
The test creates "DontFailOnDataLoss" table which should be dropped at the end 
but after a while the test failed constantly with an exception which stated 
that the table already exists.
I've tried to overcome this issue by setting an overwrite flag but futher 
different exceptions arrived. Here I've felt that doesn't lead to my goal so 
just removed the directory.
Hope this info helps.

> CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException
> ---
>
> Key: SPARK-26727
> URL: https://issues.apache.org/jira/browse/SPARK-26727
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Srinivas Yarra
>Priority: Major
>
> We experienced that sometimes the Hive query "CREATE OR REPLACE VIEW  name> AS SELECT  FROM " fails with the following exception:
> {code:java}
> // code placeholder
> org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or 
> view '' already exists in database 'default'; at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:314)
>  at 
> org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:165) 
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365) at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364) at 
> org.apache.spark.sql.Dataset.(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80) at 
> org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ... 49 elided
> {code}
> {code}
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res1: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res2: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res3: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res4: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res5: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res6: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res7: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res8: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res9: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res10: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res11: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW 

[jira] [Comment Edited] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

2019-03-01 Thread Gabor Somogyi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782162#comment-16782162
 ] 

Gabor Somogyi edited comment on SPARK-26727 at 3/1/19 10:21 PM:


I've seen this issue lately when I was dealing with a unit test but that test 
created a table and not a view (latest master).


was (Author: gsomogyi):
I've seen this issue lately when I was dealing with a unit test but that test 
created a table and not a view.

> CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException
> ---
>
> Key: SPARK-26727
> URL: https://issues.apache.org/jira/browse/SPARK-26727
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Srinivas Yarra
>Priority: Major
>
> We experienced that sometimes the Hive query "CREATE OR REPLACE VIEW  name> AS SELECT  FROM " fails with the following exception:
> {code:java}
> // code placeholder
> org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or 
> view '' already exists in database 'default'; at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:314)
>  at 
> org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:165) 
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365) at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364) at 
> org.apache.spark.sql.Dataset.(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80) at 
> org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ... 49 elided
> {code}
> {code}
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res1: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res2: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res3: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res4: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res5: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res6: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res7: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res8: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res9: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res10: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res11: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") 
> org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or 
> view 'testsparkreplace' already exists in database 'default'; at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply$mcV$sp(HiveExternalCatalog.scala:246)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:236)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:236)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:236)
>  at 
> 

[jira] [Comment Edited] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

2019-02-05 Thread Laszlo Rigo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760604#comment-16760604
 ] 

Laszlo Rigo edited comment on SPARK-26727 at 2/5/19 10:05 AM:
--

For me these calls still seem to be asynchronous:
{noformat}
var i=0
try{
  while(true){
println(i)
spark.sql("DROP VIEW IF EXISTS testSparkReplace")
spark.sql("CREATE VIEW testSparkReplace as SELECT dummy FROM ae_dual")
while(!spark.catalog.tableExists("testSparkReplace")) {}
i=i+1
  }
}catch{
  case e: Exception => e.printStackTrace()
}{noformat}
This script failed with the same exception when the value of 'i' was 183:
{noformat}
org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
AlreadyExistsException(message:Table testsparkreplace already exists);
at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:236)
at 
org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:319)
at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:175)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195)
at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365)
at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364)
at org.apache.spark.sql.Dataset.(Dataset.scala:195)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
at $line17.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.liftedTree1$1(:30)
at $line17.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(:26)
at $line17.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(:41)
at $line17.$read$$iw$$iw$$iw$$iw$$iw$$iw.(:43)
at $line17.$read$$iw$$iw$$iw$$iw$$iw.(:45)
at $line17.$read$$iw$$iw$$iw$$iw.(:47)
at $line17.$read$$iw$$iw$$iw.(:49)
at $line17.$read$$iw$$iw.(:51)
at $line17.$read$$iw.(:53)
at $line17.$read.(:55)
at $line17.$read$.(:59)
at $line17.$read$.()
at $line17.$eval$.$print$lzycompute(:7)
at $line17.$eval$.$print(:6)
at $line17.$eval.$print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:793)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1054)
at 
scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:645)
at 
scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:644)
at 
scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at 
scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at 
scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:644)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:576)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:572)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:819)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:837)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:691)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:404)
at 

[jira] [Comment Edited] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

2019-02-05 Thread Laszlo Rigo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755846#comment-16755846
 ] 

Laszlo Rigo edited comment on SPARK-26727 at 2/5/19 9:29 AM:
-

[~dongjoon], We have this hive-metastore rpm installed:

rpm -qi hive-metastore
 Name : hive-metastore Relocations: (not relocatable)
 Version : 1.1.0+cdh5.4.8+277 Vendor: (none)
 Release : 1.cdh5.4.8.p1373.1769.el6 Build Date: Fri 25 Mar 2016 05:46:42 PM CET
 Install Date: Tue 29 Jan 2019 07:48:54 PM CET Build Host: 
ec2-pkg-centos-6-1226.vpc.cloudera.com
 Group : System/Daemons Source RPM: 
hive-1.1.0+cdh5.4.8+277-1.cdh5.4.8.p1373.1769.el6.src.rpm
 Size : 5458 License: ASL 2.0
 Signature : DSA/SHA1, Fri 25 Mar 2016 06:12:22 PM CET, Key ID f90c0d8fe8f86acd
 URL : [http://hive.apache.org/]
 Summary : Shared metadata repository for Hive.
 Description :
 This optional package hosts a metadata server for Hive clients across a 
network to use.


hive --version
 Hive 1.1.0-cdh5.4.8
 Subversion 
[file:///data/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hive-1.1.0-cdh5.4.8]
 -r Unknown
 Compiled by jenkins on Fri Mar 25 09:38:39 PDT 2016
 From source with checksum e4569745d1b7e0b2785263766d99cffc


was (Author: rigolaszlo):
[~dongjoon], We have this hive-metastore rpm installed:

rpm -qi hive-metastore
 Name : hive-metastore Relocations: (not relocatable)
 Version : 1.1.0+cdh5.4.8+277 Vendor: (none)
 Release : 1.cdh5.4.8.p1373.1769.el6 Build Date: Fri 25 Mar 2016 05:46:42 PM CET
 Install Date: Tue 29 Jan 2019 07:48:54 PM CET Build Host: 
ec2-pkg-centos-6-1226.vpc.cloudera.com
 Group : System/Daemons Source RPM: 
hive-1.1.0+cdh5.4.8+277-1.cdh5.4.8.p1373.1769.el6.src.rpm
 Size : 5458 License: ASL 2.0
 Signature : DSA/SHA1, Fri 25 Mar 2016 06:12:22 PM CET, Key ID f90c0d8fe8f86acd
 URL : [http://hive.apache.org/]
 Summary : Shared metadata repository for Hive.
 Description :
 This optional package hosts a metadata server for Hive clients across a 
network to use.

 

# hive --version
Hive 1.1.0-cdh5.4.8
Subversion 
file:///data/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hive-1.1.0-cdh5.4.8
 -r Unknown
Compiled by jenkins on Fri Mar 25 09:38:39 PDT 2016
>From source with checksum e4569745d1b7e0b2785263766d99cffc

> CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException
> ---
>
> Key: SPARK-26727
> URL: https://issues.apache.org/jira/browse/SPARK-26727
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Srinivas Yarra
>Priority: Major
>
> We experienced that sometimes the Hive query "CREATE OR REPLACE VIEW  name> AS SELECT  FROM " fails with the following exception:
> {code:java}
> // code placeholder
> org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or 
> view '' already exists in database 'default'; at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:314)
>  at 
> org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:165) 
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365) at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364) at 
> org.apache.spark.sql.Dataset.(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80) at 
> org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ... 49 elided
> {code}
> {code}
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res1: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res2: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res3: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res4: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as 

[jira] [Comment Edited] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

2019-02-05 Thread Laszlo Rigo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755846#comment-16755846
 ] 

Laszlo Rigo edited comment on SPARK-26727 at 2/5/19 9:28 AM:
-

[~dongjoon], We have this hive-metastore rpm installed:

rpm -qi hive-metastore
 Name : hive-metastore Relocations: (not relocatable)
 Version : 1.1.0+cdh5.4.8+277 Vendor: (none)
 Release : 1.cdh5.4.8.p1373.1769.el6 Build Date: Fri 25 Mar 2016 05:46:42 PM CET
 Install Date: Tue 29 Jan 2019 07:48:54 PM CET Build Host: 
ec2-pkg-centos-6-1226.vpc.cloudera.com
 Group : System/Daemons Source RPM: 
hive-1.1.0+cdh5.4.8+277-1.cdh5.4.8.p1373.1769.el6.src.rpm
 Size : 5458 License: ASL 2.0
 Signature : DSA/SHA1, Fri 25 Mar 2016 06:12:22 PM CET, Key ID f90c0d8fe8f86acd
 URL : [http://hive.apache.org/]
 Summary : Shared metadata repository for Hive.
 Description :
 This optional package hosts a metadata server for Hive clients across a 
network to use.

 

# hive --version
Hive 1.1.0-cdh5.4.8
Subversion 
file:///data/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hive-1.1.0-cdh5.4.8
 -r Unknown
Compiled by jenkins on Fri Mar 25 09:38:39 PDT 2016
>From source with checksum e4569745d1b7e0b2785263766d99cffc


was (Author: rigolaszlo):
[~dongjoon], We have this hive-metastore rpm installed:

rpm -qi hive-metastore
Name : hive-metastore Relocations: (not relocatable)
Version : 1.1.0+cdh5.4.8+277 Vendor: (none)
Release : 1.cdh5.4.8.p1373.1769.el6 Build Date: Fri 25 Mar 2016 05:46:42 PM CET
Install Date: Tue 29 Jan 2019 07:48:54 PM CET Build Host: 
ec2-pkg-centos-6-1226.vpc.cloudera.com
Group : System/Daemons Source RPM: 
hive-1.1.0+cdh5.4.8+277-1.cdh5.4.8.p1373.1769.el6.src.rpm
Size : 5458 License: ASL 2.0
Signature : DSA/SHA1, Fri 25 Mar 2016 06:12:22 PM CET, Key ID f90c0d8fe8f86acd
URL : http://hive.apache.org/
Summary : Shared metadata repository for Hive.
Description :
This optional package hosts a metadata server for Hive clients across a network 
to use.

> CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException
> ---
>
> Key: SPARK-26727
> URL: https://issues.apache.org/jira/browse/SPARK-26727
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Srinivas Yarra
>Priority: Major
>
> We experienced that sometimes the Hive query "CREATE OR REPLACE VIEW  name> AS SELECT  FROM " fails with the following exception:
> {code:java}
> // code placeholder
> org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or 
> view '' already exists in database 'default'; at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:314)
>  at 
> org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:165) 
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365) at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364) at 
> org.apache.spark.sql.Dataset.(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80) at 
> org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ... 49 elided
> {code}
> {code}
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res1: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res2: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res3: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res4: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res5: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res6: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 

[jira] [Comment Edited] (SPARK-26727) CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException

2019-01-25 Thread Laszlo Rigo (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752203#comment-16752203
 ] 

Laszlo Rigo edited comment on SPARK-26727 at 1/25/19 12:09 PM:
---

Hi,

It can be reproduced with a scala snippet in spark-shell:
{noformat}
try{
  while(true){
spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM 
ae_dual")
  }
}catch{
  case e: Exception => e.printStackTrace()
}{noformat}
It fails within a few seconds with the exception.

We found out that it might happen because of this commit: 
[https://github.com/apache/spark/commit/e862dc904963cf7832bafc1d3d0ea9090bbddd81]
 

In 
[sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala|https://github.com/apache/spark/commit/e862dc904963cf7832bafc1d3d0ea9090bbddd81#diff-cfec39cf8accabd227bd325f0a0a5f30]
 the dropTable and createTable functions are called, but the dropTable seems to 
be an asynchronous call, because after the exception the view is missing, so it 
is dropped, but it is still available during the createTable function call.

We also tried to connect to the same Hive database with Spark 2.1 and we did 
not experience this issue by running the scala script above.


was (Author: rigolaszlo):
Hi,

It can be reproduced with a scala snippet in spark-shell:
{noformat}
try{
  while(true){
spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy FROM 
ae_dual")
  }
}catch{
  case e: Exception => e.printStackTrace()
}{noformat}
It fails within a few seconds with the exception.

We found out that it might happen because of this commit: 
[https://github.com/apache/spark/commit/e862dc904963cf7832bafc1d3d0ea9090bbddd81]
 

In 
[sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala|https://github.com/apache/spark/commit/e862dc904963cf7832bafc1d3d0ea9090bbddd81#diff-cfec39cf8accabd227bd325f0a0a5f30]
 the dropTable and createTable functions are called, but the dropTable seems to 
be an asynchronous call, because after the exception the view is missing, so it 
is dropped, but it is still available during the createTable function call.

> CREATE OR REPLACE VIEW query fails with TableAlreadyExistsException
> ---
>
> Key: SPARK-26727
> URL: https://issues.apache.org/jira/browse/SPARK-26727
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Srinivas Yarra
>Priority: Major
>
> We experienced that sometimes the Hive query "CREATE OR REPLACE VIEW  name> AS SELECT  FROM " fails with the following exception:
> {code:java}
> // code placeholder
> org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or 
> view '' already exists in database 'default'; at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:314)
>  at 
> org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:165) 
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365) at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364) at 
> org.apache.spark.sql.Dataset.(Dataset.scala:195) at 
> org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80) at 
> org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) ... 49 elided
> {code}
> {code}
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res1: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res2: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res3: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res4: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as SELECT dummy 
> FROM ae_dual") res5: org.apache.spark.sql.DataFrame = [] 
> scala> spark.sql("CREATE OR REPLACE VIEW testSparkReplace as