[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory
[ https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297446#comment-15297446 ] Apache Spark commented on SPARK-15269: -- User 'liancheng' has created a pull request for this issue: https://github.com/apache/spark/pull/13270 > Creating external table leaves empty directory under warehouse directory > > > Key: SPARK-15269 > URL: https://issues.apache.org/jira/browse/SPARK-15269 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.0.0 >Reporter: Cheng Lian >Assignee: Xin Wu > > Adding the following test case in {{HiveDDLSuite}} may reproduce this issue: > {code} > test("foo") { > withTempPath { dir => > val path = dir.getCanonicalPath > spark.range(1).write.json(path) > withTable("ddl_test1") { > sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')") > sql("DROP TABLE ddl_test1") > sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a") > } > } > } > {code} > Note that the first {{CREATE TABLE}} command creates an external table since > data source tables are always external when {{PATH}} option is specified. > When executing the second {{CREATE TABLE}} command, which creates a managed > table with the same name, it fails because there's already an unexpected > directory with the same name as the table name in the warehouse directory: > {noformat} > [info] - foo *** FAILED *** (7 seconds, 649 milliseconds) > [info] org.apache.spark.sql.AnalysisException: path > file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1 > already exists.; > [info] at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417) > [info] at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at org.apache.spark.sql.Dataset.(Dataset.scala:186) > [info] at org.apache.spark.sql.Dataset.(Dataset.scala:167) > [info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:62) > [info] at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541) > [info] at > org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59) > [info] at > org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59) >
[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory
[ https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297408#comment-15297408 ] Cheng Lian commented on SPARK-15269: Two facts make this issue pretty hard to be fixed cleanly: # When persisting an external Spark SQL data source table to Hive metastore, we can't store data location URI of the external table in the standard Hive {{o.a.h.hive.ql.metadata.Table.dataLocation}} field, because Hive only accepts directory paths as location URI while Spark SQL also allows reading from a single file. Due to this reason, we have to store the actual data location as a SerDe property and ignore the standard {{dataLocation}} field. # When creating a table, {{Hive.createTable}} always tries to create an empty table directory under default warehouse directory when {{o.a.h.hive.ql.metadata.Table.dataLocation}} is null. However, for external tables, this directory won't be deleted while dropping the table. This leads to the following contradiction: - We can't set {{Table.dataLocation}} because it have to be a directory path, while we must also allow file paths as data locations. - We have to set {{Table.dataLocation}} because otherwise Hive creates an unexpected empty directory but doesn't remove it while dropping the external table, and thus causes the bug described in this ticket. Here are two options: # Workaround this contradiction by setting {{Table.dataLocation}} to a random location and then delete it manually after creating the external table #- Pros: Fix the bug, and keeps backwards compatibility #- Cons: Sounds like a pretty ad-hoc dirty fix # Same as Hive, only allow using directory paths as data locations when creating Spark SQL external data source tables in Spark 2.0 #- Pros: Cleaner fix #- Cons: Breaks backwards compatibility. I'm working a fix using the first approach. > Creating external table leaves empty directory under warehouse directory > > > Key: SPARK-15269 > URL: https://issues.apache.org/jira/browse/SPARK-15269 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.0.0 >Reporter: Cheng Lian >Assignee: Xin Wu > > Adding the following test case in {{HiveDDLSuite}} may reproduce this issue: > {code} > test("foo") { > withTempPath { dir => > val path = dir.getCanonicalPath > spark.range(1).write.json(path) > withTable("ddl_test1") { > sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')") > sql("DROP TABLE ddl_test1") > sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a") > } > } > } > {code} > Note that the first {{CREATE TABLE}} command creates an external table since > data source tables are always external when {{PATH}} option is specified. > When executing the second {{CREATE TABLE}} command, which creates a managed > table with the same name, it fails because there's already an unexpected > directory with the same name as the table name in the warehouse directory: > {noformat} > [info] - foo *** FAILED *** (7 seconds, 649 milliseconds) > [info] org.apache.spark.sql.AnalysisException: path > file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1 > already exists.; > [info] at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417) > [info] at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231) > [info] at
[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory
[ https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283685#comment-15283685 ] Apache Spark commented on SPARK-15269: -- User 'xwu0226' has created a pull request for this issue: https://github.com/apache/spark/pull/13120 > Creating external table leaves empty directory under warehouse directory > > > Key: SPARK-15269 > URL: https://issues.apache.org/jira/browse/SPARK-15269 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.0.0 >Reporter: Cheng Lian > > Adding the following test case in {{HiveDDLSuite}} may reproduce this issue: > {code} > test("foo") { > withTempPath { dir => > val path = dir.getCanonicalPath > spark.range(1).write.json(path) > withTable("ddl_test1") { > sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')") > sql("DROP TABLE ddl_test1") > sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a") > } > } > } > {code} > Note that the first {{CREATE TABLE}} command creates an external table since > data source tables are always external when {{PATH}} option is specified. > When executing the second {{CREATE TABLE}} command, which creates a managed > table with the same name, it fails because there's already an unexpected > directory with the same name as the table name in the warehouse directory: > {noformat} > [info] - foo *** FAILED *** (7 seconds, 649 milliseconds) > [info] org.apache.spark.sql.AnalysisException: path > file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1 > already exists.; > [info] at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417) > [info] at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at org.apache.spark.sql.Dataset.(Dataset.scala:186) > [info] at org.apache.spark.sql.Dataset.(Dataset.scala:167) > [info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:62) > [info] at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541) > [info] at > org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59) > [info] at > org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59) > [info] at >
[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory
[ https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282215#comment-15282215 ] Xin Wu commented on SPARK-15269: FYI.. The reason why the default database paths obtained by different ways are different as mentioned above, is that I have an older metastore_db in my SPARK_HOME, where the metastore database keeps the old hive.metastore.warehouse.dir value (/user/hive/warehouse). After I removed this metastore_db, I get the database path consistent now. > Creating external table leaves empty directory under warehouse directory > > > Key: SPARK-15269 > URL: https://issues.apache.org/jira/browse/SPARK-15269 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.0.0 >Reporter: Cheng Lian > > Adding the following test case in {{HiveDDLSuite}} may reproduce this issue: > {code} > test("foo") { > withTempPath { dir => > val path = dir.getCanonicalPath > spark.range(1).write.json(path) > withTable("ddl_test1") { > sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')") > sql("DROP TABLE ddl_test1") > sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a") > } > } > } > {code} > Note that the first {{CREATE TABLE}} command creates an external table since > data source tables are always external when {{PATH}} option is specified. > When executing the second {{CREATE TABLE}} command, which creates a managed > table with the same name, it fails because there's already an unexpected > directory with the same name as the table name in the warehouse directory: > {noformat} > [info] - foo *** FAILED *** (7 seconds, 649 milliseconds) > [info] org.apache.spark.sql.AnalysisException: path > file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1 > already exists.; > [info] at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417) > [info] at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at org.apache.spark.sql.Dataset.(Dataset.scala:186) > [info] at org.apache.spark.sql.Dataset.(Dataset.scala:167) > [info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:62) > [info] at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541) >
[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory
[ https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281748#comment-15281748 ] Xin Wu commented on SPARK-15269: Yes, I can . Thanks! > Creating external table leaves empty directory under warehouse directory > > > Key: SPARK-15269 > URL: https://issues.apache.org/jira/browse/SPARK-15269 > Project: Spark > Issue Type: Bug > Components: SQL, Tests >Affects Versions: 2.0.0 >Reporter: Cheng Lian > > Adding the following test case in {{HiveDDLSuite}} may reproduce this issue: > {code} > test("foo") { > withTempPath { dir => > val path = dir.getCanonicalPath > spark.range(1).write.json(path) > withTable("ddl_test1") { > sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')") > sql("DROP TABLE ddl_test1") > sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a") > } > } > } > {code} > Note that the first {{CREATE TABLE}} command creates an external table since > data source tables are always external when {{PATH}} option is specified. > When executing the second {{CREATE TABLE}} command, which creates a managed > table with the same name, it fails because there's already an unexpected > directory with the same name as the table name in the warehouse directory: > {noformat} > [info] - foo *** FAILED *** (7 seconds, 649 milliseconds) > [info] org.apache.spark.sql.AnalysisException: path > file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1 > already exists.; > [info] at > org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417) > [info] at > org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55) > [info] at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > [info] at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > [info] at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > [info] at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > [info] at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85) > [info] at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85) > [info] at org.apache.spark.sql.Dataset.(Dataset.scala:186) > [info] at org.apache.spark.sql.Dataset.(Dataset.scala:167) > [info] at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:62) > [info] at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541) > [info] at > org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59) > [info] at > org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59) > [info] at >