subject:"\[jira\] \[Commented\] \(SPARK\-15269\) Creating external table leaves empty directory under warehouse directory"

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

2016-05-23 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297446#comment-15297446
 ] 

Apache Spark commented on SPARK-15269:
--

User 'liancheng' has created a pull request for this issue:
https://github.com/apache/spark/pull/13270

> Creating external table leaves empty directory under warehouse directory
> 
>
> Key: SPARK-15269
> URL: https://issues.apache.org/jira/browse/SPARK-15269
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.0.0
>Reporter: Cheng Lian
>Assignee: Xin Wu
>
> Adding the following test case in {{HiveDDLSuite}} may reproduce this issue:
> {code}
>   test("foo") {
> withTempPath { dir =>
>   val path = dir.getCanonicalPath
>   spark.range(1).write.json(path)
>   withTable("ddl_test1") {
> sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')")
> sql("DROP TABLE ddl_test1")
> sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a")
>   }
> }
>   }
> {code}
> Note that the first {{CREATE TABLE}} command creates an external table since 
> data source tables are always external when {{PATH}} option is specified.
> When executing the second {{CREATE TABLE}} command, which creates a managed 
> table with the same name, it fails because there's already an unexpected 
> directory with the same name as the table name in the warehouse directory:
> {noformat}
> [info] - foo *** FAILED *** (7 seconds, 649 milliseconds)
> [info]   org.apache.spark.sql.AnalysisException: path 
> file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1
>  already exists.;
> [info]   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417)
> [info]   at 
> org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at org.apache.spark.sql.Dataset.(Dataset.scala:186)
> [info]   at org.apache.spark.sql.Dataset.(Dataset.scala:167)
> [info]   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:62)
> [info]   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541)
> [info]   at 
> org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59)
> [info]   at 
> org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59)
>

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

2016-05-23 Thread Cheng Lian (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297408#comment-15297408
 ] 

Cheng Lian commented on SPARK-15269:


Two facts make this issue pretty hard to be fixed cleanly:

# When persisting an external Spark SQL data source table to Hive metastore, we 
can't store data location URI of the external table in the standard Hive 
{{o.a.h.hive.ql.metadata.Table.dataLocation}} field, because Hive only accepts 
directory paths as location URI while Spark SQL also allows reading from a 
single file. Due to this reason, we have to store the actual data location as a 
SerDe property and ignore the standard {{dataLocation}} field.
# When creating a table, {{Hive.createTable}} always tries to create an empty 
table directory under default warehouse directory when 
{{o.a.h.hive.ql.metadata.Table.dataLocation}} is null. However, for external 
tables, this directory won't be deleted while dropping the table.

This leads to the following contradiction:

- We can't set {{Table.dataLocation}} because it have to be a directory path, 
while we must also allow file paths as data locations.
- We have to set {{Table.dataLocation}} because otherwise Hive creates an 
unexpected empty directory but doesn't remove it while dropping the external 
table, and thus causes the bug described in this ticket.

Here are two options:

# Workaround this contradiction by setting {{Table.dataLocation}} to a random 
location and then delete it manually after creating the external table
#- Pros: Fix the bug, and keeps backwards compatibility
#- Cons: Sounds like a pretty ad-hoc dirty fix
# Same as Hive, only allow using directory paths as data locations when 
creating Spark SQL external data source tables in Spark 2.0
#- Pros: Cleaner fix
#- Cons: Breaks backwards compatibility.

I'm working a fix using the first approach.


> Creating external table leaves empty directory under warehouse directory
> 
>
> Key: SPARK-15269
> URL: https://issues.apache.org/jira/browse/SPARK-15269
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.0.0
>Reporter: Cheng Lian
>Assignee: Xin Wu
>
> Adding the following test case in {{HiveDDLSuite}} may reproduce this issue:
> {code}
>   test("foo") {
> withTempPath { dir =>
>   val path = dir.getCanonicalPath
>   spark.range(1).write.json(path)
>   withTable("ddl_test1") {
> sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')")
> sql("DROP TABLE ddl_test1")
> sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a")
>   }
> }
>   }
> {code}
> Note that the first {{CREATE TABLE}} command creates an external table since 
> data source tables are always external when {{PATH}} option is specified.
> When executing the second {{CREATE TABLE}} command, which creates a managed 
> table with the same name, it fails because there's already an unexpected 
> directory with the same name as the table name in the warehouse directory:
> {noformat}
> [info] - foo *** FAILED *** (7 seconds, 649 milliseconds)
> [info]   org.apache.spark.sql.AnalysisException: path 
> file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1
>  already exists.;
> [info]   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417)
> [info]   at 
> org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231)
> [info]   at

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

2016-05-14 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283685#comment-15283685
 ] 

Apache Spark commented on SPARK-15269:
--

User 'xwu0226' has created a pull request for this issue:
https://github.com/apache/spark/pull/13120

> Creating external table leaves empty directory under warehouse directory
> 
>
> Key: SPARK-15269
> URL: https://issues.apache.org/jira/browse/SPARK-15269
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.0.0
>Reporter: Cheng Lian
>
> Adding the following test case in {{HiveDDLSuite}} may reproduce this issue:
> {code}
>   test("foo") {
> withTempPath { dir =>
>   val path = dir.getCanonicalPath
>   spark.range(1).write.json(path)
>   withTable("ddl_test1") {
> sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')")
> sql("DROP TABLE ddl_test1")
> sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a")
>   }
> }
>   }
> {code}
> Note that the first {{CREATE TABLE}} command creates an external table since 
> data source tables are always external when {{PATH}} option is specified.
> When executing the second {{CREATE TABLE}} command, which creates a managed 
> table with the same name, it fails because there's already an unexpected 
> directory with the same name as the table name in the warehouse directory:
> {noformat}
> [info] - foo *** FAILED *** (7 seconds, 649 milliseconds)
> [info]   org.apache.spark.sql.AnalysisException: path 
> file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1
>  already exists.;
> [info]   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417)
> [info]   at 
> org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at org.apache.spark.sql.Dataset.(Dataset.scala:186)
> [info]   at org.apache.spark.sql.Dataset.(Dataset.scala:167)
> [info]   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:62)
> [info]   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541)
> [info]   at 
> org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59)
> [info]   at 
> org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59)
> [info]   at 
>

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

2016-05-12 Thread Xin Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282215#comment-15282215
 ] 

Xin Wu commented on SPARK-15269:


FYI.. 
The reason why the default database paths obtained by different ways are 
different as mentioned above, is that I have an older metastore_db in my 
SPARK_HOME, where the metastore database keeps the old 
hive.metastore.warehouse.dir value (/user/hive/warehouse). After I removed this 
metastore_db, I get the database path consistent now. 

> Creating external table leaves empty directory under warehouse directory
> 
>
> Key: SPARK-15269
> URL: https://issues.apache.org/jira/browse/SPARK-15269
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.0.0
>Reporter: Cheng Lian
>
> Adding the following test case in {{HiveDDLSuite}} may reproduce this issue:
> {code}
>   test("foo") {
> withTempPath { dir =>
>   val path = dir.getCanonicalPath
>   spark.range(1).write.json(path)
>   withTable("ddl_test1") {
> sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')")
> sql("DROP TABLE ddl_test1")
> sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a")
>   }
> }
>   }
> {code}
> Note that the first {{CREATE TABLE}} command creates an external table since 
> data source tables are always external when {{PATH}} option is specified.
> When executing the second {{CREATE TABLE}} command, which creates a managed 
> table with the same name, it fails because there's already an unexpected 
> directory with the same name as the table name in the warehouse directory:
> {noformat}
> [info] - foo *** FAILED *** (7 seconds, 649 milliseconds)
> [info]   org.apache.spark.sql.AnalysisException: path 
> file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1
>  already exists.;
> [info]   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417)
> [info]   at 
> org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at org.apache.spark.sql.Dataset.(Dataset.scala:186)
> [info]   at org.apache.spark.sql.Dataset.(Dataset.scala:167)
> [info]   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:62)
> [info]   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541)
>

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

2016-05-12 Thread Xin Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-15269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281748#comment-15281748
 ] 

Xin Wu commented on SPARK-15269:


Yes, I can . Thanks!

> Creating external table leaves empty directory under warehouse directory
> 
>
> Key: SPARK-15269
> URL: https://issues.apache.org/jira/browse/SPARK-15269
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Tests
>Affects Versions: 2.0.0
>Reporter: Cheng Lian
>
> Adding the following test case in {{HiveDDLSuite}} may reproduce this issue:
> {code}
>   test("foo") {
> withTempPath { dir =>
>   val path = dir.getCanonicalPath
>   spark.range(1).write.json(path)
>   withTable("ddl_test1") {
> sql(s"CREATE TABLE ddl_test1 USING json OPTIONS (PATH '$path')")
> sql("DROP TABLE ddl_test1")
> sql(s"CREATE TABLE ddl_test1 USING json AS SELECT 1 AS a")
>   }
> }
>   }
> {code}
> Note that the first {{CREATE TABLE}} command creates an external table since 
> data source tables are always external when {{PATH}} option is specified.
> When executing the second {{CREATE TABLE}} command, which creates a managed 
> table with the same name, it fails because there's already an unexpected 
> directory with the same name as the table name in the warehouse directory:
> {noformat}
> [info] - foo *** FAILED *** (7 seconds, 649 milliseconds)
> [info]   org.apache.spark.sql.AnalysisException: path 
> file:/Users/lian/local/src/spark/workspace-b/target/tmp/warehouse-205e25e7-8918-4615-acf1-10e06af7c35c/ddl_test1
>  already exists.;
> [info]   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:88)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:417)
> [info]   at 
> org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:231)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:57)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:55)
> [info]   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:69)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> [info]   at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> [info]   at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:85)
> [info]   at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:85)
> [info]   at org.apache.spark.sql.Dataset.(Dataset.scala:186)
> [info]   at org.apache.spark.sql.Dataset.(Dataset.scala:167)
> [info]   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:62)
> [info]   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:541)
> [info]   at 
> org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59)
> [info]   at 
> org.apache.spark.sql.test.SQLTestUtils$$anonfun$sql$1.apply(SQLTestUtils.scala:59)
> [info]   at 
>

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

[jira] [Commented] (SPARK-15269) Creating external table leaves empty directory under warehouse directory

5 matches

Site Navigation

Mail list logo

Footer information