liupengcheng created SPARK-28195:
------------------------------------

             Summary: CheckAnalysis not working for Command and report 
misleading error message
                 Key: SPARK-28195
                 URL: https://issues.apache.org/jira/browse/SPARK-28195
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.3.2
            Reporter: liupengcheng


Currently, we encountered an issue when executing 
`InsertIntoDataSourceDirCommand`, and we found that it's query relied on 
non-exist table or view, but we finally got a misleading error message:
{code:java}
Caused by: org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid 
call to dataType on unresolved object, tree: 'kr.objective_id
at 
org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute.dataType(unresolved.scala:105)
at 
org.apache.spark.sql.types.StructType$$anonfun$fromAttributes$1.apply(StructType.scala:440)
at 
org.apache.spark.sql.types.StructType$$anonfun$fromAttributes$1.apply(StructType.scala:440)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.sql.types.StructType$.fromAttributes(StructType.scala:440)
at 
org.apache.spark.sql.catalyst.plans.QueryPlan.schema$lzycompute(QueryPlan.scala:159)
at org.apache.spark.sql.catalyst.plans.QueryPlan.schema(QueryPlan.scala:159)
at 
org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:544)
at 
org.apache.spark.sql.execution.command.InsertIntoDataSourceDirCommand.run(InsertIntoDataSourceDirCommand.scala:70)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at 
org.apache.spark.sql.execution.adaptive.QueryStage.executeCollect(QueryStage.scala:246)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3277)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3276)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:277)
... 11 more

{code}
After looking into the code, I found that it's because we support 
`runSQLOnFiles` feature since 2.3, and if the table does not exist and it's not 
a temporary table, then It will be treated as running directly on files.

`ResolveSQLOnFile` rule will analyze it, and return an `UnresolvedRelation` on 
resolve failure(it's actually not a sql on files, so it will fail when 
resolving). Due to Command has empty children, `CheckAnalysis` will skip check 
the `UnresolvedRelation` and finally we got the above misleading error message 
when executing this command.

I think maybe we should checkAnalysis for command's query plan? Or is there any 
consideration for not checking analysis for command?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to