Sanchay Javeria created LIVY-770: ------------------------------------ Summary: Livy sql session doesn't return the correct errors Key: LIVY-770 URL: https://issues.apache.org/jira/browse/LIVY-770 Project: Livy Issue Type: Bug Components: Server Environment: Ubuntu18 Reporter: Sanchay Javeria
Livy session with Kind "sql" doesn't always return the correct error message for failed SQL queries. For example, run any query in a SQL session on a partitioned table without specifying a partition predicate - 1) {code:java} curl --location --request POST 'http://<livy_instance>:8088/sessions' --header 'Content-Type: application/json' --data-raw '{"kind": "sql", "conf":{"livy.spark.master":"yarn"}}'{code} 2) {code:java} curl --location --request POST 'http://<livy_instance>:8088/sessions/0/statements' --header 'Content-Type: application/json' --data-raw '{"code": "select * from default.partitioned_table limit 1"}'{code} Livy will have this stack trace: {code:java} Traceback: ['org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)', 'org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)', 'org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)', 'org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)', 'org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)', 'org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)', 'org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)', 'org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)', 'org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)', 'org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)', 'org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)', 'org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)', 'org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)', 'org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)', 'org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)', 'org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)', 'org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)', 'org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)', 'org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)', 'org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)', 'org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)', 'org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)', 'org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:545)', 'org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:104)', 'org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)', 'org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)', 'scala.Option.map(Option.scala:146)', 'org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)', 'org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)', 'org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)', 'org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)', 'scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)', 'scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)', 'java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)', 'java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)', 'java.lang.Thread.run(Thread.java:748)'] {code} However, the real stack trace in the driver logs will be something like: {code:java} 20/05/08 19:19:56 WARN repl.SQLInterpreter: Fail to execute query select * from default.partitioned_table where limit 1 org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159) at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104) at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102) at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80) at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676) at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229) at org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:545) at org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:104) at org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274) at org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272) at scala.Option.map(Option.scala:146) at org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272) at org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168) at org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163) at org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange SinglePartition +- *(1) LocalLimit 1 +- Scan hive default.unique_actions_sa [userid#5L, action_type#6, count#7, count_sa#8, dt#9], HiveTableRelation `default`.`unique_actions_sa`, org.apache.hadoop.hive.ql.io.orc.OrcSerde, [userid#5L, action_type#6, count#7, count_sa#8], [dt#9] at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:119) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:391) at org.apache.spark.sql.execution.BaseLimitExec$class.inputRDDs(limit.scala:62) at org.apache.spark.sql.execution.GlobalLimitExec.inputRDDs(limit.scala:108) at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:627) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:143) ... 35 more Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: No partition predicate found for partitioned table default.partitioned_table. {code} Notice the last line *Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: No partition predicate found for partitioned table default.partitioned_table.* Is there any way we can fetch this in Livy to return to the user without having to dig in the driver logs? -- This message was sent by Atlassian Jira (v8.3.4#803005)