[jira] [Updated] (LIVY-770) Livy sql session doesn't return the correct error stack trace

RightBitShift (Jira) Fri, 08 May 2020 12:32:08 -0700


     [ 
https://issues.apache.org/jira/browse/LIVY-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


RightBitShift updated LIVY-770:
-------------------------------
    Description: 
Livy session with Kind "sql" doesn't always return the correct error message 
for failed SQL queries. 

For example, run any query in a SQL session on a partitioned table without 
specifying a partition predicate - 

1) 
{code:java}
curl --location --request POST 'http://<livy_instance>:8088/sessions' --header 
'Content-Type: application/json' --data-raw '{"kind": "sql", 
"conf":{"livy.spark.master":"yarn"}}'{code}
 

2) 
{code:java}
curl --location --request POST 
'http://<livy_instance>:8088/sessions/0/statements' --header 'Content-Type: 
application/json' --data-raw '{"code": "select * from default.partitioned_table 
limit 1"}'{code}
 

Livy will have this stack trace: 

 
{code:java}
Traceback: 
['org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)',
 
'org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)',
 
'org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)',
 
'org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)',
 
'org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)',
 
'org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)',
 
'org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)',
 
'org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)',
 
'org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)',
 'org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)', 
'org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)', 
'org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)',
 
'org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)', 
'org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)',
 
'org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)',
 
'org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)',
 
'org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)',
 
'org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)',
 'org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)', 
'org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)',
 'org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)', 
'org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)', 
'org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:545)', 
'org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:104)', 
'org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)', 
'org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)', 
'scala.Option.map(Option.scala:146)', 
'org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)',
 
'org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)',
 'org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)', 
'org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)', 
'scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)',
 'scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)', 
'java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)',
 
'java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)',
 'java.lang.Thread.run(Thread.java:748)']
{code}
 

However, the real stack trace in the driver logs will be something like: 

 
{code:java}
20/05/08 19:19:56 WARN repl.SQLInterpreter: Fail to execute query select * from 
default.partitioned_table limit 1
org.apache.spark.SparkException: Job aborted.
        at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)
        at 
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)
        at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
        at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
        at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
        at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
        at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
        at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
        at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
        at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
        at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
        at org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:545)
        at org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:104)
        at org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)
        at org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)
        at scala.Option.map(Option.scala:146)
        at 
org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)
        at 
org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)
        at 
org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)
        at 
org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: 
execute, tree:
Exchange SinglePartition
+- *(1) LocalLimit 1
   +- Scan hive default.unique_actions_sa [userid#5L, action_type#6, count#7, 
count_sa#8, dt#9], HiveTableRelation `default`.`unique_actions_sa`, 
org.apache.hadoop.hive.ql.io.orc.OrcSerde, [userid#5L, action_type#6, count#7, 
count_sa#8], [dt#9]

        at 
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
        at 
org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:119)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:391)
        at 
org.apache.spark.sql.execution.BaseLimitExec$class.inputRDDs(limit.scala:62)
        at 
org.apache.spark.sql.execution.GlobalLimitExec.inputRDDs(limit.scala:108)
        at 
org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:627)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:143)
        ... 35 more
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: No partition 
predicate found for partitioned table
 default.partitioned_table.
{code}
 

Notice the last line *Caused by: 
org.apache.hadoop.hive.ql.parse.SemanticException: No partition predicate found 
for partitioned table default.partitioned_table.*

Is there any way we can fetch this in Livy to return to the user without having 
to dig in the driver logs?

  was:
Livy session with Kind "sql" doesn't always return the correct error message 
for failed SQL queries. 

For example, run any query in a SQL session on a partitioned table without 
specifying a partition predicate - 

1) 
{code:java}
curl --location --request POST 'http://<livy_instance>:8088/sessions' --header 
'Content-Type: application/json' --data-raw '{"kind": "sql", 
"conf":{"livy.spark.master":"yarn"}}'{code}
 

2) 
{code:java}
curl --location --request POST 
'http://<livy_instance>:8088/sessions/0/statements' --header 'Content-Type: 
application/json' --data-raw '{"code": "select * from default.partitioned_table 
limit 1"}'{code}
 

Livy will have this stack trace: 

 
{code:java}
Traceback: 
['org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)',
 
'org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)',
 
'org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)',
 
'org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)',
 
'org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)',
 
'org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)',
 
'org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)',
 
'org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)',
 
'org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)',
 'org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)', 
'org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)', 
'org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)',
 
'org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)', 
'org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)',
 
'org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)',
 
'org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)',
 
'org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)',
 
'org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)',
 'org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)', 
'org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)',
 'org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)', 
'org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)', 
'org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:545)', 
'org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:104)', 
'org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)', 
'org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)', 
'scala.Option.map(Option.scala:146)', 
'org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)',
 
'org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)',
 'org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)', 
'org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)', 
'scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)',
 'scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)', 
'java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)',
 
'java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)',
 'java.lang.Thread.run(Thread.java:748)']
{code}
 

However, the real stack trace in the driver logs will be something like: 

 
{code:java}
20/05/08 19:19:56 WARN repl.SQLInterpreter: Fail to execute query select * from 
default.partitioned_table where limit 1
org.apache.spark.SparkException: Job aborted.
        at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)
        at 
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)
        at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
        at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
        at 
org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
        at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
        at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
        at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
        at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
        at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
        at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
        at org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:545)
        at org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:104)
        at org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)
        at org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)
        at scala.Option.map(Option.scala:146)
        at 
org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)
        at 
org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)
        at 
org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)
        at 
org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: 
execute, tree:
Exchange SinglePartition
+- *(1) LocalLimit 1
   +- Scan hive default.unique_actions_sa [userid#5L, action_type#6, count#7, 
count_sa#8, dt#9], HiveTableRelation `default`.`unique_actions_sa`, 
org.apache.hadoop.hive.ql.io.orc.OrcSerde, [userid#5L, action_type#6, count#7, 
count_sa#8], [dt#9]

        at 
org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
        at 
org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:119)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:391)
        at 
org.apache.spark.sql.execution.BaseLimitExec$class.inputRDDs(limit.scala:62)
        at 
org.apache.spark.sql.execution.GlobalLimitExec.inputRDDs(limit.scala:108)
        at 
org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:627)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:143)
        ... 35 more
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: No partition 
predicate found for partitioned table
 default.partitioned_table.
{code}
 

Notice the last line *Caused by: 
org.apache.hadoop.hive.ql.parse.SemanticException: No partition predicate found 
for partitioned table default.partitioned_table.*

Is there any way we can fetch this in Livy to return to the user without having 
to dig in the driver logs?


> Livy sql session doesn't return the correct error stack trace
> -------------------------------------------------------------
>
>                 Key: LIVY-770
>                 URL: https://issues.apache.org/jira/browse/LIVY-770
>             Project: Livy
>          Issue Type: Bug
>          Components: Server
>         Environment: Ubuntu18
>            Reporter: RightBitShift
>            Priority: Major
>
> Livy session with Kind "sql" doesn't always return the correct error message 
> for failed SQL queries. 
> For example, run any query in a SQL session on a partitioned table without 
> specifying a partition predicate - 
> 1) 
> {code:java}
> curl --location --request POST 'http://<livy_instance>:8088/sessions' 
> --header 'Content-Type: application/json' --data-raw '{"kind": "sql", 
> "conf":{"livy.spark.master":"yarn"}}'{code}
>  
> 2) 
> {code:java}
> curl --location --request POST 
> 'http://<livy_instance>:8088/sessions/0/statements' --header 'Content-Type: 
> application/json' --data-raw '{"code": "select * from 
> default.partitioned_table limit 1"}'{code}
>  
> Livy will have this stack trace: 
>  
> {code:java}
> Traceback: 
> ['org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)',
>  
> 'org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)',
>  
> 'org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)',
>  
> 'org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)',
>  
> 'org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)',
>  
> 'org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)',
>  
> 'org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)',
>  
> 'org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)',
>  
> 'org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)',
>  
> 'org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)', 
> 'org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)', 
> 'org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)',
>  
> 'org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)',
>  
> 'org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)',
>  
> 'org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)',
>  
> 'org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)',
>  
> 'org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)',
>  
> 'org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)',
>  
> 'org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)', 
> 'org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)',
>  'org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)', 
> 'org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)', 
> 'org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:545)', 
> 'org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:104)', 
> 'org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)', 
> 'org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)', 
> 'scala.Option.map(Option.scala:146)', 
> 'org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)',
>  
> 'org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)',
>  'org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)', 
> 'org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)', 
> 'scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)',
>  
> 'scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)',
>  
> 'java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)',
>  
> 'java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)',
>  'java.lang.Thread.run(Thread.java:748)']
> {code}
>  
> However, the real stack trace in the driver logs will be something like: 
>  
> {code:java}
> 20/05/08 19:19:56 WARN repl.SQLInterpreter: Fail to execute query select * 
> from default.partitioned_table limit 1
> org.apache.spark.SparkException: Job aborted.
>       at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)
>       at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)
>       at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
>       at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
>       at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>       at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
>       at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
>       at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
>       at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
>       at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
>       at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
>       at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>       at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>       at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>       at 
> org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
>       at 
> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
>       at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
>       at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
>       at org.apache.spark.sql.DataFrameWriter.json(DataFrameWriter.scala:545)
>       at org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:104)
>       at org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:274)
>       at org.apache.livy.repl.Session$$anonfun$7.apply(Session.scala:272)
>       at scala.Option.map(Option.scala:146)
>       at 
> org.apache.livy.repl.Session.org$apache$livy$repl$Session$$executeCode(Session.scala:272)
>       at 
> org.apache.livy.repl.Session$$anonfun$execute$1.apply$mcV$sp(Session.scala:168)
>       at 
> org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)
>       at 
> org.apache.livy.repl.Session$$anonfun$execute$1.apply(Session.scala:163)
>       at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>       at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: 
> execute, tree:
> Exchange SinglePartition
> +- *(1) LocalLimit 1
>    +- Scan hive default.unique_actions_sa [userid#5L, action_type#6, count#7, 
> count_sa#8, dt#9], HiveTableRelation `default`.`unique_actions_sa`, 
> org.apache.hadoop.hive.ql.io.orc.OrcSerde, [userid#5L, action_type#6, 
> count#7, count_sa#8], [dt#9]
>       at 
> org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
>       at 
> org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.doExecute(ShuffleExchangeExec.scala:119)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>       at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
>       at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
>       at 
> org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:391)
>       at 
> org.apache.spark.sql.execution.BaseLimitExec$class.inputRDDs(limit.scala:62)
>       at 
> org.apache.spark.sql.execution.GlobalLimitExec.inputRDDs(limit.scala:108)
>       at 
> org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:627)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
>       at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>       at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
>       at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
>       at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:143)
>       ... 35 more
> Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: No partition 
> predicate found for partitioned table
>  default.partitioned_table.
> {code}
>  
> Notice the last line *Caused by: 
> org.apache.hadoop.hive.ql.parse.SemanticException: No partition predicate 
> found for partitioned table default.partitioned_table.*
> Is there any way we can fetch this in Livy to return to the user without 
> having to dig in the driver logs?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (LIVY-770) Livy sql session doesn't return the correct error stack trace

Reply via email to