Sanford Ryza created SPARK-53735:
------------------------------------
Summary: Hide server-side JVM stack traces by default in
spark-pipelines output
Key: SPARK-53735
URL: https://issues.apache.org/jira/browse/SPARK-53735
Project: Spark
Issue Type: Improvement
Components: Declarative Pipelines
Affects Versions: 4.1.0
Reporter: Sanford Ryza
Error output for failing pipeline runs can be very verbose and show a bunch of
info that is not relevant to the user. We should hide the server-side stack
traces by default.
2025-09-26 17:07:50: Failed to resolve flow:
'spark_catalog.default.rental_bike_trips'.
Error: [TABLE_OR_VIEW_NOT_FOUND] The table or view
`spark_catalog`.`default`.`rental_bike_trips_raws` cannot be found. Verify the
spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema()
output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS.
SQLSTATE: 42P01;
'UnresolvedRelation [spark_catalog, default, rental_bike_trips_raws], [], true
Traceback (most recent call last):
File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 360, in
<module>
run(
File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 287, in run
handle_pipeline_events(result_iter)
File
"/Users/sandy.ryza/oss/python/pyspark/pipelines/spark_connect_pipeline.py",
line 53, in handle_pipeline_events
for result in iter:
File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py", line
1169, in execute_command_as_iterator
for response in self._execute_and_fetch_as_iterator(req, observations or
{}):
File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py", line
1559, in _execute_and_fetch_as_iterator
self._handle_error(error)
File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py", line
1833, in _handle_error
self._handle_rpc_error(error)
File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py", line
1904, in _handle_rpc_error
raise convert_exception(
pyspark.errors.exceptions.connect.AnalysisException:
Failed to resolve flows in the pipeline.
A flow can fail to resolve because the flow itself contains errors or because
it reads
from an upstream flow which failed to resolve.
Flows with errors: spark_catalog.default.rental_bike_trips
Flows that failed due to upstream errors:
To view the exceptions that were raised while resolving these flows, look for
flow
failures that precede this log.
JVM stacktrace:
org.apache.spark.sql.pipelines.graph.UnresolvedPipelineException
at
org.apache.spark.sql.pipelines.graph.GraphValidations.validateSuccessfulFlowAnalysis(GraphValidations.scala:284)
at
org.apache.spark.sql.pipelines.graph.GraphValidations.validateSuccessfulFlowAnalysis$(GraphValidations.scala:247)
at
org.apache.spark.sql.pipelines.graph.DataflowGraph.validateSuccessfulFlowAnalysis(DataflowGraph.scala:33)
at
org.apache.spark.sql.pipelines.graph.DataflowGraph.$anonfun$validationFailure$1(DataflowGraph.scala:186)
at scala.util.Try$.apply(Try.scala:217)
at
org.apache.spark.sql.pipelines.graph.DataflowGraph.validationFailure$lzycompute(DataflowGraph.scala:185)
at
org.apache.spark.sql.pipelines.graph.DataflowGraph.validationFailure(DataflowGraph.scala:185)
at
org.apache.spark.sql.pipelines.graph.DataflowGraph.validate(DataflowGraph.scala:173)
at
org.apache.spark.sql.pipelines.graph.PipelineExecution.resolveGraph(PipelineExecution.scala:109)
at
org.apache.spark.sql.pipelines.graph.PipelineExecution.startPipeline(PipelineExecution.scala:48)
at
org.apache.spark.sql.pipelines.graph.PipelineExecution.runPipeline(PipelineExecution.scala:63)
at
org.apache.spark.sql.connect.pipelines.PipelinesHandler$.startRun(PipelinesHandler.scala:294)
at
org.apache.spark.sql.connect.pipelines.PipelinesHandler$.handlePipelinesCommand(PipelinesHandler.scala:93)
at
org.apache.spark.sql.connect.planner.SparkConnectPlanner.handlePipelineCommand(SparkConnectPlanner.scala:2727)
at
org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:2697)
at
org.apache.spark.sql.connect.execution.ExecuteThreadRunner.handleCommand(ExecuteThreadRunner.scala:322)
at
org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1(ExecuteThreadRunner.scala:224)
at
org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1$adapted(ExecuteThreadRunner.scala:196)
at
org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$2(SessionHolder.scala:349)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:804)
at
org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$1(SessionHolder.scala:349)
at
org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:94)
at
org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:112)
at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:187)
at
org.apache.spark.sql.artifact.ArtifactManager.withClassLoaderIfNeeded(ArtifactManager.scala:102)
at
org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:111)
at
org.apache.spark.sql.connect.service.SessionHolder.withSession(SessionHolder.scala:348)
at
org.apache.spark.sql.connect.execution.ExecuteThreadRunner.executeInternal(ExecuteThreadRunner.scala:196)
at
org.apache.spark.sql.connect.execution.ExecuteThreadRunner.org$apache$spark$sql$connect$execution$ExecuteThreadRunner$$execute(ExecuteThreadRunner.scala:125)
at
org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.run(ExecuteThreadRunner.scala:347)
25/09/26 10:07:50 INFO ShutdownHookManager: Shutdown hook called
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]