[
https://issues.apache.org/jira/browse/SPARK-53735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-53735:
-----------------------------------
Labels: pull-request-available (was: )
> Hide server-side JVM stack traces by default in spark-pipelines output
> ----------------------------------------------------------------------
>
> Key: SPARK-53735
> URL: https://issues.apache.org/jira/browse/SPARK-53735
> Project: Spark
> Issue Type: Improvement
> Components: Declarative Pipelines
> Affects Versions: 4.1.0
> Reporter: Sanford Ryza
> Priority: Major
> Labels: pull-request-available
>
> Error output for failing pipeline runs can be very verbose and show a bunch
> of info that is not relevant to the user. We should hide the server-side
> stack traces by default.
>
> 2025-09-26 17:07:50: Failed to resolve flow:
> 'spark_catalog.default.rental_bike_trips'.
> Error: [TABLE_OR_VIEW_NOT_FOUND] The table or view
> `spark_catalog`.`default`.`rental_bike_trips_raws` cannot be found. Verify
> the spelling and correctness of the schema and catalog.
> If you did not qualify the name with a schema, verify the current_schema()
> output, or qualify the name with the correct schema and catalog.
> To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF
> EXISTS. SQLSTATE: 42P01;
> 'UnresolvedRelation [spark_catalog, default, rental_bike_trips_raws], [], true
>
> Traceback (most recent call last):
> File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 360, in
> <module>
> run(
> File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 287, in
> run
> handle_pipeline_events(result_iter)
> File
> "/Users/sandy.ryza/oss/python/pyspark/pipelines/spark_connect_pipeline.py",
> line 53, in handle_pipeline_events
> for result in iter:
> File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py",
> line 1169, in execute_command_as_iterator
> for response in self._execute_and_fetch_as_iterator(req, observations or
> {}):
> File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py",
> line 1559, in _execute_and_fetch_as_iterator
> self._handle_error(error)
> File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py",
> line 1833, in _handle_error
> self._handle_rpc_error(error)
> File "/Users/sandy.ryza/oss/python/pyspark/sql/connect/client/core.py",
> line 1904, in _handle_rpc_error
> raise convert_exception(
> pyspark.errors.exceptions.connect.AnalysisException:
> Failed to resolve flows in the pipeline.
>
> A flow can fail to resolve because the flow itself contains errors or because
> it reads
> from an upstream flow which failed to resolve.
>
> Flows with errors: spark_catalog.default.rental_bike_trips
> Flows that failed due to upstream errors:
>
> To view the exceptions that were raised while resolving these flows, look for
> flow
> failures that precede this log.
>
> JVM stacktrace:
> org.apache.spark.sql.pipelines.graph.UnresolvedPipelineException
> at
> org.apache.spark.sql.pipelines.graph.GraphValidations.validateSuccessfulFlowAnalysis(GraphValidations.scala:284)
> at
> org.apache.spark.sql.pipelines.graph.GraphValidations.validateSuccessfulFlowAnalysis$(GraphValidations.scala:247)
> at
> org.apache.spark.sql.pipelines.graph.DataflowGraph.validateSuccessfulFlowAnalysis(DataflowGraph.scala:33)
> at
> org.apache.spark.sql.pipelines.graph.DataflowGraph.$anonfun$validationFailure$1(DataflowGraph.scala:186)
> at scala.util.Try$.apply(Try.scala:217)
> at
> org.apache.spark.sql.pipelines.graph.DataflowGraph.validationFailure$lzycompute(DataflowGraph.scala:185)
> at
> org.apache.spark.sql.pipelines.graph.DataflowGraph.validationFailure(DataflowGraph.scala:185)
> at
> org.apache.spark.sql.pipelines.graph.DataflowGraph.validate(DataflowGraph.scala:173)
> at
> org.apache.spark.sql.pipelines.graph.PipelineExecution.resolveGraph(PipelineExecution.scala:109)
> at
> org.apache.spark.sql.pipelines.graph.PipelineExecution.startPipeline(PipelineExecution.scala:48)
> at
> org.apache.spark.sql.pipelines.graph.PipelineExecution.runPipeline(PipelineExecution.scala:63)
> at
> org.apache.spark.sql.connect.pipelines.PipelinesHandler$.startRun(PipelinesHandler.scala:294)
> at
> org.apache.spark.sql.connect.pipelines.PipelinesHandler$.handlePipelinesCommand(PipelinesHandler.scala:93)
> at
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.handlePipelineCommand(SparkConnectPlanner.scala:2727)
> at
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:2697)
> at
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.handleCommand(ExecuteThreadRunner.scala:322)
> at
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1(ExecuteThreadRunner.scala:224)
> at
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1$adapted(ExecuteThreadRunner.scala:196)
> at
> org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$2(SessionHolder.scala:349)
> at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:804)
> at
> org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$1(SessionHolder.scala:349)
> at
> org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:94)
> at
> org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:112)
> at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:187)
> at
> org.apache.spark.sql.artifact.ArtifactManager.withClassLoaderIfNeeded(ArtifactManager.scala:102)
> at
> org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:111)
> at
> org.apache.spark.sql.connect.service.SessionHolder.withSession(SessionHolder.scala:348)
> at
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.executeInternal(ExecuteThreadRunner.scala:196)
> at
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner.org$apache$spark$sql$connect$execution$ExecuteThreadRunner$$execute(ExecuteThreadRunner.scala:125)
> at
> org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.run(ExecuteThreadRunner.scala:347)
> 25/09/26 10:07:50 INFO ShutdownHookManager: Shutdown hook called
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]