cdkrot opened a new pull request, #42949:
URL: https://github.com/apache/spark/pull/42949

   ### What changes were proposed in this pull request?
   
   Add error logging into `addArtifact`  (see example in "How this is tested). 
The logging code is moved into separate file to avoid circular dependency.
   
   ### Why are the changes needed?
   
   Currently, in case `addArtifact` is executed with the file which doesn't 
exist, the user gets cryptic error
   
   ```grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that 
terminated with:
           status = StatusCode.UNKNOWN
           details = "Exception iterating requests!"
           debug_error_string = "None"
   >
   ```
   
   Which is impossible to debug without deep digging into the subject.
   
   This happens because addArtifact is implemented as client-side streaming and 
the actual error happens during grpc consuming iterator generating requests. 
Unfortunately grpc doesn't print any debug information for user to understand 
the problem.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Additional logging which is opt-in same way as before with 
`SPARK_CONNECT_LOG_LEVEL` environment variable.
   
   ### How was this patch tested?
   
   ```
   >>> s.addArtifact("XYZ", file=True)
   2023-09-15 17:06:40,078 11789 ERROR _create_requests Failed to execute 
addArtifact: [Errno 2] No such file or directory: 
'/Users/alice.sayutina/apache_spark/python/XYZ'
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File 
"/Users/alice.sayutina/apache_spark/python/pyspark/sql/connect/session.py", 
line 743, in addArtifacts
       self._client.add_artifacts(*path, pyfile=pyfile, archive=archive, 
file=file)
   
   [....]
   
     File 
"/Users/alice.sayutina/oss-venv/lib/python3.11/site-packages/grpc/_channel.py", 
line 910, in _end_unary_response_blocking
       raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated 
with:
           status = StatusCode.UNKNOWN
           details = "Exception iterating requests!"
           debug_error_string = "None"
   >
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to