This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new f62b36c [SPARK-38128][PYTHON][TESTS] Show full stacktrace in tests by default in PySpark tests f62b36c is described below commit f62b36c6d3964c40336959b129b284edb8097f61 Author: Hyukjin Kwon <gurwls...@apache.org> AuthorDate: Mon Feb 7 21:18:04 2022 +0900 [SPARK-38128][PYTHON][TESTS] Show full stacktrace in tests by default in PySpark tests ### What changes were proposed in this pull request? This PR proposes to show full stacktrace of Python worker and JVM in PySpark by controlling `spark.sql.pyspark.jvmStacktrace.enabled` and `spark.sql.execution.pyspark.udf.simplifiedTraceback.enabled` only in tests. ### Why are the changes needed? [SPARK-33407](https://issues.apache.org/jira/browse/SPARK-33407) and [SPARK-31849](https://issues.apache.org/jira/browse/SPARK-31849) hide Java stacktrace and internal Python worker side traceback by default for simpler error messages to end users. However, specifically for unit tests, that makes a bit harder to debug the test failures. We should probably show the full stacktrace by default in tests. ### Does this PR introduce _any_ user-facing change? No, this is test only. ### How was this patch tested? Manually tested. Now the test failures show the logs as below: **Before:** ``` ===================================================================== ERROR [3.480s]: test (pyspark.sql.tests.test_functions.FunctionsTests) ---------------------------------------------------------------------- Traceback (most recent call last): ... pyspark.sql.utils.PythonException: An exception was thrown from the Python worker. Please see the stack trace below. Traceback (most recent call last): File "/.../pyspark/sql/tests/test_functions.py", line 60, in <lambda> self.spark.range(1).select(udf(lambda x: x / 0)("id")).show() ZeroDivisionError: division by zero ---------------------------------------------------------------------- Ran 1 test in 12.468s FAILED (errors=1) ``` **After:** ``` ====================================================================== ERROR [3.259s]: test (pyspark.sql.tests.test_functions.FunctionsTests) ---------------------------------------------------------------------- Traceback (most recent call last): ... pyspark.sql.utils.PythonException: An exception was thrown from the Python worker. Please see the stack trace below. Traceback (most recent call last): File "/.../pyspark/worker.py", line 678, in main process() File "/.../pyspark/worker.py", line 670, in process serializer.dump_stream(out_iter, outfile) File "/.../lib/pyspark/serializers.py", line 217, in dump_stream self.serializer.dump_stream(self._batched(iterator), stream) ... ZeroDivisionError: division by zero JVM stacktrace: ... at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:558) at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:86) at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:68) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:511) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) ... Driver stacktrace: ... Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last): ... 1 more ---------------------------------------------------------------------- Ran 1 test in 12.610s FAILED (errors=1) ``` Closes #35423 from HyukjinKwon/SPARK-38128. Authored-by: Hyukjin Kwon <gurwls...@apache.org> Signed-off-by: Hyukjin Kwon <gurwls...@apache.org> --- .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index 42979a6..59a896a 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -2383,7 +2383,8 @@ object SQLConf { "and shows a Python-friendly exception only.") .version("3.0.0") .booleanConf - .createWithDefault(false) + // show full stacktrace in tests but hide in production by default. + .createWithDefault(Utils.isTesting) val ARROW_SPARKR_EXECUTION_ENABLED = buildConf("spark.sql.execution.arrow.sparkr.enabled") @@ -2440,7 +2441,8 @@ object SQLConf { "shows the exception messages from UDFs. Note that this works only with CPython 3.7+.") .version("3.1.0") .booleanConf - .createWithDefault(true) + // show full stacktrace in tests but hide in production by default. + .createWithDefault(!Utils.isTesting) val PANDAS_GROUPED_MAP_ASSIGN_COLUMNS_BY_NAME = buildConf("spark.sql.legacy.execution.pandas.groupedMap.assignColumnsByName") --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org