Hi, I've seen a hangup of a job (resp. one of the executors) if the message of an uncaught exception contains bytes which cannot be properly decoded as Unicode characters. The last lines in the executor logs were
PySpark worker failed with exception: Traceback (most recent call last): File "/data/1/yarn/local/usercache/ubuntu/appcache/application_1492496523387_0009/container_1492496523387_0009_01_000006/pyspark.zip/pyspark/worker.py", lin e 178, in main write_with_length(traceback.format_exc().encode("utf-8"), outfile) UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1386: ordinal not in range(128) After that nothing happened for hours, no CPU used on the machine running the executor. First seen with Spark on Yarn Spark 2.1.0, Scala 2.11.8 Python 2.7.6 Hadoop 2.6.0-cdh5.11.0 Reproduced with Spark 2.1.0 and Python 2.7.12 in local mode and traced down to this small script: https://gist.github.com/sebastian-nagel/310a5a5f39cc668fb71b6ace208706f7 Is this a known problem? Of course, one may argue that the job would have been failed anyway, but a hang-up isn't that nice, on Yarn it blocks resources (containers) until killed. Thanks, Sebastian --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org