Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22782#discussion_r227336318 --- Diff: bin/docker-image-tool.sh --- @@ -79,7 +79,7 @@ function build { fi # Verify that Spark has actually been built/is a runnable distribution - #Â i.e. the Spark JARs that the Docker files will place into the image are present --- End diff -- For the issue itself, It's related to a historical reason for Python. Python 2 supported `str` type as bytes like string. It looked a mistake that confuses users about the concept between bytes and string, and then Python 3 introduced `str` as unicode strings concepts like other programing languages. `open(...).read()` reads it as `str` (which is bytes) in Python 2 but it's read in unicode strings in Python 3 - where we need an implicit conversion between bytes and strings. Looks it had to be to minimise the breaking changes in users codes. So, bytes to string conversion happened here and unfortunately our Jenkins's system default encoding is set to ascii (even though arguably UTF-8 is common). For non-ascii itself, please see the justification at http://www.scalastyle.org/rules-dev.html in ScalaStyle.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org