[ https://issues.apache.org/jira/browse/SPARK-23240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341609#comment-16341609 ]
Bruce Robbins commented on SPARK-23240: --------------------------------------- I will be making a pull request. > PythonWorkerFactory issues unhelpful message when pyspark.daemon produces > bogus stdout > -------------------------------------------------------------------------------------- > > Key: SPARK-23240 > URL: https://issues.apache.org/jira/browse/SPARK-23240 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.2.1 > Reporter: Bruce Robbins > Priority: Minor > > Environmental issues or site-local customizations (i.e., sitecustomize.py > present in the python install directory) can interfere with daemon.py’s > output to stdout. PythonWorkerFactory produces unhelpful messages when this > happens, causing some head scratching before the actual issue is determined. > Case #1: Extraneous data in pyspark.daemon’s stdout. In this case, > PythonWorkerFactory uses the output as the daemon’s port number and ends up > throwing an exception when creating the socket: > {noformat} > java.lang.IllegalArgumentException: port out of range:1819239265 > at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) > at java.net.InetSocketAddress.<init>(InetSocketAddress.java:188) > at java.net.Socket.<init>(Socket.java:244) > at > org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:78) > {noformat} > Case #2: No data in pyspark.daemon’s stdout. In this case, > PythonWorkerFactory throws an EOFException exception reading the from the > Process input stream. > The second case is somewhat less mysterious than the first, because > PythonWorkerFactory also displays the stderr from the python process. > When there is unexpected or missing output in pyspark.daemon’s stdout, > PythonWorkerFactory should say so. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org