Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20519#discussion_r166415978
  
    --- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
    @@ -181,17 +181,33 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
           }
     
           try {
    +        // get a server socket so that the launched daemon can tell us its 
server port
    +        val serverSocket = new ServerSocket(0, 0, 
InetAddress.getByAddress(Array(127, 0, 0, 1)))
    +        val serverPort = serverSocket.getLocalPort
    +
             // Create and start the daemon
    -        val pb = new ProcessBuilder(Arrays.asList(pythonExec, "-m", 
daemonModule))
    +        val pb = new ProcessBuilder(Arrays.asList(pythonExec, "-m", 
daemonModule,
    +          serverPort.toString))
             val workerEnv = pb.environment()
             workerEnv.putAll(envVars.asJava)
             workerEnv.put("PYTHONPATH", pythonPath)
             // This is equivalent to setting the -u flag; we use it because 
ipython doesn't support -u:
             workerEnv.put("PYTHONUNBUFFERED", "YES")
             daemon = pb.start()
     
    +        // get the local port of the daemon's server socket,
    +        // but don't wait forever for the daemon to connect
    +        serverSocket.setSoTimeout(10000)
    +        val socketToDaemon = serverSocket.accept()
    --- End diff --
    
    So, the problem with TCP sockets is: what if some other malicious process 
connects here instead of the process you expect to?
    
    You can add an auth token and expose it to the child in an env variable, 
but that complicates the code a bit more. Sometimes I wish we could create 
proper pipes between processes in Java...
    
    (You also probably want some `try..finally` error handling to close the 
server socket, at least.)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to