haiyangsea created SPARK-19469: ---------------------------------- Summary: PySpark should allow driver process on different machine Key: SPARK-19469 URL: https://issues.apache.org/jira/browse/SPARK-19469 Project: Spark Issue Type: Wish Components: PySpark Affects Versions: 1.6.3 Reporter: haiyangsea
In my scenario, there is a resident spark driver process(creating by yarn cluster mode), all PySpark python client will connect to this resident driver process and share the SparkContext. In python client process, I also run other applications that have requirements for the environment.So I want separate the PySpark python client process and the driver process. There are two main limitations in PySpark: 1. *parallelize* method uses local file to transform data between java process and python process. 2. using the hard code *localhost* to transform task result or intermediate data between java process and python process. Is there any way to achieve my goal? -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org