[ https://issues.apache.org/jira/browse/SPARK-17960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-17960: ------------------------------ Assignee: Jagadeesan A S > Upgrade to Py4J 0.10.4 > ---------------------- > > Key: SPARK-17960 > URL: https://issues.apache.org/jira/browse/SPARK-17960 > Project: Spark > Issue Type: Improvement > Components: PySpark > Reporter: holdenk > Assignee: Jagadeesan A S > Priority: Trivial > Labels: starter > Fix For: 2.1.0 > > > In general we should try and keep up to date with Py4J's new releases. The > changes in this one are small ( > https://github.com/bartdag/py4j/milestone/21?closed=1 ) and shouldn't impact > Spark in any significant way so I'm going to tag this as a starter issue for > someone looking to get a deeper understanding of how PySpark works. > Upgrading Py4J can be a bit tricky compared to updating other packages in > general the steps are: > 1) Upgrade the Py4J version on the Java side > 2) Update the py4j src zip file we bundle with Spark > 3) Make sure everything still works (especially the streaming tests because > we do weird things to make streaming work and its the most likely place to > break during a Py4J upgrade). > You can see how these bits have been done in past releases by looking in the > git log for the last time we changed the Py4J version numbers. Sometimes even > for "compatible" releases like this one we may need to make some small code > changes in side of PySpark because we hook into Py4Js internals, but I don't > think this should be the case here. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org