[jira] [Commented] (SPARK-10361) model.predictAll() fails at user_product.first()
[ https://issues.apache.org/jira/browse/SPARK-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723621#comment-14723621 ] Velu nambi commented on SPARK-10361: Thanks [~srowen]. Is this a known issue, any suggestions ? > model.predictAll() fails at user_product.first() > > > Key: SPARK-10361 > URL: https://issues.apache.org/jira/browse/SPARK-10361 > Project: Spark > Issue Type: Bug > Components: MLlib, PySpark >Affects Versions: 1.3.1, 1.4.1, 1.5.0 > Environment: Windows 10, Python 2.7 and with all the three versions > of Spark >Reporter: Velu nambi > > This code, adapted from the documentation, fails when calling PredictAll() > after an ALS.train() > 15/08/31 00:11:45 ERROR PythonRDD: Python worker exited unexpectedly (crashed) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR PythonRDD: This may have been caused by a prior > exception: > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR Executor: Exception in task 0.0 in stage 187.0 (TID > 85) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at
[jira] [Commented] (SPARK-10361) model.predictAll() fails at user_product.first()
[ https://issues.apache.org/jira/browse/SPARK-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723668#comment-14723668 ] Sean Owen commented on SPARK-10361: --- The tests are passing and I haven't heard anything like this. It does point to a local problem. At least, this stack trace is not the problem per se; the Python process wasn't able to connect to the JVM. You'd need to see why. > model.predictAll() fails at user_product.first() > > > Key: SPARK-10361 > URL: https://issues.apache.org/jira/browse/SPARK-10361 > Project: Spark > Issue Type: Bug > Components: MLlib, PySpark >Affects Versions: 1.3.1, 1.4.1, 1.5.0 > Environment: Windows 10, Python 2.7 and with all the three versions > of Spark >Reporter: Velu nambi > > This code, adapted from the documentation, fails when calling PredictAll() > after an ALS.train() > 15/08/31 00:11:45 ERROR PythonRDD: Python worker exited unexpectedly (crashed) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR PythonRDD: This may have been caused by a prior > exception: > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR Executor: Exception in task 0.0 in stage 187.0 (TID > 85) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at >
[jira] [Commented] (SPARK-10361) model.predictAll() fails at user_product.first()
[ https://issues.apache.org/jira/browse/SPARK-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723193#comment-14723193 ] Sean Owen commented on SPARK-10361: --- The error is right here though -- your Python and JVM processes aren't able to communicate. > model.predictAll() fails at user_product.first() > > > Key: SPARK-10361 > URL: https://issues.apache.org/jira/browse/SPARK-10361 > Project: Spark > Issue Type: Bug > Components: MLlib, PySpark >Affects Versions: 1.3.1, 1.4.1, 1.5.0 > Environment: Windows 10, Python 2.7 and with all the three versions > of Spark >Reporter: Velu nambi > > This code, adapted from the documentation, fails when calling PredictAll() > after an ALS.train() > 15/08/31 00:11:45 ERROR PythonRDD: Python worker exited unexpectedly (crashed) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR PythonRDD: This may have been caused by a prior > exception: > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR Executor: Exception in task 0.0 in stage 187.0 (TID > 85) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at
[jira] [Commented] (SPARK-10361) model.predictAll() fails at user_product.first()
[ https://issues.apache.org/jira/browse/SPARK-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723141#comment-14723141 ] Sean Owen commented on SPARK-10361: --- Doesn't this just appear to be a network problem or other problem specific to your cluster? The error is SocketException > model.predictAll() fails at user_product.first() > > > Key: SPARK-10361 > URL: https://issues.apache.org/jira/browse/SPARK-10361 > Project: Spark > Issue Type: Bug > Components: MLlib, PySpark >Affects Versions: 1.3.1, 1.4.1, 1.5.0 > Environment: Windows 10, Python 2.7 and with all the three versions > of Spark >Reporter: Velu nambi > > This code, adapted from the documentation, fails when calling PredictAll() > after an ALS.train() > 15/08/31 00:11:45 ERROR PythonRDD: Python worker exited unexpectedly (crashed) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR PythonRDD: This may have been caused by a prior > exception: > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR Executor: Exception in task 0.0 in stage 187.0 (TID > 85) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at >
[jira] [Commented] (SPARK-10361) model.predictAll() fails at user_product.first()
[ https://issues.apache.org/jira/browse/SPARK-10361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723162#comment-14723162 ] Velu nambi commented on SPARK-10361: [~srowen] I'm running a standalone version of spark on windows. I didn't see any process crash or anything suspicious in the firewall logs -- let me know if I'm missing something ? > model.predictAll() fails at user_product.first() > > > Key: SPARK-10361 > URL: https://issues.apache.org/jira/browse/SPARK-10361 > Project: Spark > Issue Type: Bug > Components: MLlib, PySpark >Affects Versions: 1.3.1, 1.4.1, 1.5.0 > Environment: Windows 10, Python 2.7 and with all the three versions > of Spark >Reporter: Velu nambi > > This code, adapted from the documentation, fails when calling PredictAll() > after an ALS.train() > 15/08/31 00:11:45 ERROR PythonRDD: Python worker exited unexpectedly (crashed) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR PythonRDD: This may have been caused by a prior > exception: > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:248) > at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) > at > org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208) > 15/08/31 00:11:45 ERROR Executor: Exception in task 0.0 in stage 187.0 (TID > 85) > java.net.SocketException: Connection reset by peer: socket write error > at java.net.SocketOutputStream.socketWrite0(Native Method) > at java.net.SocketOutputStream.socketWrite(Unknown Source) > at java.net.SocketOutputStream.write(Unknown Source) > at java.io.BufferedOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at java.io.FilterOutputStream.write(Unknown Source) > at > org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:413) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at > org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:425) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at > org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) > at > org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:425) > at >