Re: PySpark on YARN "port out of range"

2015-06-22 Thread Andrew Or
Unfortunately there is not a great way to do it without modifying Spark to print more things it reads from the stream. 2015-06-20 23:10 GMT-07:00 John Meehan : > Yes it seems to be consistently "port out of range:1315905645”. Is there > any way to see what the python process is actually outputti

Re: PySpark on YARN "port out of range"

2015-06-19 Thread Andrew Or
Hm, one thing to see is whether the same port appears many times (1315905645). The way pyspark works today is that the JVM reads the port from the stdout of the python process. If there is some interference in output from the python side (e.g. any print statements, exception messages), then the Jav

PySpark on YARN "port out of range"

2015-06-19 Thread John Meehan
Has anyone encountered this “port out of range” error when launching PySpark jobs on YARN? It is sporadic (e.g. 2/3 jobs get this error). LOG: 15/06/19 11:49:44 INFO scheduler.TaskSetManager: Lost task 0.3 in stage 39.0 (TID 211) on executor xxx.xxx.xxx.com : java.lan