Re: ExecutorLostFailure when working with RDDs

2015-10-09 Thread Ivan Héda
The solution is to set 'spark.shuffle.io.preferDirectBufs' to 'false'. Then it is working. Cheers! On Fri, Oct 9, 2015 at 3:13 PM, Ivan Héda <ivan.h...@gmail.com> wrote: > Hi, > > I'm facing an issue with PySpark (1.5.1, 1.6.0-SNAPSHOT) running over Yarn > (2.6.0-cdh5.4.4)

ExecutorLostFailure when working with RDDs

2015-10-09 Thread Ivan Héda
Hi, I'm facing an issue with PySpark (1.5.1, 1.6.0-SNAPSHOT) running over Yarn (2.6.0-cdh5.4.4). Everything seems fine when working with dataframes, but when i need RDD the workers start to fail. Like in the next code table1 = sqlContext.table('someTable') table1.count() ## OK ## cca 500