The solution is to set 'spark.shuffle.io.preferDirectBufs' to 'false'.
Then it is working.
Cheers!
On Fri, Oct 9, 2015 at 3:13 PM, Ivan Héda <ivan.h...@gmail.com> wrote:
> Hi,
>
> I'm facing an issue with PySpark (1.5.1, 1.6.0-SNAPSHOT) running over Yarn
> (2.6.0-cdh5.4.4)
Hi,
I'm facing an issue with PySpark (1.5.1, 1.6.0-SNAPSHOT) running over Yarn
(2.6.0-cdh5.4.4). Everything seems fine when working with dataframes, but
when i need RDD the workers start to fail. Like in the next code
table1 = sqlContext.table('someTable')
table1.count() ## OK ## cca 500