Do you have a large number of tasks? This can happen if you have a large
number of tasks and a small driver or if you use accumulators of lists like
datastructures.

2015-12-11 11:17 GMT-08:00 Zhan Zhang <zzh...@hortonworks.com>:

> I think you are fetching too many results to the driver. Typically, it is
> not recommended to collect much data to driver. But if you have to, you can
> increase the driver memory, when submitting jobs.
>
> Thanks.
>
> Zhan Zhang
>
> On Dec 11, 2015, at 6:14 AM, Tom Seddon <mr.tom.sed...@gmail.com> wrote:
>
> I have a job that is running into intermittent errors with  [SparkDriver]
> java.lang.OutOfMemoryError: Java heap space.  Before I was getting this
> error I was getting errors saying the result size exceed the 
> spark.driver.maxResultSize.
> This does not make any sense to me, as there are no actions in my job that
> send data to the driver - just a pull of data from S3, a map and
> reduceByKey and then conversion to dataframe and saveAsTable action that
> puts the results back on S3.
>
> I've found a few references to reduceByKey and spark.driver.maxResultSize
> having some importance, but cannot fathom how this setting could be related.
>
> Would greatly appreciated any advice.
>
> Thanks in advance,
>
> Tom
>
>
>

Reply via email to