Setting Spark's memoryOverhead configuration variable is recommended in
your logs, and has helped me with these issues in the past.  Search for
"memoryOverhead" here:
http://spark.apache.org/docs/latest/running-on-yarn.html

That said, you're running on a huge cluster as it is.  If it's possible to
filter your tables down before the join (keeping just the rows/columns you
need), that may be a better solution.

Jon

On Mon, Feb 13, 2017 at 5:27 AM, nancy henry <nancyhenry6...@gmail.com>
wrote:

> Hi All,,
>
> I am getting below error while I am trying to join 3 tables which are in
> ORC format in hive from 5 10gb tables through hive context in spark
>
> Container killed by YARN for exceeding memory limits. 11.1 GB of 11 GB
> physical memory used. Consider boosting spark.yarn.executor.
> memoryOverhead.
> 17/02/13 02:21:19 WARN YarnSchedulerBackend$YarnSchedulerEndpoint:
> Container killed by YARN for exceeding memory limits. 11.1 GB of 11 GB
> physical memory used
>
>
> I am using below memory parameters to launch shell .. what else i could
> increase from these parameters or do I need to change any configuration
> settings please let me know
>
> spark-shell --master yarn --deploy-mode client --driver-memory 16G
> --num-executors 500 executor-cores 7 --executor-memory 10G
>
>

Reply via email to