Job on Yarn not using all given capacity ends up failing

Cesar Berezowski Mon, 05 Oct 2015 03:57:08 -0700

Hi, 

I recently upgraded from 1.2.1 to 1.3.1 (through HDP).


I have a job that does a cartesian product on two datasets (2K and 500K lines 
minimum) to do string matching.

I updated it to use Dataframes because the old code wouldn’t run anymore 
(deprecated RDD functions).

It used to run very well and use all allocated memory/cores but doesn’t anymore 
and I can’t figure out why.

My cluster has 4 workers each with 14 available vCores and 40 GB of RAM

I run the job with the following properties: 
—master yarn-client
—num-executors 12
—executor-cores 4
—executor-memory 12G

That should give me 3 executors per workers however I always end up with ~3 
executors (total) and 2 tasks that end up failing on « OutOfMemoryError: GC 
overhead limit exceeded », restart and fail again because output directory 
exists.

Any idea ?

Thanks !

César.




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Job on Yarn not using all given capacity ends up failing

Reply via email to