By any chance does this thread address look similar: http://apache-spark-developers-list.1001551.n3.nabble.com/Lost-executor-on-YARN-ALS-iterations-td7916.html ?
On Tue, Mar 24, 2015 at 5:23 AM Harut Martirosyan < harut.martiros...@gmail.com> wrote: > What is performance overhead caused by YARN, or what configurations are > being changed when the app is ran through YARN? > > The following example: > > sqlContext.sql("SELECT dayStamp(date), > count(distinct deviceId) AS c > FROM full > GROUP BY dayStamp(date) > ORDER BY c > DESC LIMIT 10") > .collect() > > runs on shell when we use standalone scheduler: > ./spark-shell --master sparkmaster:7077 --executor-memory 20g > --executor-cores 10 --driver-memory 10g --num-executors 8 > > and fails due to losing an executor, when we run it through YARN. > ./spark-shell --master yarn-client --executor-memory 20g --executor-cores > 10 --driver-memory 10g --num-executors 8 > > There are no evident logs, just messages that executors are being lost, > and connection refused errors, (apparently due to executor failures) > The cluster is the same, 8 nodes, 64Gb RAM each. > Format is parquet. > > -- > RGRDZ Harut >