Which Spark release are you using ? Have you registered to all the events provided by SparkListener ?
If so, can you do event-wise summation of execution time ? Thanks On Thu, Apr 7, 2016 at 11:03 AM, JasmineGeorge <j.geo...@samsung.com> wrote: > We are running a batch job with the following specifications > • Building RandomForest with config : maxbins=100, depth=19, num of > trees = > 20 > • Multiple runs with different input data size 2.8 GB, 10 Million > records > • We are running spark application on Yarn in cluster mode, with 3 > Node > Managers(each with 16 virtual cores and 96G RAM) > • Spark config : > o spark.driver.cores = 2 > o spark.driver.memory = 32 G > o spark.executor.instances = 5 and spark.executor.cores = 8 so 40 > cores in > total. > o spark.executor.memory= 32G so total executor memory around 160 G. > > We are collecting execution times for the tasks using a SparkListener, and > also the total execution time for the application from the Spark Web UI. > Across all the tests we saw consistently that, sum total of the execution > times of all the tasks is accounting to about 60% of the total application > run time. > We are just kind of wondering where is the rest of the 40% of the time > being > spent. > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Only-60-of-Total-Spark-Batch-Application-execution-time-spent-in-Task-Processing-tp26703.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >