Jasmine:
Let's know if listening to more events would give you better picture.

Thanks

On Thu, Apr 7, 2016 at 1:54 PM, Jasmine George <j.geo...@samsung.com> wrote:

> Hi Ted,
>
>
>
> Thanks for replying so fast.
>
>
>
> We are using spark 1.5.2.
>
> I was collecting only TaskEnd Events.
>
> I can do the event wise summation for couple of runs and get back to you.
>
>
>
> Thanks,
>
> Jasmine
>
>
>
> *From:* Ted Yu [mailto:yuzhih...@gmail.com]
> *Sent:* Thursday, April 07, 2016 1:43 PM
> *To:* JasmineGeorge
> *Cc:* user
> *Subject:* Re: Only 60% of Total Spark Batch Application execution time
> spent in Task Processing
>
>
>
> Which Spark release are you using ?
>
>
>
> Have you registered to all the events provided by SparkListener ?
>
>
>
> If so, can you do event-wise summation of execution time ?
>
>
>
> Thanks
>
>
>
> On Thu, Apr 7, 2016 at 11:03 AM, JasmineGeorge <j.geo...@samsung.com>
> wrote:
>
> We are running a batch job with the following specifications
> •       Building RandomForest with config : maxbins=100, depth=19, num of
> trees =
> 20
> •       Multiple runs with different input data size 2.8 GB, 10 Million
> records
> •       We are running spark application on Yarn in cluster mode, with 3
> Node
> Managers(each with 16 virtual cores and 96G RAM)
> •       Spark config :
> o       spark.driver.cores = 2
> o       spark.driver.memory = 32 G
> o       spark.executor.instances = 5  and spark.executor.cores = 8 so 40
> cores in
> total.
> o       spark.executor.memory= 32G so total executor memory around 160 G.
>
> We are collecting execution times for the tasks using a SparkListener, and
> also the total execution time for the application from the Spark Web UI.
> Across all the tests we saw consistently that,  sum total of the execution
> times of all the tasks is accounting to about 60% of the total application
> run time.
> We are just kind of wondering where is the rest of the 40% of the time
> being
> spent.
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Only-60-of-Total-Spark-Batch-Application-execution-time-spent-in-Task-Processing-tp26703.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
>

Reply via email to