[jira] [Commented] (TEZ-2148) Slow container grabbing with Capacity Scheduler in comparision to MapReduce

Johannes Zillmann (JIRA) Thu, 26 Feb 2015 05:23:24 -0800

    [ 
https://issues.apache.org/jira/browse/TEZ-2148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338385#comment-14338385
 ]


Johannes Zillmann commented on TEZ-2148:
----------------------------------------

Hey [~jeffzhang], oh yes i missed that part.
So its a series of jobs, 4 to exact. For map-reduce we submit 4 
jobs/applications and i attached the application of the first job only. Since 
we use session mode for Tez its application log containing the entries of all 4 
DAG submissions, but only the first one is of concern.
Also both client logs cover only the 1st of the 4 jobs.
HTH

> Slow container grabbing with Capacity Scheduler in comparision to MapReduce
> ---------------------------------------------------------------------------
>
>                 Key: TEZ-2148
>                 URL: https://issues.apache.org/jira/browse/TEZ-2148
>             Project: Apache Tez
>          Issue Type: Task
>    Affects Versions: 0.5.1
>            Reporter: Johannes Zillmann
>         Attachments: applicationLogs.zip, capacity-scheduler.xml, 
> client-mapreduce.log, client-tez.log, dag1.pdf, dag2.pdf, dag3.pdf, dag4.pdf
>
>
> A customer experienced the following:
> - Setup a CapacityScheduler for user 'company'
> - Same processing job on same data is faster with MapReduce then with Tez 
> with "normal" cluster business. Only if nothing else runs on Hadoop then Tez 
> outperforms MapReduce. (Its hard to give exact data here since we get every 
> information second hand from the customer, but the timings were pretty stable 
> over a dozen of runs. The MapReduce job in about 70 sec and Tez in about 170 
> sec.)
> So questions is, is there some difference in how Tez is grabbing resources 
> from the capacity scheduler in difference to MapReduce ?
> Looking at the logs it looks like Tez is always very slow in starting the 
> containers where as MapReduce parallelizes very quickly.
> Attached client and application logs for Tez and MapReduce run as well as the 
> scheduler configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TEZ-2148) Slow container grabbing with Capacity Scheduler in comparision to MapReduce

Reply via email to