[ https://issues.apache.org/jira/browse/TEZ-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajesh Balamohan resolved TEZ-1512. ----------------------------------- Resolution: Fixed Fix Version/s: 0.6.0 Hadoop Flags: Reviewed Thanks [~sseth]. Committed to master. commit ddef389a976793da397856f397398bdddc8db123 Author: Rajesh Balamohan <rbalamo...@apache.org> Date: Thu Aug 28 13:41:04 2014 +0530 > VertexImpl.getTask(int) can be CPU intensive when lots of tasks are present > in the vertex > ----------------------------------------------------------------------------------------- > > Key: TEZ-1512 > URL: https://issues.apache.org/jira/browse/TEZ-1512 > Project: Apache Tez > Issue Type: Bug > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Labels: performance > Fix For: 0.6.0 > > Attachments: TEZ-1512.1.WIP.patch, TEZ-1512.2.patch, > large_job_small_tasks.svg, with_patch_large_job_small_tasks.svg > > > I tried a synthetic benchmark (without much input data) with the tez app. > This was tried to understand the bare minimum time taken by Tez for container > launch / reuse / scheduling etc. > Profiling DAGAppMaster showed that lots of CPU time was spent on > VertexImpl.getTask(int) which gets accessed as a part of event handling and > transitions. > This problem would more prevalent in large jobs which has got lots of small > tasks. > I will attach the perf SVG output of the DAG soon. -- This message was sent by Atlassian JIRA (v6.2#6252)