[ https://issues.apache.org/jira/browse/TEZ-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajesh Balamohan reassigned TEZ-1512: ------------------------------------- Assignee: Rajesh Balamohan > VertexImpl.getTask(int) can be CPU intensive when lots of tasks are present > in the vertex > ----------------------------------------------------------------------------------------- > > Key: TEZ-1512 > URL: https://issues.apache.org/jira/browse/TEZ-1512 > Project: Apache Tez > Issue Type: Bug > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Labels: performance > Attachments: TEZ-1512.1.WIP.patch, TEZ-1512.2.patch, > large_job_small_tasks.svg, with_patch_large_job_small_tasks.svg > > > I tried a synthetic benchmark (without much input data) with the tez app. > This was tried to understand the bare minimum time taken by Tez for container > launch / reuse / scheduling etc. > Profiling DAGAppMaster showed that lots of CPU time was spent on > VertexImpl.getTask(int) which gets accessed as a part of event handling and > transitions. > This problem would more prevalent in large jobs which has got lots of small > tasks. > I will attach the perf SVG output of the DAG soon. -- This message was sent by Atlassian JIRA (v6.2#6252)