Gopal V created TEZ-1698:
----------------------------

             Summary: Use ResourceCalculatorPlugin instead of 
ResourceCalculatorProcessTree in Tez
                 Key: TEZ-1698
                 URL: https://issues.apache.org/jira/browse/TEZ-1698
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.5.2
            Reporter: Gopal V
         Attachments: ProcfsBasedProcessTree.png

ResourceCalculatorProcessTree scraps all of /proc/ for PIDs which are part of 
the current task's process group.

This is mostly wasted in Tez, since unlike YARN which has to do this since it 
has the PID for the container-executor process (bash) and has to trace the bash 
-> java spawn inheritance.

!ProcfsBasedProcessTree.png!

The effect of this is less clearly visible with the profiler turned on as this 
is primarily related to Syscall overhead in the kernel (via the following 
codepath in YARN).

{code}
 private List<String> getProcessList() {
    String[] processDirs = (new File(procfsDir)).list();
...
    for (String dir : processDirs) {
      try {
        if ((new File(procfsDir, dir)).isDirectory()) {
          processList.add(dir);
        }
...

  public void updateProcessTree() {
    if (!pid.equals(deadPid)) {
      // Get the list of processes
      List<String> processList = getProcessList();
...
      for (String proc : processList) {
        // Get information for each process
        ProcessInfo pInfo = new ProcessInfo(proc);
        if (constructProcessInfo(pInfo, procfsDir) != null) {
          allProcessInfo.put(proc, pInfo);
          if (proc.equals(this.pid)) {
            me = pInfo; // cache 'me'
            processTree.put(proc, pInfo);
          }
        }
      }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to