[jira] [Commented] (TEZ-1698) Cut down on ResourceCalculatorProcessTree overheads in Tez

2014-10-30 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191204#comment-14191204
 ] 

Gopal V commented on TEZ-1698:
--

+1 - Thanks Rajesh, this looks good.

This isn't on by default, so it should be good for 0.5.2 as well.

> Cut down on ResourceCalculatorProcessTree overheads in Tez
> --
>
> Key: TEZ-1698
> URL: https://issues.apache.org/jira/browse/TEZ-1698
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Gopal V
>Assignee: Rajesh Balamohan
> Attachments: ProcfsBasedProcessTree.png, ProcfsFiles.png, 
> TEZ-1698.1.patch, TEZ-1698.2.patch, TEZ-1698.3.patch, TEZ-1698.4.patch
>
>
> ResourceCalculatorProcessTree scraps all of /proc/ for PIDs which are part of 
> the current task's process group.
> This is mostly wasted in Tez, since unlike YARN which has to do this since it 
> has the PID for the container-executor process (bash) and has to trace the 
> bash -> java spawn inheritance.
> !ProcfsBasedProcessTree.png!
> The latency effect of this is less clearly visible with the profiler turned 
> on as this is primarily related to rate of syscalls + overhead in the kernel 
> (via the following codepath in YARN).
> !ProcfsFiles.png!
> {code}
>  private List getProcessList() {
> String[] processDirs = (new File(procfsDir)).list();
> ...
> for (String dir : processDirs) {
>   try {
> if ((new File(procfsDir, dir)).isDirectory()) {
>   processList.add(dir);
> }
> ...
>   public void updateProcessTree() {
> if (!pid.equals(deadPid)) {
>   // Get the list of processes
>   List processList = getProcessList();
> ...
>   for (String proc : processList) {
> // Get information for each process
> ProcessInfo pInfo = new ProcessInfo(proc);
> if (constructProcessInfo(pInfo, procfsDir) != null) {
>   allProcessInfo.put(proc, pInfo);
>   if (proc.equals(this.pid)) {
> me = pInfo; // cache 'me'
> processTree.put(proc, pInfo);
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1698) Cut down on ResourceCalculatorProcessTree overheads in Tez

2014-10-30 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191458#comment-14191458
 ] 

Siddharth Seth commented on TEZ-1698:
-

[~rajesh.balamohan] - I think the license needs to be at the top of files. 
Before the package / imports.

> Cut down on ResourceCalculatorProcessTree overheads in Tez
> --
>
> Key: TEZ-1698
> URL: https://issues.apache.org/jira/browse/TEZ-1698
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
>Reporter: Gopal V
>Assignee: Rajesh Balamohan
> Fix For: 0.5.2
>
> Attachments: ProcfsBasedProcessTree.png, ProcfsFiles.png, 
> TEZ-1698.1.patch, TEZ-1698.2.patch, TEZ-1698.3.patch, TEZ-1698.4.patch
>
>
> ResourceCalculatorProcessTree scraps all of /proc/ for PIDs which are part of 
> the current task's process group.
> This is mostly wasted in Tez, since unlike YARN which has to do this since it 
> has the PID for the container-executor process (bash) and has to trace the 
> bash -> java spawn inheritance.
> !ProcfsBasedProcessTree.png!
> The latency effect of this is less clearly visible with the profiler turned 
> on as this is primarily related to rate of syscalls + overhead in the kernel 
> (via the following codepath in YARN).
> !ProcfsFiles.png!
> {code}
>  private List getProcessList() {
> String[] processDirs = (new File(procfsDir)).list();
> ...
> for (String dir : processDirs) {
>   try {
> if ((new File(procfsDir, dir)).isDirectory()) {
>   processList.add(dir);
> }
> ...
>   public void updateProcessTree() {
> if (!pid.equals(deadPid)) {
>   // Get the list of processes
>   List processList = getProcessList();
> ...
>   for (String proc : processList) {
> // Get information for each process
> ProcessInfo pInfo = new ProcessInfo(proc);
> if (constructProcessInfo(pInfo, procfsDir) != null) {
>   allProcessInfo.put(proc, pInfo);
>   if (proc.equals(this.pid)) {
> me = pInfo; // cache 'me'
> processTree.put(proc, pInfo);
>   }
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)