Here's a program with a couple of problems.

It runs three concurrent child processes, and measures the resource usage 
for each of them separately.  I'm using a dummy child which is /bin/sh -c 
"yes >/dev/null", and let it run for a few seconds before forcibly 
terminating it.

package main

import (
"context"
"fmt"
"os/exec"
"syscall"
"time"
)

func child(n int, done chan int) {
defer func() { done <- 0 }()
ctx, cancel := context.WithTimeout(context.Background(), 
time.Duration(n)*time.Second)
defer cancel()

cmd := exec.CommandContext(ctx, "/bin/sh", "-c", "yes >/dev/null")
err := cmd.Run()
if err != nil {
fmt.Printf("%d Run(): %v\n", n, err)
}
if cmd.ProcessState == nil {
fmt.Printf("%d nil ProcessState", n)
return
}
if rusage, ok := cmd.ProcessState.SysUsage().(*syscall.Rusage); ok {
fmt.Printf("rusage %d: Utime=%v, Stime=%v, Maxrss=%v\n", n, rusage.Utime, 
rusage.Stime, rusage.Maxrss)
} else {
fmt.Printf("%d no rusage\n", n)
}
}

func main() {
done := make(chan int)
go child(4, done)
go child(1, done)
go child(2, done)
<-done
<-done
<-done
fmt.Println("Bye!")
}

*Problem 1*: when the context timeout expires, the shell is killed, but its 
descendant process ("yes") isn't.  This leaves three orphaned "yes" 
processes running, burning all CPU on your machine, which have to be 
manually found and killed.  (Aside: that's why I didn't want to post it on 
play.golang.org, although I expect it has strong protections against this 
sort of thing)

When a context timeout occurs, it's ambiguous in the documentation 
<https://golang.org/pkg/os/#Process.Kill> whether Process.Kill sends a 
SIGTERM or a SIGKILL (since "kill" is both the name of the syscall and the 
name of a signal).  Looking at the implementation 
<https://github.com/golang/go/blob/master/src/os/exec_posix.go#L65>, it 
appears to send SIGKILL, which means that there's no opportunity for the 
process to kill its descendants.

I'm not sure what the right solution is here, but I think it's something 
about sending a signal to a process group (-pid) rather than a single 
process, which could be done if the child runs in its own progress group 
(setpgid? setsid?)

*Problem 2*: the Utime/Stime CPU usage printed is very low.  I believe it's 
showing me the resource usage for the parent shell, but not the child "yes" 
process.  I'd like to have the resource usage for the subprocess *and* its 
descendants.

As far as I can see, the usage comes from wait4() here: 
https://github.com/golang/go/blob/master/src/os/exec_unix.go#L43.  The 
manpage for wait4 says:

       If  rusage  is  not NULL, the struct rusage to which it points will 
be filled with accounting information about the child.
       See getrusage(2) for details.

However it doesn't say if it uses RUSAGE_CHILDREN or RUSAGE_SELF, 
which getrusage() lets you specify.  A bit of Googling turns up that some 
systems have a wait6 
<http://manpages.ubuntu.com/manpages/xenial/man2/waitpid.2freebsd.html> 
which returns both forms of usage.

Although Go lets me call Getrusage() 
<https://golang.org/pkg/syscall/#Getrusage> directly, this isn't much use 
if there are multiple concurrent children.  And as far as i can see, Go 
doesn't let me fork() my own child explicitly so I could measure its 
descendants separately.

Right now I'm thinking I'll have to invoke a wrapper binary, e.g.

exec.CommandContext(ctx, "measure_resource", "real_program", "arg1", "arg2")

where "measure_resource" calls Getrusage(RUSAGE_CHILDREN) and writes it to 
stderr just before terminating, and the parent extracts this from stderr.  
It could also apply its own session with setsid, and/or implement a softer 
timeout than the hard SIGKILL that exec.CommandContext() generates.

Can anyone think of a cleaner solution to this?

Many thanks,

Brian.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/1a6dda12-b66f-4297-b229-08b417b5c5d7o%40googlegroups.com.

Reply via email to