modernc.org/ccgo/v3 compiles test C source files into Go source file and
executes the result to check if it compiles and performs correctly. To
avoid incorrectly transpilled programs to hang the test, the resulting Go
code is executed like this:

        out, err := func() ([]byte, error) {
                ctx, cancel := context.WithTimeout(context.Background(),
60*time.Second)
                defer cancel()

                return exec.CommandContext(ctx, "go",
append([]string{"run", main}, runargs...)...).CombinedOutput()
        }()

(
https://gitlab.com/cznic/ccgo/-/blob/b4ea401c94ad40a2cd3146e9c1c5f0ec85f8834b/v3/lib/all_test.go#L976
)

The code is lifted almost verbatim from the example here:
https://pkg.go.dev/os/exec@go1.17.2#example-CommandContext

For some time I suspected the code does not work properly on some builders:
https://gitlab.com/cznic/ccgo/-/issues/19. I now tried to investigate and
wrap my head around what's wrong. Here's a distilled experiment, consisting
of four short Go files (also enclosed as an archive for convenience).

        jnml@e5-1650:~/tmp$ ls -l
        total 16
        -rw-r--r-- 1 jnml jnml  77 Oct 12 14:38 crash.go
        -rw-r--r-- 1 jnml jnml 112 Oct 12 14:16 hang.go
        -rw-r--r-- 1 jnml jnml 740 Oct 12 14:41 main.go
        -rw-r--r-- 1 jnml jnml 111 Oct 12 14:30 sleep.go
        jnml@e5-1650:~/tmp$ cat main.go
        package main

        import (
                "context"
                "fmt"
                "os"
                "os/exec"
                "runtime"
                "strings"
                "time"
        )

        func main() {
                fmt.Printf("%s, GO111MODULE=%v\n", runtime.Version(),
os.Getenv("GO111MODULE"))
                run("A", "sleep", "10")
                run("B", "go", "run", "sleep.go")
                run("C", "go", "run", "crash.go")
                run("D", "go", "run", "hang.go")
        }

        func run(cs, cmd string, args ...string) {
                ctx, cancel := context.WithTimeout(context.Background(),
time.Second)

                defer cancel()

                fmt.Printf("\ncase %s: excuting: %s %s\n", cs, cmd,
strings.Join(args, " "))
                t0 := time.Now()
                out, err := exec.CommandContext(ctx, cmd,
args...).CombinedOutput()
                fmt.Printf("returned in %s\n", time.Since(t0))
                if err != nil {
                        fmt.Printf("FAIL: %s, err: %s\n", out, err)
                        return
                }
        }
        jnml@e5-1650:~/tmp$ cat crash.go
        //go:build ignore
        // +build ignore

        package main

        func main() {
                select {}
        }
        jnml@e5-1650:~/tmp$ cat hang.go
        //go:build ignore
        // +build ignore

        package main

        import "net"

        var _ net.Interface

        func main() {
                select {}
        }
        jnml@e5-1650:~/tmp$ cat sleep.go
        //go:build ignore
        // +build ignore

        package main

        import "time"

        func main() {
                time.Sleep(10 * time.Second)
        }
        jnml@e5-1650:~/tmp$

Here's the output of executing main.go:

        jnml@e5-1650:~/tmp$ go run main.go
        go1.17.2, GO111MODULE=

        case A: excuting: sleep 10
        returned in 1.000413389s
        FAIL: , err: signal: killed

        case B: excuting: go run sleep.go
        returned in 10.130517587s
        FAIL: , err: signal: killed

        case C: excuting: go run crash.go
        returned in 90.458159ms
        FAIL: fatal error: all goroutines are asleep - deadlock!

        goroutine 1 [select (no cases)]:
        main.main()
        /home/jnml/tmp/crash.go:7 +0x17
        exit status 2
        , err: exit status 1

        case D: excuting: go run hang.go
        ^Csignal: interrupt
        jnml@e5-1650:~/tmp$

The program was killed by ctrl-c after hanging for more than 10 secs.

Case A is equal to the exec.ContextWithTimeout linked earlier. It works as
expected, killing the child process in one second.

Case B also returns an error "signal killed", but only after the child
process exits after time.Sleep(10*time.Second). Q1: Why is the child
process in this case not killed after the timeout of one second?

Case C works as expected, a crashed child process returns immediately.

Case D differs from case C only in that it imports "net". This was observed
recently, after modernc.org/libc started importing "github.com/google/uuid",
which in turn imports "net". Q2: Why does hang.go _not_ crash?

To verify if using 'go run' may have something to do with it, I later tried
also the below and it seems to disprove the hypothesis:

        jnml@e5-1650:~/tmp$ go build -o crash crash.go && ./crash
        fatal error: all goroutines are asleep - deadlock!

        goroutine 1 [select (no cases)]:
        main.main()
        /home/jnml/tmp/crash.go:7 +0x17
        jnml@e5-1650:~/tmp$ go build -o hang hang.go && ./hang
        ^C
        jnml@e5-1650:~/tmp$

Summary: There are two things I either do not understand correctly or that
might be bugs:

1)  exec.CommandWithContext is not always cancelled in one second as
expected for passing it context.WithTimeout(context.Background(),
time.Second).

2) The deadlock detector might be confused by merely importing package
"net", probably because of (transitively) initialization of that package.

Any insights will be appreciated, thanks in advance.

-j

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAA40n-Utm34ROhU4pLrTCg22jwJ1mtXyo9KSsd8j5fc7Q5V0Ag%40mail.gmail.com.

Attachment: src.tar.gz
Description: application/gzip

Reply via email to