On Sun, Jul 5, 2020 at 4:14 PM Marcin Romaszewicz <marc...@gmail.com> wrote:
>
> On Sun, Jul 5, 2020 at 1:05 PM Ian Lance Taylor <i...@golang.org> wrote:
>>
>> On Sun, Jul 5, 2020 at 10:54 AM Marcin Romaszewicz <marc...@gmail.com> wrote:
>> >
>> > I'm hitting a problem using os.exec Cmd.Start to run a process.
>> >
>> > I'm setting Cmd.Stdio and Cmd.Stderr to the same instance of an io.Pipe, 
>> > and spawn a Goroutine to consume the pipe reader until I reach EOF. I then 
>> > call cmd.Start(), do some additional work, and call cmd.Wait(). The 
>> > runtime of the executable I launch is 15-30 minutes, and stdout/stderr 
>> > output is minimal, a few 10's of kB during this 15-30 minute run.
>> >
>> > When the pipe reaches EOF or errors out, I close the pipe reader, exit the 
>> > goroutine reading the pipe, and that's when cmd.Wait() returns, exactly as 
>> > documented.
>> >
>> > This works exactly as described about 70% of the time. The remaining 30% 
>> > of the time, cmd.Wait()  returns an error, which stringifies as "signal: 
>> > broken pipe". I'm running thousands of copies of this executable across 
>> > thousands of instances in AWS, so I have a big data set here. The broken 
>> > pipe error happens at the very end when my exec'd executable is exiting, 
>> > so as far as I can tell, it's run successfully and is hitting this error 
>> > on exit.
>> >
>> > I realize that SIGPIPE and EPIPE are common ways that processes clean each 
>> > other up, and that shells do a lot of work hiding them, so I've also tried 
>> > using exec.Cmd to spawn bash, which in turn runs my executable, but I 
>> > still get a lot of these deaths due to SIGPIPE.
>> >
>> > I've tried to reproduce this with simple commands - like `cat 
>> > <longfile.txt>`, and none of these simple commands ever result in the 
>> > broken pipe, and I capture all their output without issue. The command I'm 
>> > running differs in that it uses quite a lot of resources and the machine 
>> > is doing significant work when the executable is exiting. However, the 
>> > sigpipe is being received by the application, not my Go code, implying 
>> > that the Go side is closing the pipe. I can't find where this is happening.
>> >
>> > Any tips on how to chase this down?
>>
>> The executable is dying due to receiving a SIGPIPE signal.  As you
>> know, that means that it made a write system call to a pipe that had
>> no open readers.  If you're confident that you are reading all the
>> data from the pipe in the Go program, then the natural first thing to
>> check is the other possible pipe: if you are reading from stdout,
>> check what happens on stderr, and vice-versa.
>>
>> Since that probably won't help, since you can reproduce it with some
>> reliability, try running the whole system under strace -f.  That will
>> show you the system calls both of your program and of the subprocess,
>> and should let you determine exactly which write is triggering the
>> SIGPIPE, and let you verify that the read end of the pipe has been
>> closed.
>>
>> And if that doesn't help, perhaps you can modify the subprocess to
>> catch SIGPIPE and get a stack trace, again with the goal of finding
>> out exactly what write is failing.
>>
>> Hope this helps.
>
>
> Thanks for the tips.
>
> The comment on Stdout and Stderr on cmd says:
>
> // If Stdout and Stderr are the same writer, and have a type that can
> // be compared with ==, at most one goroutine at a time will call Write.
>
> Using an io.Pipe shared between these two should result in both being drained 
> correctly, right?

I guess I don't see how that affects what I said one way or another.

Although, let me back up a second: are you really using an io.Pipe
rather than an os.Pipe?  An io.Pipe shouldn't lead to a SIGPIPE
signal.

Ian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAOyqgcV7cP2JHWbJkuTMzARiRdX0nSU1kKZj7cs4CiouJcwi4A%40mail.gmail.com.

Reply via email to