On Thursday, November 29, 2018 at 6:22:56 PM UTC+13, Justin Israel wrote:
>
>
>
> On Thu, Nov 29, 2018 at 6:20 PM Justin Israel <justinisr...@gmail.com> 
> wrote:
>
>> On Thu, Nov 29, 2018 at 5:32 PM Ian Lance Taylor <i...@golang.org> wrote:
>>
>>> On Wed, Nov 28, 2018 at 7:18 PM Justin Israel <justinisr...@gmail.com> 
>>> wrote:
>>> >
>>> > I've got a service that I have been testing quite a lot over the last 
>>> few days. Only after I handed it off for some testing to a colleague, was 
>>> he able to produce a SIGBUS panic that I had not seen before:
>>> >
>>> > go 1.11.2 linux/amd64
>>> >
>>> > The service does set up its own SIGINT/SIGTERM handling via the 
>>> typical siginal.Notify approach. The nature of the program is that it 
>>> listens on nats.io message queues, and receives requests to run tasks 
>>> as sub-processes. My tests have been running between 40-200 of these 
>>> instances over the course of a few days. But this panic occurred on a 
>>> completely different machine that those I had been testing...
>>> >
>>> > goroutine 1121 [runnable (scan)]:
>>> > fatal error: unexpected signal during runtime execution
>>> > panic during panic
>>> > [signal SIGBUS: bus error code=0x2 addr=0xfa2adc pc=0x451637]
>>> >
>>> > runtime stack:
>>> > runtime.throw(0xcf7fe3, 0x2a)
>>> >         /vol/apps/go/1.11.2/src/runtime/panic.go:608 +0x72
>>> > runtime.sigpanic()
>>> >         /vol/apps/go/1.11.2/src/runtime/signal_unix.go:374 +0x2f2
>>> > runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 
>>> 0xc0004baa80, 0x0, 0x0, 0x64, 0x0, 0x0, 0x0, ...)
>>> >         /vol/apps/go/1.11.2/src/runtime/traceback.go:190 +0x377
>>> > runtime.traceback1(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 
>>> 0xc0004baa80, 0x0)
>>> >         /vol/apps/go/1.11.2/src/runtime/traceback.go:728 +0xf3
>>> > runtime.traceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 
>>> 0xc0004baa80)
>>> >         /vol/apps/go/1.11.2/src/runtime/traceback.go:682 +0x52
>>> > runtime.tracebackothers(0xc00012e780)
>>> >         /vol/apps/go/1.11.2/src/runtime/traceback.go:947 +0x187
>>> > runtime.dopanic_m(0xc00012e780, 0x42dcc2, 0x7f83f6ffc808, 0x1)
>>> >         /vol/apps/go/1.11.2/src/runtime/panic.go:805 +0x2aa
>>> > runtime.fatalthrow.func1()
>>> >         /vol/apps/go/1.11.2/src/runtime/panic.go:663 +0x5f
>>> > runtime.fatalthrow()
>>> >         /vol/apps/go/1.11.2/src/runtime/panic.go:660 +0x57
>>> > runtime.throw(0xcf7fe3, 0x2a)
>>> >         /vol/apps/go/1.11.2/src/runtime/panic.go:608 +0x72
>>> > runtime.sigpanic()
>>> >         /vol/apps/go/1.11.2/src/runtime/signal_unix.go:374 +0x2f2
>>> > runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 
>>> 0xc0004baa80, 0x0, 0x0, 0x7fffffff, 0x7f83f6ffcd00, 0x0, 0x0, ...)
>>> >         /vol/apps/go/1.11.2/src/runtime/traceback.go:190 +0x377
>>> > runtime.scanstack(0xc0004baa80, 0xc000031270)
>>> >         /vol/apps/go/1.11.2/src/runtime/mgcmark.go:786 +0x15a
>>> > runtime.scang(0xc0004baa80, 0xc000031270)
>>> >         /vol/apps/go/1.11.2/src/runtime/proc.go:947 +0x218
>>> > runtime.markroot.func1()
>>> >         /vol/apps/go/1.11.2/src/runtime/mgcmark.go:264 +0x6d
>>> > runtime.markroot(0xc000031270, 0xc000000047)
>>> >         /vol/apps/go/1.11.2/src/runtime/mgcmark.go:245 +0x309
>>> > runtime.gcDrain(0xc000031270, 0x6)
>>> >         /vol/apps/go/1.11.2/src/runtime/mgcmark.go:882 +0x117
>>> > runtime.gcBgMarkWorker.func2()
>>> >         /vol/apps/go/1.11.2/src/runtime/mgc.go:1858 +0x13f
>>> > runtime.systemstack(0x7f83f7ffeb90)
>>> >         /vol/apps/go/1.11.2/src/runtime/asm_amd64.s:351 +0x66
>>> > runtime.mstart()
>>> >         /vol/apps/go/1.11.2/src/runtime/proc.go:1229
>>> >
>>> > Much appreciated for any insight.
>>>
>>> Is the problem repeatable?
>>>
>>> It looks like it crashed while tracing back the stack during garbage
>>> collection, but I don't know why since the panic was evidently able to
>>> trace back the stack just fine.
>>>
>>
>>
>> Thanks for the reply. Unfortunately it was rare and never happened in my 
>> own testing of thousands of runs of this service. The colleague that saw 
>> this crash on one of his workstations was not able to repro it after 
>> attempting another run of the workflow. I wasn't really sure how to debug 
>> this particular crash since it was in the gc and I have seen a "panic 
>> during panic" before. Thought it might jump out at someone.
>>
>
> Oops. I meant that I *haven't* seen a "panic during panic" before :-) 
>
>
>>
>>> Ian
>>>
>>

This is a follow up to the issue of seeing a SIGBUS in my application. 
While I still don't have a way to reproduce the problem, I have received 
reports from my users of another similar SIGBUS:

unexpected fault address 0x7fdf50
fatal error: fault
[signal 0xb code=0x2 addr=0x7fdf50 pc=0x7fdf50]

runtime.throw(0xad7840, 0x5)
        /s/go/1.12.1/src/runtime/panic.go:617 +0x72 fp=0xc000f75aa8 
sp=0xc000f75a78 pc=0x444a5e 
runtime.sigpanic()
        /s/go/1.12.1/src/runtime/sigpanic_unix.go:387 +0x47e 
fp=0xc000f75ad8 sp=0xc000f75aa8 pc=0x444a5e
project.com/project/obj.(*Server).newPushHandler.func1.1.1(0xc0008ea330, 
0x25, 0x0)

This is an anonymous inline function closure that was passed to a nats.io 
client topic subscription. If I am reading this correctly, it seems the 
address to the anonymous function is suddenly invalid?

ie.

go func() {
    ...
    someChan := make(chan bool, 1)
    natsConn.Subscribe(topic, func(_ string, typ Type) {
        ...
        someChan <- true
    })
    ...
}()

Could I be triggering a bug based on this anonymous function closure in the 
goroutine? I can try defining things outside the goroutine, including the 
function. But honestly without this being a reliable crash I would be 
fumbling in the dark. 

Justin 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to