Joshua M. Clulow writes:

> On 22 June 2015 at 15:47, Ryan Zezeski <[email protected]> wrote:
>>> Do you have core files for this failure:
>>>   Assertion failed: ucp != NULL, file ../common/lx_brand.c, line 797
>> Unfortunately I do not see any cores from around the time that happened.
>> In fact, I feel like it didn't create a core, is that possible? I
>> remember the process never actually died. I have global cores enabled
>> and see cores for other crashes.
>>
>> Looking in my notes I see that when the assert tripped the test runner
>> process started spinning in __sigaction() and never returned. I had a
>> pstack of this moment but deleted it...sigh.
>
> It's possible that the SIGABRT generated by the assertion was being
> (mis)handled by the runtime.  This would not be the first managed code
> environment to obscure an assertion with a SIGABRT handler.
>

Yes the runtime installs it's own SIGABRT handler, though I'm not quite
sure its purpose.

https://github.com/mono/mono/blob/mono-4.0.1.44/mono/mini/mini-posix.c#L197

Looks like SIGSEGV is rebranded for debugging:

https://github.com/mono/mono/blob/mono-4.0.1.44/mono/mini/mini.c#L6736

> You could try setting LX_NO_ABORT_HANDLER in the environment, which
> will silently ignore the attempt to install a SIGSEGV or SIGABRT
> handler from Linux code.  Mono may require SIGSEGV for NULL
> dereference exceptions, though, if it's similar to the JDK; you might
> need to patch the LX brand signal code.  You should be able to rebuild
> and lofs mount "lx_brand.so.1" without needing to regenerate the
> entire platform image and reboot.

That's good to know, but the assert failure is extremely rare. I have
reproduced this issue what feels like hundreds of times and only saw the
assert trip twice. Furthermore, I think it only tripped on the previous
platform image and it may have been when I was accidnetly mixing
libraries from one mono build with runtime of another because of a
copy-pasta error.

The predominant issue is that the test gets stuck in various ways. I'm
not sure if this is an lx issue or a mono issue (passes just fine on
KVM) or maybe a little of both.

...and then it hits me...Are you saying that maybe mono's
SIGABRT/SIGSEGV handlers are hiding underlying issues? I see some code
about signal chaining, alternate stacks, ignoring SIGABRT in certain
contexts, and all this other stuff that looks way complicated and
fragile. I suppose I could try running as you said an see what happens.

-Z


-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to