The way to debug these kinds of issues is to use gdb.

The best clue in your message seems to be "The GHC heap size is
restricted to a small size ~32MB using "-M32M" rts option.".

Good luck!

Matt

On Sun, Feb 2, 2020 at 10:26 PM Harendra Kumar <[email protected]> wrote:
>
> Hi,
>
> While running a test-suite for the streaming library streamly I am 
> encountering a crash which seems to happen at random places at different 
> times. The common messages are:
>
> * Segmentation fault: 11
> *  internal error: scavenge_mark_stack: unimplemented/strange closure type 
> 24792696 @ 0x4200a623e0
> * internal error: update_fwd: unknown/strange object  223743520
>
> and several other such messages. Prima facie this looks like the memory is 
> getting corrupted/scribbled somehow. My first suspicion was that this could 
> be a problem in the streamly library code. But I have stripped down the code 
> to bare minimum and there is no C FFI code or no poking to memory pointers.
>
> My next suspicion was the hspec/quickcheck testing code that is being used in 
> this test. I checked the hspec code to ensure that there is no C code/pointer 
> poking in any of the code involved. But no luck there as well, still looking 
> to further strip down that code.
>
> My suspicion now is moving more towards the GHC RTS. This issue only shows 
> when the following conditions are met:
>
> * hspec "parallel" combinator is used to run tests in parallel
> * streamly concurrent code is being tested which can create many threads
> * The GHC heap size is restricted to a small size ~32MB using "-M32M" rts 
> option.
> * It is consistently seen with GHC 8.6.5 as well as GHC 8.8.1
>
> It never occurs when the heap size is not restricted. I have seen random 
> crashes before as well with a "IO manager die" message, when using concurrent 
> networking IO with streamly. Though earlier it was not easily reproducible, I 
> stopped chasing it. But now it looks like that issue might also be a 
> manifestation of the same underlying problem.
>
> My guess is it could be something in the RTS concurrency/threading related 
> code. Let me know if the symptoms ring a bell or if you can point to 
> something specific based on the symptoms. Also, what are the usual 
> tools/methods/debugging aids/flags to debug such issues in GHC? If not a GHC 
> issue what are the possible ways in which such problem can be induced by 
> application code?
>
> Meanwhile, I am also trying to simplify the reproducing code further to 
> remove other factors as much as possible. The current code is at 
> https://github.com/composewell/streamly on the ghc-segfault branch. Run "$ 
> while true; do cabal run properties || break; done" in the shell and if you 
> are lucky it may crash soon. The test code is in "test/Prop.hs" - here 
> https://github.com/composewell/streamly/blob/ghc-segfault/test/Prop.hs .
>
> -harendra
> _______________________________________________
> ghc-devs mailing list
> [email protected]
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
[email protected]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply via email to