#367: Infinite loops can hang Concurrent Haskell
-------------------------------------+--------------------------------------
Reporter: simonpj | Owner: nobody
Type: bug | Status: assigned
Priority: lowest | Milestone: _|_
Component: Compiler | Version: 6.4.1
Severity: normal | Resolution: None
Keywords: scheduler allocation | Difficulty: Unknown
Testcase: | Os: Unknown/Multiple
Architecture: Unknown/Multiple |
-------------------------------------+--------------------------------------
Comment (by SamB):
According to CapabilitiesAndScheduling:
We do have a time-slice mechanism: the timer interrupt (see Timer.c)
sets the context_switch flag, which causes the running thread to return to
the scheduler the next time a heap check fails (at the end of the current
nursery block). When a heap check fails, the thread doesn't necessarily
always return to the scheduler: as long as the context_switch flag isn't
set, and there is another block in the nursery, it resets Hp and HpLim? to
point to the new block, and continues.
To fix this bug, we need a way for the timer signal handler to *force* the
Haskell code to stop in bounded time.
Two ways that come to mind for the handler to force a stop are:
1. insert a breakpoint (or a special jump?) at some pre-arranged point
in any arbitrarily-long allocation-free loop, such that the breakpoint
signal handler can safely enter the schedular
2. use some sort of instruction-by-instruction Call Frame Information,
something like that of DWARF 2 and up, to figure out the innermost frame's
stack layout.
Approach 1 is kind of icky insofar as it might cause spurious stops in
other threads, or worse!
For those unfamiliar with it: DWARF's CFI represents a function of type
(IP value × register name) → Maybe (how to find the value of that register
in the caller), where Nothing means "that register got clobbered".
Obviously, we would also need information about which of the interrupted
code's registers and stack slots represented pointers that should be
followed by the garbage collector for approach 2 to work.
Any other ideas about how the timer handler can guarantee re-entering the
scheduler in bounded time?
There *is* the obvious "check every time around a non-allocating loop"
approach, but it seems obvious that that would cost far too much where we
can least afford it. (Is this actually true? So many things that seem
obvious aren't...)
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/367#comment:10>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
Glasgow-haskell-bugs@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs