Re: Understanding behavior of BlockedIndefinitelyOnMVar exception

2011-07-26 Thread Brandon Simmons
On Tue, Jul 26, 2011 at 1:25 AM, Edward Z. Yang  wrote:
> Hello Brandon,
>
> The answer is subtle, and has to do with what references are kept in code,
> which make an object considered reachable.  Essentially, the main thread
> itself keeps the MVar live while it still has forking to do, so that
> it cannot get garbage collected and trigger these errors.

Ah, okay. That seems like an obvious explanation for the exceptions
to be raised at the same time in the forked threads.

>
> Here is a simple demonstrative program:
>
>    main = do
>        lock <- newMVar ()
>        forkIO (takeMVar lock)
>        forkIO (takeMVar lock)
>        forkIO (takeMVar lock)
>

(snip)

>
> But in the meantime (esp. between invocation 2 and 3), the MVar cannot be
> garbage collected, because it is live on the stack.
>
> Could GHC have been more clever in this case?  Not in general, since deciding
> whether or not a reference will actually be used or not boils down to the
> halting problem.
>
>    loop = threadDelay 100 >> loop -- prevent blackholing from discovering this
>    main = do
>        lock <- newEmptyMVar
>        t1 <- newEmptyMVar
>        forkIO (takeMVar lock >> putMVar t1 ())
>        forkIO (loop `finally` putMVar lock ())
>        takeMVar t1
>
> Maybe we could do something where MVar references are known to be writer ends
> or read ends, and let the garbage collector know that an MVar with only read
> ends left is a deadlocked one.  However, this would be a very imprecise
> analysis, and would not help in your original code (since all of your 
> remaining
> threads had the possibility of writing to the MVar: it doesn't become clear
> that they can't until they all hit their takeMVar statements.)

I think this is the crux of what I was confused about. I had assumed
read vs. write was being taken into account by the runtime in raising
BlockedIndefinitelyOnMVar. This makes it obvious:

loop = threadDelay 100 >> loop -- prevent blackholing from discovering this
main = do
   lock <- newEmptyMVar
   forkIO (loop `finally` takeMVar lock)
   takeMVar lock


Given that, I still can't say I understand what is happening in my
original code. I'll try to work out an even simpler example on my own.

Thanks for  the thoughtful response,
Brandon


>
> Cheers,
> Edward
>

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Understanding behavior of BlockedIndefinitelyOnMVar exception

2011-07-25 Thread Edward Z. Yang
Hello Brandon,

The answer is subtle, and has to do with what references are kept in code,
which make an object considered reachable.  Essentially, the main thread
itself keeps the MVar live while it still has forking to do, so that
it cannot get garbage collected and trigger these errors.

Here is a simple demonstrative program:

main = do
lock <- newMVar ()
forkIO (takeMVar lock)
forkIO (takeMVar lock)
forkIO (takeMVar lock)

Consider what the underlying code needs to do after it has performed
the first forkIO.  'lock' is a local variable that the code generator
knows it's going to need later in the function body. So what does it
do? It saves it on the stack.

// R1 is a pointer to the MVar
cqo:
Hp = Hp + 8;
if (Hp > HpLim) goto cqq;
I32[Hp - 4] = spd_info;
I32[Hp + 0] = R1;
I32[Sp + 0] = R1;
R1 = Hp - 3;
I32[Sp - 4] = spe_info;
Sp = Sp - 4;
jump stg_forkzh ();

(Ignore the Hp > HpLim; that's just the heap check.)

This lives on until we continue executing the main thread at spe_info
(at which point we may or may not deallocate the stack frame).  But what
happens instead?

cqk:
Hp = Hp + 8;
if (Hp > HpLim) goto cqm;
I32[Hp - 4] = sph_info;
I32[Hp + 0] = I32[Sp + 4];
R1 = Hp - 3;
I32[Sp + 0] = spi_info;
jump stg_forkzh ();

We keep the pointer to the MVar to the stack, because we know there
is yet /another/ forkIO (takeMVar lock) coming up. (It's located at
Sp + 4; you have to squint a little since Sp is being fiddled
with, but it's still there, we just overwrite the infotable with
a new one.)

Finally, spi_info decides we don't need the contents of Sp + 4 anymore,
and overwrites it accordingly:

cqg:
Hp = Hp + 8;
if (Hp > HpLim) goto cqi;
I32[Hp - 4] = spl_info;
I32[Hp + 0] = I32[Sp + 4];
R1 = Hp - 3;
I32[Sp + 4] = spm_info;
Sp = Sp + 4;
jump stg_forkzh ();

But in the meantime (esp. between invocation 2 and 3), the MVar cannot be
garbage collected, because it is live on the stack.

Could GHC have been more clever in this case?  Not in general, since deciding
whether or not a reference will actually be used or not boils down to the
halting problem.

loop = threadDelay 100 >> loop -- prevent blackholing from discovering this
main = do
lock <- newEmptyMVar
t1 <- newEmptyMVar
forkIO (takeMVar lock >> putMVar t1 ())
forkIO (loop `finally` putMVar lock ())
takeMVar t1

Maybe we could do something where MVar references are known to be writer ends
or read ends, and let the garbage collector know that an MVar with only read
ends left is a deadlocked one.  However, this would be a very imprecise
analysis, and would not help in your original code (since all of your remaining
threads had the possibility of writing to the MVar: it doesn't become clear
that they can't until they all hit their takeMVar statements.)

Cheers,
Edward

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Understanding behavior of BlockedIndefinitelyOnMVar exception

2011-07-25 Thread Brandon Simmons
On Sun, Jul 24, 2011 at 10:07 PM, Edward Z. Yang  wrote:
> Excerpts from Felipe Almeida Lessa's message of Sun Jul 24 22:02:36 -0400 
> 2011:
>> Does anything change if you somehow force a GC sometime after "good2"?
>>  Perhaps with some calculation generating garbage, perhaps with
>> performGC.  IIRC, the runtime detects BlockedIndefinitelyOnMVar on GC.
>>  But I'm probably wrong =).
>
> That's correct.
>
>   resurrectThreads is called after garbage collection on the list of
>   threads found to be garbage.  Each of these threads will be woken
>   up and sent a signal: BlockedOnDeadMVar if the thread was blocked
>   on an MVar, or NonTermination if the thread was blocked on a Black
>   Hole.
>
> Cheers,
> Edward
>

Thanks, Edward. I'm going to take a look at the GHC source and see if
I can grok any of it. Any comment on whether it is correct behavior to
have the exception raised in all the threads attempting a readMVar at
once (if that's actually what's happening), even though an exception
handler will fill the MVar for subsequent threads?

I think I'm not totally clear on what qualifies as "indefinitely"

Thanks again,
Brandon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Understanding behavior of BlockedIndefinitelyOnMVar exception

2011-07-25 Thread Brandon Simmons
On Sun, Jul 24, 2011 at 10:02 PM, Felipe Almeida Lessa
 wrote:
> On Sun, Jul 24, 2011 at 7:56 PM, Brandon Simmons
>  wrote:
>> What I think I've learned here is that the BlockedIndefinitelyOnMVar
>> exception is raised in all the blocked threads "at once" as it were.
>> That despite the fact that the handler code in 'lockPrint' restores
>> the lock for successive threads.
>>
>> This would also seem to imply that putMVar's in an exception handler
>> don't stop the runtime from raising the BlockedIndefinitelyOnMVar. But
>> that doesn't really seem right.
>
> Does anything change if you somehow force a GC sometime after "good2"?
>  Perhaps with some calculation generating garbage, perhaps with
> performGC.  IIRC, the runtime detects BlockedIndefinitelyOnMVar on GC.
>  But I'm probably wrong =).

Here is a variation that calls 'performGC' after the first thread is
forked. It prints the exception simultaneously right before  the last
'threadDelay':

main2 = do
lock <- newMVar ()
forkIO $ lockPrint "good1" lock

threadDelay 100
forkIO $ badLockPrint "bad" lock

-- these both raise blocked indefinitely exception
threadDelay 100
forkIO $ lockPrint "good2" lock
performGC
threadDelay 100
forkIO $ lockPrint "good3" lock

threadDelay 100

Perhaps laziness is confusing the issue as well?

Thanks and sorry for the delayed response,
Brandon Simmons



>
> Cheers,
>
> --
> Felipe.
>

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Understanding behavior of BlockedIndefinitelyOnMVar exception

2011-07-24 Thread Edward Z. Yang
Excerpts from Felipe Almeida Lessa's message of Sun Jul 24 22:02:36 -0400 2011:
> Does anything change if you somehow force a GC sometime after "good2"?
>  Perhaps with some calculation generating garbage, perhaps with
> performGC.  IIRC, the runtime detects BlockedIndefinitelyOnMVar on GC.
>  But I'm probably wrong =).

That's correct.

   resurrectThreads is called after garbage collection on the list of
   threads found to be garbage.  Each of these threads will be woken
   up and sent a signal: BlockedOnDeadMVar if the thread was blocked
   on an MVar, or NonTermination if the thread was blocked on a Black
   Hole.

Cheers,
Edward

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Understanding behavior of BlockedIndefinitelyOnMVar exception

2011-07-24 Thread Felipe Almeida Lessa
On Sun, Jul 24, 2011 at 7:56 PM, Brandon Simmons
 wrote:
> What I think I've learned here is that the BlockedIndefinitelyOnMVar
> exception is raised in all the blocked threads "at once" as it were.
> That despite the fact that the handler code in 'lockPrint' restores
> the lock for successive threads.
>
> This would also seem to imply that putMVar's in an exception handler
> don't stop the runtime from raising the BlockedIndefinitelyOnMVar. But
> that doesn't really seem right.

Does anything change if you somehow force a GC sometime after "good2"?
 Perhaps with some calculation generating garbage, perhaps with
performGC.  IIRC, the runtime detects BlockedIndefinitelyOnMVar on GC.
 But I'm probably wrong =).

Cheers,

-- 
Felipe.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Understanding behavior of BlockedIndefinitelyOnMVar exception

2011-07-24 Thread Brandon Simmons
I'm trying to really understand how the BlockedIndefinitelyOnMVar
exception works in concurrent code as I would like to rely on it as a
useful runtime signal in a concurrency library I'm working on.

Here is some code illustrating a function restoring an abandoned lock
in a single-threaded program and works as I would expect:

 START CODE 
module Main
where

import Control.Concurrent
import Control.Exception

-- This raises the exception only once and the lock is successfully restored:
main1 = do
lock <- newMVar ()
lockPrint "good1" lock
badLockPrint "bad" lock
-- exception is raised and lock is restored here:
lockPrint "good2" lock
-- no exception raised:
lockPrint "good3" lock
readMVar lock

lockPrint :: String -> MVar () -> IO ()
lockPrint name v =
do e <- try $ takeMVar v :: IO (Either BlockedIndefinitelyOnMVar ())
   -- either print exception, or print name:
   either print (const $ putStrLn name) e
   `finally`  putMVar v ()

-- perhaps simulates an operation that died before it could return a lock:
badLockPrint :: String -> MVar () -> IO ()
badLockPrint s v = do
takeMVar v
putStrLn s
-- Forgot to return the lock here!:
 END CODE 


Now here is a variation of 'main' that forks the operations:


 START CODE 
main0 = do
lock <- newMVar ()
forkIO $ lockPrint "good1" lock

threadDelay 100
forkIO $ badLockPrint "bad" lock

-- these both raise blocked indefinitely exception
threadDelay 100
forkIO $ lockPrint "good2" lock
threadDelay 100
forkIO $ lockPrint "good3" lock

    threadDelay 100
 END CODE 


What I think I've learned here is that the BlockedIndefinitelyOnMVar
exception is raised in all the blocked threads "at once" as it were.
That despite the fact that the handler code in 'lockPrint' restores
the lock for successive threads.

This would also seem to imply that putMVar's in an exception handler
don't stop the runtime from raising the BlockedIndefinitelyOnMVar. But
that doesn't really seem right.

Can anyone comment on the two conclusions above?

FWIW, this was an interesting related thread:
http://comments.gmane.org/gmane.comp.lang.haskell.glasgow.user/18667

Thanks,
Brandon Simmons
http://coder.bsimmons.name

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-07-05 Thread Neil Mitchell
>> I wrote my Chan around the abstraction:
>>
>> data Chan a = Chan (MVar (Either [a] [MVar a]))
>>
>> The Chan either has elements in it (Left), or has readers waiting for
>> elements (Right). To get the fairness properties on Chan you might
>> want to make these two lists Queue's, but I think the basic principle
>> still works. By using this abstraction my Chan was a lot simpler. With
>> this scheme implementing isEmpyChan or unGetChan would both work
>> nicely. My Chan was not designed for performance. (In truth I replaced
>> the Left with IntMap a, and inserted elements with a randomly chosen
>> key, but the basic idea is the same.)
>
> I like the idea.  But what happens if one of the blocked threads gets killed
> by a killThread (e.g. a timeout) while it is waiting?  Won't we still give
> it an element of the Chan sometime in the future?  Perhaps this doesn't
> happen in your scenario, but it seems to throw a spanner in the works for
> using this as a general-purpose implementation.

I hadn't thought of that at all - my scenario doesn't have any threads
being killed. With the thought of threads dying concurrency
abstractions become significantly harder - I hadn't quite realised how
hard that must make it.

Thanks, Neil
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-07-05 Thread Simon Marlow

On 04/07/2010 21:51, Neil Mitchell wrote:

http://hackage.haskell.org/trac/ghc/ticket/4154


Yup, that's a bug.  Not clear if it's fixable.


http://hackage.haskell.org/trac/ghc/ticket/3527


That too.  A very similar bug in fact, if there is a fix it will probably
fix both of them.  The problem is that readChan holds a lock on the read end
of the Chan, so neither isEmptyChan nor unGetChan can work when a reader is
blocked.


I wrote my Chan around the abstraction:

data Chan a = Chan (MVar (Either [a] [MVar a]))

The Chan either has elements in it (Left), or has readers waiting for
elements (Right). To get the fairness properties on Chan you might
want to make these two lists Queue's, but I think the basic principle
still works. By using this abstraction my Chan was a lot simpler. With
this scheme implementing isEmpyChan or unGetChan would both work
nicely. My Chan was not designed for performance. (In truth I replaced
the Left with IntMap a, and inserted elements with a randomly chosen
key, but the basic idea is the same.)


I like the idea.  But what happens if one of the blocked threads gets 
killed by a killThread (e.g. a timeout) while it is waiting?  Won't we 
still give it an element of the Chan sometime in the future?  Perhaps 
this doesn't happen in your scenario, but it seems to throw a spanner in 
the works for using this as a general-purpose implementation.


The STM version doesn't have this bug, of course :-)  But then, it 
doesn't have fairness either.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-07-04 Thread Neil Mitchell
>> http://hackage.haskell.org/trac/ghc/ticket/4154
>
> Yup, that's a bug.  Not clear if it's fixable.
>
>> http://hackage.haskell.org/trac/ghc/ticket/3527
>
> That too.  A very similar bug in fact, if there is a fix it will probably
> fix both of them.  The problem is that readChan holds a lock on the read end
> of the Chan, so neither isEmptyChan nor unGetChan can work when a reader is
> blocked.

I wrote my Chan around the abstraction:

data Chan a = Chan (MVar (Either [a] [MVar a]))

The Chan either has elements in it (Left), or has readers waiting for
elements (Right). To get the fairness properties on Chan you might
want to make these two lists Queue's, but I think the basic principle
still works. By using this abstraction my Chan was a lot simpler. With
this scheme implementing isEmpyChan or unGetChan would both work
nicely. My Chan was not designed for performance. (In truth I replaced
the Left with IntMap a, and inserted elements with a randomly chosen
key, but the basic idea is the same.)

>> own Chan implementation worked. My Chan had different properties (it
>> queues items randomly) and a subset of the Chan functions, so it still
>> doesn't prove any issue with Chan - but I am now sceptical.
>
> It's surprising how difficult it is to get these MVar-based abstractions
> right.  Some thorough testing of Chan is probably in order.

Agreed! In this project I wrote 8 different concurrency abstractions.
I had bugs in most. MVar is a great building block on which to put
higher layered abstractions, but using it correctly is tricky. I found
that I used MVar's in four ways:

1) MVar's which are always full, and are just locks around data for
consistency. Created with newMVar, used with modifyMVar.

2) MVar's which contain unit and are used for locking something other
than data (i.e. a file on disk). Created with newMVar, used with
withMVar.

3) MVar's which are used to signal computation can begin, created with
newMVarEmpty, given to someone who calls putMVar (), and waited on by
the person who created them.

4) MVar's which go in a higher-level concurrency operation - CountVars
(variables which wait until they have been signaled N times), RandChan
(Chan but with randomness), Pool (thread pool) etc.

Thanks, Neil
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-07-04 Thread Simon Marlow

On 04/07/10 10:30, Neil Mitchell wrote:

Hi Simon,


My suspicion for the root cause of the problem is that Concurrent.Chan
is incorrect. In the course of debugging this problem we found 2 bugs
in Chan, and while I never tracked down any other bugs in Chan, I no
longer trust it. By rewriting parts of the program, including avoiding
Chan, the bugs disappeared.I don't think I'll be using Chan again
until after someone has proven in correct.


Considering Chan is<150 lines of code and has been around for many years,
that's amazing!  Did you report the bugs?  Is it anything to do with
asynchronous exceptions?


Nothing to do with async exceptions. I found:

http://hackage.haskell.org/trac/ghc/ticket/4154


Yup, that's a bug.  Not clear if it's fixable.


http://hackage.haskell.org/trac/ghc/ticket/3527


That too.  A very similar bug in fact, if there is a fix it will 
probably fix both of them.  The problem is that readChan holds a lock on 
the read end of the Chan, so neither isEmptyChan nor unGetChan can work 
when a reader is blocked.



Of course, there's also the async exceptions bug still around:

http://hackage.haskell.org/trac/ghc/ticket/3160


Yes, that's a bug (though not in Chan).


However, even after having a program with no async exceptions (I never
used them), and eliminating unGetChan and isEmpyChan, I still got
bugs. I have no proof they came from the Chan module, and no minimal
test case was ever able to recreate them, but the same program with my
own Chan implementation worked. My Chan had different properties (it
queues items randomly) and a subset of the Chan functions, so it still
doesn't prove any issue with Chan - but I am now sceptical.


It's surprising how difficult it is to get these MVar-based abstractions 
right.  Some thorough testing of Chan is probably in order.


Cheers, 
Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-07-04 Thread Neil Mitchell
Hi Simon,

>> My suspicion for the root cause of the problem is that Concurrent.Chan
>> is incorrect. In the course of debugging this problem we found 2 bugs
>> in Chan, and while I never tracked down any other bugs in Chan, I no
>> longer trust it. By rewriting parts of the program, including avoiding
>> Chan, the bugs disappeared.I don't think I'll be using Chan again
>> until after someone has proven in correct.
>
> Considering Chan is <150 lines of code and has been around for many years,
> that's amazing!  Did you report the bugs?  Is it anything to do with
> asynchronous exceptions?

Nothing to do with async exceptions. I found:

http://hackage.haskell.org/trac/ghc/ticket/4154
http://hackage.haskell.org/trac/ghc/ticket/3527

Of course, there's also the async exceptions bug still around:

http://hackage.haskell.org/trac/ghc/ticket/3160

However, even after having a program with no async exceptions (I never
used them), and eliminating unGetChan and isEmpyChan, I still got
bugs. I have no proof they came from the Chan module, and no minimal
test case was ever able to recreate them, but the same program with my
own Chan implementation worked. My Chan had different properties (it
queues items randomly) and a subset of the Chan functions, so it still
doesn't prove any issue with Chan - but I am now sceptical.

> You should have more luck with Control.Concurrent.STM.TChan, incedentally.
>  It's much easier to get right, and when we benchmarked it, performance was
> about the same (all those withMVar/modifyMVars in Chan are quite expensive),
> plus you get to compose it easily: reading from either of 2 TChans is
> trivial.

The performance of the Haskell is irrelevant - the program spends all
its time invoking system calls. Looking at the implementation I am
indeed much more trusting of TChan, I'll be using that in future if
there is ever a need.

Thanks, Neil
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-07-02 Thread Simon Marlow

On 01/07/2010 21:10, Neil Mitchell wrote:

Hi Simon,

Thanks for the excellent information. I've now debugged my problem,
and think I've got the last of the MVar blocking problems out.


* How confident are people that this exception does really mean that
it is in a blocked state? Is there any chance the error could be
raised incorrectly?


There have been one or two bugs in the past that could lead to this
exception being raised incorrectly, but I'm not aware of any right now.
  It's not inconceivable of course.


I have no reason to think it's broken. I found at least 3 separate
concurrency bugs in various parts (one added the day before, one over
a year old, one of which had been introduced while trying to work
around the MVar problem).

My suspicion for the root cause of the problem is that Concurrent.Chan
is incorrect. In the course of debugging this problem we found 2 bugs
in Chan, and while I never tracked down any other bugs in Chan, I no
longer trust it. By rewriting parts of the program, including avoiding
Chan, the bugs disappeared.I don't think I'll be using Chan again
until after someone has proven in correct.


Considering Chan is <150 lines of code and has been around for many 
years, that's amazing!  Did you report the bugs?  Is it anything to do 
with asynchronous exceptions?


You should have more luck with Control.Concurrent.STM.TChan, 
incedentally.  It's much easier to get right, and when we benchmarked 
it, performance was about the same (all those withMVar/modifyMVars in 
Chan are quite expensive), plus you get to compose it easily: reading 
from either of 2 TChans is trivial.


Cheers,
Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-07-01 Thread Neil Mitchell
Hi Simon,

Thanks for the excellent information. I've now debugged my problem,
and think I've got the last of the MVar blocking problems out.

>> * How confident are people that this exception does really mean that
>> it is in a blocked state? Is there any chance the error could be
>> raised incorrectly?
>
> There have been one or two bugs in the past that could lead to this
> exception being raised incorrectly, but I'm not aware of any right now.
>  It's not inconceivable of course.

I have no reason to think it's broken. I found at least 3 separate
concurrency bugs in various parts (one added the day before, one over
a year old, one of which had been introduced while trying to work
around the MVar problem).

My suspicion for the root cause of the problem is that Concurrent.Chan
is incorrect. In the course of debugging this problem we found 2 bugs
in Chan, and while I never tracked down any other bugs in Chan, I no
longer trust it. By rewriting parts of the program, including avoiding
Chan, the bugs disappeared.I don't think I'll be using Chan again
until after someone has proven in correct.

>> * Any debugging tips for this problem?
>
> I'd use the event log: compile with -debug, run with +RTS -Ds -l, and dump
> the event log with show-ghc-events (cabal install ghc-events).  Or just dump
> it to stderr with +RTS -Ds, if the log isn't too large.  Use
> GHC.Exts.traceEvent to add your own events to the trace.

The event log is fantastic!

Thanks, Neil
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-06-29 Thread Simon Marlow

On 26/06/10 12:28, Neil Mitchell wrote:

I have a very big and highly threaded program that generates a
BlockedIndefinitelyOnMVar exception when run. I have spent a
reasonable amount of time pouring over the source code, as has Max
Bolingbroke. Neither of us have the slightest idea why it raises the
exception.

Some questions:

* Does anyone know the exact sequence of actions that causes this
exception to be thrown? I couldn't find it written down.


Sure - it means the garbage collector found that the thread was blocked 
on an MVar that is otherwise unreachable, and hence the thread could 
never be awoken.



* How confident are people that this exception does really mean that
it is in a blocked state? Is there any chance the error could be
raised incorrectly?


There have been one or two bugs in the past that could lead to this 
exception being raised incorrectly, but I'm not aware of any right now. 
 It's not inconceivable of course.



* Any debugging tips for this problem?


I'd use the event log: compile with -debug, run with +RTS -Ds -l, and 
dump the event log with show-ghc-events (cabal install ghc-events).  Or 
just dump it to stderr with +RTS -Ds, if the log isn't too large.  Use 
GHC.Exts.traceEvent to add your own events to the trace.


Cheers,
Simon
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-06-26 Thread Neil Mitchell
> My understanding was that this error occurred when one thread was blocked,
> waiting on an MVar, and no other thread in the program has a reference to
> that MVar (this can be detected during GC).  Ergo, the blocked thread will
> end up waiting forever because no-one can ever wake it up again.

That certainly seems a sensible rule - I'll see if that can help me
debug my problem.

> Do you actually have use of MVars in your program directly, or are they
> being used via a library?  And do you at least know which thread is
> throwing this exception?  It should be catchable so you can probably wrap
> the arguments to your forkIO calls with a catcher than indicates which
> thread blew up.

I use MVar's directly, use Chan/QSem, and have about 5 concurrency
data types built on top of MVar's - they're everywhere.

I also have a thread pool structure, so tasks move between threads
regularly - knowing which thread got blocked isn't very interesting.

Thanks for the information,

Neil
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: BlockedIndefinitelyOnMVar exception

2010-06-26 Thread nccb2
> Hi,
>
> I have a very big and highly threaded program that generates a
> BlockedIndefinitelyOnMVar exception when run. I have spent a
> reasonable amount of time pouring over the source code, as has Max
> Bolingbroke. Neither of us have the slightest idea why it raises the
> exception.
>
> Some questions:
>
> * Does anyone know the exact sequence of actions that causes this
> exception to be thrown? I couldn't find it written down.
> * How confident are people that this exception does really mean that
> it is in a blocked state? Is there any chance the error could be
> raised incorrectly?
> * Any debugging tips for this problem?

My understanding was that this error occurred when one thread was blocked,
waiting on an MVar, and no other thread in the program has a reference to
that MVar (this can be detected during GC).  Ergo, the blocked thread will
end up waiting forever because no-one can ever wake it up again.

Whenever I have had this error (or its STM equivalent) I think it was
always telling the truth.  I seem to remember it was often a symptom of
another thread terminating unexpectedly.  So if thread A is blocked on an
MVar that it is expecting thread B to write to, then thread B terminating
can cause this error to arise in thread A, even though the real problem is
in thread B.

Do you actually have use of MVars in your program directly, or are they
being used via a library?  And do you at least know which thread is
throwing this exception?  It should be catchable so you can probably wrap
the arguments to your forkIO calls with a catcher than indicates which
thread blew up.

Thanks,

Neil.



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


BlockedIndefinitelyOnMVar exception

2010-06-26 Thread Neil Mitchell
Hi,

I have a very big and highly threaded program that generates a
BlockedIndefinitelyOnMVar exception when run. I have spent a
reasonable amount of time pouring over the source code, as has Max
Bolingbroke. Neither of us have the slightest idea why it raises the
exception.

Some questions:

* Does anyone know the exact sequence of actions that causes this
exception to be thrown? I couldn't find it written down.
* How confident are people that this exception does really mean that
it is in a blocked state? Is there any chance the error could be
raised incorrectly?
* Any debugging tips for this problem?

Thanks, Neil
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users