Re: A bug of multicore IO manager

2013-09-05 Thread 山本和彦
Hi,

 If you need two Ctrl-Cs to kill the program, it probably means that it
 deadlocked somewhere, maybe in the RTS.  Kazu: if you can attach to
 the deadlocked process with gdb and get stack traces of all the
 threads, that might help.

To debug with GDB, I complied Mighty with -debug. This changes the
behavior and I got the following error:

mighty-20130905: internal error: ASSERTION FAILED: file rts/sm/MarkWeak.c, line 
371 

(GHC version 7.7.20130901 for i386_unknown_linux)
Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

Simon, can you tell what's going on?

--Kazu

___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: A bug of multicore IO manager

2013-09-05 Thread Akio Takano
I'm going to try to make a small test case today (probably after 08:00
UTC), but feel free to try it. If my guess is correct, reverting the patch
should fix the problem.

On Fri, Sep 6, 2013 at 7:38 AM, Kazu Yamamoto k...@iij.ad.jp wrote:

 Hi Takano-san,

  It looks like after the commit, addCFinalizerToWeak# can call into the GC
  with the closure lock held. This means the info pointer points to
  stg_WHITEHOLE_info, breaking the asserted invariant. I haven't done any
  testing to confirm this, however.

 I can try. Should I revert this patch?

 --Kazu

 ___
 ghc-devs mailing list
 ghc-devs@haskell.org
 http://www.haskell.org/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: A bug of multicore IO manager

2013-09-04 Thread Simon Marlow

On 03/09/13 22:57, Johan Tibell wrote:

Hi Kazu,

On Tue, Sep 3, 2013 at 2:52 PM, Kazu Yamamoto k...@iij.ad.jp
mailto:k...@iij.ad.jp wrote:

Hi,

As I said before, I started running HTTP server using Mio in the real
world. Unfortunately, the daemon is not stable.

After one day or so, the server cannot accept any HTTP requests.  No
error messages from the server.

The server is alive. To terminate the server (running in a screen
terminal), single Ctrl-c is not enough. Typing Ctrl-c again terminates
the server.


Could you run an strace on the process in this state so we can get an
idea what it's doing?

After several tests, I'm getting convinced that this occurs only when
+RTS -Nx is specified (where x = 2). The server runs well if +RTS
-Nx is not specified.


That indicates that the problem is with the threaded RTS and perhaps
with the IO manager.

My question: if the program complied with GHC needs double Ctrl-c to
terminate, what is the situation of the program?


If Ctrl+C generates an exception (does it?) there could be an
overzealous exception catcher somewhere that catches all exceptions,
including your Ctrl+C.


The first Ctrl-C is sent as an Interrupted exception to the main thread. 
 The second Ctrl-C sends a SIGINT as usual, which tends to kill the 
process.


If you need two Ctrl-Cs to kill the program, it probably means that it 
deadlocked somewhere, maybe in the RTS.  Kazu: if you can attach to the 
deadlocked process with gdb and get stack traces of all the threads, 
that might help.


Cheers,
Simon



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: A bug of multicore IO manager

2013-09-03 Thread Andreas Voellmy
Kazu, thanks for noticing this! I will try to recreate it on my server as
well.

-Andi


On Tue, Sep 3, 2013 at 5:57 PM, Johan Tibell johan.tib...@gmail.com wrote:

 Hi Kazu,

 On Tue, Sep 3, 2013 at 2:52 PM, Kazu Yamamoto k...@iij.ad.jp wrote:

 Hi,

 As I said before, I started running HTTP server using Mio in the real
 world. Unfortunately, the daemon is not stable.

 After one day or so, the server cannot accept any HTTP requests.  No
 error messages from the server.

 The server is alive. To terminate the server (running in a screen
 terminal), single Ctrl-c is not enough. Typing Ctrl-c again terminates
 the server.


 Could you run an strace on the process in this state so we can get an idea
 what it's doing?


 After several tests, I'm getting convinced that this occurs only when
 +RTS -Nx is specified (where x = 2). The server runs well if +RTS
 -Nx is not specified.


 That indicates that the problem is with the threaded RTS and perhaps with
 the IO manager.


 My question: if the program complied with GHC needs double Ctrl-c to
 terminate, what is the situation of the program?


 If Ctrl+C generates an exception (does it?) there could be an overzealous
 exception catcher somewhere that catches all exceptions, including your
 Ctrl+C.



 P.S.

 It seems to me that the server also is leaking space. The server is
 getting fatter gradually.


 Could you use the profiler to see what type of objects are leaking?


 ___
 ghc-devs mailing list
 ghc-devs@haskell.org
 http://www.haskell.org/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: A bug of multicore IO manager

2013-09-03 Thread Andreas Voellmy
Hi Kazu,

What sort of workload was the mighty server under during those 1 or 2 days
while you waited for it to become unresponsive. I.e. was this a production
web server? Or were you generating requests at some frequency or leaving it
mostly idle?

-Andi


On Tue, Sep 3, 2013 at 6:29 PM, Andreas Voellmy
andreas.voel...@gmail.comwrote:

 Kazu, thanks for noticing this! I will try to recreate it on my server as
 well.

 -Andi


 On Tue, Sep 3, 2013 at 5:57 PM, Johan Tibell johan.tib...@gmail.comwrote:

 Hi Kazu,

 On Tue, Sep 3, 2013 at 2:52 PM, Kazu Yamamoto k...@iij.ad.jp wrote:

 Hi,

 As I said before, I started running HTTP server using Mio in the real
 world. Unfortunately, the daemon is not stable.

 After one day or so, the server cannot accept any HTTP requests.  No
 error messages from the server.

 The server is alive. To terminate the server (running in a screen
 terminal), single Ctrl-c is not enough. Typing Ctrl-c again terminates
 the server.


 Could you run an strace on the process in this state so we can get an
 idea what it's doing?


 After several tests, I'm getting convinced that this occurs only when
 +RTS -Nx is specified (where x = 2). The server runs well if +RTS
 -Nx is not specified.


 That indicates that the problem is with the threaded RTS and perhaps with
 the IO manager.


 My question: if the program complied with GHC needs double Ctrl-c to
 terminate, what is the situation of the program?


 If Ctrl+C generates an exception (does it?) there could be an overzealous
 exception catcher somewhere that catches all exceptions, including your
 Ctrl+C.



 P.S.

 It seems to me that the server also is leaking space. The server is
 getting fatter gradually.


 Could you use the profiler to see what type of objects are leaking?


 ___
 ghc-devs mailing list
 ghc-devs@haskell.org
 http://www.haskell.org/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: A bug of multicore IO manager

2013-09-03 Thread 山本和彦
Hi Andi,

 What sort of workload was the mighty server under during those 1 or 2 days
 while you waited for it to become unresponsive. I.e. was this a production
 web server? Or were you generating requests at some frequency or leaving it
 mostly idle?

I ran Mighty on http://mew.org. This is my private domain which
provides my free programs and articles. It's not so busy but not so
dull.

I did not generate requests from measurement tools.

--Kazu

___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs