Re: A bug of multicore IO manager
Hi, If you need two Ctrl-Cs to kill the program, it probably means that it deadlocked somewhere, maybe in the RTS. Kazu: if you can attach to the deadlocked process with gdb and get stack traces of all the threads, that might help. To debug with GDB, I complied Mighty with -debug. This changes the behavior and I got the following error: mighty-20130905: internal error: ASSERTION FAILED: file rts/sm/MarkWeak.c, line 371 (GHC version 7.7.20130901 for i386_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug Simon, can you tell what's going on? --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: A bug of multicore IO manager
I'm going to try to make a small test case today (probably after 08:00 UTC), but feel free to try it. If my guess is correct, reverting the patch should fix the problem. On Fri, Sep 6, 2013 at 7:38 AM, Kazu Yamamoto k...@iij.ad.jp wrote: Hi Takano-san, It looks like after the commit, addCFinalizerToWeak# can call into the GC with the closure lock held. This means the info pointer points to stg_WHITEHOLE_info, breaking the asserted invariant. I haven't done any testing to confirm this, however. I can try. Should I revert this patch? --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: A bug of multicore IO manager
On 03/09/13 22:57, Johan Tibell wrote: Hi Kazu, On Tue, Sep 3, 2013 at 2:52 PM, Kazu Yamamoto k...@iij.ad.jp mailto:k...@iij.ad.jp wrote: Hi, As I said before, I started running HTTP server using Mio in the real world. Unfortunately, the daemon is not stable. After one day or so, the server cannot accept any HTTP requests. No error messages from the server. The server is alive. To terminate the server (running in a screen terminal), single Ctrl-c is not enough. Typing Ctrl-c again terminates the server. Could you run an strace on the process in this state so we can get an idea what it's doing? After several tests, I'm getting convinced that this occurs only when +RTS -Nx is specified (where x = 2). The server runs well if +RTS -Nx is not specified. That indicates that the problem is with the threaded RTS and perhaps with the IO manager. My question: if the program complied with GHC needs double Ctrl-c to terminate, what is the situation of the program? If Ctrl+C generates an exception (does it?) there could be an overzealous exception catcher somewhere that catches all exceptions, including your Ctrl+C. The first Ctrl-C is sent as an Interrupted exception to the main thread. The second Ctrl-C sends a SIGINT as usual, which tends to kill the process. If you need two Ctrl-Cs to kill the program, it probably means that it deadlocked somewhere, maybe in the RTS. Kazu: if you can attach to the deadlocked process with gdb and get stack traces of all the threads, that might help. Cheers, Simon ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: A bug of multicore IO manager
Kazu, thanks for noticing this! I will try to recreate it on my server as well. -Andi On Tue, Sep 3, 2013 at 5:57 PM, Johan Tibell johan.tib...@gmail.com wrote: Hi Kazu, On Tue, Sep 3, 2013 at 2:52 PM, Kazu Yamamoto k...@iij.ad.jp wrote: Hi, As I said before, I started running HTTP server using Mio in the real world. Unfortunately, the daemon is not stable. After one day or so, the server cannot accept any HTTP requests. No error messages from the server. The server is alive. To terminate the server (running in a screen terminal), single Ctrl-c is not enough. Typing Ctrl-c again terminates the server. Could you run an strace on the process in this state so we can get an idea what it's doing? After several tests, I'm getting convinced that this occurs only when +RTS -Nx is specified (where x = 2). The server runs well if +RTS -Nx is not specified. That indicates that the problem is with the threaded RTS and perhaps with the IO manager. My question: if the program complied with GHC needs double Ctrl-c to terminate, what is the situation of the program? If Ctrl+C generates an exception (does it?) there could be an overzealous exception catcher somewhere that catches all exceptions, including your Ctrl+C. P.S. It seems to me that the server also is leaking space. The server is getting fatter gradually. Could you use the profiler to see what type of objects are leaking? ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: A bug of multicore IO manager
Hi Kazu, What sort of workload was the mighty server under during those 1 or 2 days while you waited for it to become unresponsive. I.e. was this a production web server? Or were you generating requests at some frequency or leaving it mostly idle? -Andi On Tue, Sep 3, 2013 at 6:29 PM, Andreas Voellmy andreas.voel...@gmail.comwrote: Kazu, thanks for noticing this! I will try to recreate it on my server as well. -Andi On Tue, Sep 3, 2013 at 5:57 PM, Johan Tibell johan.tib...@gmail.comwrote: Hi Kazu, On Tue, Sep 3, 2013 at 2:52 PM, Kazu Yamamoto k...@iij.ad.jp wrote: Hi, As I said before, I started running HTTP server using Mio in the real world. Unfortunately, the daemon is not stable. After one day or so, the server cannot accept any HTTP requests. No error messages from the server. The server is alive. To terminate the server (running in a screen terminal), single Ctrl-c is not enough. Typing Ctrl-c again terminates the server. Could you run an strace on the process in this state so we can get an idea what it's doing? After several tests, I'm getting convinced that this occurs only when +RTS -Nx is specified (where x = 2). The server runs well if +RTS -Nx is not specified. That indicates that the problem is with the threaded RTS and perhaps with the IO manager. My question: if the program complied with GHC needs double Ctrl-c to terminate, what is the situation of the program? If Ctrl+C generates an exception (does it?) there could be an overzealous exception catcher somewhere that catches all exceptions, including your Ctrl+C. P.S. It seems to me that the server also is leaking space. The server is getting fatter gradually. Could you use the profiler to see what type of objects are leaking? ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
Re: A bug of multicore IO manager
Hi Andi, What sort of workload was the mighty server under during those 1 or 2 days while you waited for it to become unresponsive. I.e. was this a production web server? Or were you generating requests at some frequency or leaving it mostly idle? I ran Mighty on http://mew.org. This is my private domain which provides my free programs and articles. It's not so busy but not so dull. I did not generate requests from measurement tools. --Kazu ___ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs