On 21/01/2015 03:43, Michael Jones wrote:
Simon,

The code below hangs on the frameEx function.

But, if I change it to:

        f  <- frameCreate objectNull idAny "linti-scope PMBus Scope Tool" 
rectZero (frameDefaultStyle .|. wxMAXIMIZE)

it will progress, but no frame pops up, except once in many tries. Still hangs, 
but progresses through all the setup code.

However, I did make past statements that a non-GUI version was hanging. So I am 
not blaming wxHaskell. Just noting that in this case it is where things go 
wrong.

Anyone,

Are there any wxHaskell experts around that might have some insight?

(Remember, works on single core 32 bit, works on quad core 64 bit,
fails on 2 core 64 bit. Using GHC 7.8.3. Any recent updates to the
code base to fix problems like this?)

No, there are no recently fixed or outstanding bugs in this area that I'm aware of.

From the symptoms I strongly suspect there's an unsafe foreign call somewhere causing problems, or another busy-wait loop.

Cheers,
Simon



— CODE SAMPLE --------

gui :: IO ()
gui
   = do
        values <- varCreate []                            -- Values to be 
painted
        timeLine <- varCreate 0                           -- Line time
        sample <- varCreate 0                             -- Sample Number
        running <- varCreate True                         -- True when 
telemetry is active

<<HANG HERE>>

        f <- frameEx frameDefaultStyle [ text := "linti-scope PMBus Scope 
Tool"] objectNull

Setup GUI components code was here

        return ()

go :: IO ()
go = do
     putStrLn "Start GUI"
     start $ gui

exeMain :: IO ()
exeMain = do
   hSetBuffering stdout NoBuffering
   getArgs >>= parse
   where
     parse ["-h"] = usage   >> exit
     parse ["-v"] = version >> exit
     parse []     = go
     parse [url, port, session, target] = goServer url port (read session) 
(read target)

     usage   = putStrLn "Usage: linti-scope [url, port, session, target]"
     version = putStrLn "Haskell linti-scope 0.1.0.0"
     exit    = System.Exit.exitWith System.Exit.ExitSuccess
     die     = System.Exit.exitWith (System.Exit.ExitFailure 1)

#ifndef MAIN_FUNCTION
#define MAIN_FUNCTION exeMain
#endif
main = MAIN_FUNCTION

On Jan 20, 2015, at 9:00 AM, Simon Marlow <marlo...@gmail.com> wrote:

My guess would be that either
- a thread is in a non-allocating loop
- a long-running foreign call is marked unsafe

Either of these would block the other threads.  ThreadScope together with some 
traceEventIO calls might help you identify the culprit.

Cheers,
Simon

On 20/01/2015 15:49, Michael Jones wrote:
Simon,

This was fixed some time back. I combed the code base looking for other busy 
loops and there are no more. I commented out the code that runs the I2C + 
Machines + IO stuff, and only left the GUI code. It appears that just the 
wxhaskell part of the program fails to start. This matches a previous 
observation based on printing.

I’ll see if I can hack up the code to a minimal set that I can publish. All the 
IP is in the I2C code, so I might be able to get it down to one file.

Mike

On Jan 19, 2015, at 3:37 AM, Simon Marlow <marlo...@gmail.com> wrote:

Hi Michael,

Previously in this thread it was pointed out that your code was doing busy 
waiting, and so the problem can be fixed by modifying your code to not do busy 
waiting.  Did you do this?  The -C flag is just a workaround which will make 
the RTS reschedule more often, it won't fix the underlying problem.

The code you showed us was:

sendTransactions :: MonadIO m => SMBusDevice DeviceDC590 -> TVar Bool -> 
ProcessT m (Spec, String) ()
sendTransactions dev dts = repeatedly $ do
  dts' <- liftIO $ atomically $ readTVar dts
  when (dts' == True) (do
      (_, transactions) <- await
      liftIO $ sendOut dev transactions)

This loops when the contents of the TVar is False.

Cheers,
Simon

On 18/01/2015 01:15, Michael Jones wrote:
I have narrowed down the problem a bit. It turns out that many times if
I run the program and wait long enough, it will start. Given an event
log, it may take from 1000-10000 entries sometimes.

When I look at a good start vs. slow start, I see that in both cases
things startup and there is some thread activity for thread 2 and 3,
then the application starts creating other threads, which is when the
wxhaskell GUI pops up and IO out my /dev/i2c begins. In the slow case,
it just gets stuck on thread 2/3 activity for a very long time.

If I switch from -C0.001 to -C0.010, the startup is more reliable, in
that most starts result in an immediate GUI and i2c IO.

The behavior suggests to me that some initial threads are starving the
ability for other threads to start, and perhaps on a dual core machine
it is more of a problem than single or quad core machines. For certain,
due to some printing, I know that the main thread is starting, and that
a print just before the first fork is not printing. Code between them is
evaluating wxhaskell functions, but the main frame is not yet asked to
become visible. From last week, I know that an non-gui version of the
app is getting stuck, but I do not know if it eventually runs like this
case.

Is there some convention that when I look at an event log you can tell
which threads are OS threads vs threads from fork?

Perhaps someone that knows the scheduler might have some advice. It
seems odd that a scheduler could behave this way. The scheduler should
have some built in notion of fairness.


On Jan 12, 2015, at 11:02 PM, Michael Jones <m...@proclivis.com
<mailto:m...@proclivis.com>> wrote:

Sorry I am reviving an old problem, but it has resurfaced, such that
one system behaves different than another.

Using -C0.001 solved problems on a Mac + VM + Ubuntu 14. It worked on
a single core 32 bit Atom NUC. But on a dual core Atom MinnowBoardMax,
something bad is going on. In summary, the same code that runs on two
machines does not run on a third machine. So this indicates I have not
made any breaking changes to the code or cabal files. Compiling with
GHC 7.8.3.

This bad system has Ubuntu 14 installed, with an updated Linux 3.18.1
kernel. It is a dual core 64 bit I86 Atom processor. The application
hangs at startup. If I remove the -C0.00N option and instead use -V0,
the application runs. It has bad timing properties, but it does at
least run. Note that a hang hangs an IO thread talking USB, and the
GUI thread.

When testing with the -C0.00N option, it did run 2 times out of 20
tries, so fail means fail most but not all of the time. When it did
run, it continued to run properly. This perhaps indicates some kind of
internal race condition.

In the fail to run case, it does some printing up to the point where
it tries to create a wxHaskell frame. In another non-UI version of the
program it also fails to run. Logging to a file gives a similar
indication. It is clear that the program starts up, then fails during
the run in some form of lockup, well after the initial startup code.

If I run with the strace command, it always runs with -C0.00N.

All the above was done with profiling enabled, so I removed that and
instead enabled eventlog to look for clues.

In this case it lies between good and bad, in that IO to my USB is
working, but the GUI comes up blank and never paints. Running this
case without -v0 (event log) the gui partially paints and stops, but
USB continues.

Questions:

1) Does ghc 7.8.4 have any improvements that might pertain to these
kinds of scheduling/thread problems?
2) Is there anything about the nature of a thread using USB, I2C, or
wxHaskell IO that leads to problems that a pure calculation app would
not have?
3) Any ideas how to track down the problem when changing conditions
(compiler or runtime options) affects behavior?
4) Are there other options besides -V and -C for the runtime that
might apply?
5) What does -V0 do that makes a problem program run?

Mike




On Oct 29, 2014, at 6:02 PM, Michael Jones <m...@proclivis.com
<mailto:m...@proclivis.com>> wrote:

John,

Adding -C0.005 makes it much better. Using -C0.001 makes it behave
more like -N4.

Thanks. This saves my project, as I need to deploy on a single core
Atom and was stuck.

Mike

On Oct 29, 2014, at 5:12 PM, John Lato <jwl...@gmail.com
<mailto:jwl...@gmail.com>> wrote:

By any chance do the delays get shorter if you run your program with
`+RTS -C0.005` ?  If so, I suspect you're having a problem very
similar to one that we had with ghc-7.8 (7.6 too, but it's worse on
ghc-7.8 for some reason), involving possible misbehavior of the
thread scheduler.

On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones <m...@proclivis.com
<mailto:m...@proclivis.com>> wrote:

    I have a general question about thread behavior in 7.8.3 vs 7.6.X

    I moved from 7.6 to 7.8 and my application behaves very
    differently. I have three threads, an application thread that
    plots data with wxhaskell or sends it over a network (depends on
    settings), a thread doing usb bulk writes, and a thread doing
    usb bulk reads. Data is moved around with TChan, and TVar is
    used for coordination.

    When the application was compiled with 7.6, my stream of usb
    traffic was smooth. With 7.8, there are lots of delays where
    nothing seems to be running. These delays are up to 40ms,
    whereas with 7.6 delays were a 1ms or so.

    When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it
    runs fine without with -N2/4.

    The program is compiled -O2 with profiling. The -N2/4 version
    uses more memory,  but in both cases with 7.8 and with 7.6 there
    is no space leak.

    I tired to compile and use -ls so I could take a look with
    threadscope, but the application hangs and writes no data to the
    file. The CPU fans run wild like it is in an infinite loop. It
    at least pops an unpainted wxhaskell window, so it got partially
    running.

    One of my libraries uses option -fsimpl-tick-factor=200 to get
    around the compiler.

    What do I need to know about changes to threading and event
    logging between 7.6 and 7.8? Is there some general documentation
    somewhere that might help?

    I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar
    ball and installed myself, after removing 7.6 with apt-get.

    Any hints appreciated.

    Mike


    _______________________________________________
    Glasgow-haskell-users mailing list
    Glasgow-haskell-users@haskell.org
    <mailto:Glasgow-haskell-users@haskell.org>
    http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
<mailto:Glasgow-haskell-users@haskell.org>
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to