Re: Increased memory usage with GHC 7.10.1
I have run out of memory before when compiling on small machines using GHC 7.8, where small machines have 4GB RAM, no swap, say small Dual Core Atom, almost embedded design. That forced me to compile on a laptop and mount file systems to run it. But since Ubuntu runs well on a NUC, it is nice to be able to run the compiler on it, even if a bit slow. On Apr 2, 2015, at 9:08 AM, George Colpitts george.colpi...@gmail.com wrote: I'm curious why the amount of RAM is relevant as all of our OS have virtual memory so it is only the size of the heap and the amount of swap that should be relevant for an Out Of Memory error, right? How big is your heap? Amount of RAM should only affect speed (i.e. if there is excessive paging) but should not affect Out Of Memory right? On Thu, Apr 2, 2015 at 9:47 AM, Jan Stolarek jan.stola...@p.lodz.pl wrote: I will. But I was curious whether this is only me or is anyone else seeing similar behaviour. And what about performance comparisson between 7.8.4 and 7.10.1? Do we have any numbers? Janek Dnia czwartek, 2 kwietnia 2015, Richard Eisenberg napisał: Post a bug report! :) On Apr 2, 2015, at 8:19 AM, Jan Stolarek jan.stola...@p.lodz.pl wrote: An update frrom my second machine, this time with 4GB of RAM. Compiling Agda ran out of memory (again Agda.TypeChecking.Serialise module) and I had to kill the build. But once I restarted the build the module was compiled succesfully in a matter of minutes and using around 50% of memory. This looks like some kind of memory leak in GHC. Janek Dnia środa, 1 kwietnia 2015, Jan Stolarek napisał: Forall hi, I just uprgaded both of my machines to use GHC 7.10.1. I keep sandboxed installations of GHC and this means I had to rebuild Agda and Idris because the binaries built with GHC 7.8.4 were stored inside deactivated 7.8.4 sandbox. Sadly, I had problems building both Agda and Idris due to GHC taking up all of available memory. With Idris the problematic module was Idris.ElabTerm (~2900LOC). The interesting part of the story is that when I do a clean build of Idris GHC consumes all of memory when compiling that module and I have to kill the build. But when I restart the build after killing GHC the module is compiled using a reasonable amount of memory and within reasonable time. With Agda the problematic module is Agda.TypeChecking.Serialise (~2000LOC). The trick with killing the build and restarting it didn't work in this case. I had to compile Agda with GHC 7.8.4 (which works without problems though the mentioned module still requires a lot of memory) and alter my setup so that Agda binary is not stored inside GHC sandbox. I wonder if any of you came across similar issues with GHC 7.10.1? Do we have any performance data that allows to compare memory usage and performance of GHC 7.10.1 with previous stable releases? All of the above happened on 64bit Debian Wheezy with 2GB of RAM. Janek --- Politechnika Łódzka Lodz University of Technology Treść tej wiadomości zawiera informacje przeznaczone tylko dla adresata. Jeżeli nie jesteście Państwo jej adresatem, bądź otrzymaliście ją przez pomyłkę prosimy o powiadomienie o tym nadawcy oraz trwałe jej usunięcie. This email contains information intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient or if you have received this message in error, please notify the sender and delete it from your system. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users --- Politechnika Łódzka Lodz University of Technology Treść tej wiadomości zawiera informacje przeznaczone tylko dla adresata. Jeżeli nie jesteście Państwo jej adresatem, bądź otrzymaliście ją przez pomyłkę prosimy o powiadomienie o tym nadawcy oraz trwałe jej usunięcie. This email contains information intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient or if you have received this message in error, please notify the sender and delete it from your system. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users --- Politechnika Łódzka Lodz University of Technology Treść tej wiadomości zawiera informacje przeznaczone tylko dla adresata. Jeżeli nie jesteście Państwo jej adresatem, bądź otrzymaliście ją przez pomyłkę prosimy o powiadomienie o tym nadawcy oraz trwałe jej usunięcie. This email contains information intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient or if
Re: Thread behavior in 7.8.3
Bas, I have not upgraded, mainly because my problems manifest without enabling USB. However, I think I can upgrade in a few days and move forward. Are you using ghc 7.8.10 these days or something older? Mike On Jan 21, 2015, at 12:52 PM, Bas van Dijk v.dijk@gmail.com wrote: Hi Michael, Are you already using usb-1.3.0.0? If not, could you upgrade and test again? That release fixed the deadlock that Ben and Carter where talking about. Good luck, Bas ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Thread behavior in 7.8.3
Bas, I checked my cabal file and I was already using 1.3.0.0. Mike On Jan 21, 2015, at 12:52 PM, Bas van Dijk v.dijk@gmail.com wrote: Hi Michael, Are you already using usb-1.3.0.0? If not, could you upgrade and test again? That release fixed the deadlock that Ben and Carter where talking about. Good luck, Bas ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Thread behavior in 7.8.3
Simon, I went back and retested my non-GUI version and it seems to work fine. But here is what is strange, the non-GUI version is really just a client server version of what I have problems with. I have a non-GUI app running the USB and streaming data to a server. The client app (the one that has the lockup), works fine when in the client server mode. In this mode, it executes the very same code I listed below that locked up. The main difference is where in the code below it says: Setup GUI components code was here”. The client server version just connects to the server rather than start up the USB IO. Strange that the behavior is so sensitive. Is there any plans to make the scheduling more pre-emptive so that rogue threads can’t derail an application? Seems to open up lots of difficulties when you are reusing lots of libraries you are not familiar with to build a large application. The more libraries you use, the more unknown risk you are taking that you project is killed because you can’t meet a deadline. I think I’ll let 7.10 settling down with one maintenance release and then give it a try just to see if it is any different. If that fails, I’ll scratch my head some more. What I don’t want to do is dig into wxHaskell’s FFI. I have a Python GUI started and I can just use that. The motivation for that was the inability to get wxHaskell to work on all three platforms (Windows, Linux, Mac), and getting python GUIs to work on all three was not too hard. Granted, I would prefer Haskell, but it is an enormous task to make a GUI work on all platforms. Unlike non-GUI libraries, it is not “just works”, at least it wasn’t for me. Mike On Jan 21, 2015, at 3:18 AM, Simon Marlow marlo...@gmail.com wrote: On 21/01/2015 03:43, Michael Jones wrote: Simon, The code below hangs on the frameEx function. But, if I change it to: f - frameCreate objectNull idAny linti-scope PMBus Scope Tool rectZero (frameDefaultStyle .|. wxMAXIMIZE) it will progress, but no frame pops up, except once in many tries. Still hangs, but progresses through all the setup code. However, I did make past statements that a non-GUI version was hanging. So I am not blaming wxHaskell. Just noting that in this case it is where things go wrong. Anyone, Are there any wxHaskell experts around that might have some insight? (Remember, works on single core 32 bit, works on quad core 64 bit, fails on 2 core 64 bit. Using GHC 7.8.3. Any recent updates to the code base to fix problems like this?) No, there are no recently fixed or outstanding bugs in this area that I'm aware of. From the symptoms I strongly suspect there's an unsafe foreign call somewhere causing problems, or another busy-wait loop. Cheers, Simon — CODE SAMPLE gui :: IO () gui = do values - varCreate []-- Values to be painted timeLine - varCreate 0 -- Line time sample - varCreate 0 -- Sample Number running - varCreate True -- True when telemetry is active HANG HERE f - frameEx frameDefaultStyle [ text := linti-scope PMBus Scope Tool] objectNull Setup GUI components code was here return () go :: IO () go = do putStrLn Start GUI start $ gui exeMain :: IO () exeMain = do hSetBuffering stdout NoBuffering getArgs = parse where parse [-h] = usageexit parse [-v] = version exit parse [] = go parse [url, port, session, target] = goServer url port (read session) (read target) usage = putStrLn Usage: linti-scope [url, port, session, target] version = putStrLn Haskell linti-scope 0.1.0.0 exit= System.Exit.exitWith System.Exit.ExitSuccess die = System.Exit.exitWith (System.Exit.ExitFailure 1) #ifndef MAIN_FUNCTION #define MAIN_FUNCTION exeMain #endif main = MAIN_FUNCTION On Jan 20, 2015, at 9:00 AM, Simon Marlow marlo...@gmail.com wrote: My guess would be that either - a thread is in a non-allocating loop - a long-running foreign call is marked unsafe Either of these would block the other threads. ThreadScope together with some traceEventIO calls might help you identify the culprit. Cheers, Simon On 20/01/2015 15:49, Michael Jones wrote: Simon, This was fixed some time back. I combed the code base looking for other busy loops and there are no more. I commented out the code that runs the I2C + Machines + IO stuff, and only left the GUI code. It appears that just the wxhaskell part of the program fails to start. This matches a previous observation based on printing. I’ll see if I can hack up the code to a minimal set that I can publish. All the IP is in the I2C code, so I might be able to get it down to one file. Mike On Jan 19, 2015, at 3:37 AM, Simon Marlow marlo...@gmail.com wrote: Hi Michael
Re: Thread behavior in 7.8.3
Simon, This was fixed some time back. I combed the code base looking for other busy loops and there are no more. I commented out the code that runs the I2C + Machines + IO stuff, and only left the GUI code. It appears that just the wxhaskell part of the program fails to start. This matches a previous observation based on printing. I’ll see if I can hack up the code to a minimal set that I can publish. All the IP is in the I2C code, so I might be able to get it down to one file. Mike On Jan 19, 2015, at 3:37 AM, Simon Marlow marlo...@gmail.com wrote: Hi Michael, Previously in this thread it was pointed out that your code was doing busy waiting, and so the problem can be fixed by modifying your code to not do busy waiting. Did you do this? The -C flag is just a workaround which will make the RTS reschedule more often, it won't fix the underlying problem. The code you showed us was: sendTransactions :: MonadIO m = SMBusDevice DeviceDC590 - TVar Bool - ProcessT m (Spec, String) () sendTransactions dev dts = repeatedly $ do dts' - liftIO $ atomically $ readTVar dts when (dts' == True) (do (_, transactions) - await liftIO $ sendOut dev transactions) This loops when the contents of the TVar is False. Cheers, Simon On 18/01/2015 01:15, Michael Jones wrote: I have narrowed down the problem a bit. It turns out that many times if I run the program and wait long enough, it will start. Given an event log, it may take from 1000-1 entries sometimes. When I look at a good start vs. slow start, I see that in both cases things startup and there is some thread activity for thread 2 and 3, then the application starts creating other threads, which is when the wxhaskell GUI pops up and IO out my /dev/i2c begins. In the slow case, it just gets stuck on thread 2/3 activity for a very long time. If I switch from -C0.001 to -C0.010, the startup is more reliable, in that most starts result in an immediate GUI and i2c IO. The behavior suggests to me that some initial threads are starving the ability for other threads to start, and perhaps on a dual core machine it is more of a problem than single or quad core machines. For certain, due to some printing, I know that the main thread is starting, and that a print just before the first fork is not printing. Code between them is evaluating wxhaskell functions, but the main frame is not yet asked to become visible. From last week, I know that an non-gui version of the app is getting stuck, but I do not know if it eventually runs like this case. Is there some convention that when I look at an event log you can tell which threads are OS threads vs threads from fork? Perhaps someone that knows the scheduler might have some advice. It seems odd that a scheduler could behave this way. The scheduler should have some built in notion of fairness. On Jan 12, 2015, at 11:02 PM, Michael Jones m...@proclivis.com mailto:m...@proclivis.com wrote: Sorry I am reviving an old problem, but it has resurfaced, such that one system behaves different than another. Using -C0.001 solved problems on a Mac + VM + Ubuntu 14. It worked on a single core 32 bit Atom NUC. But on a dual core Atom MinnowBoardMax, something bad is going on. In summary, the same code that runs on two machines does not run on a third machine. So this indicates I have not made any breaking changes to the code or cabal files. Compiling with GHC 7.8.3. This bad system has Ubuntu 14 installed, with an updated Linux 3.18.1 kernel. It is a dual core 64 bit I86 Atom processor. The application hangs at startup. If I remove the -C0.00N option and instead use -V0, the application runs. It has bad timing properties, but it does at least run. Note that a hang hangs an IO thread talking USB, and the GUI thread. When testing with the -C0.00N option, it did run 2 times out of 20 tries, so fail means fail most but not all of the time. When it did run, it continued to run properly. This perhaps indicates some kind of internal race condition. In the fail to run case, it does some printing up to the point where it tries to create a wxHaskell frame. In another non-UI version of the program it also fails to run. Logging to a file gives a similar indication. It is clear that the program starts up, then fails during the run in some form of lockup, well after the initial startup code. If I run with the strace command, it always runs with -C0.00N. All the above was done with profiling enabled, so I removed that and instead enabled eventlog to look for clues. In this case it lies between good and bad, in that IO to my USB is working, but the GUI comes up blank and never paints. Running this case without -v0 (event log) the gui partially paints and stops, but USB continues. Questions: 1) Does ghc 7.8.4 have any improvements that might pertain to these kinds of scheduling/thread problems? 2) Is there anything about
Re: Thread behavior in 7.8.3
Simon, The code below hangs on the frameEx function. But, if I change it to: f - frameCreate objectNull idAny linti-scope PMBus Scope Tool rectZero (frameDefaultStyle .|. wxMAXIMIZE) it will progress, but no frame pops up, except once in many tries. Still hangs, but progresses through all the setup code. However, I did make past statements that a non-GUI version was hanging. So I am not blaming wxHaskell. Just noting that in this case it is where things go wrong. Anyone, Are there any wxHaskell experts around that might have some insight? (Remember, works on single core 32 bit, works on quad core 64 bit, fails on 2 core 64 bit. Using GHC 7.8.3. Any recent updates to the code base to fix problems like this?) — CODE SAMPLE gui :: IO () gui = do values - varCreate []-- Values to be painted timeLine - varCreate 0 -- Line time sample - varCreate 0 -- Sample Number running - varCreate True -- True when telemetry is active HANG HERE f - frameEx frameDefaultStyle [ text := linti-scope PMBus Scope Tool] objectNull Setup GUI components code was here return () go :: IO () go = do putStrLn Start GUI start $ gui exeMain :: IO () exeMain = do hSetBuffering stdout NoBuffering getArgs = parse where parse [-h] = usageexit parse [-v] = version exit parse [] = go parse [url, port, session, target] = goServer url port (read session) (read target) usage = putStrLn Usage: linti-scope [url, port, session, target] version = putStrLn Haskell linti-scope 0.1.0.0 exit= System.Exit.exitWith System.Exit.ExitSuccess die = System.Exit.exitWith (System.Exit.ExitFailure 1) #ifndef MAIN_FUNCTION #define MAIN_FUNCTION exeMain #endif main = MAIN_FUNCTION On Jan 20, 2015, at 9:00 AM, Simon Marlow marlo...@gmail.com wrote: My guess would be that either - a thread is in a non-allocating loop - a long-running foreign call is marked unsafe Either of these would block the other threads. ThreadScope together with some traceEventIO calls might help you identify the culprit. Cheers, Simon On 20/01/2015 15:49, Michael Jones wrote: Simon, This was fixed some time back. I combed the code base looking for other busy loops and there are no more. I commented out the code that runs the I2C + Machines + IO stuff, and only left the GUI code. It appears that just the wxhaskell part of the program fails to start. This matches a previous observation based on printing. I’ll see if I can hack up the code to a minimal set that I can publish. All the IP is in the I2C code, so I might be able to get it down to one file. Mike On Jan 19, 2015, at 3:37 AM, Simon Marlow marlo...@gmail.com wrote: Hi Michael, Previously in this thread it was pointed out that your code was doing busy waiting, and so the problem can be fixed by modifying your code to not do busy waiting. Did you do this? The -C flag is just a workaround which will make the RTS reschedule more often, it won't fix the underlying problem. The code you showed us was: sendTransactions :: MonadIO m = SMBusDevice DeviceDC590 - TVar Bool - ProcessT m (Spec, String) () sendTransactions dev dts = repeatedly $ do dts' - liftIO $ atomically $ readTVar dts when (dts' == True) (do (_, transactions) - await liftIO $ sendOut dev transactions) This loops when the contents of the TVar is False. Cheers, Simon On 18/01/2015 01:15, Michael Jones wrote: I have narrowed down the problem a bit. It turns out that many times if I run the program and wait long enough, it will start. Given an event log, it may take from 1000-1 entries sometimes. When I look at a good start vs. slow start, I see that in both cases things startup and there is some thread activity for thread 2 and 3, then the application starts creating other threads, which is when the wxhaskell GUI pops up and IO out my /dev/i2c begins. In the slow case, it just gets stuck on thread 2/3 activity for a very long time. If I switch from -C0.001 to -C0.010, the startup is more reliable, in that most starts result in an immediate GUI and i2c IO. The behavior suggests to me that some initial threads are starving the ability for other threads to start, and perhaps on a dual core machine it is more of a problem than single or quad core machines. For certain, due to some printing, I know that the main thread is starting, and that a print just before the first fork is not printing. Code between them is evaluating wxhaskell functions, but the main frame is not yet asked to become visible. From last week, I know that an non-gui version of the app is getting stuck, but I do not know if it eventually runs like this case. Is there some convention that when I look at an event log you can tell which
Re: Thread behavior in 7.8.3
Donn, True, but in that case I was using a driver for the Aardvark, and my current two test cases use: A) DC1613A from LTC B) /dev/i2c driver with FFI wrapper I wrote Case A uses the haskell usb package and libusb. I suppose SIGVALRM could be in used in the libusb driver. I know for sure it is not used by my I2C stuff, unless it is behind the /dev/i2c user mode calls. But interesting. Obviously the scheduler is using timers from the OS. Is it really an advantage not to use OS threads all around? Is there anyway to enable such behavior to see if things are better? Mike On Jan 17, 2015, at 11:00 PM, Donn Cave d...@avvanta.com wrote: Quoth Michael Jones m...@proclivis.com, ... 5) What does -V0 do that makes a problem program run? Well, there's that SIGVTALRM barrage, you may remember we went over that mid-August. I expect there are other effects. Donn ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Thread behavior in 7.8.3
I have narrowed down the problem a bit. It turns out that many times if I run the program and wait long enough, it will start. Given an event log, it may take from 1000-1 entries sometimes. When I look at a good start vs. slow start, I see that in both cases things startup and there is some thread activity for thread 2 and 3, then the application starts creating other threads, which is when the wxhaskell GUI pops up and IO out my /dev/i2c begins. In the slow case, it just gets stuck on thread 2/3 activity for a very long time. If I switch from -C0.001 to -C0.010, the startup is more reliable, in that most starts result in an immediate GUI and i2c IO. The behavior suggests to me that some initial threads are starving the ability for other threads to start, and perhaps on a dual core machine it is more of a problem than single or quad core machines. For certain, due to some printing, I know that the main thread is starting, and that a print just before the first fork is not printing. Code between them is evaluating wxhaskell functions, but the main frame is not yet asked to become visible. From last week, I know that an non-gui version of the app is getting stuck, but I do not know if it eventually runs like this case. Is there some convention that when I look at an event log you can tell which threads are OS threads vs threads from fork? Perhaps someone that knows the scheduler might have some advice. It seems odd that a scheduler could behave this way. The scheduler should have some built in notion of fairness. On Jan 12, 2015, at 11:02 PM, Michael Jones m...@proclivis.com wrote: Sorry I am reviving an old problem, but it has resurfaced, such that one system behaves different than another. Using -C0.001 solved problems on a Mac + VM + Ubuntu 14. It worked on a single core 32 bit Atom NUC. But on a dual core Atom MinnowBoardMax, something bad is going on. In summary, the same code that runs on two machines does not run on a third machine. So this indicates I have not made any breaking changes to the code or cabal files. Compiling with GHC 7.8.3. This bad system has Ubuntu 14 installed, with an updated Linux 3.18.1 kernel. It is a dual core 64 bit I86 Atom processor. The application hangs at startup. If I remove the -C0.00N option and instead use -V0, the application runs. It has bad timing properties, but it does at least run. Note that a hang hangs an IO thread talking USB, and the GUI thread. When testing with the -C0.00N option, it did run 2 times out of 20 tries, so fail means fail most but not all of the time. When it did run, it continued to run properly. This perhaps indicates some kind of internal race condition. In the fail to run case, it does some printing up to the point where it tries to create a wxHaskell frame. In another non-UI version of the program it also fails to run. Logging to a file gives a similar indication. It is clear that the program starts up, then fails during the run in some form of lockup, well after the initial startup code. If I run with the strace command, it always runs with -C0.00N. All the above was done with profiling enabled, so I removed that and instead enabled eventlog to look for clues. In this case it lies between good and bad, in that IO to my USB is working, but the GUI comes up blank and never paints. Running this case without -v0 (event log) the gui partially paints and stops, but USB continues. Questions: 1) Does ghc 7.8.4 have any improvements that might pertain to these kinds of scheduling/thread problems? 2) Is there anything about the nature of a thread using USB, I2C, or wxHaskell IO that leads to problems that a pure calculation app would not have? 3) Any ideas how to track down the problem when changing conditions (compiler or runtime options) affects behavior? 4) Are there other options besides -V and -C for the runtime that might apply? 5) What does -V0 do that makes a problem program run? Mike On Oct 29, 2014, at 6:02 PM, Michael Jones m...@proclivis.com wrote: John, Adding -C0.005 makes it much better. Using -C0.001 makes it behave more like -N4. Thanks. This saves my project, as I need to deploy on a single core Atom and was stuck. Mike On Oct 29, 2014, at 5:12 PM, John Lato jwl...@gmail.com wrote: By any chance do the delays get shorter if you run your program with `+RTS -C0.005` ? If so, I suspect you're having a problem very similar to one that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some reason), involving possible misbehavior of the thread scheduler. On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones m...@proclivis.com wrote: I have a general question about thread behavior in 7.8.3 vs 7.6.X I moved from 7.6 to 7.8 and my application behaves very differently. I have three threads, an application thread that plots data with wxhaskell or sends it over a network (depends
Re: Thread behavior in 7.8.3
Ben, Interesting. In this case, I can duplicate the problem when not using USB (USB to i2c dongle) by using /dev/i2c_n, and when I do use USB, in some cases the USB is working (can see i2c on scope), but the GUI is hung. So I believe this is not causing the problem. Thanks, Mike On Jan 13, 2015, at 1:02 PM, Ben Gamari bgamari.f...@gmail.com wrote: Michael Jones m...@proclivis.com writes: Sorry I am reviving an old problem, but it has resurfaced, such that one system behaves different than another. [snip] 1) Does ghc 7.8.4 have any improvements that might pertain to these kinds of scheduling/thread problems? 2) Is there anything about the nature of a thread using USB, I2C, or wxHaskell IO that leads to problems that a pure calculation app would not have? Do you know about [1]? This is a regression due to an interface change that arose from the new event manager. `usb` 1.3.0.0` has a workaround, GHC 7.10 will have a fixed event manager [2]. Given that it sounds like your program works some of the time this may not be relevant but I thought it would be negligent not to mention it. Cheers, - Ben [1] https://github.com/basvandijk/usb/issues/7 [2] https://phabricator.haskell.org/D347 ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Thread behavior in 7.8.3
Sorry I am reviving an old problem, but it has resurfaced, such that one system behaves different than another. Using -C0.001 solved problems on a Mac + VM + Ubuntu 14. It worked on a single core 32 bit Atom NUC. But on a dual core Atom MinnowBoardMax, something bad is going on. In summary, the same code that runs on two machines does not run on a third machine. So this indicates I have not made any breaking changes to the code or cabal files. Compiling with GHC 7.8.3. This bad system has Ubuntu 14 installed, with an updated Linux 3.18.1 kernel. It is a dual core 64 bit I86 Atom processor. The application hangs at startup. If I remove the -C0.00N option and instead use -V0, the application runs. It has bad timing properties, but it does at least run. Note that a hang hangs an IO thread talking USB, and the GUI thread. When testing with the -C0.00N option, it did run 2 times out of 20 tries, so fail means fail most but not all of the time. When it did run, it continued to run properly. This perhaps indicates some kind of internal race condition. In the fail to run case, it does some printing up to the point where it tries to create a wxHaskell frame. In another non-UI version of the program it also fails to run. Logging to a file gives a similar indication. It is clear that the program starts up, then fails during the run in some form of lockup, well after the initial startup code. If I run with the strace command, it always runs with -C0.00N. All the above was done with profiling enabled, so I removed that and instead enabled eventlog to look for clues. In this case it lies between good and bad, in that IO to my USB is working, but the GUI comes up blank and never paints. Running this case without -v0 (event log) the gui partially paints and stops, but USB continues. Questions: 1) Does ghc 7.8.4 have any improvements that might pertain to these kinds of scheduling/thread problems? 2) Is there anything about the nature of a thread using USB, I2C, or wxHaskell IO that leads to problems that a pure calculation app would not have? 3) Any ideas how to track down the problem when changing conditions (compiler or runtime options) affects behavior? 4) Are there other options besides -V and -C for the runtime that might apply? 5) What does -V0 do that makes a problem program run? Mike On Oct 29, 2014, at 6:02 PM, Michael Jones m...@proclivis.com wrote: John, Adding -C0.005 makes it much better. Using -C0.001 makes it behave more like -N4. Thanks. This saves my project, as I need to deploy on a single core Atom and was stuck. Mike On Oct 29, 2014, at 5:12 PM, John Lato jwl...@gmail.com wrote: By any chance do the delays get shorter if you run your program with `+RTS -C0.005` ? If so, I suspect you're having a problem very similar to one that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some reason), involving possible misbehavior of the thread scheduler. On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones m...@proclivis.com wrote: I have a general question about thread behavior in 7.8.3 vs 7.6.X I moved from 7.6 to 7.8 and my application behaves very differently. I have three threads, an application thread that plots data with wxhaskell or sends it over a network (depends on settings), a thread doing usb bulk writes, and a thread doing usb bulk reads. Data is moved around with TChan, and TVar is used for coordination. When the application was compiled with 7.6, my stream of usb traffic was smooth. With 7.8, there are lots of delays where nothing seems to be running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or so. When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine without with -N2/4. The program is compiled -O2 with profiling. The -N2/4 version uses more memory, but in both cases with 7.8 and with 7.6 there is no space leak. I tired to compile and use -ls so I could take a look with threadscope, but the application hangs and writes no data to the file. The CPU fans run wild like it is in an infinite loop. It at least pops an unpainted wxhaskell window, so it got partially running. One of my libraries uses option -fsimpl-tick-factor=200 to get around the compiler. What do I need to know about changes to threading and event logging between 7.6 and 7.8? Is there some general documentation somewhere that might help? I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and installed myself, after removing 7.6 with apt-get. Any hints appreciated. Mike ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: GHC 7.8.3 thread hang
I am having difficulty imagining how there could be a loop that evaluates a black hole a second time by the same thread. The paper Haskell on a Shared-Memory Processor mentions that the runtime must not do this. My take on this description is that it should never happen. Even if code was recursive, it should not happen. Code might recurse forever and chew up memory until failure, but a thread creating a black hold should not trip up on its own black hole. So I am not sure what I am looking for. There is no recursion in the encode. Can you give some examples of what I should be looking for? Mike On Nov 11, 2014, at 8:37 PM, John Lato jwl...@gmail.com wrote: The blocked on black hole message is very suspicious. It means that thread 7 is blocked waiting for another thread to evaluate a thunk. But in this case, it's thread 7 that created that thunk and is supposed to be doing the evaluating. This is some evidence that Gregory's theory is correct and your encode function loops somewhere. On Wed Nov 12 2014 at 11:25:30 AM Michael Jones m...@proclivis.com wrote: Gregory, The options in the 7.8.3 user guide says in the -Msize option that by default the heap is unlimited. I have several applications, and they all have messages like: 7fddc7bcd700: cap 2: waking up thread 7 on cap 2 7fddc7bcd700: cap 2: thread 4 stopped (yielding) 7fddcaad6740: cap 2: running thread 7 (ThreadRunGHC) 7fddcaad6740: cap 2: thread 7 stopped (heap overflow) 7fddcaad6740: cap 2: requesting parallel GC 7fddc5ffe700: cap 0: starting GC 7fddc57fd700: cap 1: starting GC 7fdda77fe700: cap 3: starting GC 7fddcaad6740: cap 2: starting GC I assumed that when the heap ran out of space, it caused a GC, or it enlarged the heap. The programs that have these messages run for very long periods of time, and when I heap profile them, they use about 500KM to 1MB over long periods of time, and are quite stable. As a test, I ran the hang application with profiling to see if memory jumps up before or after the hang. What I notice is the app moves along using about 800KB, then there is a spike to 2MB at the hang. So I believe you, but I am confused about the RTS behavior and how I can have all these overflow messages in a normal application and how to tell the difference between these routine messages vs a real heap problem. So, I dug deeper into the log. A normal execution for sending a command looks like: 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC) 7f99e6ffd700: cap 0: thread 7 stopped (heap overflow) 7f99e6ffd700: cap 0: requesting parallel GC 7f99e6ffd700: cap 0: starting GC 7f99e6ffd700: cap 0: GC working 7f99e6ffd700: cap 0: GC idle 7f99e6ffd700: cap 0: GC done 7f99e6ffd700: cap 0: GC idle 7f99e6ffd700: cap 0: GC done 7f99e6ffd700: cap 0: all caps stopped for GC 7f99e6ffd700: cap 0: finished GC 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC) 7f99e6ffd700: cap 0: sendCommand 7f99e6ffd700: cap 0: sendCommand: encoded 7f99e6ffd700: cap 0: sendCommand: size 4 7f99e6ffd700: cap 0: sendCommand: unpacked 7f99e6ffd700: cap 0: Sending command of size 4 7f99e6ffd700: cap 0: Sending command of size \NUL\EOT 7f99e6ffd700: cap 0: sendCommand: sent 7f99e6ffd700: cap 0: sendCommand: flushed 7f99e6ffd700: cap 0: thread 7 stopped (blocked on an MVar) 7f99e6ffd700: cap 0: running thread 2 (ThreadRunGHC) 7f99e6ffd700: cap 0: thread 2 stopped (yielding) 7f99e6ffd700: cap 0: running thread 45 (ThreadRunGHC) 7f99e6ffd700: cap 0: fetchTelemetryServer 7f99e6ffd700: cap 0: fetchTelemetryServer: got lock The thread is run, overflows, GC, runs, then blocks on an MVAr. For a the hang case: 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC) 7f99e6ffd700: cap 0: sendCommand 7f99e6ffd700: cap 0: thread 7 stopped (heap overflow) 7f99e6ffd700: cap 0: requesting parallel GC 7f99e6ffd700: cap 0: starting GC 7f99e6ffd700: cap 0: GC working 7f99e6ffd700: cap 0: GC idle 7f99e6ffd700: cap 0: GC done 7f99e6ffd700: cap 0: GC idle 7f99e6ffd700: cap 0: GC done 7f99e6ffd700: cap 0: all caps stopped for GC 7f99e6ffd700: cap 0: finished GC 7f9a05362a40: cap 0: running thread 1408 (ThreadRunGHC) 7f9a05362a40: cap 0: thread 1408 stopped (yielding) 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC) 7f99e6ffd700: cap 0: thread 7 stopped (heap overflow) 7f99e6ffd700: cap 0: requesting parallel GC 7f99e6ffd700: cap 0: starting GC 7f99e6ffd700: cap 0: GC working 7f99e6ffd700: cap 0: GC idle 7f99e6ffd700: cap 0: GC done 7f99e6ffd700: cap 0: GC idle 7f99e6ffd700: cap 0: GC done 7f99e6ffd700: cap 0: all caps stopped for GC 7f99e6ffd700: cap 0: finished GC 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC) 7f99e6ffd700: cap 0: thread 7 stopped (yielding) 7f99e6ffd700: cap 0: running thread 2 (ThreadRunGHC) 7f99e6ffd700: cap 0: thread 2 stopped (yielding) 7f99e6ffd700: cap 0: running thread 45 (ThreadRunGHC) 7f99e6ffd700: cap 0: fetchTelemetryServer
Re: GHC 7.8.3 thread hang
OS: Ubuntu 14.X TLS GHC-Options: -rtsopts -threaded -debug -eventlog Behaves the same with GHC-Options: -rtsopts -threaded -O2 Mike On Nov 11, 2014, at 8:01 AM, Carter Schonwald carter.schonw...@gmail.com wrote: what OS are you on? what build options did you use? On Tue, Nov 11, 2014 at 2:11 AM, Michael Jones m...@proclivis.com wrote: I am trying to debug a lockup problem (and need help with debugging technique), where hang means a thread stops at a known place during evaluation, and other threads continue. The code near the problem is like: ec - return $ encode command l - return $ BSL.length ec ss - return $ BSC.unpack ec It does not matter if I use let or return, or if the length is taken after unpack. I used return so I could use this code for tracing, with strictness to try to find the exact statement that is the problem: traceEventIO sendCommand ec - return $ encode command traceEventIO $ sendCommand: encoded l - ec `seq` return $ BSL.length ec traceEventIO $ sendCommand: size ++ (show l) ss - ec `seq` return $ BSC.unpack ec When this runs, the program executes this many times, but always hangs under a certain condition. For good evaluations: 7f04173ff700: cap 0: sendCommand 7f04173ff700: cap 0: sendCommand: encoded 7f04173ff700: cap 0: sendCommand: size 4 7f04173ff700: cap 0: sendCommand: unpacked 7f04173ff700: cap 0: Sending command of size 4 7f04173ff700: cap 0: Sending command of size \NUL\EOT 7f04173ff700: cap 0: sendCommand: sent 7f04173ff700: cap 0: sendCommand: flushed for bad evaluation: 7f04173ff700: cap 0: sendCommand 7f04173ff700: cap 0: sendCommand: encoded The lockup occurs when length is taken. The difference between the working and non-working case is as follows: A wxHaskell callback stuffs some data in a TChan. A thread started at application startup is reading the TChan and calling the code that hangs. If it did not hang, it would send it by TCP to another process/computer. In the working case the callback pops a dialog, and passes data from one TChan to another TChan. In the failing case, the data is used to generate strings in a wxHaskell grid, then it is parsed, and a new data is made. The new data is a combination of old and new pieces of sub data. The shape of the date is identical, because I am not making any edits to the rows. So when data that the callback sends to TChan is unmodified, no hang. But when the data is used to make text, put it in the gui, process it, and generate new data, it hangs. As a test I modified the code so that the text is not put into the gui. The results are the same. This indicates it has something to do with creating strings and then data from strings and mixing old and new subdata. Strings are created with show. Data is created by pattern matching and generating numbers from strings. I should also point out that in the working case, the size of the resulting string is small, say 3. In the hang case, the resulting string would be long, say 5000-1. I assume there are no limits to the size of ByteStrings or fundemental issues with the RTS stack/heap that require special settings. I am using the following revisions: GHC 7.8.3 base ==4.7.*, mtl ==2.2.1, containers == 0.5.5.1, transformers ==0.4.1.0, random == 1.0.1.1, wx == 0.91.0.0, wxcore == 0.91.0.0, wxdirect == 0.91.0.0, colour == 2.3.3, stm == 2.4.2, monad-loops == 0.4.2.1, time == 1.4.2, old-locale == 1.0.0.6, fast-logger == 2.2.3, network == 2.6.0.2, bytestring == 0.10.4.0, control-monad-loop == 0.1, binary == 0.7.2.2, I know that nobody can have an answer based on this. But what I am hoping is either there is some known bug, or someone can guide me in narrowing it down. The event log does not have anything unusual in it. Other threads keep running, and I can exit the application normally. The thread does not throw an exception. It just hangs. When I run the app, I just use +RTS -v Perhaps there are some other options that might give more info? — SNIPPET of log — 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC) 7fe544cfea40: cap 0: thread 5 stopped (yielding) 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC) 7fe544cfea40: cap 0: thread 5 stopped (suspended while making a foreign call) 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC) 7fe544cfea40: cap 0: thread 5 stopped (blocked on an MVar) 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC) 7fe537eff700: cap 0: waking up thread 5 on cap 0 7fe537eff700: cap 0: thread 2 stopped (yielding) 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC) 7fe544cfea40: cap 0: sendCommand 7fe544cfea40: cap 0: sendCommand: encoded 7fe544cfea40: cap 0: sendCommand: size 3 WORKS HERE 7fe544cfea40: cap 0
Re: GHC 7.8.3 thread hang
Gregory, You are correct, it moved the problem. I’ll study the code looking for loops, but... one thing to note, there are two uses of the data for IO, and both would force evaluation of length/encode. One puts data in the GUI, one sends data by Ethernet. When only the GUI code is forcing IO, there is no hang. It is the Ethernet IO that causes the hang. Because the GUI forcing does not hang, it leads me to believe there are no loops in the code. Below is some description of overall application. - means order of operation. T == Thread#. I note where the hang is, and what statement removal fixes it. Mike Main T1 - put TVar - Start T2 - Start T3 Async Module (TChan1/2/3 input/output wrapper around Eth based server in another process) (This module takes a function that sends and recv data so that it is independent from communication mechanism) (There are 2 output channels, one for response to commands in TCHan1, and one for a stream of telemetry) T2 - take TChan1 (config) - Serialize - (Hangs here) Send Eth T2 - Read Eth - Deserialize - put TChan2 (data1) put TChan3 (data2) fetchTelemetryServer (Consumes from Async Module T2 and puts data in sequence for callback) T3 - tryTake MVar2 (server lock) - take TChan3 (data) - put TSequence (data) - put MVar2 (server unlock) showOptionDialogServer (Produces for Async Module T2) (Changes the config in a dialog and send it to the server to modify telemetry definition) wxH Menu Callback - take MVar2 (server lock) - take MVar1 (gui lock) - take TVar (config) - convert config to strings display in dialog pull strings from dialog build modified config put TVar (config) - put TChan1 (config) -(Remove and no hang) put MVar1 (gui unlock) - put MVar2 (server lock) performTelemetryServer (Takes data from telemetry and prepares it for painting) wxH Timer Callback - take MVar1 (gui lock) - take TSequence (data) - modify data put Var (data2) - repaint - put MVar1 (gui unlock) onPaint (Update the graphs in the GUI) wx Paint Callback - tryTake MVar1 (gui lock) - get Var (data2) - get TVar (config) - draw on gui - put MVar1 (gui unlock) On Nov 11, 2014, at 9:45 AM, Gregory Collins g...@gregorycollins.net wrote: On Mon, Nov 10, 2014 at 11:11 PM, Michael Jones m...@proclivis.com wrote: ec - return $ encode command traceEventIO $ sendCommand: encoded l - ec `seq` return $ BSL.length ec Your encode function probably loops on some inputs. When you call return $ foo, foo is not forced; that doesn't happen in your example until BSL.length forces its input. If I'm right, changing the first line to return $! encode command will move the hang to before the call to traceEventIO. -- Gregory Collins g...@gregorycollins.net ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: GHC 7.8.3 thread hang
Those are all over the log even when it runs properly. So I assume the runtime is resizing the heap or something. Perhaps someone knows if this is normal or not. Mike On Nov 11, 2014, at 12:24 PM, Brandon Allbery allber...@gmail.com wrote: On Tue, Nov 11, 2014 at 2:11 AM, Michael Jones m...@proclivis.com wrote: 7fe537eff700: cap 0: sendCommand: encoded PROBLEM HERE 7fe537eff700: cap 0: thread 7 stopped (heap overflow) Is it just me, or is that second message significant? -- brandon s allbery kf8nh sine nomine associates allber...@gmail.com ballb...@sinenomine.net unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: GHC 7.8.3 thread hang
and why it can hang on a black hole. But if you have some hits, let me know. Mike On Nov 11, 2014, at 4:01 PM, Gregory Collins g...@gregorycollins.net wrote: On Tue, Nov 11, 2014 at 2:06 PM, Michael Jones m...@proclivis.com wrote: Those are all over the log even when it runs properly. So I assume the runtime is resizing the heap or something. No, it means you're exhausting the heap (maybe the runtime stack for the thread running encode), probably because encode is infinite-looping. I think Occam's razor applies here, check that any recursion you're doing is actually reducing the recursive argument. Perhaps you could post the code (e.g. http://gist.github.com/)? G -- Gregory Collins g...@gregorycollins.net ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
GHC 7.8.3 thread hang
I am trying to debug a lockup problem (and need help with debugging technique), where hang means a thread stops at a known place during evaluation, and other threads continue. The code near the problem is like: ec - return $ encode command l - return $ BSL.length ec ss - return $ BSC.unpack ec It does not matter if I use let or return, or if the length is taken after unpack. I used return so I could use this code for tracing, with strictness to try to find the exact statement that is the problem: traceEventIO sendCommand ec - return $ encode command traceEventIO $ sendCommand: encoded l - ec `seq` return $ BSL.length ec traceEventIO $ sendCommand: size ++ (show l) ss - ec `seq` return $ BSC.unpack ec When this runs, the program executes this many times, but always hangs under a certain condition. For good evaluations: 7f04173ff700: cap 0: sendCommand 7f04173ff700: cap 0: sendCommand: encoded 7f04173ff700: cap 0: sendCommand: size 4 7f04173ff700: cap 0: sendCommand: unpacked 7f04173ff700: cap 0: Sending command of size 4 7f04173ff700: cap 0: Sending command of size \NUL\EOT 7f04173ff700: cap 0: sendCommand: sent 7f04173ff700: cap 0: sendCommand: flushed for bad evaluation: 7f04173ff700: cap 0: sendCommand 7f04173ff700: cap 0: sendCommand: encoded The lockup occurs when length is taken. The difference between the working and non-working case is as follows: A wxHaskell callback stuffs some data in a TChan. A thread started at application startup is reading the TChan and calling the code that hangs. If it did not hang, it would send it by TCP to another process/computer. In the working case the callback pops a dialog, and passes data from one TChan to another TChan. In the failing case, the data is used to generate strings in a wxHaskell grid, then it is parsed, and a new data is made. The new data is a combination of old and new pieces of sub data. The shape of the date is identical, because I am not making any edits to the rows. So when data that the callback sends to TChan is unmodified, no hang. But when the data is used to make text, put it in the gui, process it, and generate new data, it hangs. As a test I modified the code so that the text is not put into the gui. The results are the same. This indicates it has something to do with creating strings and then data from strings and mixing old and new subdata. Strings are created with show. Data is created by pattern matching and generating numbers from strings. I should also point out that in the working case, the size of the resulting string is small, say 3. In the hang case, the resulting string would be long, say 5000-1. I assume there are no limits to the size of ByteStrings or fundemental issues with the RTS stack/heap that require special settings. I am using the following revisions: GHC 7.8.3 base ==4.7.*, mtl ==2.2.1, containers == 0.5.5.1, transformers ==0.4.1.0, random == 1.0.1.1, wx == 0.91.0.0, wxcore == 0.91.0.0, wxdirect == 0.91.0.0, colour == 2.3.3, stm == 2.4.2, monad-loops == 0.4.2.1, time == 1.4.2, old-locale == 1.0.0.6, fast-logger == 2.2.3, network == 2.6.0.2, bytestring == 0.10.4.0, control-monad-loop == 0.1, binary == 0.7.2.2, I know that nobody can have an answer based on this. But what I am hoping is either there is some known bug, or someone can guide me in narrowing it down. The event log does not have anything unusual in it. Other threads keep running, and I can exit the application normally. The thread does not throw an exception. It just hangs. When I run the app, I just use +RTS -v Perhaps there are some other options that might give more info? — SNIPPET of log — 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC) 7fe544cfea40: cap 0: thread 5 stopped (yielding) 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC) 7fe544cfea40: cap 0: thread 5 stopped (suspended while making a foreign call) 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC) 7fe544cfea40: cap 0: thread 5 stopped (blocked on an MVar) 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC) 7fe537eff700: cap 0: waking up thread 5 on cap 0 7fe537eff700: cap 0: thread 2 stopped (yielding) 7fe544cfea40: cap 0: running thread 5 (ThreadRunGHC) 7fe544cfea40: cap 0: sendCommand 7fe544cfea40: cap 0: sendCommand: encoded 7fe544cfea40: cap 0: sendCommand: size 3 WORKS HERE 7fe544cfea40: cap 0: sendCommand: unpacked 7fe544cfea40: cap 0: Sending command of size 3 7fe544cfea40: cap 0: Sending command of size \NUL\ETX 7fe544cfea40: cap 0: sendCommand: sent 7fe544cfea40: cap 0: sendCommand: flushed 7fe544cfea40: cap 0: thread 5 stopped (blocked on an MVar) 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC) 7fe537eff700: cap 0: thread 2 stopped (yielding) 7fe537eff700: cap 0: running thread 2 (ThreadRunGHC) 7fe537eff700: cap 0: thread 2 stopped
Re: Thread behavior in 7.8.3
My hope is that if my threads are doing IO, the scheduler acts when there is an IO action with delay, or when STM blocks, etc. So at the end of my pipe out, I have: sendTransactions :: MonadIO m = SMBusDevice DeviceDC590 - TVar Bool - ProcessT m (Spec, String) () sendTransactions dev dts = repeatedly $ do dts' - liftIO $ atomically $ readTVar dts when (dts' == True) (do (_, transactions) - await liftIO $ sendOut dev transactions) And my pipe in: returnTransactionResults :: MonadIO m = SMBusDevice DeviceDC590 - TVar Bool - SourceT m (Spec, Char) returnTransactionResults dev dts = repeatedly $ do (status, spec) - liftIO $ readIn2 dev oldDts - liftIO $ atomically $ readTVar dts let dts' = (ord $ status!!1) .. 0x20 let newDts = dts' /= 0 when (oldDts /= newDts) ( liftIO $ atomically $ writeTVar dts newDts) when (length spec /= 0) (mapM_ (\ch - yield (executeSpec, ch)) spec) sendOut will do a usb bulk write, and readIn2 will do a use bulk read. Hopefully, somewhere in the usb code IO blocks for an interrupt (probably in libusb), and that allows the scheduler to switch threads. Given the behavior, I assume this is not the case, and it requires time slicing to switch threads. I also send data between the in/out pipes via TChan. Remembering that each pipe is in a thread, hopefully if a readTChan blocks, the scheduler reschedules and the other thread runs. For context, I do a lot of RTOS work, so my worldview of the expected behavior comes from that perspective. Mike On Oct 29, 2014, at 6:41 PM, Edward Z. Yang ezy...@mit.edu wrote: Yes, that's right. I brought it up because you mentioned that there might still be occasional delays, and those might be caused by a thread not being preemptible for a while. Edward Excerpts from John Lato's message of 2014-10-29 17:31:45 -0700: My understanding is that -fno-omit-yields is subtly different. I think that's for the case when a function loops without performing any heap allocations, and thus would never yield even after the context switch timeout. In my case the looping function does perform heap allocations and does eventually yield, just not until after the timeout. Is that understanding correct? (technically, doesn't it change to yielding after stack checks or something like that?) On Thu, Oct 30, 2014 at 8:24 AM, Edward Z. Yang ezy...@mit.edu wrote: I don't think this is directly related to the problem, but if you have a thread that isn't yielding, you can force it to yield by using -fno-omit-yields on your code. It won't help if the non-yielding code is in a library, and it won't help if the problem was that you just weren't setting timeouts finely enough (which sounds like what was happening). FYI. Edward Excerpts from John Lato's message of 2014-10-29 17:19:46 -0700: I guess I should explain what that flag does... The GHC RTS maintains capabilities, the number of capabilities is specified by the `+RTS -N` option. Each capability is a virtual machine that executes Haskell code, and maintains its own runqueue of threads to process. A capability will perform a context switch at the next heap block allocation (every 4k of allocation) after the timer expires. The timer defaults to 20ms, and can be set by the -C flag. Capabilities perform context switches in other circumstances as well, such as when a thread yields or blocks. My guess is that either the context switching logic changed in ghc-7.8, or possibly your code used to trigger a switch via some other mechanism (stack overflow or something maybe?), but is optimized differently now so instead it needs to wait for the timer to expire. The problem we had was that a time-sensitive thread was getting scheduled on the same capability as a long-running non-yielding thread, so the time-sensitive thread had to wait for a context switch timeout (even though there were free cores available!). I expect even with -N4 you'll still see occasional delays (perhaps 5% of calls). We've solved our problem with judicious use of `forkOn`, but that won't help at N1. We did see this behavior in 7.6, but it's definitely worse in 7.8. Incidentally, has there been any interest in a work-stealing scheduler? There was a discussion from about 2 years ago, in which Simon Marlow noted it might be tricky, but it would definitely help in situations like this. John L. On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones m...@proclivis.com wrote: John, Adding -C0.005 makes it much better. Using -C0.001 makes it behave more like -N4. Thanks. This saves my project, as I need to deploy on a single core Atom and was stuck. Mike On Oct 29, 2014, at 5:12 PM, John Lato jwl...@gmail.com wrote: By any chance do the delays get shorter if you run your program with `+RTS -C0.005` ? If so, I suspect you're having a problem very similar to one that we had with ghc-7.8 (7.6 too, but it's worse
Thread behavior in 7.8.3
I have a general question about thread behavior in 7.8.3 vs 7.6.X I moved from 7.6 to 7.8 and my application behaves very differently. I have three threads, an application thread that plots data with wxhaskell or sends it over a network (depends on settings), a thread doing usb bulk writes, and a thread doing usb bulk reads. Data is moved around with TChan, and TVar is used for coordination. When the application was compiled with 7.6, my stream of usb traffic was smooth. With 7.8, there are lots of delays where nothing seems to be running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or so. When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine without with -N2/4. The program is compiled -O2 with profiling. The -N2/4 version uses more memory, but in both cases with 7.8 and with 7.6 there is no space leak. I tired to compile and use -ls so I could take a look with threadscope, but the application hangs and writes no data to the file. The CPU fans run wild like it is in an infinite loop. It at least pops an unpainted wxhaskell window, so it got partially running. One of my libraries uses option -fsimpl-tick-factor=200 to get around the compiler. What do I need to know about changes to threading and event logging between 7.6 and 7.8? Is there some general documentation somewhere that might help? I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and installed myself, after removing 7.6 with apt-get. Any hints appreciated. Mike ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Thread behavior in 7.8.3
Ben, I am using Bas van Dijk’s usb, and I am past the -threading issue by using the latest commit. I don’t have any easy way of making comparisons between 7.6 and 7.8 productivity, but from oscilloscope activity, I can’t see any difference. The only difference I see is the thread scheduling on 7.8 for -N1 vs -N2/4. If —sstderr gives some notion of productivity, I’ll have to do an experiment between -N1 and -N2/4. Unchartered territory for me. I’ll setup and experiment tonight. I am not familiar with strace. I’ll fix that soon. Mike On Oct 29, 2014, at 10:24 AM, Ben Gamari bgamari.f...@gmail.com wrote: Michael Jones m...@proclivis.com writes: I have a general question about thread behavior in 7.8.3 vs 7.6.X I moved from 7.6 to 7.8 and my application behaves very differently. I have three threads, an application thread that plots data with wxhaskell or sends it over a network (depends on settings), a thread doing usb bulk writes, and a thread doing usb bulk reads. Data is moved around with TChan, and TVar is used for coordination. Are you using Bas van Dijk's `usb` library by any chance? If so, you should be aware of this [1] issue. When the application was compiled with 7.6, my stream of usb traffic was smooth. With 7.8, there are lots of delays where nothing seems to be running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or so. When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine without with -N2/4. The program is compiled -O2 with profiling. The -N2/4 version uses more memory, but in both cases with 7.8 and with 7.6 there is no space leak. Have you looked at the RTS's output when run with `+RTS -sstderr`? Is productivity any different between the two tests? I tired to compile and use -ls so I could take a look with threadscope, but the application hangs and writes no data to the file. The CPU fans run wild like it is in an infinite loop. Oh dear, this doesn't sound good at all. Have you tried getting a backtrace out of gdb? Usually this isn't terribly useful but in this case since the event log is involved it might be getting stuck in the RTS which should give a useful backtrace. If not, perhaps strace will give some clues as to what is happening (you'll probably want to hide SIGVTALM to improve signal/noise)? Cheers, - Ben [1] https://github.com/basvandijk/usb/issues/7 ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Thread behavior in 7.8.3
John, Adding -C0.005 makes it much better. Using -C0.001 makes it behave more like -N4. Thanks. This saves my project, as I need to deploy on a single core Atom and was stuck. Mike On Oct 29, 2014, at 5:12 PM, John Lato jwl...@gmail.com wrote: By any chance do the delays get shorter if you run your program with `+RTS -C0.005` ? If so, I suspect you're having a problem very similar to one that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some reason), involving possible misbehavior of the thread scheduler. On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones m...@proclivis.com wrote: I have a general question about thread behavior in 7.8.3 vs 7.6.X I moved from 7.6 to 7.8 and my application behaves very differently. I have three threads, an application thread that plots data with wxhaskell or sends it over a network (depends on settings), a thread doing usb bulk writes, and a thread doing usb bulk reads. Data is moved around with TChan, and TVar is used for coordination. When the application was compiled with 7.6, my stream of usb traffic was smooth. With 7.8, there are lots of delays where nothing seems to be running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or so. When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine without with -N2/4. The program is compiled -O2 with profiling. The -N2/4 version uses more memory, but in both cases with 7.8 and with 7.6 there is no space leak. I tired to compile and use -ls so I could take a look with threadscope, but the application hangs and writes no data to the file. The CPU fans run wild like it is in an infinite loop. It at least pops an unpainted wxhaskell window, so it got partially running. One of my libraries uses option -fsimpl-tick-factor=200 to get around the compiler. What do I need to know about changes to threading and event logging between 7.6 and 7.8? Is there some general documentation somewhere that might help? I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and installed myself, after removing 7.6 with apt-get. Any hints appreciated. Mike ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Odd FFI behavior
Donn, I was able to duplicate my problem in C using SIGVTALRM. Can someone explain the impact of using -V0 ? What does it do to performance, etc? Mike Sent from my iPad On Aug 13, 2014, at 9:56 AM, Donn Cave d...@avvanta.com wrote: [ ... re -V0 ] Thanks, this solved the problem. I would like to know more about what the signals are doing, and what am I giving up by disabling them? My hope is I can then go back to the dll expert and ask why this is causing their library a problem and try to see if they can solve the problem from their end, etc. I'm disgracefully ignorant about that. When I've been forced to run this way, it doesn't seem to do any very obvious immediate harm to the application at all, but I could be missing long term effects. The problem with the library might be easy to fix, and in principle it's sure worth looking into - while the GHC runtime delivers signals on an exceptionally massive scale, there are plenty of normal UNIX applications that use signals, maybe timers just like this for example, and it's easy to set up a similar test environment using setitimer(2) to provide the signal bombardment. (I believe GHC actually uses SIGVTALRM rather than SIGALRM, but don't think it will make any difference.) But realistically, in the end sometimes we can't get a fix for it, so it's interesting to know how -V0 works out as a work-around. I hope you will keep us posted. Donn ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Odd FFI behavior
Donn, Thanks, this solved the problem. I would like to know more about what the signals are doing, and what am I giving up by disabling them? My hope is I can then go back to the dll expert and ask why this is causing their library a problem and try to see if they can solve the problem from their end, etc. Mike On Aug 12, 2014, at 11:04 PM, Donn Cave d...@avvanta.com wrote: ... Because the failures are not general in that they target one particular value, and seem to be affected by time, it makes me wonder if there is some subtle Haskell run time issue. Like, could the garbage collector be interacting with things? Does anyone have an idea what kind of things to look for? Sure - not that I have worked out in any detail how this would do what you're seeing, but it's easy to do and often enough works. Compile with RTS options enabled and invoke with RTS option -V0. That will disable the runtime internal timer, which uses signals. The flood of signals from this source can interrupt functions that aren't really designed to deal with that, because in a more normal context they don't have to. Donn ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Odd FFI behavior
I have some strange behavior with GHC 7.6.3 on Ubuntu 14 TLS when using FFI and I am looking for some ideas on what could be going on. Fundamentally, adding wait calls (delays coded in the C) changes the behavior of the C, in that returned status codes have proper values when there are delays, and return errors when there are no delays. But these same calls result in proper behavior on the Aardvark’s serial bus, return proper data, etc. Only the status get’s messed up. The module calls a thin custom C layer over the Aaardvark C layer, which dynamically loads a dll and makes calls into it. The thin layer just makes the use of c2hs eaiser. It is always possible there is some kind of memory issue, but there is no pattern to the mishap. It is random. Adding delays just changes probabilities of bad status. I made a C version of my application calling the same custom C layer, and there are no problems. This sort of indicates the problem is with the FFI. Because the failures are not general in that they target one particular value, and seem to be affected by time, it makes me wonder if there is some subtle Haskell run time issue. Like, could the garbage collector be interacting with things? Does anyone have an idea what kind of things to look for? Mike DLL loader static void *_loadFunction (const char *name, int *result) { static DLL_HANDLE handle = 0; void * function = 0; /* Load the shared library if necessary */ if (handle == 0) { u32 (*version) (void); u16 sw_version; u16 api_version_req; _setSearchPath(); handle = dlopen(SO_NAME, RTLD_LAZY); if (handle == 0) { #if API_DEBUG fprintf(stderr, Unable to load %s\n, SO_NAME); fprintf(stderr, %s\n, dlerror()); #endif *result = API_UNABLE_TO_LOAD_LIBRARY; return 0; } version = (void *)dlsym(handle, c_version); if (version == 0) { #if API_DEBUG fprintf(stderr, Unable to bind c_version() in %s\n, SO_NAME); fprintf(stderr, %s\n, dlerror()); #endif handle = 0; *result = API_INCOMPATIBLE_LIBRARY; return 0; } sw_version = (u16)((version() 0) 0x); api_version_req = (u16)((version() 16) 0x); if (sw_version API_REQ_SW_VERSION || API_HEADER_VERSION api_version_req) { #if API_DEBUG fprintf(stderr, \nIncompatible versions:\n); fprintf(stderr, Header version = v%d.%02d , (API_HEADER_VERSION 8) 0xff, API_HEADER_VERSION 0xff); if (sw_version API_REQ_SW_VERSION) fprintf(stderr, (requires library = %d.%02d)\n, (API_REQ_SW_VERSION 8) 0xff, API_REQ_SW_VERSION 0xff); else fprintf(stderr, (library version OK)\n); fprintf(stderr, Library version = v%d.%02d , (sw_version 8) 0xff, (sw_version 0) 0xff); if (API_HEADER_VERSION api_version_req) fprintf(stderr, (requires header = %d.%02d)\n, (api_version_req 8) 0xff, (api_version_req 0) 0xff); else fprintf(stderr, (header version OK)\n); #endif handle = 0; *result = API_INCOMPATIBLE_LIBRARY; return 0; } } /* Bind the requested function in the shared library */ function = (void *)dlsym(handle, name); *result = function ? API_OK : API_UNABLE_TO_LOAD_FUNCTION; return function; } ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Cross compiling for Cortex A9
I ran the same program on my dev linux and compared it to see if there were clues. On my dev box (Ubuntu) here are some snippets of the log (working app) new task (taskCount: 1) cap 0: created thread 1 new bound thread (1) cap 0: schedule() cap 0: running thread 1 (ThreadRunGHC) New weak pointer at 0x7f7788704050 stg_ap_0_ret... base:GHC.MVar.MVar(0x7f7788704148) stg_ap_0_ret... ghc-prim:GHC.Tuple.(,)(0x7f7788704281, 0x7f7788704272) stg_ap_0_ret... ghc-prim:GHC.Types.:(0x7f7788704421, 0x7f778870440a) New weak pointer at 0x7f7788706188 stg_ap_0_ret... base:GHC.IO.Handle.Types.FileHandle(0x6d4080, 0x7f77887060b0) And on the failing embedded target: new task (taskCount: 1) cap 0: created thread 1 new bound thread (1) cap 0: schedule() cap 0: running thread 1 (ThreadRunGHC) New weak pointer at 0x76c02038 stg_ap_0_ret... base:GHC.MVar.MVar(0x76c020d4) stg_ap_0_ret... ghc-prim:GHC.Tuple.(,)(0x76c021a9, 0x76c0219a) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) stg_ap_0_ret... ghc-prim:GHC.Types.:(0x76c023d1, 0x76c023be) stg_ap_0_ret... ghc-prim:GHC.Tuple.(,)(0x76c029a1, 0x76c02992) stg_ap_0_ret... THUNK_SELECTOR(0x46ee50, CAF, 0x76c05068) stg_ap_0_ret... ghc-prim:GHC.Types.[]() stg_ap_v_ret... IND_STATIC(0x76c0360c) stg_ap_v_ret... FUN/1(0x59088, CAF, 0x76c02980) cap 0: thread 1 stopped (stack overflow) A couple of things are different in the failing target. First, the thread yields, and then is allowed to run, whereas on Ubuntu it just runs to completion. Second, there is a second set of primitive calls: stg_ap_0_ret... ghc-prim:GHC.Tuple.(,)(0x76c029a1, 0x76c02992) stg_ap_0_ret... ghc-prim:GHC.Types.[]() cap 0: thread 1 stopped (stack overflow) And this is followed by the stack overflow. These two calls are not present when run on Ubuntu. The original source is: module Main where main::IO() main = putStr Hello world!” As a reminder, the difference is: Working: Standard GHC on Ubuntu Failing: Cross compiled for Cortex A and run on Wandboard with Yocto Linux Do these extra two calls give any clues to the cause of the stack flow? The other thing to note, is the working version eventually gets to various IO calls. These are never reached in the failing version. So it seems to be choking in the type system code before it gets to IO. Is there a test program that could be run that might narrow down the problem? Mike On Aug 4, 2014, at 8:23 AM, Michael Jones m...@proclivis.com wrote: To be complete, there is an old link on compiling for arm, and it recommends this build process: $ chmod ugo+rx build-ghc-arm.sh Edit build-ghc-arm.sh to fix EOF $ ./build-ghc-arm.sh -j4 $ make test $ sudo make install $ arm-poky-linux-gnueabi-ghc --info This way of building produces the same result as below. So I am guessing that problems that this fixed moons ago no longer exist, or produce different problems that are masked. What will help me, not knowing the RTS, is if anyone can recognize the failure pattern and give me some hints where in the code the problem might be. That way I can add some logging specifically related to the problem to get more data, and limit my study of the RTS code to a specific set of files. My other thought is if there is some test code that might help reveal the problem. Something that if it runs, and the log is clean, perhaps eliminates possible problems. The RTS is able to write a log via stdio, so that could be exploited to run test code that can log information. But the hello world fails to write to stdio. So perhaps the problem occurs in the start up phase where the thread is created and things go wrong before hello world is executed. This would suggest any test code would have to be run within the RTS itself. Is there some way to enable some ASSERTIONS within RTS that would help narrow down the problem? Also, is there any way to compile up something small to run the RTS without any Haskell libraries, etc, so and debug it with a C/C++ debugger? I am able to debug other C/C++ programs on the target device, so if I had a C/C++ program, code, etc, I could run and debug it. Mike On Aug 2, 2014, at 4:04 PM, Michael Jones m...@proclivis.com wrote: Karel, My configure hack is now limited to two hacks on GHC 7.8.3. The vendor “pokey, and the ARM settings that make the target architecture correct. I also enabled profiling in build.mk. So pretty clean. No compiler options. GHC compiles clean. I am getting a stack overflow during execution of the hello world applicaiton. I compiled like: arm-poky-linux-gnueabi-ghc -static -debug -auto-all -caf-all -prof Main.hs I ran on the target like: ./Main +RTS -Ds -Di -Db -DS -Dt -Da 2log I attached a log. Search for overflow. created capset 0 of type 2 created capset 1 of type 3 cap 0: initialised assigned cap 0 to capset 0 assigned cap 0 to capset 1 free block list [0]: free block list [1]: free block list [2
Re: Cross compiling for Cortex A9
To be complete, there is an old link on compiling for arm, and it recommends this build process: $ chmod ugo+rx build-ghc-arm.sh Edit build-ghc-arm.sh to fix EOF $ ./build-ghc-arm.sh -j4 $ make test $ sudo make install $ arm-poky-linux-gnueabi-ghc --info This way of building produces the same result as below. So I am guessing that problems that this fixed moons ago no longer exist, or produce different problems that are masked. What will help me, not knowing the RTS, is if anyone can recognize the failure pattern and give me some hints where in the code the problem might be. That way I can add some logging specifically related to the problem to get more data, and limit my study of the RTS code to a specific set of files. My other thought is if there is some test code that might help reveal the problem. Something that if it runs, and the log is clean, perhaps eliminates possible problems. The RTS is able to write a log via stdio, so that could be exploited to run test code that can log information. But the hello world fails to write to stdio. So perhaps the problem occurs in the start up phase where the thread is created and things go wrong before hello world is executed. This would suggest any test code would have to be run within the RTS itself. Is there some way to enable some ASSERTIONS within RTS that would help narrow down the problem? Also, is there any way to compile up something small to run the RTS without any Haskell libraries, etc, so and debug it with a C/C++ debugger? I am able to debug other C/C++ programs on the target device, so if I had a C/C++ program, code, etc, I could run and debug it. Mike On Aug 2, 2014, at 4:04 PM, Michael Jones m...@proclivis.com wrote: Karel, My configure hack is now limited to two hacks on GHC 7.8.3. The vendor “pokey, and the ARM settings that make the target architecture correct. I also enabled profiling in build.mk. So pretty clean. No compiler options. GHC compiles clean. I am getting a stack overflow during execution of the hello world applicaiton. I compiled like: arm-poky-linux-gnueabi-ghc -static -debug -auto-all -caf-all -prof Main.hs I ran on the target like: ./Main +RTS -Ds -Di -Db -DS -Dt -Da 2log I attached a log. Search for overflow. created capset 0 of type 2 created capset 1 of type 3 cap 0: initialised assigned cap 0 to capset 0 assigned cap 0 to capset 1 free block list [0]: free block list [1]: free block list [2]: free block list [3]: free block list [4]: free block list [5]: free block list [6]: group at 0x76c82000, length 126 blocks free block list [7]: free block list [8]: free block list [0]: free block list [1]: free block list [2]: free block list [3]: free block list [4]: free block list [5]: free block list [6]: group at 0x76c82000, length 126 blocks free block list [7]: free block list [8]: free block list [0]: free block list [1]: free block list [2]: free block list [3]: free block list [4]: free block list [5]: free block list [6]: group at 0x76c82000, length 125 blocks free block list [7]: free block list [8]: free block list [0]: free block list [1]: free block list [2]: free block list [3]: free block list [4]: free block list [5]: free block list [6]: group at 0x76c82000, length 124 blocks free block list [7]: free block list [8]: free block list [0]: free block list [1]: free block list [2]: free block list [3]: free block list [4]: free block list [5]: free block list [6]: group at 0x76c82000, length 123 blocks free block list [7]: free block list [8]: free block list [0]: free block list [1]: free block list [2]: free block list [3]: free block list [4]: free block list [5]: free block list [6]: group at 0x76c82000, length 122 blocks free block list [7]: free block list [8]: new task (taskCount: 1) cap 0: created thread 1 new bound thread (1) cap 0: schedule() cap 0: running thread 1 (ThreadRunGHC) stg_ap_v_ret... PAP/1(0x5ba25a, 0x5a3670) stg_ap_0_ret... base:GHC.MVar.MVar(0x76c020d4) stg_ap_v_ret... Main.main(0xa964, CAF:main) stg_ap_v_ret... PAP/1(0x5bce1a, 0x76c02160) stg_ap_0_ret... ghc-prim:GHC.Tuple.(,)(0x76c021a9, 0x76c0219a) stg_ap_0_ret... ghc-prim:GHC.Types.:(0x76c023d1, 0x76c023be) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) stg_ap_0_ret... base:GHC.IO.Encoding.Types.TextEncoding(0x5aadd0, 0x76c026c5, 0x76c02665) stg_ap_v_ret... GHC.IO.Encoding.getForeignEncoding(0x59458, CAF) stg_ap_0_ret... ghc-prim:GHC.Tuple.(,)(0x76c029a1, 0x76c02992) stg_ap_0_ret... FUN/1(0x59088, CAF, 0x76c02980) stg_ap_v_ret... FUN/1(0x59088, CAF, 0x76c02980) stg_ap_0_ret... base:GHC.IO.Encoding.Types.TextEncoding(0x5aadd0, 0x76c02aa1, 0x76c02a41) stg_ap_0_ret... ghc-prim:GHC.Types.:(0x76c023d1, 0x76c02b40) stg_ap_0_ret... ghc-prim:GHC.Types.:(0x76c023ad, 0x76c02ba8) stg_ap_0_ret... ghc-prim:GHC.Types.:(0x76c02389, 0x76c02c10) stg_ap_0_ret... ghc-prim:GHC.Types
Re: Cross compiling for Cortex A9
... FUN/1(0x59088, CAF, 0x76c02980) stg_ap_v_ret... IND_STATIC(0x76c0360c) stg_ap_v_ret... FUN/1(0x59088, CAF, 0x76c02980) stg_ap_v_ret... IND_STATIC(0x76c0360c) stg_ap_v_ret... FUN/1(0x59088, CAF, 0x76c02980) cap 0: thread 1 stopped (stack overflow) On Aug 2, 2014, at 12:07 PM, Karel Gardas karel.gar...@centrum.cz wrote: On 08/ 2/14 07:04 AM, Michael Jones wrote: ,(target arch,ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}) ,(target word size,4) this looks good, but I hope you got it on clean tree, i.e. without your configure hack... Karel ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Cross compiling for Cortex A9
Karel, On CFLAGS.. When the cross compiler is compiled, does it use gcc, or is gcc only used to compile libraries with the stage-1 compiler? Because if gcc is used for both, I would need different flags for each, and I don't know how to make that happen. Mike Sent from my iPad On Aug 1, 2014, at 3:19 AM, Karel Gardas karel.gar...@centrum.cz wrote: OK, so you do have ghc-stage1 which is a cross-compiler. Now, what does ghc-stage1 --info tell you? Aha, you hacked configure, hmm. I don't think this is needed especially if you use proper CFLAGS. Karel On 07/25/14 06:00 AM, Michael Jones wrote: I have some progress, and a problem. First, note I am using the 7.8.3 tar ball, for this discussion here. If you read through, you will see a request for help at the end. It looks like the cross compilation is trying to build stage2 when it shouldn’t. In order to get the resulting stage1 cross compiler to have: ,(target arch,ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}) I hacked this: AC_DEFUN([GET_ARM_ISA], [ changequote(, )dnl ARM_ISA=ARMv7 ARM_ISA_EXT=[VFPv3,NEON] changequote([, ])dnl [ARM_ABI=HARD] ]) Now, my gcc cross compiler does not default to ARMv7. To compile for my Cortex A, I add these options: -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 So I need to make sure that when building the libraries with stage1, it passes the correct options. To do that: AC_DEFUN([FPTOOLS_SET_C_LD_FLAGS], [ AC_MSG_CHECKING([Setting up $2, $3, $4 and $5]) … arm-poky-*) $2=$$2 -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 ;; Which results in a stage1 compiler with: ,(C compiler flags, -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 -fno-stack-protector”) As the build proceeds, all calls to stage1 are completing. Then, the build gets to this point: inplace/bin/mkdirhier compiler/stage2/build//. rm -f compiler/stage2/build/Config.hs Creating compiler/stage2/build/Config.hs ... done. And I assume this means it is trying to build stage2. Correct me if I am wrong. Eventually I get a build failure like this: gcc -E -DMAKING_GHC_BUILD_SYSTEM_DEPENDENCIES -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 -fno-stack-protector -Icompiler/. -Icompiler/parser -Icompiler/utils -Icompiler/../rts/dist/build -Icompiler/stage2 -DGHCI -I'/home/mike/ghc-7.8.3/libraries/process/include' -I'/home/mike/ghc-7.8.3/libraries/directory/include' -I'/home/mike/ghc-7.8.3/libraries/unix/include' -I'/home/mike/ghc-7.8.3/libraries/time/include' -I'/home/mike/ghc-7.8.3/libraries/containers/include' -I'/home/mike/ghc-7.8.3/libraries/bytestring/include' -I'/home/mike/ghc-7.8.3/libraries/base/include' -I'/home/mike/ghc-7.8.3/rts/dist/build' -I'/home/mike/ghc-7.8.3/includes' -I'/home/mike/ghc-7.8.3/includes/dist-derivedconstants/header' -MM -x c compiler/parser/cutils.c -MF compiler/stage2/build/.depend-v.c_asm.bit gcc: error: unrecognized command line option ‘-mthumb-interwork’ gcc: error: unrecognized command line option ‘-mfloat-abi=hard’ gcc: error: unrecognized command line option ‘-mfpu=neon’ make[1]: *** [compiler/stage2/build/.depend-v.c_asm] Error 1 make: *** [all] Error 2 It is applying the -march… arguments to the local gcc. I am guessing that compiling for stage2 is using the local gcc because stage2 is only built when not making a cross compiler. Now, in build.mk I have: BuildFlavour = quick-cross Which is supposed to prevent stage2 compilation. Something is wrong. Either I need to stop stage2 compilation, if that is what this is really doing, or prevent gcc from using the extra arguments. But I see no reason to run gcc. Seems like that would only be used at stage0 if at all. Mike On Jul 14, 2014, at 10:12 AM, Karel Gardas karel.gar...@centrum.cz mailto:karel.gar...@centrum.cz wrote: On 07/14/14 04:58 PM, Michael Jones wrote: Karel, Thanks. This helps. If I understand, you have Linux running on a Panda, and on that Panda system you have gcc, and you compile GHC on the Panda itself, rather than build a cross compiler. I can see the advantage of building this way. Correct! As far as cross compilers, I have a reason for trying to build a cross compiler, other than the ability to keep the image of the target small. That is, eventually, I want to be able to compile for an RTOS and/or bare iron system. I decided that learning to cross compile for Linux first would be a good approach. Learn the build system on something known to work. So I probably will not give up on that. That is right, in future I'd also like to give a try to port GHC to some free RTOS and for this I'd need to use cross-compiling anyway, so I'll be on the same boat... I got a book on Autoconfig. I’ll just have to dig a level deeper into the whole build system. Mainly
Re: Cross compiling for Cortex A9
Karel, I think I proved that make dies early if CFLAGS has these options. The local gcc does not support ARM related flags. I am trying to build the tool chain with —with-arch= and —with-tune== so that it defaults to my target. This will bypass any GHC build issues. Mike On Aug 1, 2014, at 3:23 PM, Michael Jones m...@proclivis.com wrote: Karel, On CFLAGS.. When the cross compiler is compiled, does it use gcc, or is gcc only used to compile libraries with the stage-1 compiler? Because if gcc is used for both, I would need different flags for each, and I don't know how to make that happen. Mike Sent from my iPad On Aug 1, 2014, at 3:19 AM, Karel Gardas karel.gar...@centrum.cz wrote: OK, so you do have ghc-stage1 which is a cross-compiler. Now, what does ghc-stage1 --info tell you? Aha, you hacked configure, hmm. I don't think this is needed especially if you use proper CFLAGS. Karel On 07/25/14 06:00 AM, Michael Jones wrote: I have some progress, and a problem. First, note I am using the 7.8.3 tar ball, for this discussion here. If you read through, you will see a request for help at the end. It looks like the cross compilation is trying to build stage2 when it shouldn’t. In order to get the resulting stage1 cross compiler to have: ,(target arch,ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}) I hacked this: AC_DEFUN([GET_ARM_ISA], [ changequote(, )dnl ARM_ISA=ARMv7 ARM_ISA_EXT=[VFPv3,NEON] changequote([, ])dnl [ARM_ABI=HARD] ]) Now, my gcc cross compiler does not default to ARMv7. To compile for my Cortex A, I add these options: -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 So I need to make sure that when building the libraries with stage1, it passes the correct options. To do that: AC_DEFUN([FPTOOLS_SET_C_LD_FLAGS], [ AC_MSG_CHECKING([Setting up $2, $3, $4 and $5]) … arm-poky-*) $2=$$2 -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 ;; Which results in a stage1 compiler with: ,(C compiler flags, -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 -fno-stack-protector”) As the build proceeds, all calls to stage1 are completing. Then, the build gets to this point: inplace/bin/mkdirhier compiler/stage2/build//. rm -f compiler/stage2/build/Config.hs Creating compiler/stage2/build/Config.hs ... done. And I assume this means it is trying to build stage2. Correct me if I am wrong. Eventually I get a build failure like this: gcc -E -DMAKING_GHC_BUILD_SYSTEM_DEPENDENCIES -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 -fno-stack-protector -Icompiler/. -Icompiler/parser -Icompiler/utils -Icompiler/../rts/dist/build -Icompiler/stage2 -DGHCI -I'/home/mike/ghc-7.8.3/libraries/process/include' -I'/home/mike/ghc-7.8.3/libraries/directory/include' -I'/home/mike/ghc-7.8.3/libraries/unix/include' -I'/home/mike/ghc-7.8.3/libraries/time/include' -I'/home/mike/ghc-7.8.3/libraries/containers/include' -I'/home/mike/ghc-7.8.3/libraries/bytestring/include' -I'/home/mike/ghc-7.8.3/libraries/base/include' -I'/home/mike/ghc-7.8.3/rts/dist/build' -I'/home/mike/ghc-7.8.3/includes' -I'/home/mike/ghc-7.8.3/includes/dist-derivedconstants/header' -MM -x c compiler/parser/cutils.c -MF compiler/stage2/build/.depend-v.c_asm.bit gcc: error: unrecognized command line option ‘-mthumb-interwork’ gcc: error: unrecognized command line option ‘-mfloat-abi=hard’ gcc: error: unrecognized command line option ‘-mfpu=neon’ make[1]: *** [compiler/stage2/build/.depend-v.c_asm] Error 1 make: *** [all] Error 2 It is applying the -march… arguments to the local gcc. I am guessing that compiling for stage2 is using the local gcc because stage2 is only built when not making a cross compiler. Now, in build.mk I have: BuildFlavour = quick-cross Which is supposed to prevent stage2 compilation. Something is wrong. Either I need to stop stage2 compilation, if that is what this is really doing, or prevent gcc from using the extra arguments. But I see no reason to run gcc. Seems like that would only be used at stage0 if at all. Mike On Jul 14, 2014, at 10:12 AM, Karel Gardas karel.gar...@centrum.cz mailto:karel.gar...@centrum.cz wrote: On 07/14/14 04:58 PM, Michael Jones wrote: Karel, Thanks. This helps. If I understand, you have Linux running on a Panda, and on that Panda system you have gcc, and you compile GHC on the Panda itself, rather than build a cross compiler. I can see the advantage of building this way. Correct! As far as cross compilers, I have a reason for trying to build a cross compiler, other than the ability to keep the image of the target small. That is, eventually, I want to be able to compile for an RTOS and/or bare iron system. I decided that learning to cross compile for Linux first would be a good approach. Learn
Re: Cross compiling for Cortex A9
Karel, I was able to get GHC to compile by rebuilding the gcc cross compiler using —with-arch= and —with-tune= However, it is not a satisfactory solution, because if a gcc toolchain uses multilibs, adding those options does not make sense. In the case of the Yocto toolchain, it does not use multilibs, so I can get away with it. I’ll test the compiler tomorrow and see how things go. Here is the GHC info after building: [(Project name,The Glorious Glasgow Haskell Compilation System) ,(GCC extra via C opts, -fwrapv) ,(C compiler command,arm-poky-linux-gnueabi-gcc) ,(C compiler flags, -fno-stack-protector) ,(C compiler link flags,) ,(Haskell CPP command,arm-poky-linux-gnueabi-gcc) ,(Haskell CPP flags,-E -undef -traditional ) ,(ld command,/home/mike/fsl-community-bsp/build/tmp/sysroots/x86_64-linux/usr/bin/cortexa9hf-vfp-neon-poky-linux-gnueabi/arm-poky-linux-gnueabi-ld) ,(ld flags,) ,(ld supports compact unwind,YES) ,(ld supports build-id,YES) ,(ld supports filelist,NO) ,(ld is GNU ld,YES) ,(ar command,/home/mike/fsl-community-bsp/build/tmp/sysroots/x86_64-linux/usr/bin/cortexa9hf-vfp-neon-poky-linux-gnueabi/arm-poky-linux-gnueabi-ar) ,(ar flags,q) ,(ar supports at file,YES) ,(touch command,touch) ,(dllwrap command,/bin/false) ,(windres command,/bin/false) ,(libtool command,libtool) ,(perl command,/usr/bin/perl) ,(target os,OSLinux) ,(target arch,ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}) ,(target word size,4) ,(target has GNU nonexec stack,False) ,(target has .ident directive,True) ,(target has subsections via symbols,False) ,(Unregisterised,NO) ,(LLVM llc command,/usr/bin/llc) ,(LLVM opt command,/usr/bin/opt) ,(Project version,7.8.3) ,(Booter version,7.6.3) ,(Stage,1) ,(Build platform,x86_64-unknown-linux) ,(Host platform,x86_64-unknown-linux) ,(Target platform,arm-poky-linux) ,(Have interpreter,YES) ,(Object splitting supported,NO) ,(Have native code generator,NO) ,(Support SMP,YES) ,(Tables next to code,YES) ,(RTS ways,l debug thr thr_debug thr_l ) ,(Support dynamic-too,YES) ,(Support parallel --make,YES) ,(Dynamic by default,NO) ,(GHC Dynamic,NO) ,(Leading underscore,NO) ,(Debug on,False) ,(LibDir,/home/mike/ghc-7.8.3/inplace/lib) ,(Global Package DB,/home/mike/ghc-7.8.3/inplace/lib/package.conf.d) ] On Aug 1, 2014, at 8:02 PM, Michael Jones m...@proclivis.com wrote: Karel, I think I proved that make dies early if CFLAGS has these options. The local gcc does not support ARM related flags. I am trying to build the tool chain with —with-arch= and —with-tune== so that it defaults to my target. This will bypass any GHC build issues. Mike On Aug 1, 2014, at 3:23 PM, Michael Jones m...@proclivis.com wrote: Karel, On CFLAGS.. When the cross compiler is compiled, does it use gcc, or is gcc only used to compile libraries with the stage-1 compiler? Because if gcc is used for both, I would need different flags for each, and I don't know how to make that happen. Mike Sent from my iPad On Aug 1, 2014, at 3:19 AM, Karel Gardas karel.gar...@centrum.cz wrote: OK, so you do have ghc-stage1 which is a cross-compiler. Now, what does ghc-stage1 --info tell you? Aha, you hacked configure, hmm. I don't think this is needed especially if you use proper CFLAGS. Karel On 07/25/14 06:00 AM, Michael Jones wrote: I have some progress, and a problem. First, note I am using the 7.8.3 tar ball, for this discussion here. If you read through, you will see a request for help at the end. It looks like the cross compilation is trying to build stage2 when it shouldn’t. In order to get the resulting stage1 cross compiler to have: ,(target arch,ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}) I hacked this: AC_DEFUN([GET_ARM_ISA], [ changequote(, )dnl ARM_ISA=ARMv7 ARM_ISA_EXT=[VFPv3,NEON] changequote([, ])dnl [ARM_ABI=HARD] ]) Now, my gcc cross compiler does not default to ARMv7. To compile for my Cortex A, I add these options: -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 So I need to make sure that when building the libraries with stage1, it passes the correct options. To do that: AC_DEFUN([FPTOOLS_SET_C_LD_FLAGS], [ AC_MSG_CHECKING([Setting up $2, $3, $4 and $5]) … arm-poky-*) $2=$$2 -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 ;; Which results in a stage1 compiler with: ,(C compiler flags, -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 -fno-stack-protector”) As the build proceeds, all calls to stage1 are completing. Then, the build gets to this point: inplace/bin/mkdirhier compiler/stage2/build//. rm -f compiler/stage2/build/Config.hs Creating compiler/stage2/build/Config.hs ... done. And I assume this means it is trying to build stage2. Correct me if I am wrong. Eventually I get a build failure like
Re: Cross compiling for Cortex A9
I have some progress, and a problem. First, note I am using the 7.8.3 tar ball, for this discussion here. If you read through, you will see a request for help at the end. It looks like the cross compilation is trying to build stage2 when it shouldn’t. In order to get the resulting stage1 cross compiler to have: ,(target arch,ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}) I hacked this: AC_DEFUN([GET_ARM_ISA], [ changequote(, )dnl ARM_ISA=ARMv7 ARM_ISA_EXT=[VFPv3,NEON] changequote([, ])dnl [ARM_ABI=HARD] ]) Now, my gcc cross compiler does not default to ARMv7. To compile for my Cortex A, I add these options: -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 So I need to make sure that when building the libraries with stage1, it passes the correct options. To do that: AC_DEFUN([FPTOOLS_SET_C_LD_FLAGS], [ AC_MSG_CHECKING([Setting up $2, $3, $4 and $5]) … arm-poky-*) $2=$$2 -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 ;; Which results in a stage1 compiler with: ,(C compiler flags, -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 -fno-stack-protector”) As the build proceeds, all calls to stage1 are completing. Then, the build gets to this point: inplace/bin/mkdirhier compiler/stage2/build//. rm -f compiler/stage2/build/Config.hs Creating compiler/stage2/build/Config.hs ... done. And I assume this means it is trying to build stage2. Correct me if I am wrong. Eventually I get a build failure like this: gcc -E -DMAKING_GHC_BUILD_SYSTEM_DEPENDENCIES -march=armv7-a -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a9 -fno-stack-protector -Icompiler/. -Icompiler/parser -Icompiler/utils -Icompiler/../rts/dist/build -Icompiler/stage2 -DGHCI -I'/home/mike/ghc-7.8.3/libraries/process/include' -I'/home/mike/ghc-7.8.3/libraries/directory/include' -I'/home/mike/ghc-7.8.3/libraries/unix/include' -I'/home/mike/ghc-7.8.3/libraries/time/include' -I'/home/mike/ghc-7.8.3/libraries/containers/include' -I'/home/mike/ghc-7.8.3/libraries/bytestring/include' -I'/home/mike/ghc-7.8.3/libraries/base/include' -I'/home/mike/ghc-7.8.3/rts/dist/build' -I'/home/mike/ghc-7.8.3/includes' -I'/home/mike/ghc-7.8.3/includes/dist-derivedconstants/header'-MM -x c compiler/parser/cutils.c -MF compiler/stage2/build/.depend-v.c_asm.bit gcc: error: unrecognized command line option ‘-mthumb-interwork’ gcc: error: unrecognized command line option ‘-mfloat-abi=hard’ gcc: error: unrecognized command line option ‘-mfpu=neon’ make[1]: *** [compiler/stage2/build/.depend-v.c_asm] Error 1 make: *** [all] Error 2 It is applying the -march… arguments to the local gcc. I am guessing that compiling for stage2 is using the local gcc because stage2 is only built when not making a cross compiler. Now, in build.mk I have: BuildFlavour = quick-cross Which is supposed to prevent stage2 compilation. Something is wrong. Either I need to stop stage2 compilation, if that is what this is really doing, or prevent gcc from using the extra arguments. But I see no reason to run gcc. Seems like that would only be used at stage0 if at all. Mike On Jul 14, 2014, at 10:12 AM, Karel Gardas karel.gar...@centrum.cz wrote: On 07/14/14 04:58 PM, Michael Jones wrote: Karel, Thanks. This helps. If I understand, you have Linux running on a Panda, and on that Panda system you have gcc, and you compile GHC on the Panda itself, rather than build a cross compiler. I can see the advantage of building this way. Correct! As far as cross compilers, I have a reason for trying to build a cross compiler, other than the ability to keep the image of the target small. That is, eventually, I want to be able to compile for an RTOS and/or bare iron system. I decided that learning to cross compile for Linux first would be a good approach. Learn the build system on something known to work. So I probably will not give up on that. That is right, in future I'd also like to give a try to port GHC to some free RTOS and for this I'd need to use cross-compiling anyway, so I'll be on the same boat... I got a book on Autoconfig. I’ll just have to dig a level deeper into the whole build system. Mainly it means learning the M4 system. I have never used it. Below are the defines from the command you suggested. Thanks for that, got me over an ignorance hump. At least this define, __ARM_ARCH_5T__, is in the aclocal.m4 file. So I will have to study the macros until I figure out what controls the gcc options passed to the gcc cross compiler. I guess my question is what actually controls this result (target arch, ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}”)? Are these controlled by the defines below, or are they controlled by passing gcc arguments to the gcc compiler when the Haskell compiler calls gcc? Basically speaking
Re: Cross compiling for Cortex A9
Karel, Not solved yet, but I did not notice the target problem. It is obvious once pointed out. I’ll try to fix that and try again and report back. Thanks, Mike On Jul 11, 2014, at 4:35 AM, Karel Gardas karel.gar...@centrum.cz wrote: I'm not sure if this is already solved, but if you cross-compile to A9, why do you use ARMv5 platform OS? (target arch,ArchARM {armISA = ARMv5, armISAExt = [], armABI = HARD}) this looks really strange. armABI HARD, that means FP data in FP regs, still no VFP in armISAExt and even armISA set to ARMv5. For example on my ubuntu 12.04 I do have: (target arch, ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}), which is right for pandaboard which is dual Cortex-A9. So, shortly I really do not know if you do not hit some corner case in LLVM here. I would certainly suggest especially considering your Cortex-A9 target to update your OS to get to what I do have here: ARMv7+VFPv3/NEON+ABI HARD. BTW: Another issue may be that GHC misconfigures on your platform and they you will need to tell us more about your target OS... Cheers, Karel On 07/ 8/14 07:51 AM, Michael Jones wrote: I am pasting both the info from the HOST and TARGET compilers: HOST [(Project name,The Glorious Glasgow Haskell Compilation System) ,(GCC extra via C opts, -fwrapv) ,(C compiler command,/usr/bin/gcc) ,(C compiler flags, -fno-stack-protector -Wl,--hash-size=31 -Wl,--reduce-memory-overheads) ,(ar command,/usr/bin/ar) ,(ar flags,q) ,(ar supports at file,YES) ,(touch command,touch) ,(dllwrap command,/bin/false) ,(windres command,/bin/false) ,(perl command,/usr/bin/perl) ,(target os,OSLinux) ,(target arch,ArchX86_64) ,(target word size,8) ,(target has GNU nonexec stack,True) ,(target has .ident directive,True) ,(target has subsections via symbols,False) ,(LLVM llc command,llc) ,(LLVM opt command,opt) ,(Project version,7.6.3) ,(Booter version,7.6.3) ,(Stage,2) ,(Build platform,x86_64-unknown-linux) ,(Host platform,x86_64-unknown-linux) ,(Target platform,x86_64-unknown-linux) ,(Have interpreter,YES) ,(Object splitting supported,YES) ,(Have native code generator,YES) ,(Support SMP,YES) ,(Unregisterised,NO) ,(Tables next to code,YES) ,(RTS ways,l debug thr thr_debug thr_l thr_p dyn debug_dyn thr_dyn thr_debug_dyn thr_debug_p) ,(Leading underscore,NO) ,(Debug on,False) ,(LibDir,/usr/lib/ghc) ,(Global Package DB,/usr/lib/ghc/package.conf.d) ,(Gcc Linker flags,[\-Wl,--hash-size=31\,\-Wl,--reduce-memory-overheads\]) ,(Ld Linker flags,[\--hash-size=31\,\--reduce-memory-overheads\]) ] TARGET === [(Project name,The Glorious Glasgow Haskell Compilation System) ,(GCC extra via C opts, -fwrapv) ,(C compiler command,arm-poky-linux-gnueabi-gcc) ,(C compiler flags, -fno-stack-protector) ,(C compiler link flags,) ,(ld command,arm-poky-linux-gnueabi-ld) ,(ld flags,) ,(ld supports compact unwind,YES) ,(ld supports build-id,YES) ,(ld supports filelist,NO) ,(ld is GNU ld,YES) ,(ar command,/usr/bin/ar) ,(ar flags,q) ,(ar supports at file,YES) ,(touch command,touch) ,(dllwrap command,/bin/false) ,(windres command,/bin/false) ,(libtool command,libtool) ,(perl command,/usr/bin/perl) ,(target os,OSLinux) ,(target arch,ArchARM {armISA = ARMv5, armISAExt = [], armABI = HARD}) ,(target word size,4) ,(target has GNU nonexec stack,False) ,(target has .ident directive,True) ,(target has subsections via symbols,False) ,(Unregisterised,NO) ,(LLVM llc command,llc) ,(LLVM opt command,opt) ,(Project version,7.8.2) ,(Booter version,7.6.3) ,(Stage,1) ,(Build platform,x86_64-unknown-linux) ,(Host platform,x86_64-unknown-linux) ,(Target platform,arm-unknown-linux) ,(Have interpreter,YES) ,(Object splitting supported,NO) ,(Have native code generator,NO) ,(Support SMP,YES) ,(Tables next to code,YES) ,(RTS ways,l debug thr thr_debug thr_l ) ,(Support dynamic-too,YES) ,(Support parallel --make,YES) ,(Dynamic by default,NO) ,(GHC Dynamic,NO) ,(Leading underscore,NO) ,(Debug on,False) ,(LibDir,/usr/local/lib/arm-unknown-linux-gnueabi-ghc-7.8.2) ,(Global Package DB,/usr/local/lib/arm-unknown-linux-gnueabi-ghc-7.8.2/package.conf.d) ] On Jul 7, 2014, at 10:42 PM, Carter Schonwald carter.schonw...@gmail.com mailto:carter.schonw...@gmail.com wrote: could you share the output of ghc --info? On Tue, Jul 8, 2014 at 12:10 AM, Michael Jones m...@proclivis.com mailto:m...@proclivis.com wrote: I am having problems building a GHC cross compiler for Linux (Yocto on a Wandboard) running on a Cortex A9, and need some advice on how to debug it. The cross compiler produces an executable that runs on the Target, but fails to print. So I need help coming up with a strategy to narrow down the root cause. Some details: The application: main = do putStrLn Haskell start The command line options work. The program runs, and I can step
Re: Cross compiling for Cortex A9
Karel, I have failed to figure out how to make this happen: (target arch, ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}”) I have added poky to the list of vendors in aclocal.m4 then configured like this: /configure --target=arm-poky-linux-gnueabi --with-gcc=arm-poky-linux-gnueabi-gcc But I end up with ARMv5. I am new to Autotools and the Haskell build system, so I am not sure what controls this. I assume the idea is that the gcc cross-compiler compiles some code that checks for versions when it evaluates stuff like: AC_COMPILE_IFELSE([ AC_LANG_PROGRAM( [], [#if defined(__ARM_ARCH_2__) || \ defined(__ARM_ARCH_3__) || \ defined(__ARM_ARCH_3M__) || \ defined(__ARM_ARCH_4__) || \ defined(__ARM_ARCH_4T__) || \ So I then suspect the compiler needs options like -mcpu=cortex-a9 -mfpu=neon to make the proper version defined, so that the code can check the architecture. But if all these assumptions are true, it is not clear where to put these options. But that is a big if. Can you explain or point to a doc that explains how this works? The Haskell Developer Wiki does not get into the details at this level. Or, if you have tweaked files for the Panda, I can difference them with mine and figure it out. Thanks, Mike On Jul 11, 2014, at 4:35 AM, Karel Gardas karel.gar...@centrum.cz wrote: I'm not sure if this is already solved, but if you cross-compile to A9, why do you use ARMv5 platform OS? (target arch,ArchARM {armISA = ARMv5, armISAExt = [], armABI = HARD}) this looks really strange. armABI HARD, that means FP data in FP regs, still no VFP in armISAExt and even armISA set to ARMv5. For example on my ubuntu 12.04 I do have: (target arch, ArchARM {armISA = ARMv7, armISAExt = [VFPv3,NEON], armABI = HARD}), which is right for pandaboard which is dual Cortex-A9. So, shortly I really do not know if you do not hit some corner case in LLVM here. I would certainly suggest especially considering your Cortex-A9 target to update your OS to get to what I do have here: ARMv7+VFPv3/NEON+ABI HARD. BTW: Another issue may be that GHC misconfigures on your platform and they you will need to tell us more about your target OS... Cheers, Karel On 07/ 8/14 07:51 AM, Michael Jones wrote: I am pasting both the info from the HOST and TARGET compilers: HOST [(Project name,The Glorious Glasgow Haskell Compilation System) ,(GCC extra via C opts, -fwrapv) ,(C compiler command,/usr/bin/gcc) ,(C compiler flags, -fno-stack-protector -Wl,--hash-size=31 -Wl,--reduce-memory-overheads) ,(ar command,/usr/bin/ar) ,(ar flags,q) ,(ar supports at file,YES) ,(touch command,touch) ,(dllwrap command,/bin/false) ,(windres command,/bin/false) ,(perl command,/usr/bin/perl) ,(target os,OSLinux) ,(target arch,ArchX86_64) ,(target word size,8) ,(target has GNU nonexec stack,True) ,(target has .ident directive,True) ,(target has subsections via symbols,False) ,(LLVM llc command,llc) ,(LLVM opt command,opt) ,(Project version,7.6.3) ,(Booter version,7.6.3) ,(Stage,2) ,(Build platform,x86_64-unknown-linux) ,(Host platform,x86_64-unknown-linux) ,(Target platform,x86_64-unknown-linux) ,(Have interpreter,YES) ,(Object splitting supported,YES) ,(Have native code generator,YES) ,(Support SMP,YES) ,(Unregisterised,NO) ,(Tables next to code,YES) ,(RTS ways,l debug thr thr_debug thr_l thr_p dyn debug_dyn thr_dyn thr_debug_dyn thr_debug_p) ,(Leading underscore,NO) ,(Debug on,False) ,(LibDir,/usr/lib/ghc) ,(Global Package DB,/usr/lib/ghc/package.conf.d) ,(Gcc Linker flags,[\-Wl,--hash-size=31\,\-Wl,--reduce-memory-overheads\]) ,(Ld Linker flags,[\--hash-size=31\,\--reduce-memory-overheads\]) ] TARGET === [(Project name,The Glorious Glasgow Haskell Compilation System) ,(GCC extra via C opts, -fwrapv) ,(C compiler command,arm-poky-linux-gnueabi-gcc) ,(C compiler flags, -fno-stack-protector) ,(C compiler link flags,) ,(ld command,arm-poky-linux-gnueabi-ld) ,(ld flags,) ,(ld supports compact unwind,YES) ,(ld supports build-id,YES) ,(ld supports filelist,NO) ,(ld is GNU ld,YES) ,(ar command,/usr/bin/ar) ,(ar flags,q) ,(ar supports at file,YES) ,(touch command,touch) ,(dllwrap command,/bin/false) ,(windres command,/bin/false) ,(libtool command,libtool) ,(perl command,/usr/bin/perl) ,(target os,OSLinux) ,(target arch,ArchARM {armISA = ARMv5, armISAExt = [], armABI = HARD}) ,(target word size,4) ,(target has GNU nonexec stack,False) ,(target has .ident directive,True) ,(target has subsections via symbols,False) ,(Unregisterised,NO) ,(LLVM llc command,llc) ,(LLVM opt command,opt) ,(Project version,7.8.2) ,(Booter version,7.6.3) ,(Stage,1) ,(Build platform,x86_64-unknown-linux) ,(Host platform,x86_64-unknown-linux) ,(Target platform,arm-unknown-linux) ,(Have interpreter,YES) ,(Object splitting
Cross compiling for Cortex A9
I am having problems building a GHC cross compiler for Linux (Yocto on a Wandboard) running on a Cortex A9, and need some advice on how to debug it. The cross compiler produces an executable that runs on the Target, but fails to print. So I need help coming up with a strategy to narrow down the root cause. Some details: The application: main = do putStrLn Haskell start The command line options work. The program runs, and I can step through assembly. Debug data is printed to the console. But putStrLn fails, and program enters an infinite loop. I compile my app as follows: arm-unknown-linux-gnueabi-ghc -debug -static Main.hs Using -threaded does not fix the problem. Let me compare debug data from a run on my HOST, with a run on my TARGET. First, a run from my HOST: created capset 0 of type 2 created capset 1 of type 3 cap 0: initialised assigned cap 0 to capset 0 assigned cap 0 to capset 1 cap 0: created thread 1 cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (suspended while making a foreign call) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (finished) cap 0: created thread 2 cap 0: running thread 2 (ThreadRunGHC) cap 0: thread 2 stopped (finished) cap 0: starting GC cap 0: GC working cap 0: GC idle cap 0: GC done cap 0: GC idle cap 0: GC done cap 0: GC idle cap 0: GC done cap 0: GC idle cap 0: GC done cap 0: all caps stopped for GC cap 0: finished GC removed cap 0 from capset 0 removed cap 0 from capset 1 cap 0: shutting down deleted capset 0 deleted capset 1 And, it prints properly. So this is my referenced for what it should do on the TARGET. When I run on my TARGET, I get: created capset 0 of type 2 created capset 1 of type 3 cap 0: initialised assigned cap 0 to capset 0 assigned cap 0 to capset 1 cap 0: created thread 1 cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (stack overflow) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (stack overflow) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (stack overflow) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (stack overflow) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (stack overflow) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (stack overflow) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (stack overflow) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (heap overflow) cap 0: starting GC cap 0: GC working cap 0: GC idle cap 0: GC done cap 0: GC idle cap 0: GC done cap 0: GC idle cap 0: GC done cap 0: all caps stopped for GC cap 0: finished GC cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (stack overflow) ... And the debug data goes on forever, just as debugging assembly demonstrated an infinite loop. Clearly, the following does not occur: cap 0: thread 1 stopped (suspended while making a foreign call) And there are overflows. If I had to guess, it is possible that some code is in a loop retrying to foreign call, and failing. Certainly, it is in some kind of a loop, because I found a place I can put a break point and and telling GDB to continue will cause the break over and over at the same place. So somewhere there is a loop. I can step through the application with GDB and see names of files and offsets in assembly. But without a true source code debug, that is a bit rough, especially for someone that does not know the RTS code. If there was a way to compile such that C source code was available and a place to break, that would help. However, I suspect since it never makes a foreign call, there is no place in C to place the breakpoint anyway. So I am also assuming there is no direct way to debug with GDB. But, I can see debug output printed to the console. My hope is there is a way to enable more printing, or a place I can add more print functions to help find the problem. So I think I need one of the following: - A solution from someone that has seen this before, perhaps on the iPhone - How to enable more debug logging - Where in the source code to add debug statements to narrow down the problem Thanks for any help you can give. Mike ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org
Re: Cross compiling for Cortex A9
I am pasting both the info from the HOST and TARGET compilers: HOST [(Project name,The Glorious Glasgow Haskell Compilation System) ,(GCC extra via C opts, -fwrapv) ,(C compiler command,/usr/bin/gcc) ,(C compiler flags, -fno-stack-protector -Wl,--hash-size=31 -Wl,--reduce-memory-overheads) ,(ar command,/usr/bin/ar) ,(ar flags,q) ,(ar supports at file,YES) ,(touch command,touch) ,(dllwrap command,/bin/false) ,(windres command,/bin/false) ,(perl command,/usr/bin/perl) ,(target os,OSLinux) ,(target arch,ArchX86_64) ,(target word size,8) ,(target has GNU nonexec stack,True) ,(target has .ident directive,True) ,(target has subsections via symbols,False) ,(LLVM llc command,llc) ,(LLVM opt command,opt) ,(Project version,7.6.3) ,(Booter version,7.6.3) ,(Stage,2) ,(Build platform,x86_64-unknown-linux) ,(Host platform,x86_64-unknown-linux) ,(Target platform,x86_64-unknown-linux) ,(Have interpreter,YES) ,(Object splitting supported,YES) ,(Have native code generator,YES) ,(Support SMP,YES) ,(Unregisterised,NO) ,(Tables next to code,YES) ,(RTS ways,l debug thr thr_debug thr_l thr_p dyn debug_dyn thr_dyn thr_debug_dyn thr_debug_p) ,(Leading underscore,NO) ,(Debug on,False) ,(LibDir,/usr/lib/ghc) ,(Global Package DB,/usr/lib/ghc/package.conf.d) ,(Gcc Linker flags,[\-Wl,--hash-size=31\,\-Wl,--reduce-memory-overheads\]) ,(Ld Linker flags,[\--hash-size=31\,\--reduce-memory-overheads\]) ] TARGET === [(Project name,The Glorious Glasgow Haskell Compilation System) ,(GCC extra via C opts, -fwrapv) ,(C compiler command,arm-poky-linux-gnueabi-gcc) ,(C compiler flags, -fno-stack-protector) ,(C compiler link flags,) ,(ld command,arm-poky-linux-gnueabi-ld) ,(ld flags,) ,(ld supports compact unwind,YES) ,(ld supports build-id,YES) ,(ld supports filelist,NO) ,(ld is GNU ld,YES) ,(ar command,/usr/bin/ar) ,(ar flags,q) ,(ar supports at file,YES) ,(touch command,touch) ,(dllwrap command,/bin/false) ,(windres command,/bin/false) ,(libtool command,libtool) ,(perl command,/usr/bin/perl) ,(target os,OSLinux) ,(target arch,ArchARM {armISA = ARMv5, armISAExt = [], armABI = HARD}) ,(target word size,4) ,(target has GNU nonexec stack,False) ,(target has .ident directive,True) ,(target has subsections via symbols,False) ,(Unregisterised,NO) ,(LLVM llc command,llc) ,(LLVM opt command,opt) ,(Project version,7.8.2) ,(Booter version,7.6.3) ,(Stage,1) ,(Build platform,x86_64-unknown-linux) ,(Host platform,x86_64-unknown-linux) ,(Target platform,arm-unknown-linux) ,(Have interpreter,YES) ,(Object splitting supported,NO) ,(Have native code generator,NO) ,(Support SMP,YES) ,(Tables next to code,YES) ,(RTS ways,l debug thr thr_debug thr_l ) ,(Support dynamic-too,YES) ,(Support parallel --make,YES) ,(Dynamic by default,NO) ,(GHC Dynamic,NO) ,(Leading underscore,NO) ,(Debug on,False) ,(LibDir,/usr/local/lib/arm-unknown-linux-gnueabi-ghc-7.8.2) ,(Global Package DB,/usr/local/lib/arm-unknown-linux-gnueabi-ghc-7.8.2/package.conf.d) ] On Jul 7, 2014, at 10:42 PM, Carter Schonwald carter.schonw...@gmail.com wrote: could you share the output of ghc --info? On Tue, Jul 8, 2014 at 12:10 AM, Michael Jones m...@proclivis.com wrote: I am having problems building a GHC cross compiler for Linux (Yocto on a Wandboard) running on a Cortex A9, and need some advice on how to debug it. The cross compiler produces an executable that runs on the Target, but fails to print. So I need help coming up with a strategy to narrow down the root cause. Some details: The application: main = do putStrLn Haskell start The command line options work. The program runs, and I can step through assembly. Debug data is printed to the console. But putStrLn fails, and program enters an infinite loop. I compile my app as follows: arm-unknown-linux-gnueabi-ghc -debug -static Main.hs Using -threaded does not fix the problem. Let me compare debug data from a run on my HOST, with a run on my TARGET. First, a run from my HOST: created capset 0 of type 2 created capset 1 of type 3 cap 0: initialised assigned cap 0 to capset 0 assigned cap 0 to capset 1 cap 0: created thread 1 cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (suspended while making a foreign call) cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (finished) cap 0: created thread 2 cap 0: running thread 2 (ThreadRunGHC) cap 0: thread 2 stopped (finished) cap 0: starting GC cap 0: GC working cap 0: GC idle cap 0: GC done cap 0: GC idle cap 0: GC done cap 0: GC idle cap 0: GC done cap 0: GC idle cap 0: GC done cap 0: all caps stopped for GC cap 0: finished GC removed cap 0 from capset 0 removed cap 0 from capset 1 cap 0: shutting down deleted capset 0 deleted capset 1 And, it prints properly. So this is my referenced for what it should do on the TARGET. When I run on my TARGET, I get: created capset 0