STM and fairness
Hi, I'd like to know a bit about the STM implementation in GHC, specifically about how it tries to achieve fairness. I've been reading Composable Memory Transactions but it does not contain that much details on this specific matter. What I want to know boils down to this: what order are processes run which have been woken up from a call to retry? When programming with condition variables the standard behaviour is that the process which has waited the longest is the first one to get to run. But that doesn't seem to be the behaviour here. Consider the following program: \begin{code} module STMFair where import Control.Concurrent import Control.Concurrent.STM test n = do v - newTVarIO 0 mapM_ (\n - forkIO (process n v) threadDelay delay) [1..n] atomically (writeTVar v 1) threadDelay delay delay = 50 process id var = do putStrLn (Process ++ show id ++ started) atomically $ do v - readTVar var if v == 0 then retry else return () putStrLn (Process ++ show id ++ finished) \end{code} When I run 'test 2' I expect it to print: Process 1 started Process 2 started Process 1 finished Process 2 finished This would correspond to the oldest process being executed first. But that is not what happens instead I get this (ghci 6.8.2, Ubuntu Linux): Process 1 started Process 2 started Process 2 finished Process 1 finished This is certainly not the behaviour I would want. I discovered this behaviour when implementing the dining philosophers using STM and there one of the philosophers gets starved. Except, that he's not quite starved. When I run the simulation long enough he will eventually be able to eat but then for a long time there will be some other philosopher that is starved. I find this behaviour very mysterious and it would be nice to have some light shed on it. Apart from this mysterious behaviour it seems quite easy to improve the fairness of the implementation. From my examples above it seems that the wait queues for a transactional variable do contain the processes in the order they call retry (try running 'test n' for some large n). It just seems that they are given to the scheduler in the wrong order, so all that needs to be done is to reverse the list. Am I right? Thanks for reading, Josef ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: STM and fairness
On Fri, Feb 29, 2008 at 4:27 PM, Roberto Zunino [EMAIL PROTECTED] wrote: Josef Svenningsson wrote: What I want to know boils down to this: what order are processes run which have been woken up from a call to retry? IIUC, the order of wake up is irrelevant, since *all* the threads will re-run the transaction in parallel. So, even if thread 1 is the first to wake up, thread 2 might beat it in the race, and complete its transaction first. That's not quite right since there is no true parallelism here. I'm running on a single core (which I suppose I could have mentioned) and so it is up the scheduler to make sure that processes get a fair chance at doing their business, i.e. achieving fairness. The point I was trying to make is that the scheduler isn't doing a very good job in this case. I suggest you put some random delay in your fairness tests, maybe using unsafeIOtoSTM, so that you can improve starvation ;-) I'd rather fix the scheduler. Also, try running a very slow (much-delayed) transaction againts several fast ones. I expect the slow one will never reach completion. Indeed. This is a well known problem with STM but afaict orthogonal to the problem I'm talking about. AFAIK, achieving fairness in STM can be quite hard (not unlike other mainstream approaches to concurrency, sadly). Yes. Still, in the particular situation I showed I think we can do a better job than what is currently being done. Cheers, Josef ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: static constants -- ideas?
jay: Don Stewart [EMAIL PROTECTED]: jay: Don Stewart [EMAIL PROTECTED]: jay: I also have constants that are too large to compile. I am resigned to loading them from data files--other solutions seem even worse. ... Data.Binary eases the irritation somewhat. Did you try bytestring literals (and maybe parsing them in-memory with Data.Binary)? I finally squeezed enough time to try it, and it didn't work for me. -- ghc Overflow.hs [1 of 1] Compiling Overflow ( Overflow.hs, Overflow.o ) Enable optimisations! Compile with ghc -O2. You need this to avoid having a very slow pack call at runtime. Overflow.hs:8:10:stack overflow: use +RTS -Ksize to increase it -- where Overflow.hs is in the vicinity of 40M and looks like -- {-# LANGUAGE OverloadedStrings #-} module Overflow where import qualified Data.ByteString.Lazy as S bigData :: S.ByteString bigData = \0\0\0\0\0\5\67\195\0\0\0\0... -- I didn't compress it, because Codec.Compression.GZip didn't compile for me. It looked like a library change since 6.6 broke it. Probably you don't have the zlib.h header? Or make sure you have the latest version of zlib from hackage -- it does work. Is there a handy string escaping function in the libraries somewhere? It only took a minute to write one, and I spent longer than that looking, so maybe it's the wrong question Surely it's in there somewhere, and I'm just 2 dum 2 c. The show function? ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: static constants -- ideas?
Don Stewart [EMAIL PROTECTED]: jay: Don Stewart [EMAIL PROTECTED]: jay: I also have constants that are too large to compile. I am resigned to loading them from data files--other solutions seem even worse. ... Data.Binary eases the irritation somewhat. Did you try bytestring literals (and maybe parsing them in-memory with Data.Binary)? I finally squeezed enough time to try it, and it didn't work for me. -- ghc Overflow.hs [1 of 1] Compiling Overflow ( Overflow.hs, Overflow.o ) Overflow.hs:8:10:stack overflow: use +RTS -Ksize to increase it -- where Overflow.hs is in the vicinity of 40M and looks like -- {-# LANGUAGE OverloadedStrings #-} module Overflow where import qualified Data.ByteString.Lazy as S bigData :: S.ByteString bigData = \0\0\0\0\0\5\67\195\0\0\0\0... -- I didn't compress it, because Codec.Compression.GZip didn't compile for me. It looked like a library change since 6.6 broke it. Is there a handy string escaping function in the libraries somewhere? It only took a minute to write one, and I spent longer than that looking, so maybe it's the wrong question Surely it's in there somewhere, and I'm just 2 dum 2 c. Jay ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: static constants -- ideas?
Don Stewart [EMAIL PROTECTED]: jay: Don Stewart [EMAIL PROTECTED]: jay: I also have constants that are too large to compile. I am resigned to loading them from data files--other solutions seem even worse. ... Data.Binary eases the irritation somewhat. Did you try bytestring literals (and maybe parsing them in-memory with Data.Binary)? That didn't occur to me, since neither of my large constants includes strings I think you're suggesting that each constant could appear in the source as a long bytestring and be deserialized into the data structure. If that works, it should improve the startup time, but it's still not as nice as simply compiling it straight up. I'll try it. Here's an example, which stores a Data.Map in a gzip-compressed bytestring literal (a C string literal in the compiled code). The Map is reconstructed on startup. {-# LANGUAGE OverloadedStrings #-} import Data.Binary import qualified Data.Map as M import qualified Data.ByteString.Char8 as S import Data.ByteString.Lazy import Codec.Compression.GZip -- -- this is a gzip compressed literal bytestring, storing a binary- encoded Data.Map -- mytable = \US\139\b\NUL\NUL\NUL\NUL\NUL\NUL\ETXEN\219\SO\194 \f\197\224 \188\196\CAN\227\US\224\171~\NAKc\GS4ce\161`\178\191\215(\176\190\180\167 \231\210\n\241\171\203\191\ti\157\217\149\249 \ENQ\214\9\202\162\179a \132X\233\ESC=\231\215\164\SYN\157\DC2D\226*\146\174o\t\167\DLE\209\i_ \240\193\129\199W\250nC\CAN\212\CAN\162J\160\141C\178\133\216;[EMAIL PROTECTED] \203\209x\205\140\166\RS\163\237]9f\170\143\ACK\163g\223\STX\184\7\rH \222\FSW\130\7D\197\NUL\164\0U\193\186\t\186o\228\180~\NUL\a6\249\137# \SOH\NUL\NUL main = print = M.lookup ghc m where -- build the table from the bytestring: m :: M.Map String (Maybe String) m = decode . decompress . fromChunks . return $ mytable Running it: $ ./A Just dinosaur! :) Important to use a bytestring, since that gets compiled to a C string literal (and not messed with by the simplifier). -- Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: static constants -- ideas?
Don Stewart [EMAIL PROTECTED]: jay: Don Stewart [EMAIL PROTECTED]: jay: Don Stewart [EMAIL PROTECTED]: jay: I also have constants that are too large to compile. I am resigned to loading them from data files--other solutions seem even worse. ... Data.Binary eases the irritation somewhat. Did you try bytestring literals (and maybe parsing them in-memory with Data.Binary)? I finally squeezed enough time to try it, and it didn't work for me. -- ghc Overflow.hs [1 of 1] Compiling Overflow ( Overflow.hs, Overflow.o ) Enable optimisations! Compile with ghc -O2. You need this to avoid having a very slow pack call at runtime. Yes, I tried basic variations like that. The result is the same with -O1 or with -O2, and with Data.ByteString or Data.ByteString.Lazy . I'm just 2 dum 2 c. The show function? Ha ha! Jay ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: static constants -- ideas?
jay: Don Stewart [EMAIL PROTECTED]: jay: Don Stewart [EMAIL PROTECTED]: jay: Don Stewart [EMAIL PROTECTED]: jay: I also have constants that are too large to compile. I am resigned to loading them from data files--other solutions seem even worse. ... Data.Binary eases the irritation somewhat. Did you try bytestring literals (and maybe parsing them in-memory with Data.Binary)? I finally squeezed enough time to try it, and it didn't work for me. -- ghc Overflow.hs [1 of 1] Compiling Overflow ( Overflow.hs, Overflow.o ) Enable optimisations! Compile with ghc -O2. You need this to avoid having a very slow pack call at runtime. Yes, I tried basic variations like that. The result is the same with -O1 or with -O2, and with Data.ByteString or Data.ByteString.Lazy . Ok, hmm, that really shouldn't be the case. Do you have the example available somewhere? It's just a 40M inline bytestring? -- Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: STM and fairness
| I'd like to know a bit about the STM implementation in GHC, | specifically about how it tries to achieve fairness. I've been reading | Composable Memory Transactions but it does not contain that much | details on this specific matter. What I want to know boils down to | this: what order are processes run which have been woken up from a | call to retry? Tim is the one who implemented this stuff, so I'm ccing him. If threads queue up on a single MVar, it's obvious how to achieve fairness of a sort. Furthremore, if 100 threads are blocked on one MVar, the scheduler can wake up exactly one when the MVar is filled. With STM it's much less obvious. First, a thread may block on a whole bunch of TVars; if any of them are changed, the thread should re-run. So there is no single list of threads to reverse or not reverse. Second, if 100 threads are blocked on a TVar, t, waking up just one of them may not suffice -- it may read some more TVars and then retry again, re-blocking itself on t (plus some more). The only simple thing to do is to wake all of them up. In common situations (e.g. a buffer), we may wake up all 100 threads, only for 99 of them to lose the race and block again. This arises from the fact that transactions do a wonderful thing, by letting you perform multiple operations atomically -- but that makes it harder to optimize. All that said, you may well be right that one could do a better job of scheduling. For example, even though there may be lots of threads blocked on a TVar, and all must be made runnable, they could perhaps be run in the same order that they blocked, so the longest-blocked got to run first. I don't think we try to do that, but Tim would know. By all means suggest a patch! Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: static constants -- ideas?
Bryan O'Sullivan [EMAIL PROTECTED] wrote: The trick I usually use in cases like this is to compile the data as C code and link against it, then access it from Haskell via a Ptr. For my particular application, I really need to ship a single static binary that has it all -- data as well as algorithms -- so I'm going with the FFI. It's too bad that I end up working in the IO monad much of the time. I hope we'll have massive static constants someday soon! -- _jsn ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users