RE: Bound Threads
| I think the essence of Daan's proposal was that the goals we hope to | achieve using 'bound', 'threadsafe', and other ffi annotations could | be achieved by adding a small amount of additional functionality | and wrappers and that benefits of doing this are: That all sounds splendid. If only I understood what the proposal actually was. (Precisely; I do have a vague understanding of it, but it's a slippery topic.) Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Daan writes: > What I mostly wanted to ensure is that people have really thought > about this carefully and that they could give strong reasons for > choosing a particular design over another. SimonPJ replies: > The difficulty is that I can't give strong reasons for choosing X > over Y when I don't understand what Y is. Next week is ok though. I think the essence of Daan's proposal was that the goals we hope to achieve using 'bound', 'threadsafe', and other ffi annotations could be achieved by adding a small amount of additional functionality and wrappers and that benefits of doing this are: - greater transparency to programmers - greater flexibility because those primitives can be used in many different ways - more future-proof because of greater flexibility - simpler design Some inspiration for this approach might be found in the move from GreenCard to the standard FFI. GreenCard gives you some important functionality that covers many cases you might want but, since it doesn't cover _everything_ you want to do, we've seen a gradual addition of features over the years as the shoe is made to pinch a little less. I think the details of which operations to add were largely intended to show that this kind of approach was plausible rather than being intended as a complete proposal. [Personally, I'm not sure that Daan is right - but I think it's worth exploring a little.] -- Alastair Reid [EMAIL PROTECTED] Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/ ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Bound Threads
| However, my proposal is not anywhere fundamentally difficult -- in its essence, I just | propose to move the implementation of the thread allocation strategy from the RTS/C | code, to a Haskell library. This gives programmers both a low-level interface for | explicit access and a high-level interface as it is now. It may well not be difficult, but nevertheless I am having difficulty understanding it. I'll have to wait till you have time to explain it. | What I mostly wanted to ensure is that people have really thought about this carefully | and that they could give strong reasons for choosing a particular design over another. The difficulty is that I can't give strong reasons for choosing X over Y when I don't understand what Y is. Next week is ok though. Good luck with thesis writing! Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Hi all, I think everyone is keen to make progress on this bound-threads stuff. You have an alternative idea which we are trying to understand. Do you plan to have a go at the operational semantics, as a way of explaining it? Sorry for not having replied. I am very busy finishing my thesis and I can't look into it sooner than next week. The thesis-finishing business is in any case taking so much time that I can't really help out on implementation or other time consuming activities. However, my proposal is not anywhere fundamentally difficult -- in its essence, I just propose to move the implementation of the thread allocation strategy from the RTS/C code, to a Haskell library. This gives programmers both a low-level interface for explicit access and a high-level interface as it is now. At the moment we're a bit stuck: no one wants to move on before we have some kind of consensus, but you're the only one who can help us understand your proposal. Well, it is not my intention to stop progress! I haven't fully worked out my design, for example, it seems that dynamic rescheduling of haskell threads to OS threads is rather difficult -- I can only say more about this next week. What I mostly wanted to ensure is that people have really thought about this carefully and that they could give strong reasons for choosing a particular design over another. If you feel that this is the case -- by all means continue as you have done and disregard my disturbances. All the best, Daan. Simon | -Original Message- | From: Simon Peyton-Jones [mailto:[EMAIL PROTECTED] | Sent: 17 March 2003 22:06 | To: Daan Leijen; Wolfgang Thaller; [EMAIL PROTECTED] | Subject: RE: Bound Threads | | | | | Maybe, the forkOS/forkIO approach is flawed, but I think we | | should only rule it out when we can provide a convincing | | example where only the keyword approach would work, and where | | we can't use combinators to achieve the same effect. | | | Daan, | | There has been extended discussion on this stuff, which Wolfgang and | Simon and I tried to boil out into a document. It's hard to say exactly | what 'safe' or 'bound' exports, or whatever, might mean, so we give a | little operational semantics. | | My hope is that the very same operational-semantic framework would serve | to describe your system. Would you like to write its transition rules, | in the same style? Then we could compare the two more easily. Without | that, I am hard pressed to understand the implications of what you | suggest, just as I was hard pressed to understand Wolfgang's proposal | till we had it specified. | | You can find the document in the CVS respository in | haskell-report/ffi/threads.tex | | Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Bound Threads
Daan I think everyone is keen to make progress on this bound-threads stuff. You have an alternative idea which we are trying to understand. Do you plan to have a go at the operational semantics, as a way of explaining it? At the moment we're a bit stuck: no one wants to move on before we have some kind of consensus, but you're the only one who can help us understand your proposal. Simon | -Original Message- | From: Simon Peyton-Jones [mailto:[EMAIL PROTECTED] | Sent: 17 March 2003 22:06 | To: Daan Leijen; Wolfgang Thaller; [EMAIL PROTECTED] | Subject: RE: Bound Threads | | | | | Maybe, the forkOS/forkIO approach is flawed, but I think we | | should only rule it out when we can provide a convincing | | example where only the keyword approach would work, and where | | we can't use combinators to achieve the same effect. | | | Daan, | | There has been extended discussion on this stuff, which Wolfgang and | Simon and I tried to boil out into a document. It's hard to say exactly | what 'safe' or 'bound' exports, or whatever, might mean, so we give a | little operational semantics. | | My hope is that the very same operational-semantic framework would serve | to describe your system. Would you like to write its transition rules, | in the same style? Then we could compare the two more easily. Without | that, I am hard pressed to understand the implications of what you | suggest, just as I was hard pressed to understand Wolfgang's proposal | till we had it specified. | | You can find the document in the CVS respository in | haskell-report/ffi/threads.tex | | Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Bound Threads
| Now, what I don't like about my proposal and your proposal is | that the user has to be aware of OS threads when making | foreign calls by wrapping it in "threadSafe" or adding | "threadsafe" sometimes -- but maybe that is unavoidable. Actually, the proposal currently on the table, which no one has objected to, is * abolish the distinction between "safe" and "threadsafe" * make "safe" the default (it already is) So the user needs to do something special only if she wants to do something unsafe (by adding "unsafe" to the foreign import). Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Bound Threads
| Maybe, the forkOS/forkIO approach is flawed, but I think we | should only rule it out when we can provide a convincing | example where only the keyword approach would work, and where | we can't use combinators to achieve the same effect. Daan, There has been extended discussion on this stuff, which Wolfgang and Simon and I tried to boil out into a document. It's hard to say exactly what 'safe' or 'bound' exports, or whatever, might mean, so we give a little operational semantics. My hope is that the very same operational-semantic framework would serve to describe your system. Would you like to write its transition rules, in the same style? Then we could compare the two more easily. Without that, I am hard pressed to understand the implications of what you suggest, just as I was hard pressed to understand Wolfgang's proposal till we had it specified. You can find the document in the CVS respository in haskell-report/ffi/threads.tex Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Hi Daan & everyone else, Now, I can be easily convinced that "threadsafe" is the way to go, whenever there is a compelling example where forkOS/forkIO fails. I'm not arguing that there is a situation where forkOS/forkIO fail to provide sufficient functionality. In fact, I'm afraid they may provide too much functionality, leading to an overly complex solution. Also, I'm afraid that "safe" foreign imports, with their strange blocking behaviour, will be a permanent source of bugs. As I've already said, I consider lightweight threads and the ability to run multiple lightweight threads in one OS thread to be an implementation detail of many current Haskell implementations. Implementation details of the runtime system should not be exposed to the user (although they may be available as primitives that may change from compiler version to compiler version). Providing forkOS/forkIO would _require_ future implementations to provide this functionality, even if there is no longer any technical reason for it. Now, what I don't like about my proposal and your proposal is that the user has to be aware of OS threads when making foreign calls by wrapping it in "threadSafe" or adding "threadsafe" sometimes -- but maybe that is unavoidable. With threadsafe and bound, we could just treat "threadsafe" and "bound" as defaults. Then, the user does not have to be aware of the OS thread/lightweight issue, unless (s)he wants to optimize for performance. Only then would the user need to add "unsafe" or "unbound" (or remove "bound") in some places. (GHC 5.05 already has "threadsafe" as default; as "safe" doesn't seem to provide any advantages, it is currently not implemented in the threaded RTS). On the other hand, it seems awkward/impossible to make sure that certain functions are always called from a specific OS thread with the "threadsafe" approach while the forkOS/forkIO primitives are always available to the programmer for explicit management of threads. How would you make sure that a certain function is always called from a specific OS thread (other than the current one) in C? I don't see why the runtime system should provide special support for this, it looks like trying to beat C at interacting with C libraries... Thanks for the explanation. Just on the side, it seems that this approach always involves an OS context switch for bound callbacks. This doesn't happen with the forkOS/forkIO approach. It may be rather expensive, take a mouse motion event handler for example. (I assume you meant to say "unbound callbacks". Bound callbacks don't require a thread-switch). Unbound callbacks require a thread-switch in the approach I just described*. However (after having thought about it for a few minutes), I'm convinced that this extra thread-switch can easily be optimized away in the RTS. The resulting implementation would still be simpler than what is required to support the forkIO/forkOS primitives. ( * ... plus two extra OS thread switches that have nothing to do with this issue. They'd be necessary with either approach until someone makes certain improvements to the storage manager) Maybe we need a more exact specification of how forkOS/forkIO should behave, especially with respect to foreign calls blocking other threads. Could you elaborate on how you would expect "normal" (safe) > foreign calls to behave in different situations? 1) "forkOS" starts a new haskell thread in a new OS thread 2) "forkIO" starts a new haskell thread in the current OS thread 3) maybe we should also add "forkUnbound" that forks a haskell thread that can automatically be moved between OS threads. 4) a foreign call blocks all haskell threads that are attached to the current OS thread until a) the call returns, or b) the call enters Haskell again. Does that make sense for implementations that don't use the "current" OS thread for executing the Haskell code (just for foreign calls)? Such implementations would have to do extra locking that's not really necessary... 5) on top of this, we can implement *) a library function "fork" that forks a new haskell thread that maybe runs in separate OS thread, depending on the architecture. Wouldn't that mean it would be impossible to predict which other haskell threads, if any, would be blocked by a particular call to a foreign import? Cheers, Wolfgang ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
For what it is worth I favour Daan's approach, adding keywords to a language seems to want for a sufficiently rich formalism/framework. 3) maybe we should also add "forkUnbound" that forks a haskell thread > that can automatically be moved between OS threads. In the spirit of searching for the most basic primitives - why not instead define a primitive for moving haskell threads from os thread, as this primitive seems to be required in any case in order to effect the "automatic" movement between threads. Such a primitive would allow alternative thread management strategies to be defined in haskell. forkUnbound could then be implemented by registering the haskell thread with the thread manager etc... such a strategy would require explicitly identifying os threads - making them almost first class, so to speak - is this bad ? whilst ... We never want to specify what OS thread is running a particular Haskell thread. in order to define strategies where the above is true, I believe we do need to be explicit. These strategies can either be implemented by the rts, or in a library. I think that the latter is better in the sense of more open. explicit movement of Haskell threads between OS threads could potentially be extended to thread migration between processes? * "forkOS" doesn't have to use a new OS thread to run Haskell threads, just when calling a foreign function, so it would work on Hugs too for example. (as explained why complicate matters, if forkOs/forkNative is supposed to fork a native thread then why introduce exceptions, as is not forking of a native thread part of the meaning of the operation? _ MSN Instant Messenger now available on Australian mobile phones. Go to http://ninemsn.com.au/mobilecentral/hotmail_messenger.asp ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Hi Wolfgang, Maybe, the forkOS/forkIO approach is flawed, but I think we should only rule it out when we can provide a convincing example where only the keyword approach would work, and where we can't use combinators to achieve the same effect. That's unfair ;-) --- I could also claim the reverse and say that we stick with threadsafe/bound until we have a convincing example... Using combinators sounds good, but there are things where combinators are not automatically the best choice (we're usually not using combinators to implement lazyness, either), and we don't know yet whether this is the case here or not. So let's just get on with the discussion No, it is not unfair. We should only introduce a new keyword/syntax if we are unable to express the behaviour in plain Haskell or if it very awkward to do so. Now, I can be easily convinced that "threadsafe" is the way to go, whenever there is a compelling example where forkOS/forkIO fails. At first, I thought that one could never come up with a counter example: since forkOS/forkIO are the most primitive calls, we can always model any strategy implemented in C (and more). However, I just realised that there is more than meets the eye: you just described that "GHC moves all non-bound Haskell threads to a second OS thread". This is something that can not be described without adding another primitive (forkUnbound?) and it may lead to an example that is not expressible with forkOS/forkIO. On the other hand, it seems awkward/impossible to make sure that certain functions are always called from a specific OS thread with the "threadsafe" approach while the forkOS/forkIO primitives are always available to the programmer for explicit management of threads. In general though, I have learned over the years that is most of the time better to first find an implementation using low-level primitives in Haskell, and maybe later add special syntax, than to implement a strategy directly in C. About the example: [snip] That is amazing :-) [...] [snip] makes sure that there is a second OS thread available (some overhead the first time) and b) makes sure that all non-bound [in the current implementation: all] Haskell threads are executed by the second OS thread from now on. [snip] Thanks for the explanation. Just on the side, it seems that this approach always involves an OS context switch for bound callbacks. This doesn't happen with the forkOS/forkIO approach. It may be rather expensive, take a mouse motion event handler for example. Maybe we need a more exact specification of how forkOS/forkIO should behave, especially with respect to foreign calls blocking other threads. Could you elaborate on how you would expect "normal" (safe) > foreign calls to behave in different situations? 1) "forkOS" starts a new haskell thread in a new OS thread 2) "forkIO" starts a new haskell thread in the current OS thread 3) maybe we should also add "forkUnbound" that forks a haskell thread that can automatically be moved between OS threads. 4) a foreign call blocks all haskell threads that are attached to the current OS thread until a) the call returns, or b) the call enters Haskell again. 5) on top of this, we can implement *) a library function "fork" that forks a new haskell thread that maybe runs in separate OS thread, depending on the architecture. *) a "threadSafe" combinator to make non-blocking foreign calls. notes: * "forkOS" doesn't have to use a new OS thread to run Haskell threads, just when calling a foreign function, so it would work on Hugs too for example. (as explained in a previous mail). * The naming is a bit inconveniently chosen. Better would be: forkOS => forkNativeThread forkIO => forkHaskellThread Now, "forkIO" can now be implemented in terms of those two functions, and can be used when the user doesn't care about how the thread is scheduled. Now, what I don't like about my proposal and your proposal is that the user has to be aware of OS threads when making foreign calls by wrapping it in "threadSafe" or adding "threadsafe" sometimes -- but maybe that is unavoidable. All the best, Daan. ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
(I keep forgetting to correctly fill out the To: and Cc: fields before sending my reply... so here's the copy that should have been sent to the list 10 minutes ago...) Daan Leijen wrote: Hi Wolfgang, I feel like you are beating my proposal to death here, and I find it hard to react individually to all your remarks. Sorry, maybe I shouldn't shoot all my arguments at you at once :-) I'll try to focus on the main issue: You are worried that the forkOS and forkIO distinction is too primitive and that it rule out sophisticated scheduling on SMP processors for example. That is closely related to the point that I don't want implementation details (that several Haskell threads can run in one OS thread) to be part of a defined interface that has to be supported by all implementations. My main point is probably my strong phobia of "safe" calls: they are not safe to use. FFI newbies are repeatedly getting bitten by the fact that "safe" calls block everything else. There is no "natural" reason for a call to block other threads, it's a consequence of an implementation detail (lightweight threads). The intention behind threadsafe is to make this implementation detail invisible, and therefore harmless. The forkOS/forkIO proposal seems to rely on having foreign calls that block other threads (that run in the same OS thread)... or doesn't it? Maybe, the forkOS/forkIO approach is flawed, but I think we should only rule it out when we can provide a convincing example where only the keyword approach would work, and where we can't use combinators to achieve the same effect. That's unfair ;-) --- I could also claim the reverse and say that we stick with threadsafe/bound until we have a convincing example... Using combinators sounds good, but there are things where combinators are not automatically the best choice (we're usually not using combinators to implement lazyness, either), and we don't know yet whether this is the case here or not. So let's just get on with the discussion About the example: [snip] That is amazing :-) [...] Well, of course, it's "none of your business" to know. Different Haskell runtimes could use completely different schemes, and you wouldn't even notice. In GHC, the second OS thread is spawned earlier, the first time a threadsafe call is made. When the threadsafe call is made, GHC a) makes sure that there is a second OS thread available (some overhead the first time) and b) makes sure that all non-bound [in the current implementation: all] Haskell threads are executed by the second OS thread from now on. Now if a bound callback is invoked [not yet implemented], the bound callback is executed in the thread that the wrapper was called in (the first OS thread). All other Haskell threads (if any) continue to run in the second OS thread. If forkIO is called, the new Haskell thread is just added to the list of threads to be run by the second OS thread, no new OS thread has to be spawned, and forkIO doesn't have to know about threadsafe or about bound. If a non-bound callback is invoked [available now in the CVS HEAD], the callback is executed along with the other background Haskell threads in the second OS thread. Since the RTS can't guess whether to use new OS threads or not at forkIO, I assumed that you could mark "wrapper" functions (callbacks) with a "threadsafe" attribute. If this is not the case, I don't understand how the implementation could work in this particular example. It doesn't need to guess, "threadsafe" is just for imports, and it works anyway. See above, and if I haven't explained it clearly, try http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/rts-libs/multi- thread.html (which I really should update, it's outdated), or ask again. Maybe we need a more exact specification of how forkOS/forkIO should behave, especially with respect to foreign calls blocking other threads. Could you elaborate on how you would expect "normal" (safe) foreign calls to behave in different situations? Cheers, Wolfgang ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Hi Wolfgang, I feel like you are beating my proposal to death here, and I find it hard to react individually to all your remarks. I'll try to focus on the main issue: You are worried that the forkOS and forkIO distinction is too primitive and that it rule out sophisticated scheduling on SMP processors for example. I think that you are right if the programmer use forkOS and forkIO directly all the time but in general, we can provide abstractions *in haskell* to deal with scheduling. See my "fork" example in the previous mail. The bottom line is that if we have access from Haskell to the primitives, we can always implement any smart scheme that we need or want. If you implement a strategy via keywords, it is fixed. I don't promote using low-level programming but I think that "bound" and "threadsafe" should be implemented using combinators in Haskell itself, with some low-level functions like forkIO. Maybe, the forkOS/forkIO approach is flawed, but I think we should only rule it out when we can provide a convincing example where only the keyword approach would work, and where we can't use combinators to achieve the same effect. All the best, Daan. About the example: Now, I have an example from the wxHaskell GUI library that exposes some of the problems with multiple threads. I can't say it can be solved nicely with forkOS, so I wonder how it would work out with "threadsafe": [snip] In case that wxWindows makes the assumption that it's functions are invoked _from the same OS thread_ that it used to call your callback, you can add "bound" to the foreign export statement for your callback. Then you would use just forkIO and everything would work. That is amazing :-) Suppose that there is one OS thread running here -- the GUI thread. When the callback is called, I use "bound" so that the callback still runs in the GUI thread. When I use "forkIO", the forked computation could also run in the same GUI thread (I guess it is unspecified, but suppose it does). When the callback returns to C land, the eventloop will wait and since there is still just one OS thread running, the forked computation will block Urk, somehow the RTS has to be smart enough to spawn a new OS thread when forkIO is called. Right? Since the haskell function is called via a callback, I guess that "threadsafe" should also apply to "wrapper" functions -- that is, when the foreign world calls haskell, we use another OS thread to run the haskell code. Sorry, no clue what you mean... could you elaborate? "threadsafe" just applies to imports, not to ? exports and wrappers. "bound" applies to exports and wrappers, but not to imports. Should it be different? Since the RTS can't guess whether to use new OS threads or not at forkIO, I assumed that you could mark "wrapper" functions (callbacks) with a "threadsafe" attribute. If this is not the case, I don't understand how the implementation could work in this particular example. ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
In general, I think that only the programmer knows what strategy to use. Do programmers know? I know about my own program, but do I know about the library that I am going to use? Does it use forkOS or forkIO? What will be the consequences if it uses forkIO and I do a lengthy foreign call? Does the library writer know about my program? I fear I'll end up wrapping every call to *Haskell* libraries in your "threadSafe" combinator - just to be sure that the library and my program don't interfere. I'm very afraid of having some long debugging sessions once we have this "feature". Now, I have an example from the wxHaskell GUI library that exposes some of the problems with multiple threads. I can't say it can be solved nicely with forkOS, so I wonder how it would work out with "threadsafe": In both cases, threadsafe and forkOS, we would have two OS threads, and we would have OS thread context switches between them. So it will get us nowhere to count the OS thread context switches involved (incidentally, handling GUI events is an area where we can easily afford the cost of OS thread context switches). You would mark the call to the wxWindows event loop as threadsafe (actually, in the CVS version of GHC, "safe" is currently a shorthand for "threadsafe" as nobody has yet provided a logically sound and meaningful definition of the semantics that "safe" should have instead). In case that wxWindows makes the assumption that it's functions are invoked _from the same OS thread_ that it used to call your callback, you can add "bound" to the foreign export statement for your callback. Then you would use just forkIO and everything would work. (If wxWindows doesn't allow access from multiple threads, then of course you can't call wx functions from the thread you just forked. But that's the same no matter which proposal we follow.) The problem is that the "processor" thread won't run since we have returned to C-land and the haskell scheduler can't run. Threadsafe means that you don't need to care about things like that. [...] but it is how it is done in all other major programming languages. Only other languages don't require you to manually keep track of a correspondence between lightweight and heavyweight threads. Since the haskell function is called via a callback, I guess that "threadsafe" should also apply to "wrapper" functions -- that is, when the foreign world calls haskell, we use another OS thread to run the haskell code. Sorry, no clue what you mean... could you elaborate? "threadsafe" just applies to imports, not to exports and wrappers. "bound" applies to exports and wrappers, but not to imports. Should it be different? Cheers, Wolfgang ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
I have just spend some time reading through all the discussions and the new "threads" document and I would like to propose the addition of a new library function. forkOS :: IO () -> IO ThreadID Something like that is already in the proposal, only it's currently called forkBoundThread and it doesn't return the ThreadID (that can be changed, though). With this, I also propose that "forkIO" always runs a Haskell thread in the same OS thread that the current Haskell thread runs in. (i.e. "forkIO": same OS thread, "forkOS": new OS thread) In the proposal we wrote: "The specification shouldn’t explicitly require lightweight “green” threads to exist. The specification should be implementable in a simple and obvious way in haskell systems that always use a 1:1 correspondence between Haskell threads and OS threads." The idea was that lightweight ("green") threads are an optimization only (do they have any other advantage?), not a language feature, and that implementations of Haskell should not be forced to support a complex thread management system. Your proposal obviously contradicts this. What is the advantage of explicitly requiring one OS thread to execute (the foreign calls made by) several Haskell threads? So far, I was only able to think of two possible situations: a) The foreign functions don't care what thread they are called from In that case, I would like the implementation to run my Haskell threads in the most efficient way possible. Currently, that means scheduling them all in one OS thread, but that is an implementation detail that I don't want to care about when I'm writing a normal application. On a four-processor-SMP machine, the most efficient way is to run them simultaneously in four OS threads (no implementation currently supports this, but there's experimental code in the GHC repository). b) The foreign functions do care what thread they are called from In that case I want the implementation to have an exact correspondence between Haskell threads and OS thread. I just want to think about "one thread", and I don't want to manage some correspondence between Haskell threads and OS threads manually. Using the new primitive, we can view the new "threadsafe" keyword as syntactic sugar: foreign import threadsafe foo :: Int -> IO Int ===> foo :: Int -> IO Int foo i = threadSafe (primFoo i) foreign import "foo" primFoo :: IO Int where threadSafe :: IO a -> IO a threadSafe io = do result <-newEmptyMVarforkOS (do{ x <-io; putMVar result x }) getMVar result That looks dangerous: I want to call both threadsafe imports and unsafe imports from a "bound thread", and I expect all foreign calls from a bound thread to be executed from the same OS thread (by the definitioon of a "bound thread"). This implementation of "threadsafe" always uses another (new or pooled) OS thread for the threadsafe call. getOSThread :: ThreadID -> OSThreadID forkIOIn :: OSThreadID -> IO () -> IO ThreadID Why should the RTS do inter-OS-thread messaging for us? I have the feeling that it is not difficult to implement "forkOS" and family once the runtime system has been upgraded to support multiple OS threads. Wolfgang, you seem to be the expert on the OS thread area, would it be hard? It would definitely more difficult to implement in GHC than the current proposal, but it could be done. In fact I think that implementing it would be more fun for me than having to use it afterwards. I am not saying that we should discard the "threadsafe" keyword as it might be a useful shorthand, but I think that it is in general a mistake to try to keep the management of OS threads implicit -- don't use new keywords, add combinators to implement them! Management of OS threads _should_ be kept implicit. Ideally, the user should never notice that the GHC runtime is using green threads internally. I feel that the following has happened; urk, we need some way of keeping haskell threads running while calling C; we add "threadsafe"; whoops, sometimes a function expects that it is run in the same OS thread; we add "bound"; whoops, sometimes functions expect to be run from a specific OS thread... unsolved?? Not unsolved. Use Control.Concurrent.Chan :-) Before we know it, we have added tons of new keywords to solve the wrong problem. The problem being, that some Haskell implementation try to optimize concurrency by doing the scheduling themselves. We have to provide hints (threadsafe and bounds) to the implementation to specify just how much it is allowed to optimize. We should never be required to explicitly do the "optimization" in the source code. It will break with SMP implementations (which I expect to be using in a few years), because different optimizations are required - suddenly it will be desirable to have multiple OS threads for performance reasons. Maybe it is time to take a step back and use a somewhat lower level model with two fork variants: "forkIO" (in the same OS thread) and "forkOS
Re: Bound Threads
Hi Simon, I'd like to point out a concept that I think is being missed here: We never want to specify what OS thread is running a particular Haskell thread. why not? Because (a) it doesn't matter: the programmer can never tell, and (b) we want to give the implementation freedom to spread Haskell threads across multiple OS threads to make use of multiple real CPUs. I agree that these are valid points. However, as I said, I don't think we can do (b), ie. automatic management, in many real-world situations. The above points are mostly useful in a pure Haskell setting. In general, I think that only the programmer knows what strategy to use. In particular, we can provide a "fork" function that forks of a new Haskell thread that maybe runs in a new OS thread, or other CPU; basically implementing the above concept for programs that don't care about how the Haskell threads are distributed over OS threads: fork :: IO () -> IO ThreadID fork io = do newOS <-[complex algorithm that determines if a new OS thread is needed.] if (newOS) then forkOS io else do threadID <-[complex algorithm that determines in which existing thread we run it] forkIOIn threadID io Note that we can now implement our really sophisticated distributed algorithms in plain Haskell. The point is that you want to specify which OS thread is used to invoke a foreign function, NOT which OS thread is used to execute Haskell code. The semantics that Simon & I wrote make this clear. This is a good point and that is also the weakness of the "forkOS", "forkIO" approach: it is less declarative and thus leaves less freedom to the implementation. However, I hope that through functions like "fork", we can bring back declarativeness by abstraction. If we keep thinking like this, then implementations like Hugs can be single-threaded internally but switch OS threads to call out to foreign functions, and implementations like GHC can be multi-threaded internally and avoid switching threads when calling out to foreign functions. Ha, this is not true :-) We are saved by your observation that in the Haskell world we can't observe whether we run in a different OS thread or not. Thus a single-threaded Hugs will implement forkOS as forkIO but still attaches a different "Hugs OS thread identifier" to the Haskell thread. When a foreign call is made, it matches the Hugs OS thread identifiers and uses a different OS thread if necessary, maintaining a mapping between the Hugs OS thread identifiers and the spawned OS threads. > threadSafe :: IO a -> IO a > threadSafe io > = do result <-newEmptyMVar > forkOS (do{ x <-io; putMVar result x }) > getMVar result This forces a thread switch when calling a threadsafe foreign function, which is something I think we want to avoid. We can refine the implementation to avoid a thread switch when it is the only Haskell thread running in the current OS thread: threadSafeEx :: IO a -> IO a threadSafeEx io = do count <-getHaskellThreadCountInTheCurrentOSThreadif (count > 1) then threadSafe io else io I'm basing this on two assumptions: (a) switching OS threads is expensive and (b) threadsafe foreign calls are common. I could potentially be wrong on either of these, and I'm prepared to be persuaded. But if both (a) and (b) turn out to be true, then worse is worse in this case. I think you are righ on (a), but I also think that we can avoid it just as it can be sometimes avoided when implemented in C in the runtime. Can't say anything about (b). All the best, Daan. Now, I have an example from the wxHaskell GUI library that exposes some of the problems with multiple threads. I can't say it can be solved nicely with forkOS, so I wonder how it would work out with "threadsafe": The example is a Haskell initialization function that is called via a callback from the GUI library. The Haskell initialization function wants to do a lot processing but still stay reactive to close events for example. Since events are processed in an eventloop, new events can only come in by returning from the callback. So, the initilization functions forks of a Haskell thread (the processor) to do all the work and returns as soon as possible to the C GUI library. Now, the eventloop starts to wait for the next event in C land. The problem is that the "processor" thread won't run since we have returned to C-land and the haskell scheduler can't run. We can solve it by running the processor thread with "forkOS". I can't say it is a particularly nice solution but it is how it is done in all other major programming languages. I wonder how the "threadsafe" keyword can be used to solve this problem. Since the haskell function is called via a callback, I guess that "threadsafe" should also apply to "wrapper" functions -- that is, when the foreign world calls haskell, we use another OS thread to run the haskell code. However, I think that we are than forced to use a OS thread context switch?? Cheers, Simon
RE: Bound Threads
> I have just spend some time reading through all the > discussions and the > new "threads" document and I would like to propose the > addition of a new library function. > > > forkOS :: IO () -> IO ThreadID > > The function "forkOS" forks a new Haskell thread that runs in > a new OS (or > native) thread. With this, I also propose that "forkIO" > always runs a Haskell thread in the same OS thread that the > current Haskell thread runs in. > (i.e. "forkIO": same OS thread, "forkOS": new OS thread) I'd like to point out a concept that I think is being missed here: We never want to specify what OS thread is running a particular Haskell thread. why not? Because (a) it doesn't matter: the programmer can never tell, and (b) we want to give the implementation freedom to spread Haskell threads across multiple OS threads to make use of multiple real CPUs. But the programmer CAN tell, you scream! How can the programmer tell? Only by calling foreign functions. The point is that you want to specify which OS thread is used to invoke a foreign function, NOT which OS thread is used to execute Haskell code. The semantics that Simon & I wrote make this clear. If we keep thinking like this, then implementations like Hugs can be single-threaded internally but switch OS threads to call out to foreign functions, and implementations like GHC can be multi-threaded internally and avoid switching threads when calling out to foreign functions. > Using the new primitive, we can view the new "threadsafe" keyword as > syntactic sugar: > > > foreign import threadsafe foo :: Int -> IO Int > > ===> > > > foo :: Int -> IO Int > > foo i = threadSafe (primFoo i) > > > > foreign import "foo" primFoo :: IO Int > > where > > > threadSafe :: IO a -> IO a > > threadSafe io > > = do result <-newEmptyMVarforkOS (do{ x <-io; putMVar result x }) > > getMVar result This forces a thread switch when calling a threadsafe foreign function, which is something I think we want to avoid. I'm basing this on two assumptions: (a) switching OS threads is expensive and (b) threadsafe foreign calls are common. I could potentially be wrong on either of these, and I'm prepared to be persuaded. But if both (a) and (b) turn out to be true, then worse is worse in this case. Cheers, Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Hi all, I have just spend some time reading through all the discussions and the new "threads" document and I would like to propose the addition of a new library function. forkOS :: IO () -> IO ThreadID The function "forkOS" forks a new Haskell thread that runs in a new OS (or native) thread. With this, I also propose that "forkIO" always runs a Haskell thread in the same OS thread that the current Haskell thread runs in. (i.e. "forkIO": same OS thread, "forkOS": new OS thread) Using the new primitive, we can view the new "threadsafe" keyword as syntactic sugar: foreign import threadsafe foo :: Int -> IO Int ===> foo :: Int -> IO Int foo i = threadSafe (primFoo i) foreign import "foo" primFoo :: IO Int where threadSafe :: IO a -> IO a threadSafe io = do result <-newEmptyMVarforkOS (do{ x <-io; putMVar result x }) getMVar result Note that "forkOS" can use thread pooling in its implementation. The advantage of a separate function "forkOS" is that we put control back to the users hands, as a programmer can be very specific about which Haskell threads are part of a certain OS thread and can be specific about the OS thread that is used to run a foreign function. On other words, it is absolutely clear to which OS thread a Haskell thread is bound. In this respect, it helps to have another function that runs a Haskell thread in a specific OS thread. getOSThread :: ThreadID -> OSThreadID forkIOIn :: OSThreadID -> IO () -> IO ThreadID I have the feeling that it is not difficult to implement "forkOS" and family once the runtime system has been upgraded to support multiple OS threads. Wolfgang, you seem to be the expert on the OS thread area, would it be hard? I am not saying that we should discard the "threadsafe" keyword as it might be a useful shorthand, but I think that it is in general a mistake to try to keep the management of OS threads implicit -- don't use new keywords, add combinators to implement them! I feel that the following has happened; urk, we need some way of keeping haskell threads running while calling C; we add "threadsafe"; whoops, sometimes a function expects that it is run in the same OS thread; we add "bound"; whoops, sometimes functions expect to be run from a specific OS thread... unsolved?? Before we know it, we have added tons of new keywords to solve the wrong problem. Maybe it is time to take a step back and use a somewhat lower level model with two fork variants: "forkIO" (in the same OS thread) and "forkOS" (in a new OS thread). It seems that none of the above problems occur when having explicit control. In general it seems that OS threads are a resource that is too subtle to be managed automatically as they have a profound impact on how libraries are used and applications are structured. All the best, Daan. "worse is better" :-) ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Alastair Reid wrote: Does anyone plan to add support for multiple OS threads to Hugs or NHC? I think it will depend a bit on the complexity so let me sketch how I think it can be implemented. First let me outline my current understanding of what 'bound' means. Consider the following scenario: Haskell program is running in OS thread 't1' Haskell program calls C function 'foo'. 'foo' forks a new OS thread 't2'. In parallel: 't1' calls Haskell function 'f1' and 't2' calls Haskell function 'f2' 'f1' calls C function 'g1' 'f2' calls C function 'g2' My understanding is that 'bound' requires that 'g1' be executed by thread 't1' and that 'g2' be executed by thread 't2'. It would be nice if 'f1' and 'f2' could run simultaneously but the ffi is not going to impose that on us. If 'f1' were to block on an MVar, 'f2' could start running and vice-versa. While 'g1' is running, 'f2' can run and while 'g2' is running, 'f1' can run. Yes, if the Haskell functions 'f1' and 'f2' are both exported using "bound", that is foreign export bound "f1" f1 :: something foreign export bound "f2" f2 :: something Based on this understanding, I believe that single-threaded runtimes could easily implement 'bound' by doing nothing more than using a lock to ensure that at most one OS thread executes Haskell code at once. That was the general intention. Things might get more complex though if you want non-blocking foreign calls (a.k.a. "threadsafe"). Then you could either always create one OS thread for every Haskell thread, or use a slightly more complex scheme (as GHC does, but I'm sure there are other ways of doing this). Implementing "threadsafe" is basically independent from implementing the bound threads proposal, i.e. you can implement one without implementing the other (GHC currently has "threadsafe", but "bound threads" are still science fiction). Cheers, Wolfgang ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Bound Threads
| First let me outline my current understanding of what 'bound' means. | Consider the following scenario: | | Haskell program is running in OS thread 't1' | Haskell program calls C function 'foo'. | 'foo' forks a new OS thread 't2'. | In parallel: 't1' calls Haskell function 'f1' and |'t2' calls Haskell function 'f2' | 'f1' calls C function 'g1' | 'f2' calls C function 'g2' | | My understanding is that 'bound' requires that 'g1' be executed by | thread 't1' and that 'g2' be executed by thread 't2'. You didn't say where the program mentions 'bound'! I tried very hard to give a precise description of what bound threads are, in haskell-report/ffi/threads.tex. (Wolfgang circulated a PDF recently.) Does that specification make sense? Does it answer your question? (If not, we should improve it.) I'm sure it would be improved by examples -- would you like to add one? Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
> Does anyone plan to add support for multiple OS threads to Hugs or NHC? I think it will depend a bit on the complexity so let me sketch how I think it can be implemented. First let me outline my current understanding of what 'bound' means. Consider the following scenario: Haskell program is running in OS thread 't1' Haskell program calls C function 'foo'. 'foo' forks a new OS thread 't2'. In parallel: 't1' calls Haskell function 'f1' and 't2' calls Haskell function 'f2' 'f1' calls C function 'g1' 'f2' calls C function 'g2' My understanding is that 'bound' requires that 'g1' be executed by thread 't1' and that 'g2' be executed by thread 't2'. It would be nice if 'f1' and 'f2' could run simultaneously but the ffi is not going to impose that on us. If 'f1' were to block on an MVar, 'f2' could start running and vice-versa. While 'g1' is running, 'f2' can run and while 'g2' is running, 'f1' can run. Based on this understanding, I believe that single-threaded runtimes could easily implement 'bound' by doing nothing more than using a lock to ensure that at most one OS thread executes Haskell code at once. Thus, a global lock would have to be acquired when a bound function is called or when a thread starts running and the lock would be released when a thread stops running (completes, calls out to C or blocks). This sounds pretty simple (a few tricky corner cases to get right but no major upheaval in the runtime systems) and the locking requirements are quite modest (so, hopefully, portable) so I think an implementation is pretty likely to happen. Timescale will depend on when people find time or money to do it. -- Alastair Reid [EMAIL PROTECTED] Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/ ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Simon Peyton-Jones wrote: [I've updated the "Semantics for foreign threads" document by re-ordering the sections a bit. It'd benefit from having a bit more formal syntax. That should be just a matter of copying from the ffi spec and adding an additional specialid... I had hoped to avoid learning how to use yet another TeX package... but OK, I'll do that. No one has commented a single word on the operational semantics. I don't know whether that's because it's so clear that no discussion is needed, or so opaque that no discussion is possible.] I spent about half an hour staring at a printout, before I saw that it was all perfectly logical if I corrected one typo. Now I think there are no typos left in the formal semantics, so it should need less staring... Anyway, do you think the proposal has been discussed enough for me to start working on a prototype implementation? About threadsafe/safe: I think Wolfgang is saying that the apparent efficiency gain of not requiring thread-safety is illusory, [..] In GHC, we might be able to save some time, but only for calls without call-back. As soon as there's a call-back, I can't see how we can save any time at all. [...] and so we can abolish the safe/threadsafe distinction. I think that would be a very worthwhile gain. Does anyone disagree with this? Kill it kill it kill it! We don't need that distinction. However, it might be worthwhile for other implementations of Haskell, but we can't tell, because GHC is the only one that supports "threadsafe" at the moment. Does anyone plan to add support for multiple OS threads to Hugs or NHC? If someone wants to keep "safe" in the FFI spec as an "optimization hint", then that's fine for me, but something has to be done about the misleading naming ("safe" is NOT SAFE) and an optimization hint should never be the default. And it should be made absolutely clear that an implementation may treat everything as "threadsafe". Cheers, Wolfgang ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
RE: Bound Threads
[I've updated the "Semantics for foreign threads" document by re-ordering the sections a bit. It'd benefit from having a bit more formal syntax. No one has commented a single word on the operational semantics. I don't know whether that's because it's so clear that no discussion is needed, or so opaque that no discussion is possible.] | > I must admit that I can't remember the | > exact semantic distinction between those "safe" and "threadsafe" | | The problem is, nobody does... the original implementation didn't work | in all cases. The original implementation made "safe" calls block all | other haskell threads in some cases, and crashed in other cases. | "Threadsafe" means that calling the foreign import shouldn't block or | otherwise disturb other haskell threads. "Safe" means... well... almost | nobody seems to know, and still fewer people agree on it. | In the current "HEAD", there is no difference between threadsafe and | safe. If someone comes up with a clear specification of why and how | "safe" should be different from threadsafe, things might change again. Indeed, the semantics in the "semantics of bound threads" document makes no distinction between "safe" and "threadsafe" either. The original intention was this: a threadsafe foreign call must not block Haskell threads, even if the foreign call blocks in foreign land. a safe call is not required to obey this constraint The motivation was that thread-safety might require more admin (e.g. relinquishing the lock on the main Haskell heap), and this admin might be costly. A side consequence of GHC's implementation (albeit not of the above specification) is that no Haskell threads progress during a safe call (unless it provokes a call-back..?). I recall that some people actually started to rely on this, though it was never intended as part of the spec. I think Wolfgang is saying that the apparent efficiency gain of not requiring thread-safety is illusory, and so we can abolish the safe/threadsafe distinction. I think that would be a very worthwhile gain. Does anyone disagree with this? Simon ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
Sven Panne wrote: Just to make sure I have understood everything correctly: To make HOpenGL work in the presence of a threaded RTS, the only places which need a change are the stub factories for GLUT callbacks, where a "bound" attribute is required now. Correct. What about the foreign import of glutMainLoop (the main native dispatcher for GLUT, which calls the Haskell callbacks mentioned above when an event occurs and blocks the rest of the time)? It is currently "safe", is this OK? Or should it be "threadsafe"? It should probably be "threadsafe", just in case somebody wants to do some (non-OpenGL) work in the background. I must admit that I can't remember the exact semantic distinction between those two attributes anymore... :-} The problem is, nobody does... the original implementation didn't work in all cases. The original implementation made "safe" calls block all other haskell threads in some cases, and crashed in other cases. "Threadsafe" means that calling the foreign import shouldn't block or otherwise disturb other haskell threads. "Safe" means... well... almost nobody seems to know, and still fewer people agree on it. In the current "HEAD", there is no difference between threadsafe and safe. If someone comes up with a clear specification of why and how "safe" should be different from threadsafe, things might change again. This issue is (I think/hope) entirely orthogonal to the Bound Threads proposal. Cheers, Wolfgang ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi
Re: Bound Threads
I just skimmed over Wolfgang's PDF and things look quite reasonable. Just to make sure I have understood everything correctly: To make HOpenGL work in the presence of a threaded RTS, the only places which need a change are the stub factories for GLUT callbacks, where a "bound" attribute is required now. What about the foreign import of glutMainLoop (the main native dispatcher for GLUT, which calls the Haskell callbacks mentioned above when an event occurs and blocks the rest of the time)? It is currently "safe", is this OK? Or should it be "threadsafe"? I must admit that I can't remember the exact semantic distinction between those two attributes anymore... :-} Cheers, S. ___ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi