Re: Ideas for a Object-Belongs-to-Thread threading model
BrowserUK wrote: -there are the interpreter processes. Inventing (overloaded) terminology will just create confusion. Very unhelpful in a context that suffers more than its fair share already. Okay, I should probably call them Actors to use a more precise terminology - since this is highly inspired in two Actor Model languages. - The interpreter implements a scheduler, just like POE. POE does *NOT* implement a scheduler. Okay, mentioning POE was just a side comment, it doesn't interfere directly in the model. -3 - The scheduler, ulike POE, should be able to schedule in several OS threads, such that any OS thread may raise any waiting process. And how are you going to implement that? That was the part I took directly from the inspiring languages, just take a look in how Erlang and the IO language schedule their actors. The only way would be for there to be multiple concurrent (kernel threaded) instances of the state-machine running sharing (as in shared state concurrency) their controlling state. But maybe each actor is tied to a particular OS thread, which would simplify a bit... Also, it is possible to suspend an actor in order to implement a time-sharing scheduler as well... daniel
Second Version of Ideas for a Object-Belongs-to-Thread threading model
Em Ter, 2010-05-11 às 21:45 -0300, Daniel Ruoso escreveu: The threading model topic still needs lots of thinking, so I decided to try out some ideas. After BrowserUK feedback and some more reading (including http://www.c2.com/cgi/wiki?MessagePassingConcurrency ) and links from there on, I decided to rewrite that ideas in a bit different model, but still with the same spirit. 0 - The idea is inspired by Erlang and the IO Language. Additionally to OS threads there are the Coroutine Groups. 1 - No memory is shared between Coroutine Groups, so no locking is necessary. 2 - A value and a coroutine always belong to a Coroutine Group, which should be assigned to a single OS thread, thus naturally implementing synchronized access to data. 3 - The interpreter implements a scheduler, which will pick one of the waiting coroutines that belong to the groups assined to the current thread. The scheduler may also suspend a coroutine in order to implement time-sharing. The scheduler should support blocking states in the coroutines. 4 - When comparing to Perl 5, each coroutine is an ithread, but memory is shared between all the coroutines in the same group, given that they will always run in the same OS thread. 5 - When a coroutine group is created, it is assigned to one OS thread, the interpreter might decide to create new OS threads as necessary, it might optionally implement one OS thread per coroutine group. 6 - In order to implement inter-coroutine-group communication, there are: 6.1 - A MessageQueue works just like an Unix Pipe, it looks like a slurpy array. It has a configurable buffer size and coroutines might block when trying to read and/or write to it. 6.2 - A RemoteInvocation is an object that has a identifier, a capture (which might, optionally, point to a MessageQueue as input) and another MessageQueue to be used as output. New coroutines are created in the target group to execute that invocation. 6.3 - An InvocationQueue is a special type of MessageQueue that accepts only RemoteInvocation objects. 6.4 - A RemoteValue is an object that proxies requests to another coroutine group through a RemoteInvocation. 7 - The coroutine group boundary is drawn by language constructs such as async, the feed operator, junctions, hyper operators. 8 - A value might have its ownership transferred to another group if it can be detected that this value is in use only for that invocation or return value, in order to reduce the amount of RemoteInvocations. 9 - A value might do a special ThreadSafe role if it is thread-safe (such as implementing bindings to thread-safe native libraries) In which case it is sent as-is to a different group. 10 - A value might do a special ThreadCloneable role if it should be cloned instead of being proxied through a RemoteValue when sent in a RemoteInvocation. 11 - The MessageQueue notifies the scheduler whenever new data is available in that queue so the target coroutine might be raised. 12 - Exception handling gets a bit hairy, since exceptions might only be raised at the calling scope when the value is consumed. 13 - List assignment and Sink context might result in synchronized behavior. comments are appreciated... daniel
Third and simplified version of Ideas for a Object-Belongs-to-Thread threading model
Em Ter, 2010-05-11 às 21:45 -0300, Daniel Ruoso escreveu: he threading model topic still needs lots of thinking, so I decided to try out some ideas. After I sent the second version, I just realized I could make it simpler by just assuming one OS thread per Coroutine Group... so here goes the new version. 0 - No memory is shared between threads, so no locking is necessary. 1 - A value and a coroutine always belong to a thread, thus naturally implementing synchronized access to data. 2 - Coroutines are, conceptually, the equivalent to green threads, running in the same OS thread. Coroutines waiting for a return value are blocked. 3 - The interpreter implements a scheduler, which will pick one of the waiting coroutines, it may also suspend a coroutine in order to implement time-sharing. 4 - In order to implement inter-thread communication, there are: 4.1 - A MessageQueue works just like an Unix Pipe, it looks like a slurpy array. It has a configurable buffer size and coroutines might block when trying to read and/or write to it. 4.2 - A RemoteInvocation is an object that has a identifier, a capture (which might, optionally, point to a MessageQueue as input) and another MessageQueue to be used as output. New coroutines are created in the target thread to execute that invocation. 4.3 - An InvocationQueue is a special type of MessageQueue that accepts only RemoteInvocation objects. 4.4 - A RemoteValue is an object that proxies requests to another coroutine group through a RemoteInvocation. 5 - The thread group boundary is drawn by language constructs such as async, the feed operator, junctions, hyper operators. 6 - A value might have its ownership transferred to another thread if it can be detected that this value is in use only for that invocation or return value, in order to reduce the amount of RemoteInvocations. 7 - A value might do a special ThreadSafe role if it is thread-safe (such as implementing bindings to thread-safe native libraries) In which case it is sent as-is to a different thread. 8 - A value might do a special ThreadCloneable role if it should be cloned instead of being proxied through a RemoteValue when sent in a RemoteInvocation. 9 - Exception handling gets a bit hairy, since exceptions might only be raised at the calling scope when the value is consumed. 10 - List assignment and Sink context might result in synchronized behavior. daniel
Re: Second Version of Ideas for a Object-Belongs-to-Thread threading model
I might have some more to say about any threading model later, but for now I wanted to make everyone aware of a scripting language that is truly multi-threaded - you may want to check it out. Some of it's syntax is Perlish, whereas some is not - the point is that it is supposed to scale on SMP machines. It's called Qore - http://www.qore.org. I maintain the FreeBSD port for it and have played with it quite a bit. It's a nice interface - though traditional. And it does seem to scale pretty well. If the debate is shared memory threads vs message passing (ala Erlang), then I would suggest that they are not mutually exclusive (pun intended) and could actually provide some complementary benefits if deployed on a large scale distributed memory machine composed of SMP nodes. In otherwords, a mixed mode style of distributed programming where the SMP threads run on each node and the MP is used to connect these disjoint processes over the network. I know that the SMP threads is best implemented with a low level runtime (maybe even using a Qore backend?), but I have no idea how one might facilitate Erlang style remote processes - still, I believe offering both styles would be totally awesome :^). Cheers, Brett On Wed, May 12, 2010 at 09:50:19AM -0300, Daniel Ruoso wrote: Em Ter, 2010-05-11 ??s 21:45 -0300, Daniel Ruoso escreveu: The threading model topic still needs lots of thinking, so I decided to try out some ideas. After BrowserUK feedback and some more reading (including http://www.c2.com/cgi/wiki?MessagePassingConcurrency ) and links from there on, I decided to rewrite that ideas in a bit different model, but still with the same spirit. 0 - The idea is inspired by Erlang and the IO Language. Additionally to OS threads there are the Coroutine Groups. 1 - No memory is shared between Coroutine Groups, so no locking is necessary. 2 - A value and a coroutine always belong to a Coroutine Group, which should be assigned to a single OS thread, thus naturally implementing synchronized access to data. 3 - The interpreter implements a scheduler, which will pick one of the waiting coroutines that belong to the groups assined to the current thread. The scheduler may also suspend a coroutine in order to implement time-sharing. The scheduler should support blocking states in the coroutines. 4 - When comparing to Perl 5, each coroutine is an ithread, but memory is shared between all the coroutines in the same group, given that they will always run in the same OS thread. 5 - When a coroutine group is created, it is assigned to one OS thread, the interpreter might decide to create new OS threads as necessary, it might optionally implement one OS thread per coroutine group. 6 - In order to implement inter-coroutine-group communication, there are: 6.1 - A MessageQueue works just like an Unix Pipe, it looks like a slurpy array. It has a configurable buffer size and coroutines might block when trying to read and/or write to it. 6.2 - A RemoteInvocation is an object that has a identifier, a capture (which might, optionally, point to a MessageQueue as input) and another MessageQueue to be used as output. New coroutines are created in the target group to execute that invocation. 6.3 - An InvocationQueue is a special type of MessageQueue that accepts only RemoteInvocation objects. 6.4 - A RemoteValue is an object that proxies requests to another coroutine group through a RemoteInvocation. 7 - The coroutine group boundary is drawn by language constructs such as async, the feed operator, junctions, hyper operators. 8 - A value might have its ownership transferred to another group if it can be detected that this value is in use only for that invocation or return value, in order to reduce the amount of RemoteInvocations. 9 - A value might do a special ThreadSafe role if it is thread-safe (such as implementing bindings to thread-safe native libraries) In which case it is sent as-is to a different group. 10 - A value might do a special ThreadCloneable role if it should be cloned instead of being proxied through a RemoteValue when sent in a RemoteInvocation. 11 - The MessageQueue notifies the scheduler whenever new data is available in that queue so the target coroutine might be raised. 12 - Exception handling gets a bit hairy, since exceptions might only be raised at the calling scope when the value is consumed. 13 - List assignment and Sink context might result in synchronized behavior. comments are appreciated... daniel -- B. Estrade estr...@gmail.com
Re: Ideas for a Object-Belongs-to-Thread threading model
Daniel Ruoso wrote: Hi, The threading model topic still needs lots of thinking, so I decided to try out some ideas. Every concurrency model has its advantages and drawbacks, I've been wondering about this ideas for a while now and I think I finally have a sketch. My primary concerns were: 1 - It can't require locking: Locking is just not scalable; 2 - It should perform better with lots of cores even if it suffers when you have only a few; 3 - It shouldn't require complicated memory management techniques that will make it difficult to bind native libraries (yes, STM is damn hard); 4 - It should suport implicit threading and implicit event-based programming (i.e. the feed operator); 5 - It must be easier to use then Perl 5 shared variables; 6 - It can't use a Global Interpreter Lock (that already said in 1, but, as this is a widely accepted idea in some other environments, I thought it would be better to make it explicit). The idea I started was that every object has an owner thread, and only that thread should talk to it, and I ended up with the following, comments are appreciated: comments? ideas? Before discussing the implementation, I think it's worth while stating what it is that you are attempting to abstract. For example, is the abstraction intended for a mapping down to a GPU (e.g. OpenCL) with a hierarchical address space, or is it intended for a multicore CPU with linear address space, or is it intended to abstract a LAN, with communication via sockets (reliable TCP? unreliable UDP?), or is it intended to abstract the internet/cloud? Are you thinking in terms of streaming computation where throughput is dominant, or interacting agents where latency is the critical metric? I'm not sure that it makes sense to talk of a single abstraction that supports all of those environments. However, there may be bunch of abstractions that can be combined in different ways. object belongs to thread can have two interpretations: one is that the object-thread binding lasts for the life of the object; the other is that a client that wishes to use an object must request ownership, and wait to be granted (in some scenarios, the granting of ownership would require the thread to migrate to the physical processor that owns the state). In many cases, we might find that specific object-state must live in specific places, but not all of the state that is encapsulated by an object lives in the same place. Often, an object will encapsulate state that is, itself, accessed via objects. If a model requires delegated access to owned state to be passed through an intermediate object then this may imply significant overhead. A better way to think about such scenarios may be that a client would request access to a subset of methods -- and thus we have role belongs to thread, not object belongs to thread. One could imagine that a FIFO object might have a put role and a get role that producer/consumer clients would (temporarily) own while using (note that granting of ownership may imply arbitration, and later forced-revocation if the resource-ownership is not released/extended before some timeout expires). It may be wrong to conflate role as a unit of reuse with role as an owned window onto a subset of an object's methods. Perl6 has a set of language primitives to support various aspects of concurrency. It is indeed interesting to consider how these map ot vastly difference computation platforms: OpenCl Vs OpenMP Vs Cloud. It deeps a little premature to be defining roles (e.g. RemoteInvocation) without defining the mapping of the core operators to these various models of computation. Dave.
Re: Ideas for a Object-Belongs-to-Thread threading model
Em Qua, 2010-05-12 às 10:12 -0700, Dave Whipp escreveu: Before discussing the implementation, I think it's worth while stating what it is that you are attempting to abstract. For example, is the abstraction intended for a mapping down to a GPU (e.g. OpenCL) with a hierarchical address space, or is it intended for a multicore CPU with linear address space, or is it intended to abstract a LAN, with communication via sockets (reliable TCP? unreliable UDP?), or is it intended to abstract the internet/cloud? Initially I'd consider regular OS threads and queues implemented in the process address space. I'd consider other abstractions to be possible, but probably better implement them as separated modules... daniel
Fwd: Ideas for a Object-Belongs-to-Thread threading model
Forgot to send this to the list. -- Forwarded message -- From: Alex Elsayed eternal...@gmail.com Date: Wed, May 12, 2010 at 8:55 PM Subject: Re: Ideas for a Object-Belongs-to-Thread threading model To: Daniel Ruoso dan...@ruoso.com You may find interesting a paper that was (at one point) listed in the /topic of #perl6. The paper is: Combining Events And Threads For Scalable Network Services http://www.cis.upenn.edu/~stevez/papers/LZ07.ps Steve Zdancewic and Peng Li, who wrote it, implemented their proof of concept in Haskell, and I think it would mesh rather well with the 'hybrid threads' GSoC project that Parrot is undertaking. What's more, the proof-of-concept demonstrated that it performed very well, well enough that the threading/event abstractions were never a bottle neck even up to 10M threads (for memory usage, this came out to 48bytes per thread of overhead), and with 100 threads it outperformed NPTL(pthreads)+AIO on IO. It's also CPS based, which fits pretty well.
Re: Ideas for a Object-Belongs-to-Thread threading model
On Wed, May 12, 2010 at 8:57 PM, Alex Elsayed eternal...@gmail.com wrote: Forgot to send this to the list. -- Forwarded message -- From: Alex Elsayed eternal...@gmail.com ... It's also CPS based, which fits pretty well. Here's another, one that might fit more readily with perlesque/CLR: Actors that Unify Threads and Events pdf: http://lamp.epfl.ch/~phaller/doc/haller07actorsunify.pdf slides: http://lamp.epfl.ch/~phaller/doc/ActorsUnify.pdf In this paper we present an abstraction of actors that combines the benefits of thread-based and event-based concurrency. Threads support blocking operations such as system I/O, and can be executed on multiple processor cores in parallel. Event-based computation, on the other hand, is more lightweight and scales to large numbers of actors. We also present a set of combinators that allows a flexible composition of these actors. Scala actors are implemented on the JVM, but our techniques can be applied to all multi-threaded VMs with a similar architecture, such as the CLR.