Since they've come up, here's the scheme I've been pondering, based in large part on experience with POSIX threads at the C level, POSIX-style threads at the interpreter level (with perl 5's 5.005-style threads), and the interpreter/semi-fork style threads that perl 5's using now.

First off, the most important thing is that we absolutely guarantee that interpreter internal data structures are consistent and threadsafe. Nothing that bytecode does should be able to corrupt the internals of parrot. (We'll not talk about C extensions here--they can screw things up in massive and horribly creative ways, of course)

Second, we're assuming that the *non* threaded case is the common case. (This is definitely the assumption that I'm most expecting to regret in the future)

Third, we're assuming that we are *not* providing any sort of guarantees to user code about threads. User-level thread safety without user coding for it's one of those "solve the halting problem" things. While it'd be cool to solve, we've got better things to do before the sun goes nova. :)

So, the scheme is this.

1) Only PMCs may be shared across interpreters.

2) PMCs always live in the interpreter in which they were allocated

3) Only shareable PMCs may cross interpreters

4) Putting a PMC into a shared container shares the PMC and anything that the PMC points to. (This may involve a cascade of sharing. Too bad) PMCs may pitch an exception instead of sharing their contents

5) *Secondary* sharing failures are interpreter-killing fatal errors. Primary sharing failures are exceptions. That is, if you share something and that thing says no, it's an exception. If you share something and that something goes and shares something else and that fails, it's fatal.

6) Immutable things may be shared across interpreters with no synchronization or need for sub-sharing. Other than constants we don't have many of those

7) Immutable things may *not* point to mutable things. The converse is OK.

8) We're probably going to have to rejig the string functions some, and access them via a vtable off strings or something of the sort, so we can swap in and out threadsafe memory allocation and string allocation routines.

9) We need to add a share() entry to the vtable list for PMCs.

As to what gets shared and not shared...

There are three types of thread spawning we're going to do.

Type 1) Start a new thread with a new interpreter that shares *nothing* with the original, not even any communication.

Type 2) Start a new thread with a new interpreter and communicate with it in what's essentially an opto-isolated way. Done by sending messages to the other thread and waiting for acknowledgement. Messages received from another thread can be considered read-only and immutable, and should be copied and acknowledged as soon as possible into the receiving thread

Type 3) Start a new thread with a new interpreter and put it into the current thread pool. Threads of this sort may share everything, but aren't required to do so.

Type 1 threads start fresh, and basically get nothing but a chunk of bytecode and whatever that bytecode instantiates. This is essentially the same as firing up a brand new parrot process, except it's in a shared memory space. These threads can be free-running and don't require any sort of locking as they're completely independent. We count on the OS and system libraries to provide adequate locking where needed. (System memory allocation, system IO, and the like)

Type 2 threads are a bit more tightly coupled. When started, the interpreter may optionally mark its contents as cloneable into the new interpreter, in which case it gets a copy of all the creating thread's data and whatnot. (We can COW this or whatever--optimizations are fine as long as it looks like a copy)

With type 2 threads, libraries get their "You're being instantiated" function called, but do not get their "you're being loaded" function called, as they do when they're originally loaded. That's because they're *not* being loaded, just re-instantiated into a new thread. This may or may not do anything.

Type 3 threads clone off their internals (once again COWing themselves if we want) and then present everything as shared. Any library loaded in one interpreter is loaded into all of them, and we don't (I think, this can be argued) call the "you're being instantiated" function more than once. Once loaded, a package and all its data are visible to every thread in the type 3 thread group.

There'll be more, and soon, but let's hack into this part to start with and we'll go from here.
--
Dan


--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to