RFC 178 proposes a shared data model for Perl6 threads. In a shared
data model 
- globals are shared unless localized
- file-scoped lexicals are shared unless the thread recompiles the
  file 
- block scoped lexicals may be shared by
  - passing a reference to them
  - closures
  - declaring one subroutine within the scope of another

In short, lots of stuff is shared, and just about everything can be
shared. 

To prevent the interpreter from crashing in a shared data model, every
access to a named variable must be protected by a mutex

    lock(b.mutex)
    fetch(b)
    unlock(b.mutex)
    lock(a.mutex)
    store(a)
    unlock(a.mutex)
    
It has been argued on perl6-internals that 
- acquiring mutexes takes time
- most variables aren't shared
- we should optimize for the common case by requiring a :shared
  attribute on shared variables. 

Variables declared without a :shared attribute would be isolated: each
thread gets its own value for the variable. In this model, the user
incurs the cost of mutexes only for data that is actually shared
between threads.

This is a valid argument. However, an isolated data model has its own
costs, and we need to understand these, so that we can compare them to
the costs of a shared data model.

The first interesting question is: How does a thread get access to its
own value for a variable?

We can break the problem into two broad cases
- All threads execute the same op tree
- Each thread executes its own copy of the op tree

Let's look at these in detail

1. All threads execute the same op tree

Consider an op, like

        fetch(b)

If you actually compile a Perl program, like

        $a = $b
        
and then look at the op tree, you won't find the symbol "$b", or "b"
anywhere in it. The fetch() op does not have the name of the variable
$b; rather, it holds a pointer to the value for $b.

If each thread is to have its own value for $b, then the fetch() op
can't hold a pointer to *the* value. Instead, it must hold a pointer
to a map that indexes from thread ID to the value of $b for that
thread. Thread IDs tend to be sparse, so the map can't be implemented
as an array. It will have to be a hash, or a B*-tree, or a balanced
B-tree, or the like.

We can do this: we can build maps. But they take space to build, and
they take time to search, and we incur that space for every variable,
and we incur that time for every variable access.


2. Each thread executes its own copy of the op tree
This breaks down further according to how much of the op tree we copy,
and when we copy it. Here are several possibilities

2.1 Copy everything at thread creation
This is simple and straightforward. We copy the op tree for every
subroutine in the entire program at thread creation. As we copy the
ops, we create new values for all the variables, and set the new ops
to point to the new values. Obviously, this takes space and time.

2.2 Copy subroutines on demand
We could defer copying subroutines until they are actually called by
the new thread. However, this leads to a problem analogous to the one
discussed in case 1 above. The entersub() op can no longer hold a
pointer to *the* coderef for the subroutine. Instead, it must hold a
pointer to a map that indexes from thread ID to the coderef.

The first time a thread calls a subroutine, it finds that there is no
entry for it in the map, makes a copy of the subroutine for itself,
and enters it into the map. Subsequent calls find the entry in the map
and call it immediately. All subroutine calls must search the map to
find the coderef.

2.3 Copy just the subroutines we need at thread creation
We could do a control flow analysis to determine the collection of
subroutines that can be called by a thread, and copy just those
subroutines when the thread is created. In this implementation, there
is no thread ID map: the entersub() op holds a pointer to the coderef.

This trades a more complex implementation for greater run-time
efficiency. Constructs like &$foo() are likely to complicate control
flow analysis. We could probably punt on hard cases and make them go
through a thread ID map.


RFC 178 describes a shared data model, and there has been enough
discussion of it on perl6-internals that we have some understanding of
its performance characteristics. RFCs for other thread models would
allow us to discuss them in definite terms, and come to some
understanding of their performance characteristics, as well. This
would then be a basis for choosing one model over another. Any
volunteers?


- SWM

Reply via email to