> The problem is, as long as expressions can be within each other,
> and include terms that are multiple expressions, a robust deadlock
> avoidance strategy is required even with cooperative threading.
In order to understand this, we need to think in more detail about how
the Perl interpreter works. Here is a sketch of the internal data
structure that represents a string in Perl
struct Scalar
{
int length;
char *pData;
};
and here is the C code required to carry out the Perl statement
$a = $b;
free(a.pData); // 1
a.length = b.length; // 2
a.pData = malloc(a.length); // 3
memcpy(a.pData, b.pData, a.length); // 4
Now let's look at this in a multi-threaded environment.
Thread A Thread B Thread C
$a = $b $a = $c $b = $d
Preemptive
With preemptive threads, it is absolutely possible for thread B and C
to get in-between statements 3 and 4 in thread A and reallocate
a.pData and b.pData. This will cause data corruption and/or heap
corruption, and will probably crash the interpreter.
To prevent this, we have to give every scalar a mutex, and lock that
mutex before we read or write the scalar.
struct Scalar // only represents strings
{
Mutex mutex;
int length;
char *pData;
};
lock(a.mutex)
lock(b.mutex)
free(a.pData); // 1
a.length = b.length; // 2
a.pData = malloc(a.length); // 3
memcpy(a.pData, b.pData, a.length); // 4
unlock(a.mutex)
unlock(b.mutex)
Now, no other thread can operate on $a or $b in-between statements
1-4. BUT...we have a new problem. Threads A, B, and C are all trying
to lock $a, $b, and $c at the same time. This can lead to deadlock.
There are ways to avoid deadlock, but it's a big interpreter, and if
you don't get it exactly right everywhere, you lose.
Cooperative
With cooperative threads, there is no way any other thread can
execute during the assignment, unless the C code does a yield().
(This is the *definition* of cooperative).
free(a.pData); // 1
a.length = b.length; // 2
a.pData = malloc(a.length); // 3
yield(); // DON'T DO THIS!!!
memcpy(a.pData, b.pData, a.length); // 4
You don't need mutexes, and without mutexes, you can't have deadlock.
Others have pointed out that code inside sub-expressions and blocks
could also assign to our variables. This is true, but it isn't our
problem. As long as each assignment is carried out correctly by the
interpreter, then each variable always has a valid value, computed
from other valid values. For example,
$a = "abcd";
$b = "wxyz";
Thread A Thread B
$a = $b; $b = $a;
All the interpreter guarantees is
1. It won't crash.
2. After both threads run, $a and $b are both either "abcd" or "wxyz",
(and not, for example, "abyz").
If the user cares whether the final values of $a and $b are "abcd" or
"wxyz", then they have to do their own synchronization.
- SWM