Ok, this is going to be a long one, so please bear with me.
I'll start with a question.
1. std.algorithm.move() and std.container
TDPL describes when a compiler can and cannot perform a move
automatically.
For cases when it isn't done automatically but we explicitly
require a move, we have
std.algorithm.move(). This function comes in extremely handy
especially when we
want to pass around data that cannot be otherwise copied
(disabled this(this)).
For example, sinking some unique value into a thread or storing
it in a container.
Or even both.
But why is there no practical way of storing such uncopyable data
in standard
containers? I.e. both Array and DList do try perform a copy when
insert() is called,
and happily fail when this(this) is @disabled. Same with access:
front() returns
by value, so again no luck with @disabled this(this).
What is interesting though is that range interfaces for containers
do allow for moveFront() et al., and for some containers they're
even defined.
So it's safe to move contents *out* but not *in*?
Is there some deeper technical reasoning behind this that I fail
to see?
Below is a medium rant that's somewhat unrelated to the above,
and is
aimed at receiving insights from those who're interested,
so if you're not, just skip it :)
I'll use quotes here to distinguish words from language qualifier.
This is mostly my current thoughts on "shared" and its usage, and
I'd like
that you could point out where I'm wrong in this sensitive topic,
any feedback
is greately appreciated.
2. "shared" is transitive. How transitive?
Declaring something as "shared" means that all its representation
is also "shared".
This is a good thing, right?. But it does have certain
implications. Consider a shared
data structure (for example, a (multi)producer (multi)consumer
queue). If it's designed
to store anything that has indirections (pointers, references),
those are better be
either provably unique (not possble in D except for immutable
data),
or "shared". In fact, the whole stored type should just be
"shared",
which is enforced by the compiler. Thus, we come to this:
shared class Queue(T) {
private Container!T q; // that'll be shared(Container!T),
// which will in turn store shared(T)
alias shared(T) Type;
// push and pop are of course synchronized
void push(Type) { ... }
Type pop() { ... }
}
But in case when there are no indirections (i.e. a primitive
type, or, more practically,
a struct with some primitive fields and a bunch of methods that
reason about that data
or maybe do something with it) it all comes down to usage. In
case of that queue, no two threads
could possibly access the same data simultaneously.
Let me define it real quick (somewhat contrived but should state
the intent):
struct Packet {
ulong ID;
ubyte[32] header;
ubyte[64] data;
string type() inout @property { ... }
ulong checkSum() inout @property { ... }
Variant payload() inout @property { ... }
}
Note that I do have arrays in there, but they cannot possibly
introduce any aliasing,
since they're static.
As soon as a producer pushes such value, it releases ownership of
it, and some consumer
later gains ownership. Remember, there are no indirections, so no
two threads could race
against the same data. But I cannot just declare a plain struct
and then start
pushing it into that queue. It wouldn't work, because queue
expects "shared" type.
One solution would be to use a cast. On one hand, this is
feasible: such data is
really only logically shared when it's somewhere "in-between"
threads, i.e. sits in a queue.
The "shared" queue owns the data for a moment, and thus makes the
data iself "shared".
As soon as a consumer pops that value off a queue, it can be cast
back to non-"shared".
This is ugly: it imposes certain convention in handling one
"shared"
type (the queue) with another non-"shared" one (the struct): i.e.
"always cast when push or pop".
Convention is not a reasonable justification for overlooking type
system.
Another solution would be to instantiate those structs as
"shared" in the first place.
But that won't work either: now any methods that those structs
have must also have
"shared" overloads. In other words, I suddenly need to provide
"shared" interface
for my struct. Well ok, I can do that trivially, by just
declaring the whole struct type
as "shared". But this is wrong. "shared" advertises certain
promise: this data is allowed to be
accessed by more than one thread at a time. This implies that
access to the data
is better be synchronized. In other words, I would have to
actually *write* the synchronization
for something that would never *need* synchronization. If I don't
do it and simply
leave the struct declared as "shared" (or have all relevant
"shared" overloads), I'm
shooting someone in the foot: imagine that later someone starts
using my types, sees that this struct
is declared "shared" and happily assumes that it can be used
concurrently. In that
contrived example it would be easy to see that's not actually the
case. But reality is cruel.
Bang.
So ideally I'd want the queue to handle this situation for me,
and luckily I can:
shared class Queue(T) {
static if (hasUnsharedAliasing!T) {
private Container!T q; // as before
alias shared(T) Type;
} else {
private __gshared Container!T q; // nothing is "shared" here
alias T Type;
}
// push and pop are still synchronized :)
void push(Type) { ... }
Type pop() { ... }
}
But that's not the end of it. As seen from that definition, I'm
using some
container (Container(T)) as actual storage. Current definition of
Queue
requires that Container(T) have "shared" interface. Either that,
or implement
the whole storing business myself right there in the Queue. The
latter is certainly not
feasible, especially since, depending on requirements for the
Queue,
I may need different storage capabilities.
In short: I don't need that container to be "shared" at all
(provided it's a sane container
that doesn't do anything else with the data except for storing
it). And in fact, if it were
"shared" already, I wouldn't need to define Queue at all, I'd
just use Container directly.
Therefore, final iteration of Queue would look like this:
shared class Queue(T) {
static if (hasUnsharedAliasing!T)
alias shared(T) Type;
else
alias T Type;
private __gshared Container!Type q;
// still synchronized :)
void push(Type) { ... }
Type pop() { ... }
}
All synchronization issues are handled by Queue, Container merely
stores the data,
which is in turn "shared" or not depending on its representation.
This is conceptually how std.concurrency works: it allows you to
send and receive
plain value types without imposing any "shared" qualification on
them, but as soon
as you try to send a non-"shared" reference type or a struct with
non-"shared" pointers,
it won't allow it.
There is, however, a nag with this: __gshared is not @safe. But
getting rid of it
would mean only one thing: the Queue could only ever store
shared(T), which
kind of kills initial message.
So, is "shared" really not as transitive as D wants it to be?
I imagine by now you already have a big list of "you're
incorrect"s to stick in my face,
or you probably have already stopped reading :)
I have some more doubts regarding my handling of "shared", but
I'll leave them for later so as
to not bore you to death.