Ok, this is going to be a long one, so please bear with me.

I'll start with a question.

1. std.algorithm.move() and std.container

TDPL describes when a compiler can and cannot perform a move automatically. For cases when it isn't done automatically but we explicitly require a move, we have std.algorithm.move(). This function comes in extremely handy especially when we want to pass around data that cannot be otherwise copied (disabled this(this)). For example, sinking some unique value into a thread or storing it in a container.
Or even both.

But why is there no practical way of storing such uncopyable data in standard containers? I.e. both Array and DList do try perform a copy when insert() is called, and happily fail when this(this) is @disabled. Same with access: front() returns
by value, so again no luck with @disabled this(this).

What is interesting though is that range interfaces for containers
do allow for moveFront() et al., and for some containers they're even defined.
So it's safe to move contents *out* but not *in*?
Is there some deeper technical reasoning behind this that I fail to see?




Below is a medium rant that's somewhat unrelated to the above, and is
aimed at receiving insights from those who're interested,
so if you're not, just skip it :)

I'll use quotes here to distinguish words from language qualifier.
This is mostly my current thoughts on "shared" and its usage, and I'd like that you could point out where I'm wrong in this sensitive topic, any feedback
is greately appreciated.


2. "shared" is transitive. How transitive?

Declaring something as "shared" means that all its representation is also "shared". This is a good thing, right?. But it does have certain implications. Consider a shared data structure (for example, a (multi)producer (multi)consumer queue). If it's designed to store anything that has indirections (pointers, references), those are better be either provably unique (not possble in D except for immutable data), or "shared". In fact, the whole stored type should just be "shared",
which is enforced by the compiler. Thus, we come to this:

shared class Queue(T) {
        private Container!T q;  // that'll be shared(Container!T),
                                // which will in turn store shared(T)
        alias shared(T) Type;

        // push and pop are of course synchronized
        void push(Type) { ... }
        Type pop() { ... }
}

But in case when there are no indirections (i.e. a primitive type, or, more practically, a struct with some primitive fields and a bunch of methods that reason about that data or maybe do something with it) it all comes down to usage. In case of that queue, no two threads
could possibly access the same data simultaneously.
Let me define it real quick (somewhat contrived but should state the intent):

struct Packet {
        ulong ID;
        ubyte[32] header;
        ubyte[64] data;

        string type() inout @property { ... }
        ulong checkSum() inout @property { ... }
        Variant payload() inout @property { ... }
}

Note that I do have arrays in there, but they cannot possibly introduce any aliasing,
since they're static.

As soon as a producer pushes such value, it releases ownership of it, and some consumer later gains ownership. Remember, there are no indirections, so no two threads could race against the same data. But I cannot just declare a plain struct and then start pushing it into that queue. It wouldn't work, because queue expects "shared" type.

One solution would be to use a cast. On one hand, this is feasible: such data is really only logically shared when it's somewhere "in-between" threads, i.e. sits in a queue. The "shared" queue owns the data for a moment, and thus makes the data iself "shared". As soon as a consumer pops that value off a queue, it can be cast back to non-"shared".

This is ugly: it imposes certain convention in handling one "shared" type (the queue) with another non-"shared" one (the struct): i.e. "always cast when push or pop". Convention is not a reasonable justification for overlooking type system.

Another solution would be to instantiate those structs as "shared" in the first place. But that won't work either: now any methods that those structs have must also have "shared" overloads. In other words, I suddenly need to provide "shared" interface for my struct. Well ok, I can do that trivially, by just declaring the whole struct type as "shared". But this is wrong. "shared" advertises certain promise: this data is allowed to be accessed by more than one thread at a time. This implies that access to the data is better be synchronized. In other words, I would have to actually *write* the synchronization for something that would never *need* synchronization. If I don't do it and simply leave the struct declared as "shared" (or have all relevant "shared" overloads), I'm shooting someone in the foot: imagine that later someone starts using my types, sees that this struct is declared "shared" and happily assumes that it can be used concurrently. In that contrived example it would be easy to see that's not actually the case. But reality is cruel.
Bang.

So ideally I'd want the queue to handle this situation for me, and luckily I can:

shared class Queue(T) {
        static if (hasUnsharedAliasing!T) {
                private Container!T q;           // as before
                alias shared(T) Type;
        } else {
                private __gshared Container!T q; // nothing is "shared" here
                alias T Type;
        }

        // push and pop are still synchronized :)
        void push(Type) { ... }
        Type pop() { ... }
}

But that's not the end of it. As seen from that definition, I'm using some container (Container(T)) as actual storage. Current definition of Queue requires that Container(T) have "shared" interface. Either that, or implement the whole storing business myself right there in the Queue. The latter is certainly not feasible, especially since, depending on requirements for the Queue,
I may need different storage capabilities.
In short: I don't need that container to be "shared" at all (provided it's a sane container that doesn't do anything else with the data except for storing it). And in fact, if it were "shared" already, I wouldn't need to define Queue at all, I'd just use Container directly.

Therefore, final iteration of Queue would look like this:

shared class Queue(T) {
        static if (hasUnsharedAliasing!T)
                alias shared(T) Type;
        else
                alias T Type;

        private __gshared Container!Type q;

        // still synchronized :)
        void push(Type) { ... }
        Type pop() { ... }      
}

All synchronization issues are handled by Queue, Container merely stores the data,
which is in turn "shared" or not depending on its representation.

This is conceptually how std.concurrency works: it allows you to send and receive plain value types without imposing any "shared" qualification on them, but as soon as you try to send a non-"shared" reference type or a struct with non-"shared" pointers,
it won't allow it.

There is, however, a nag with this: __gshared is not @safe. But getting rid of it would mean only one thing: the Queue could only ever store shared(T), which
kind of kills initial message.

So, is "shared" really not as transitive as D wants it to be?

I imagine by now you already have a big list of "you're incorrect"s to stick in my face,
or you probably have already stopped reading :)

I have some more doubts regarding my handling of "shared", but I'll leave them for later so as
to not bore you to death.

Reply via email to