A question about move() and a rant about shared

Stanislav Blinov Fri, 24 Jan 2014 09:11:16 -0800

Ok, this is going to be a long one, so please bear with me.


I'll start with a question.

1. std.algorithm.move() and std.container

TDPL describes when a compiler can and cannot perform a moveautomatically.For cases when it isn't done automatically but we explicitlyrequire a move, we havestd.algorithm.move(). This function comes in extremely handyespecially when wewant to pass around data that cannot be otherwise copied(disabled this(this)).For example, sinking some unique value into a thread or storingit in a container.

Or even both.

But why is there no practical way of storing such uncopyable datain standardcontainers? I.e. both Array and DList do try perform a copy wheninsert() is called,and happily fail when this(this) is @disabled. Same with access:front() returns

by value, so again no luck with @disabled this(this).

What is interesting though is that range interfaces for containers

do allow for moveFront() et al., and for some containers they'reeven defined.

So it's safe to move contents *out* but not *in*?

Is there some deeper technical reasoning behind this that I failto see?

Below is a medium rant that's somewhat unrelated to the above,and is

aimed at receiving insights from those who're interested,
so if you're not, just skip it :)

I'll use quotes here to distinguish words from language qualifier.

This is mostly my current thoughts on "shared" and its usage, andI'd likethat you could point out where I'm wrong in this sensitive topic,any feedback

is greately appreciated.


2. "shared" is transitive. How transitive?

Declaring something as "shared" means that all its representationis also "shared".This is a good thing, right?. But it does have certainimplications. Consider a shareddata structure (for example, a (multi)producer (multi)consumerqueue). If it's designedto store anything that has indirections (pointers, references),those are better beeither provably unique (not possble in D except for immutabledata),or "shared". In fact, the whole stored type should just be"shared",

which is enforced by the compiler. Thus, we come to this:

shared class Queue(T) {
        private Container!T q;  // that'll be shared(Container!T),
                                // which will in turn store shared(T)
        alias shared(T) Type;

        // push and pop are of course synchronized
        void push(Type) { ... }
        Type pop() { ... }
}

But in case when there are no indirections (i.e. a primitivetype, or, more practically,a struct with some primitive fields and a bunch of methods thatreason about that dataor maybe do something with it) it all comes down to usage. Incase of that queue, no two threads

could possibly access the same data simultaneously.

Let me define it real quick (somewhat contrived but should statethe intent):


struct Packet {
        ulong ID;
        ubyte[32] header;
        ubyte[64] data;

        string type() inout @property { ... }
        ulong checkSum() inout @property { ... }
        Variant payload() inout @property { ... }
}

Note that I do have arrays in there, but they cannot possiblyintroduce any aliasing,

since they're static.

As soon as a producer pushes such value, it releases ownership ofit, and some consumerlater gains ownership. Remember, there are no indirections, so notwo threads could raceagainst the same data. But I cannot just declare a plain structand then startpushing it into that queue. It wouldn't work, because queueexpects "shared" type.

One solution would be to use a cast. On one hand, this isfeasible: such data isreally only logically shared when it's somewhere "in-between"threads, i.e. sits in a queue.The "shared" queue owns the data for a moment, and thus makes thedata iself "shared".As soon as a consumer pops that value off a queue, it can be castback to non-"shared".

This is ugly: it imposes certain convention in handling one"shared"type (the queue) with another non-"shared" one (the struct): i.e."always cast when push or pop".Convention is not a reasonable justification for overlooking typesystem.

Another solution would be to instantiate those structs as"shared" in the first place.But that won't work either: now any methods that those structshave must also have"shared" overloads. In other words, I suddenly need to provide"shared" interfacefor my struct. Well ok, I can do that trivially, by justdeclaring the whole struct typeas "shared". But this is wrong. "shared" advertises certainpromise: this data is allowed to beaccessed by more than one thread at a time. This implies thataccess to the datais better be synchronized. In other words, I would have toactually *write* the synchronizationfor something that would never *need* synchronization. If I don'tdo it and simplyleave the struct declared as "shared" (or have all relevant"shared" overloads), I'mshooting someone in the foot: imagine that later someone startsusing my types, sees that this structis declared "shared" and happily assumes that it can be usedconcurrently. In thatcontrived example it would be easy to see that's not actually thecase. But reality is cruel.

Bang.

So ideally I'd want the queue to handle this situation for me,and luckily I can:


shared class Queue(T) {
        static if (hasUnsharedAliasing!T) {
                private Container!T q;           // as before
                alias shared(T) Type;
        } else {
                private __gshared Container!T q; // nothing is "shared" here
                alias T Type;
        }

        // push and pop are still synchronized :)
        void push(Type) { ... }
        Type pop() { ... }
}

But that's not the end of it. As seen from that definition, I'musing somecontainer (Container(T)) as actual storage. Current definition ofQueuerequires that Container(T) have "shared" interface. Either that,or implementthe whole storing business myself right there in the Queue. Thelatter is certainly notfeasible, especially since, depending on requirements for theQueue,

I may need different storage capabilities.

In short: I don't need that container to be "shared" at all(provided it's a sane containerthat doesn't do anything else with the data except for storingit). And in fact, if it were"shared" already, I wouldn't need to define Queue at all, I'djust use Container directly.


Therefore, final iteration of Queue would look like this:

shared class Queue(T) {
        static if (hasUnsharedAliasing!T)
                alias shared(T) Type;
        else
                alias T Type;

        private __gshared Container!Type q;

        // still synchronized :)
        void push(Type) { ... }
        Type pop() { ... }      
}

All synchronization issues are handled by Queue, Container merelystores the data,

which is in turn "shared" or not depending on its representation.

This is conceptually how std.concurrency works: it allows you tosend and receiveplain value types without imposing any "shared" qualification onthem, but as soonas you try to send a non-"shared" reference type or a struct withnon-"shared" pointers,

it won't allow it.

There is, however, a nag with this: __gshared is not @safe. Butgetting rid of itwould mean only one thing: the Queue could only ever storeshared(T), which

kind of kills initial message.

So, is "shared" really not as transitive as D wants it to be?

I imagine by now you already have a big list of "you'reincorrect"s to stick in my face,

or you probably have already stopped reading :)

I have some more doubts regarding my handling of "shared", butI'll leave them for later so as

to not bore you to death.

A question about move() and a rant about shared

Reply via email to