Re: D 2.0 FAQ on `shared`

Sean Kelly via Digitalmars-d Mon, 20 Oct 2014 09:22:13 -0700

On Monday, 20 October 2014 at 13:29:47 UTC, Marco Leise wrote:

What guarantees is shared supposed to provide?


Shared means that multiple threads can access the data. The
guarantee is that if it is not shared, and not immutable, that
only the current thread can see it.


What if I have a thread that contains some shared data? Should
the thread be created as shared, be cast to shared after
construction or not be shared and fine grained shared applied
to the respective shared data fields?

Since Thread is by its very nature a shared thing, Thread shouldprobably be defined as shared. But more generally it depends onthe use case.

What does shared have to do with synchronization?

Only shared data can be synchronized. It makes no sense to
synchronize thread local data.


Define synchronized. With atomic ops on word size items this
is clear, but what does it mean for aggregates? The language
shows us no connection between synchronization and the shared
data. What is one unit of data that is to be synchronized?

I think there's some conflation of two separate uses of"synchronized" here. I think the above is actually talking aboutsynchronized methods (ie. involving a mutex).

What does shared have to do with memory barriers?

Reading/writing shared data emits memory barriers to ensure
sequential consistency (not implemented).


That's exactly the problem. It assumes the unit is a word size
item.

I'd say the real problem is more that it assumes, or at leastsuggests, that sequential consistency of shared variables willresult in a correct program. It won't, for any non-trivial usesof shared variables. Lock-free programming is really, reallyhard, even for experts. Using shared variables in this wayshouldn't be easy semantically because it provides a false senseof security, resulting in programs that are silently broken inweird ways under some conditions but not others.

If I have a Mutex to protect my unit of shared data, I
don't need "volatile" handling of shared data.

    private shared class SomeThread : Thread
    {
    private:

        Condition m_condition;
        bool m_shutdown = false;
        ...
    }

Yep. This is one of my biggest issues with shared as it appliesto user-defined types. I even raised it in the now defunctconcurrency mailing list before the design was finalized. Sadly,there's no good way to sort this out, because:


shared class A {
    int m_count = 0;
    void increment() shared {
        m_count.atomicOp!"+="(1);
    }

    int getCount() synchronized {
        return m_count;
    }
}

If we make accesses of shared variables non-atomic insidesynchronized methods, there may be conflicts with their use inshared methods. Also:


shared class A {
    void doSomething() synchronized {
        doSomethingElse();
    }

    private void doSomethingElse() synchronized {

    }
}

doSomethingElse must be synchronized even if I as a programmerknow it doesn't have to be because the compiler insists it mustbe. And I know that private methods are visible within themodule, but the same rule applies. In essence, we can't avoidrecursive mutexes for implementing synchronized, and we're stuckwith a lot of recursive locks and unlocks no matter what, as soonas we slap a "shared" label on something.

m_shutdown will be shared and it is shared data, but it is
synchronized by the Mutex contained in that condition.
Automatic memory barriers and such would only slow down
execution.

Yes. Though there's no overhead for having a Mutex synchronizeone more operation. A Mutex is basically just a shared variableindicating locked state. When you leave a Mutex a sharedvariable is written to to indicate that the Mutex is unlocked,and the memory model in that language/platform/cpu guaranteesthat all operations logically occurring before that shared writeactually do complete before the shared write, at least to anyonewho acquires that same mutex before looking at the protected data(ie. there's a reader-writer contract).

What are the semantics of casting FROM unshared TO shared?

Make sure there are no other unshared references to that same
data.

What are the semantics of casting FROM shared TO
unshared?

Make sure there are no other shared references to that same
data.


That's just wrong to ask. `SomeThread` is a worker thread and
data is passed to it regularly through a shared reference that
is certainly never going away until the thread dies.
Yet I must be able to "unshare" it's list of work items to
process them.

Sure, but at that point they are no longer referenced by theshared Thread, correct? The rule is simply that you can't betrying to read or write data using both shared and unsharedoperations, because of that reader-writer contract I mentionedabove.

Now let's say I have an "empty" property. Shared or unshared?

        override @property bool empty() const
        {
                return m_list.empty;
        }

It is only called internally by the thread itself after
entering a certain critical section. I _know_ that m_list wont
be accessible by other threads while .empty is running.

So one thing about shared that Walter confirmed at some point isthat atomic ops won't be imposed on operations within a sharedmethod. But I suspect that someone is likely to read thepreceding sentence and say "woah! We never said that! And if wedid, that's wrong!" In short, I agree with you that shared, asdescribed, kind of sucks here because you're stuck with a ton ofinefficiency that you, as an intelligent programmer, know isunnecessary.

But this seeming 1:1 relationship between entering "the"
critical section and stripping shared is of course
non-existent. Aggregates may contain Mutexes protecting
different fields or even stacking on top of each other.

So the text should read:

What are the semantics of casting FROM shared TO unshared?

Make sure that during the period the data is unshared, no
other thread can modify those parts of it that you will be
accessing. If you don't use synchronization objects with
built-in memory-barriers like a Mutex, it is your
responsibility to properly synchronize data access through
e.g. atomicLoad/Store.


That at least in general sanctifies casting away shared for
the purpose of calling a method under protection of a user
defined critical section.

It's more complicated than that, because you don't know how longa given operation needs to propagate to another CPU. Simplyperforming a shared write is meaningless if something else isperforming an unshared read because the optimization happens atboth points--the write side and the read side.

In essence, the CPU performs the same optimizations as acompiler. Depending on the architecture, reads may be rearrangedto occur before other reads or writes, and writes may berearranged to occur before other reads and writes. On mostarchitectures the CPU makes some intelligent guesses about whatoperations are safe to rearrange (look into "dependent loads"),though on some few others like the DEC Alpha (a CPU invented bycrazy people), they do not and if you don't explicitly tell themwhat needs to happen, it won't.

Basically what's needed is some way to have the compiler optimizeaccording to the same rules as the CPU (the goal of "shared").Or in lieu of that, to have some "don't optimize this"instruction to tell the compiler to keep it's dirty hands offyour carefully constructed code so the only thing you need toworry about is what the CPU is trying to do. This is what"volatile" was meant for in D1 and I really liked it, but I thinkI was the only one.

There's a paper on release consistency that I think is fantastic.I'll link it later if I can find it on the interweb. CPUs seemto be converging on even more strict memory ordering than releaseconsistency, but the release consistency model is reallyfantastic as it's basically equivalent to how mutexes work and soit's a model everyone already understands.

Re: D 2.0 FAQ on `shared`

Reply via email to