subject:"Re\: Something needs to happen with shared, and soon."


On 11/15/12 3:30 PM, David Nadlinger wrote:

On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:

On Nov 15, 2012, at 3:05 PM, David Nadlinger  wrote:

Well, to be picky, that depends on what kind of memory operation you
mean – moving non-volatile loads/stores across volatile ones is
typically considered acceptable.


Usually not, really. Like if you implement a mutex, you don't want
non-volatile operations to be hoisted above the mutex acquire or sunk
below the mutex release. However, it's safe to move additional
operations into the block where the mutex is held.


Oh well, I was just being stupid when typing up my response: What I
meant to say is that you _can_ reorder a set of memory operations
involving atomic/volatile ones unless you violate the guarantees of the
chosen memory order option.

So, for Andrei's statement to be true, shared needs to be defined as
making all memory operations sequentially consistent. Walter doesn't
seem to think this is the way to go, at least if that is what he is
referring to as »memory barriers«.


Shared must be sequentially consistent.

Andrei

Re: Something needs to happen with shared, and soon.


On 11/15/12 3:05 PM, David Nadlinger wrote:

On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:

On 11/15/12 1:29 PM, David Nadlinger wrote:

On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu
wrote:

That is correct. My point is that compiler implementers would follow
some specification. That specification would contain informationt hat
atomicLoad and atomicStore must have special properties that put them
apart from any other functions.


What are these special properties? Sorry, it seems like we are talking
past each other…


For example you can't hoist a memory operation before a shared load or
after a shared store.


Well, to be picky, that depends on what kind of memory operation you
mean – moving non-volatile loads/stores across volatile ones is
typically considered acceptable.


In D that's fine (as long as in-thread SC is respected) because 
non-shared vars are guaranteed to be thread-local.



But still, you can't move memory operations across any other arbitrary
function call either (unless you can prove it is safe by inspecting the
callee's body, obviously), so I don't see where atomicLoad/atomicStore
would be special here.


It is special because e.g. on x86 the function is often a simple 
unprotected load or store. So after the inliner has at it, there's 
nothing to stay in the way of reordering. The point is the compiler must 
understand the semantics of acquire and release.



Andrei

Re: Something needs to happen with shared, and soon.

On Nov 15, 2012, at 3:30 PM, David Nadlinger  wrote:

> On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:
>> On Nov 15, 2012, at 3:05 PM, David Nadlinger  wrote:
>>> Well, to be picky, that depends on what kind of memory operation you mean – 
>>> moving non-volatile loads/stores across volatile ones is typically 
>>> considered acceptable.
>> 
>> Usually not, really.  Like if you implement a mutex, you don't want 
>> non-volatile operations to be hoisted above the mutex acquire or sunk below 
>> the mutex release.  However, it's safe to move additional operations into 
>> the block where the mutex is held.
> 
> Oh well, I was just being stupid when typing up my response: What I meant to 
> say is that you _can_ reorder a set of memory operations involving 
> atomic/volatile ones unless you violate the guarantees of the chosen memory 
> order option.
> 
> So, for Andrei's statement to be true, shared needs to be defined as making 
> all memory operations sequentially consistent. Walter doesn't seem to think 
> this is the way to go, at least if that is what he is referring to as »memory 
> barriers«.

I think because of the as-if rule, the compiler can continue to optimize all it 
wants between volatile operations.  Just not across them.

Re: Something needs to happen with shared, and soon.


On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:
On Nov 15, 2012, at 3:05 PM, David Nadlinger 
 wrote:
Well, to be picky, that depends on what kind of memory 
operation you mean – moving non-volatile loads/stores across 
volatile ones is typically considered acceptable.


Usually not, really.  Like if you implement a mutex, you don't 
want non-volatile operations to be hoisted above the mutex 
acquire or sunk below the mutex release.  However, it's safe to 
move additional operations into the block where the mutex is 
held.


Oh well, I was just being stupid when typing up my response: What 
I meant to say is that you _can_ reorder a set of memory 
operations involving atomic/volatile ones unless you violate the 
guarantees of the chosen memory order option.


So, for Andrei's statement to be true, shared needs to be defined 
as making all memory operations sequentially consistent. Walter 
doesn't seem to think this is the way to go, at least if that is 
what he is referring to as »memory barriers«.


David

Re: Something needs to happen with shared, and soon.

On Nov 15, 2012, at 3:05 PM, David Nadlinger  wrote:

> On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:
>> On 11/15/12 1:29 PM, David Nadlinger wrote:
>>> On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:
 That is correct. My point is that compiler implementers would follow
 some specification. That specification would contain informationt hat
 atomicLoad and atomicStore must have special properties that put them
 apart from any other functions.
>>> 
>>> What are these special properties? Sorry, it seems like we are talking
>>> past each other…
>> 
>> For example you can't hoist a memory operation before a shared load or after 
>> a shared store.
> 
> Well, to be picky, that depends on what kind of memory operation you mean – 
> moving non-volatile loads/stores across volatile ones is typically considered 
> acceptable.

Usually not, really.  Like if you implement a mutex, you don't want 
non-volatile operations to be hoisted above the mutex acquire or sunk below the 
mutex release.  However, it's safe to move additional operations into the block 
where the mutex is held.

Re: Something needs to happen with shared, and soon.

On Nov 15, 2012, at 2:18 PM, David Nadlinger  wrote:

> On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:
>> On Nov 15, 2012, at 5:16 AM, deadalnix  wrote:
>>> What is the point of ensuring that the compiler does not reorder 
>>> load/stores if the CPU is allowed to do so ?
>> 
>> Because we can write ASM to tell the CPU not to.  We don't have any such 
>> ability for the compiler right now.
> 
> I think the question was: Why would you want to disable compiler code motion 
> for loads/stores which are not atomic, as the CPU might ruin your assumptions 
> anyway?

A barrier isn't always necessary to achieve the desired ordering on a given 
system.  But I'd still call out to ASM to make sure the intended operation 
happened.  I don't know that I'd ever feel comfortable with "volatile x=y" even 
if what I'd do instead is just a MOV.

Re: Something needs to happen with shared, and soon.

On Thursday, 15 November 2012 at 22:58:53 UTC, Andrei 
Alexandrescu wrote:

On 11/15/12 2:18 PM, David Nadlinger wrote:
On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly 
wrote:
On Nov 15, 2012, at 5:16 AM, deadalnix  
wrote:


What is the point of ensuring that the compiler does not 
reorder

load/stores if the CPU is allowed to do so ?


Because we can write ASM to tell the CPU not to. We don't 
have any

such ability for the compiler right now.


I think the question was: Why would you want to disable 
compiler code
motion for loads/stores which are not atomic, as the CPU might 
ruin your

assumptions anyway?


The compiler does whatever it takes to ensure sequential 
consistency for shared use, including possibly inserting fences 
in certain places.


Andrei


How does this have anything to do with deadalnix' question that I 
rephrased at all? It is not at all clear that shared should do 
this (it currently doesn't), and the question was explicitly 
about Walter's statement that shared should disable compiler 
reordering, when at the same time *not* inserting barriers/atomic 
ops. Thus the »which are not atomic« qualifier in my message.


David

Re: Something needs to happen with shared, and soon.

On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei 
Alexandrescu wrote:

On 11/15/12 1:29 PM, David Nadlinger wrote:
On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei 
Alexandrescu wrote:
That is correct. My point is that compiler implementers would 
follow
some specification. That specification would contain 
informationt hat
atomicLoad and atomicStore must have special properties that 
put them

apart from any other functions.


What are these special properties? Sorry, it seems like we are 
talking

past each other…


For example you can't hoist a memory operation before a shared 
load or after a shared store.


Well, to be picky, that depends on what kind of memory operation 
you mean – moving non-volatile loads/stores across volatile 
ones is typically considered acceptable.


But still, you can't move memory operations across any other 
arbitrary function call either (unless you can prove it is safe 
by inspecting the callee's body, obviously), so I don't see where 
atomicLoad/atomicStore would be special here.


David

Re: Something needs to happen with shared, and soon.


On 11/15/12 1:29 PM, David Nadlinger wrote:

On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:

That is correct. My point is that compiler implementers would follow
some specification. That specification would contain informationt hat
atomicLoad and atomicStore must have special properties that put them
apart from any other functions.


What are these special properties? Sorry, it seems like we are talking
past each other…


For example you can't hoist a memory operation before a shared load or 
after a shared store.


Andrei

Re: Something needs to happen with shared, and soon.


On 11/15/12 2:18 PM, David Nadlinger wrote:

On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:

On Nov 15, 2012, at 5:16 AM, deadalnix  wrote:


What is the point of ensuring that the compiler does not reorder
load/stores if the CPU is allowed to do so ?


Because we can write ASM to tell the CPU not to. We don't have any
such ability for the compiler right now.


I think the question was: Why would you want to disable compiler code
motion for loads/stores which are not atomic, as the CPU might ruin your
assumptions anyway?


The compiler does whatever it takes to ensure sequential consistency for 
shared use, including possibly inserting fences in certain places.


Andrei

Re: Something needs to happen with shared, and soon.


On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:
On Nov 15, 2012, at 5:16 AM, deadalnix  
wrote:


What is the point of ensuring that the compiler does not 
reorder load/stores if the CPU is allowed to do so ?


Because we can write ASM to tell the CPU not to.  We don't have 
any such ability for the compiler right now.


I think the question was: Why would you want to disable compiler 
code motion for loads/stores which are not atomic, as the CPU 
might ruin your assumptions anyway?


David

Re: Something needs to happen with shared, and soon.

On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei 
Alexandrescu wrote:
That is correct. My point is that compiler implementers would 
follow some specification. That specification would contain 
informationt hat atomicLoad and atomicStore must have special 
properties that put them apart from any other functions.


What are these special properties? Sorry, it seems like we are 
talking past each other…


[1] I am not sure where the point of diminishing returns is 
here,
although it might make sense to provide the same options as 
C++11. If I

remember correctly, D1/Tango supported a lot more levels of
synchronization.


We could start with sequential consistency and then explore 
riskier/looser policies.


I'm not quite sure what you are saying here. The functions in 
core.atomic already exist, and currently offer four levels (raw, 
acq, rel, seq). Are you suggesting to remove the other options?


David

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Jacob Carlborg


On 2012-11-15 11:52, Manu wrote:


Interesting concept. Nice idea, could certainly be useful, but it
doesn't address the problem as directly as my suggestion.
There are still many problem situations, for instance, any time a
template is involved. The template doesn't know to do that internally,
but under my proposal, you lock it prior to the workload, and then the
template works as expected. Templates won't just break and fail whenever
shared is involved, because assignments would be legal. They'll just
assert that the thing is locked at the time, which is the programmers
responsibility to ensure.


I don't understand how a template would cause problems.

--
/Jacob Carlborg

Re: Something needs to happen with shared, and soon.

On Nov 15, 2012, at 7:17 AM, Andrei Alexandrescu 
 wrote:

> On 11/15/12 1:08 AM, Manu wrote:
>> 
>> Side note: I still think a convenient and fairly practical solution is
>> to make 'shared' things 'lockable'; where you can lock()/unlock() them,
>> and assignment to/from shared things is valid (no casting), but a
>> runtime assert insists that the entity is locked whenever it is
>> accessed.
> 
> This (IIUC) is conflating mutex-based synchronization with memory models and 
> atomic operations. I suggest we postpone anything related to that for the 
> sake of staying focused.

By extension, I'd suggest postponing anything related to classes as well.

Re: Something needs to happen with shared, and soon.

On Nov 15, 2012, at 5:16 AM, deadalnix  wrote:
> 
> What is the point of ensuring that the compiler does not reorder load/stores 
> if the CPU is allowed to do so ?

Because we can write ASM to tell the CPU not to.  We don't have any such 
ability for the compiler right now.

Re: Something needs to happen with shared, and soon.

On Nov 15, 2012, at 5:10 AM, deadalnix  wrote:

> Le 14/11/2012 23:21, Andrei Alexandrescu a écrit :
>> On 11/14/12 12:00 PM, Sean Kelly wrote:
>>> On Nov 14, 2012, at 6:16 AM, Andrei
>>> Alexandrescu wrote:
>>> 
 On 11/14/12 1:20 AM, Walter Bright wrote:
> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>> If the compiler should/does not add memory barriers, then is there a
>> reason for
>> having it built into the language? Can a library solution be enough?
> 
> Memory barriers can certainly be added using library functions.

 The compiler must understand the semantics of barriers such as e.g.
 it doesn't hoist code above an acquire barrier or below a release
 barrier.
>>> 
>>> That was the point of the now deprecated "volatile" statement. I still
>>> don't entirely understand why it was deprecated.
>> 
>> Because it's better to associate volatility with data than with code.
> 
> Happy to see I'm not alone on that one.
> 
> Plus, volatile and sequential consistency are 2 different beast. Volatile 
> means no register promotion and no load/store reordering. It is required, but 
> not sufficient for concurrency.

It's sufficient for concurrency when coupled with library code that does the 
hardware-level synchronization.  In short, a program has two separate machines 
doing similar optimizations on it: the compiler and the CPU.  In D we can use 
ASM to control CPU optimizations, and in D1 we had "volatile" to control 
compiler optimizations.  "volatile" was the minimum required for handling the 
compiler portion and was easy to get wrong, but it used only one keyword and I 
suspect was relatively easy to implement on the compiler side as well.

Re: Something needs to happen with shared, and soon.

On Nov 15, 2012, at 4:54 AM, deadalnix  wrote:

> Le 14/11/2012 21:01, Sean Kelly a écrit :
>> On Nov 14, 2012, at 6:32 AM, Andrei 
>> Alexandrescu  wrote:
>>> 
>>> This is a simplification of what should be going on. The 
>>> core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the 
>>> compiler generate sequentially consistent code with them (i.e. not perform 
>>> certain reorderings). Then there are loads and stores with weaker 
>>> consistency semantics (acquire, release, acquire/release, and consume).
>> 
>> No.  These functions all contain volatile ask blocks.  If the compiler 
>> respected the "volatile" it would be enough.
> 
> It is sufficient for monocore and mostly correct for x86. But isn't enough.
> 
> volatile isn't for concurency, but memory mapping.

Traditionally, the term "volatile" is for memory mapping.  The description of 
"volatile" for D1, though, would have worked for concurrency.  Or is there some 
example you can provide where this isn't true?

Re: Something needs to happen with shared, and soon.

On Nov 11, 2012, at 6:30 PM, Walter Bright  wrote:
> 
> To make a shared type work in an algorithm, you have to:
> 
> 1. ensure single threaded access by aquiring a mutex
> 2. cast away shared
> 3. operate on the data
> 4. cast back to shared
> 5. release the mutex

So what happens if you pass a reference to the now non-shared object to a 
function that caches a local reference to it?  Half the point of the attribute 
is to protect us from accidents like this.

Re: Something needs to happen with shared, and soon.

On Nov 14, 2012, at 6:28 PM, Andrei Alexandrescu 
 wrote:

> On 11/14/12 4:50 PM, Sean Kelly wrote:
>> On Nov 14, 2012, at 2:25 PM, Andrei
>> Alexandrescu  wrote:
>> 
>>> On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared"
 must prevent the compiler from reordering them. But that's
 separate from inserting memory barriers.
>>> 
>>> It's the same issue at hand: ordering properly and inserting
>>> barriers are two ways to ensure one single goal, sequential
>>> consistency. Same thing.
>> 
>> Sequential consistency is great and all, but it doesn't render
>> concurrent code correct.  At worst, it provides a false sense of
>> security that somehow it does accomplish this, and people end up
>> actually using it as such.
> 
> Yah, but the baseline here is acquire-release which has subtle differences 
> that are all the more maddening.

Really?  Acquire-release always seemed to have equivalent safety to me.  
Typically, the user doesn't even have to understand that optimization can occur 
upwards across the trailing boundary of the block, etc, to produce correct 
code.  Though I do agree that the industry is moving towards sequential 
consistency, so there may be no point in trying for something weaker.

Re: Something needs to happen with shared, and soon.

On Nov 15, 2012, at 3:16 AM, Regan Heath  wrote:
> 
> I suggested something similar as did Sönke:
> http://forum.dlang.org/thread/k7orpj$1tt5$1...@digitalmars.com?page=2#post-op.wnnuiio554xghj:40puck.auriga.bhead.co.uk
> 
> According to deadalnix the compiler magic I suggested to add the mutex isn't 
> possible:
> http://forum.dlang.org/thread/k7orpj$1tt5$1...@digitalmars.com?page=3#post-k7qsb5:242gqk:241:40digitalmars.com
> 
> Most of our ideas can be implemented with a wrapper template containing the 
> sync object (mutex, etc).

If I understand you correctly, you don't need anything that explicitly contains 
the sync object.  A global table of mutexes used according to the address of 
the value to be mutated should work.


> So... my feeling is that the best solution for "shared", ignoring the memory 
> barrier aspect which I would relegate to a different feature and solve a 
> different way, is..
> 
> 1. Remove the existing mutex from object.
> 2. Require that all objects passed to synchronized() {} statements implement 
> a synchable(*) interface
> 3. Design a Shared(*) wrapper template/struct that contains a mutex and 
> implements synchable(*)
> 4. Design a Shared(*) base class which contains a mutex and implements 
> synchable(*)

It would be nice to eliminate the mutex that's optionally built into classes 
now.  The possibility of having to allocate a new mutex on whatever random 
function call happens to be the first one with "synchronized" is kinda not 
great.

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Dmitry Olshansky


11/15/2012 8:33 AM, Michel Fortin пишет:


If you want to declare the mutex separately, you could do it by
specifying a variable instead of a type in the variable declaration:

 Mutex m;
 synchronized(m) int i;

 synchronized(i)
 {
 // implicit: m.lock();
 // implicit: scope (exit) m.unlock();
 i++;
 }


While the rest of proposal was more or less fine. I don't get why we 
need escape control of mutex at all - in any case it just opens a 
possibility to shout yourself in the foot.


I'd say:
"Need direct access to mutex? - Go on with the manual way it's still 
right there (and scope(exit) for that matter)".


Another problem is that somebody clever can escape reference to unlocked 
'i' inside of synchronized to somewhere else.


But anyway we can make it in the library right about now.

synchronized T ---> Synchronized!T
synchronized(i){ ... } --->

i.access((x){
//will lock & cast away shared T inside of it
...
});

I fail to see what it doesn't solve (aside of syntactic sugar).

The key point is that Synchronized!T is otherwise an opaque type.
We could pack a few other simple primitives like 'load', 'store' etc. 
All of them will go through lock-unlock.


Even escaping a reference can be solved by passing inside of 'access'
a proxy of T. It could even asserts that the lock is in indeed locked.

Same goes about Atomic!T. Though the set of primitives is quite limited 
depending on T.
(I thought that built-in shared(T) is already atomic though so no need 
to reinvent this wheel)


It's time we finally agree that 'shared' qualifier is an assembly 
language of multi-threading based on sharing. It just needs some safe 
patterns in the library.


That and clarifying explicitly what guarantees (aside from being well.. 
being shared) it provides w.r.t. memory model.


Until reaching this thread I was under impression that shared means:
- globally visible
- atomic operations for stuff that fits in one word
- sequentially consistent guarantee
- any other forms of access are disallowed except via casts

--
Dmitry Olshansky

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Mehrdad

Would it be useful if 'shared' in D did something like 'volatile' 
in C++ (as in, Andrei's article on volatile-correctness)?

http://www.drdobbs.com/cpp/volatile-the-multithreaded-programmers-b/184403766

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Dmitry Olshansky


11/15/2012 1:06 AM, Walter Bright пишет:

On 11/14/2012 3:14 AM, Benjamin Thaut wrote:

A small code example which would break as soon as we allow destructing
of shared
value types would really be nice.


I hate to repeat myself, but:

Thread 1:
 1. create shared object
 2. pass reference to that object to Thread 2
 3. destroy object

Thread 2:
 1. manipulate that object


Ain't structs typically copied anyway?

Reference would imply pointer then. If the struct is on the stack (weird 
but could be) then the thread that created it destroys the object once. 
The thing is as unsafe as escaping a pointer is.


Personally I think that shared stuff allocated on the stack is 
here-be-dragons @system code in any case.


Otherwise it's GC's responsibility to destroy heap allocated struct when 
there are no references to it.


What's so puzzling about it?

BTW currently GC-allocated structs are not having their destructor 
called at all. The bug is however _minor_ ...


http://d.puremagic.com/issues/show_bug.cgi?id=2834

--
Dmitry Olshansky

Re: Something needs to happen with shared, and soon.


On 11/15/12 1:08 AM, Manu wrote:

On 14 November 2012 19:54, Andrei Alexandrescu
mailto:seewebsiteforem...@erdani.org>>
wrote:
Yah, the whole point here is that we need something IN THE LANGUAGE
DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION.

THIS IS VERY IMPORTANT.


I won't outright disagree, but this seems VERY dangerous to me.

You need to carefully study all popular architectures, and consider that
if the language is made to depend on these primitives, and the
architecture doesn't support it, or support that particular style of
implementation (fairly likely), than D will become incompatible with a
huge number of architectures on that day.


All contemporary languages that are serious about concurrency support 
atomic primitives one way or another. We must too. There's no two ways 
about it.


[snip]

Side note: I still think a convenient and fairly practical solution is
to make 'shared' things 'lockable'; where you can lock()/unlock() them,
and assignment to/from shared things is valid (no casting), but a
runtime assert insists that the entity is locked whenever it is
accessed.


This (IIUC) is conflating mutex-based synchronization with memory models 
and atomic operations. I suggest we postpone anything related to that 
for the sake of staying focused.



Andrei

Re: Something needs to happen with shared, and soon.

On 15 November 2012 15:00, Jonathan M Davis  wrote:

> On Thursday, November 15, 2012 14:32:47 Manu wrote:
> > On 15 November 2012 13:38, Jonathan M Davis  wrote:
>
> > I don't really see the difference, other than, as you say, the cast is
> > explicit.
> > Obviously the possibility for the situation you describe exists, it's
> > equally possible with the cast, except this way, the usage pattern is
> made
> > more convenient, the user has a convenient way to control the locks and
> > most importantly, it would work with templates.
> > That said, this sounds like another perfect application of 'scope'.
> Perhaps
> > only scope parameters can receive a locked, shared thing... that would
> > mechanically protect you against escape.
>
> You could make casting away const implicit too, which would make some code
> easier, but it would be a disaster, because the programer wouldn't have a
> clue
> that it's happening in many cases, and the code would end up being very,
> very
> wrong. Implicitly casting away shared would put you in the same boat.


... no, they're not even the same thing. const things can not be changed.
Shared things are still mutable things, and perfectly compatible with other
non-shared mutable things, they just have some access control requirements.

_Maybe_ you could get away with it in very restricted circumstances where
> both pure
> and scope are being used, but then it becomes so restrictive that it's
> nearly
> useless anyway. And again, it would be hidden from the programmer, when
> this
> is something that _needs_ to be explicit. Having implicit locks happen on
> you
> could really screw with any code trying to do explicit locks, as would be
> needed anyway in all but the most basic cases.
>

I think you must have misunderstood my suggestion, I certainly didn't
suggest locking would be implicit.
All locks would be explicit, all I suggested is that shared things would
gain an associated mutex, and an implicit assert that said mutex is locked
whenever it is accessed, rather than deny assignment between
shared/unshared things.

You could use lock methods, or a nice alternative would be to submit them
to some sort of synchronised scope like luka illustrates.

I'm of the opinion that for the time being, explicit lock control is
mandatory (anything else is a distant dream), and atomic primitives may not
be relied upon.

> 2. It's often the case that you need to lock/unlock groups of stuff
> together
> > > such that locking specific variables is of often of limited use and
> would
> > > just
> > > introduce pointless extra locks when dealing with multiple variables.
> It
> > > would
> > > also increase the risk of deadlocks, because you wouldn't have much -
> if
> > > any -
> > > control over what order locks were acquired in when dealing with
> multiple
> > > shared variables.
> >
> > Your fear is precisely the state we're in now, except it puts all the
> work
> > on the user to create and use the synchronisation objects, and also to
> > assert that things are locked when they are accessed.
> > I'm just suggesting some reasonably simple change that would make the
> > situation more usable and safer immediately, short of waiting for all
> these
> > fantastic designs being discussed having time to simmer and manifest.
>
> Except that with your suggestion, you're introducing potential deadlocks
> which
> are outside of the programmer's control, and you're introducing extra
> overhead
> with those locks (both in terms of memory and in terms of the runtime
> costs).
> Not to mention, it would probably cause all kinds of issues for something
> like
> shared int* to have a mutex with it, because then its size is completely
> different from int*. It also would cause even worse problems when that
> shared
> int* was cast to int* (aside from the size issues), because all of the
> locking
> that was happening for the shared int* was invisible. If you want automatic
> locks, then use synchronized classes. That's what they're for.
>
> Honestly, I really don't buy into the idea that it makes sense for shared
> to
> magically make multi-threaded code work without the programmer worrying
> about
> locks. Making it so that it's well-defined as to what's atomic is great for
> code that has any chance of being lock-free, but it's still up to the
> programmer to understand when locks are and aren't needed and how to use
> them
> correctly. I don't think that it can possibly work for it to be automatic.
> It's far to easy to introduce deadlocks, and it would only work in the
> simplest of cases anyway, meaning that the programmer needs to understand
> and
> properly solve the issues anyway. And if the programmer has to understand
> it
> all to get it right, why bother adding the extra overhead and deadlock
> potential caused by automatically locking anything? D provides some great
> synchronization primitives. People should use them.
>

To all above:
You've completely misunderstood my suggestion. It's basically

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Sönke Ludwig

Am 15.11.2012 05:32, schrieb Andrei Alexandrescu:
> On 11/14/12 7:24 PM, Jonathan M Davis wrote:
>> On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
>>> I have no idea what we want to do about this situation though. Regardless of
>>> what we do with memory barriers and the like, it has no impact on whether
>>> casts are required. And I think that introducing the shared equivalent of
>>> const would be a huge mistake, because then most code would end up being
>>> written using that attribute, meaning that all code essentially has to be
>>> treated as shared from the standpoint of compiler optimizations. It would
>>> almost be the same as making everything shared by default again. So, as far
>>> as I can see, casting is what we're forced to do.
>>
>> Actually, I think that what it comes down to is that shared works nicely when
>> you have a type which is designed to be shared, and it encapsulates 
>> everything
>> that it needs. Where it starts requiring casting is when you need to pass it
>> to other stuff.
>>
>> - Jonathan M Davis
> 
> TDPL 13.14 explains that inside synchronized classes, top-level shared is 
> automatically lifted.
> 
> Andrei

There are three problems I currently see with this:

 - It's not actually implemented
 - It's not safe because unshared references can be escaped or dragged in
 - Synchronized classes provide no way to avoid the automatic locking in 
certain methods, but often
it is necessary to have more fine-grained control for efficiency reasons, or to 
avoid dead-locks

Re: Something needs to happen with shared, and soon.


Le 15/11/2012 10:08, Manu a écrit :

The Nintendo Wii for instance, not an unpopular machine, only sold 130
million units! Does not have synchronisation instructions in the
architecture (insane, I know, but there it is. I've had to spend time
working around this in the past).
I'm sure it's not unique in this way.



Can you elaborate on that ?

Re: Something needs to happen with shared, and soon.


Le 14/11/2012 22:09, Walter Bright a écrit :

On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:

On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:

On 14-11-2012 15:14, Andrei Alexandrescu wrote:

On 11/14/12 1:19 AM, Walter Bright wrote:

On 11/13/2012 11:56 PM, Jonathan M Davis wrote:

Being able to have double-checked locking work would be valuable, and
having
memory barriers would reduce race condition weirdness when locks
aren't used
properly, so I think that it would be desirable to have memory
barriers.


I'm not saying "memory barriers are bad". I'm saying that having the
compiler blindly insert them for shared reads/writes is far from the
right way to do it.


Let's not hasten. That works for Java and C#, and is allowed in C++.

Andrei




I need some clarification here: By memory barrier, do you mean x86's
mfence, sfence, and lfence?


Sorry, I was imprecise. We need to (a) define intrinsics for loading
and storing
data with high-level semantics (a short list: acquire, release,
acquire+release,
and sequentially-consistent) and THEN (b) implement the needed code
generation
appropriately for each architecture. Indeed on x86 there is little
need to
insert fence instructions, BUT there is a definite need for the
compiler to
prevent certain reorderings. That's why implementing shared data
operations
(whether implicit or explicit) as sheer library code is NOT possible.


Because as Walter said, inserting those blindly when unnecessary can
lead to terrible performance because it practically murders
pipelining.


I think at this point we need to develop a better understanding of
what's going
on before issuing assessments.


Yes. And also, I agree that having something typed as "shared" must
prevent the compiler from reordering them. But that's separate from
inserting memory barriers.



I'm sorry but that is dumb.

What is the point of ensuring that the compiler does not reorder 
load/stores if the CPU is allowed to do so ?

Re: Something needs to happen with shared, and soon.


Le 14/11/2012 23:21, Andrei Alexandrescu a écrit :

On 11/14/12 12:00 PM, Sean Kelly wrote:

On Nov 14, 2012, at 6:16 AM, Andrei
Alexandrescu wrote:


On 11/14/12 1:20 AM, Walter Bright wrote:

On 11/13/2012 11:37 PM, Jacob Carlborg wrote:

If the compiler should/does not add memory barriers, then is there a
reason for
having it built into the language? Can a library solution be enough?


Memory barriers can certainly be added using library functions.


The compiler must understand the semantics of barriers such as e.g.
it doesn't hoist code above an acquire barrier or below a release
barrier.


That was the point of the now deprecated "volatile" statement. I still
don't entirely understand why it was deprecated.


Because it's better to associate volatility with data than with code.



Happy to see I'm not alone on that one.

Plus, volatile and sequential consistency are 2 different beast. 
Volatile means no register promotion and no load/store reordering. It is 
required, but not sufficient for concurrency.

Re: Something needs to happen with shared, and soon.

On Thursday, November 15, 2012 14:32:47 Manu wrote:
> On 15 November 2012 13:38, Jonathan M Davis  wrote:

> I don't really see the difference, other than, as you say, the cast is
> explicit.
> Obviously the possibility for the situation you describe exists, it's
> equally possible with the cast, except this way, the usage pattern is made
> more convenient, the user has a convenient way to control the locks and
> most importantly, it would work with templates.
> That said, this sounds like another perfect application of 'scope'. Perhaps
> only scope parameters can receive a locked, shared thing... that would
> mechanically protect you against escape.

You could make casting away const implicit too, which would make some code 
easier, but it would be a disaster, because the programer wouldn't have a clue 
that it's happening in many cases, and the code would end up being very, very 
wrong. Implicitly casting away shared would put you in the same boat. _Maybe_ 
you could get away with it in very restricted circumstances where both pure 
and scope are being used, but then it becomes so restrictive that it's nearly 
useless anyway. And again, it would be hidden from the programmer, when this 
is something that _needs_ to be explicit. Having implicit locks happen on you 
could really screw with any code trying to do explicit locks, as would be 
needed anyway in all but the most basic cases.

> 2. It's often the case that you need to lock/unlock groups of stuff together
> > such that locking specific variables is of often of limited use and would
> > just
> > introduce pointless extra locks when dealing with multiple variables. It
> > would
> > also increase the risk of deadlocks, because you wouldn't have much - if
> > any -
> > control over what order locks were acquired in when dealing with multiple
> > shared variables.
> 
> Your fear is precisely the state we're in now, except it puts all the work
> on the user to create and use the synchronisation objects, and also to
> assert that things are locked when they are accessed.
> I'm just suggesting some reasonably simple change that would make the
> situation more usable and safer immediately, short of waiting for all these
> fantastic designs being discussed having time to simmer and manifest.

Except that with your suggestion, you're introducing potential deadlocks which 
are outside of the programmer's control, and you're introducing extra overhead 
with those locks (both in terms of memory and in terms of the runtime costs). 
Not to mention, it would probably cause all kinds of issues for something like 
shared int* to have a mutex with it, because then its size is completely 
different from int*. It also would cause even worse problems when that shared 
int* was cast to int* (aside from the size issues), because all of the locking 
that was happening for the shared int* was invisible. If you want automatic 
locks, then use synchronized classes. That's what they're for.

Honestly, I really don't buy into the idea that it makes sense for shared to 
magically make multi-threaded code work without the programmer worrying about 
locks. Making it so that it's well-defined as to what's atomic is great for 
code that has any chance of being lock-free, but it's still up to the 
programmer to understand when locks are and aren't needed and how to use them 
correctly. I don't think that it can possibly work for it to be automatic. 
It's far to easy to introduce deadlocks, and it would only work in the 
simplest of cases anyway, meaning that the programmer needs to understand and 
properly solve the issues anyway. And if the programmer has to understand it 
all to get it right, why bother adding the extra overhead and deadlock 
potential caused by automatically locking anything? D provides some great 
synchronization primitives. People should use them.

I think that the only things that share really needs to be solving are:

1. Indicating to the compiler via the type system that the object is not 
thread-local. This properly segregates shared and unshared code and allows the 
compiler to take advantage of thread locality for optimizations and avoid 
optimizations with shared code that screw up threading (e.g. double-checked 
locking won't work if the compiler does certain optimizations).

2. Making it explicit and well-defined as part of the language which operations 
can assumed to be atomic (even if it that set of operations is very small, 
having it be well-defined is valuable).

3. Ensuring sequential consistency so that it's possible to do lock-free code 
when atomic operations permit it and so that there are fewer weird issues due 
to undefined behavior.

- Jonathan M Davis

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread luka8088

On 15.11.2012 11:52, Manu wrote:

On 15 November 2012 12:14, Jacob Carlborg mailto:d...@me.com>> wrote:

On 2012-11-15 10:22, Manu wrote:

Not to repeat my prev post... but in reply to Walter's take on
it, it
would be interesting if 'shared' just added implicit lock()/unlock()
methods to do the mutex acquisition and then remove the cast
requirement, but have the language runtime assert that the object is
locked whenever it is accessed (this guarantees the safety in a more
useful way, the casts are really annying). I can't imagine a
simpler and
more immediately useful solution.

How about implementing a library function, something like this:

shared int i;

lock(i, (x) {
// operate on x
});

* "lock" will acquire a lock
* Cast away shared for "i"
* Call the delegate with the now plain "int"
* Release the lock

http://pastebin.com/tfQ12nJB

Interesting concept. Nice idea, could certainly be useful, but it
doesn't address the problem as directly as my suggestion.
There are still many problem situations, for instance, any time a
template is involved. The template doesn't know to do that internally,
but under my proposal, you lock it prior to the workload, and then the
template works as expected. Templates won't just break and fail whenever
shared is involved, because assignments would be legal. They'll just
assert that the thing is locked at the time, which is the programmers
responsibility to ensure.

I managed to make a simple example that works with the current
implementation:

http://dpaste.dzfl.pl/27b6df62

http://forum.dlang.org/thread/k7orpj$1tt5$1...@digitalmars.com?page=4#post-k7s0gs:241h45:241:40digitalmars.com

It seems to me that solving this shared issue cannot be done purely on a
compiler basis but will require a runtime support. Actually I don't see
how it can be done properly without telling "this lock must be locked
when accessing this variable".

http://dpaste.dzfl.pl/edbd3e10

Re: Something needs to happen with shared, and soon.


Le 14/11/2012 21:01, Sean Kelly a écrit :

On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu 
 wrote:


This is a simplification of what should be going on. The 
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the 
compiler generate sequentially consistent code with them (i.e. not perform 
certain reorderings). Then there are loads and stores with weaker consistency 
semantics (acquire, release, acquire/release, and consume).


No.  These functions all contain volatile ask blocks.  If the compiler respected the 
"volatile" it would be enough.


It is sufficient for monocore and mostly correct for x86. But isn't enough.

volatile isn't for concurency, but memory mapping.

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Benjamin Thaut


Am 15.11.2012 12:48, schrieb Jonathan M Davis:


Yeah. If the reference passed across were shared, then the runtime should see
it as having multiple references, and if it's _not_ shared, that means that
you cast shared away (unsafe, since it's a cast) and passed it across threads
without making sure that it was the only reference on the original thread. In
that case, you shot yourself in the foot by using an @system construct
(casting) and not getting it right. I don't see why the runtime would have to
worry about that.

Unless the problem is that the object is a value type, so when it goes away on
the first thread, it _has_ to be destroyed? If that's the case, then it's a
pointer that was passed across rather than a reference, and then you've
effectively done the same thing as returning a pointer to a local variable,
which is @system and again only happens if you're getting @system wrong, which
the compiler generally doesn't protect you from beyond giving you an error in
the few cases where it can determine for certain that what you're doing is
wrong (which is a fairly limited portion of the time).

So, as far as I can see - unless I'm just totally missing something here -
either you're dealing with shared objects on the heap here, in which case, the
object shouldn't be destroyed on the first thread unless you do it manually (in
which case, you're doing something stupid in @system code), or you're dealing
with passing pointers to shared value types across threads, which is
essentially the equivalent of escaping a pointer to a local variable (in which
case, you're doing something stupid in @system code). In either case, it's
you're doing something stupid in @system code, and I don't see why the runtime
would have to worry about it. You shot yourself in the foot by incorrectly
using @system code. If you want protection agains that, then don't use @system
code.

- Jonathan M Davis



Thank you, thats exatcly how I'm thinking too. And because of this it 
makes absolutley no sense to me to disallow the destruction of a shared 
struct, if it is allocated on the stack or as a global. If it is 
allocated on the heap you can't destory it manually anyway because 
delete is deprecated.


And for exatcly this reason I wanted a code example from Walter. Because 
just listing a few bullet points does not make a real world use case.


Kind Regards
Benjamin Thaut

Re: Something needs to happen with shared, and soon.

On Thursday, November 15, 2012 10:22:22 Jacob Carlborg wrote:
> On 2012-11-14 22:06, Walter Bright wrote:
> > I hate to repeat myself, but:
> > 
> > Thread 1:
> >  1. create shared object
> >  2. pass reference to that object to Thread 2
> >  3. destroy object
> > 
> > Thread 2:
> >  1. manipulate that object
> 
> Why would the object be destroyed if there's still a reference to it? If
> the object is manually destroyed I don't see what threads have to do
> with it since you can do the same thing in a single thread application.

Yeah. If the reference passed across were shared, then the runtime should see 
it as having multiple references, and if it's _not_ shared, that means that 
you cast shared away (unsafe, since it's a cast) and passed it across threads 
without making sure that it was the only reference on the original thread. In 
that case, you shot yourself in the foot by using an @system construct 
(casting) and not getting it right. I don't see why the runtime would have to 
worry about that.

Unless the problem is that the object is a value type, so when it goes away on 
the first thread, it _has_ to be destroyed? If that's the case, then it's a 
pointer that was passed across rather than a reference, and then you've 
effectively done the same thing as returning a pointer to a local variable, 
which is @system and again only happens if you're getting @system wrong, which 
the compiler generally doesn't protect you from beyond giving you an error in 
the few cases where it can determine for certain that what you're doing is 
wrong (which is a fairly limited portion of the time).

So, as far as I can see - unless I'm just totally missing something here - 
either you're dealing with shared objects on the heap here, in which case, the 
object shouldn't be destroyed on the first thread unless you do it manually (in 
which case, you're doing something stupid in @system code), or you're dealing 
with passing pointers to shared value types across threads, which is 
essentially the equivalent of escaping a pointer to a local variable (in which 
case, you're doing something stupid in @system code). In either case, it's 
you're doing something stupid in @system code, and I don't see why the runtime 
would have to worry about it. You shot yourself in the foot by incorrectly 
using @system code. If you want protection agains that, then don't use @system 
code.

- Jonathan M Davis

Re: Something needs to happen with shared, and soon.

On Thursday, November 15, 2012 11:22:30 Manu wrote:
> Not to repeat my prev post... but in reply to Walter's take on it, it would
> be interesting if 'shared' just added implicit lock()/unlock() methods to
> do the mutex acquisition and then remove the cast requirement, but have the
> language runtime assert that the object is locked whenever it is accessed
> (this guarantees the safety in a more useful way, the casts are really
> annying). I can't imagine a simpler and more immediately useful solution.
> 
> In fact, it's a reasonably small step to this being possible with
> user-defined attributes. Although attributes have no current mechanism to
> add a mutex, and lock/unlock methods to the object being attributed (like
> is possible in Java/C#), but maybe it's not a huge leap.

1. It wouldn't stop you from needing to cast away shared at all, because 
without casting away shared, you wouldn't be able to pass it to anything, 
because the types would differ. Even if you were arguing that doing something 
like

void foo(C c) {...}
shared c = new C;
foo(c); //no cast required, lock automatically taken

it wouldn't work, because then foo could wile away a reference to c somewhere, 
and the type system would have no way of knowing that it was a shared variable 
that was being wiled away as opposed to a thread-local one, which means that 
it'll likely generate incorrect code. That can happen with the cast as well, 
but at least in that case, you're forced to be explicit about it, and it's 
automatically @system. If it's done for you, it'll be easy to miss and screw 
up.

2. It's often the case that you need to lock/unlock groups of stuff together 
such that locking specific variables is of often of limited use and would just 
introduce pointless extra locks when dealing with multiple variables. It would 
also increase the risk of deadlocks, because you wouldn't have much - if any - 
control over what order locks were acquired in when dealing with multiple 
shared variables.

- Jonathan M Davis

Re: Something needs to happen with shared, and soon.

On Wednesday, November 14, 2012 20:32:35 Andrei Alexandrescu wrote:
> TDPL 13.14 explains that inside synchronized classes, top-level shared
> is automatically lifted.

Then it's doing the casting for you. I suppose that that's an argument that 
using synchronized classes when dealing with shared is the way to go (which 
IIRC TDPL does argue), but that only applies to classes, and there are plenty 
of cases (maybe even the majority) where it's built-in types like arrays or 
AAs which people are trying to share, and synchronized classes won't help them 
there unless they create wrapper types. And explicit casting will be required 
for them. And of course, anyone wanting to use mutexes or synchronized blocks 
will have to use explicit casts regardless of what they're protecting, because 
it won't be inside a synchronized class. So, while synchronized classes make 
dealing with classes nicer, they only handle a very specific portion of  what 
might be used with shared.

In any case, I clearly need to reread TDPL's threading stuff (and maybe the 
whole book). It's been a while since I read it, and I'm getting rusty on the 
details.

By the way, speaking of synchronized classes, as I understand it, they're 
still broken with regards to TDPL in that synchronized is still used on 
functions rather than classes like TDPL describes. So, they aren't currently a 
solution regardless of what the language actual design is supposed to be. 
Obviously, that should be fixed though.

- Jonathan M Davis

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Regan Heath

On Thu, 15 Nov 2012 04:33:20 -, Michel Fortin  
 wrote:


On 2012-11-15 02:51:13 +, "Jonathan M Davis"   
said:


I have no idea what we want to do about this situation though.  
Regardless of
what we do with memory barriers and the like, it has no impact on  
whether

casts are required.


Let me restate and extend that idea to atomic operations. Declare a  
variable using the synchronized storage class and it automatically get a  
mutex:


synchronized int i; // declaration

i++; // error, variable shared

synchronized (i)
i++; // fine, variable is thread-local inside synchronized block

Synchronized here is some kind of storage class causing two things: a  
mutex is attached to the variable declaration, and the type of the  
variable is made shared. The variable being shared, you can't access it  
directly. But a synchronized statement will make the variable non-shared  
within its bounds.


Now, if you want a custom mutex class, write it like this:

synchronized(SpinLock) int i;

synchronized(i)
{
// implicit: i.mutexof.lock();
// implicit: scope (exit) i.mutexof.unlock();
i++;
}

If you want to declare the mutex separately, you could do it by  
specifying a variable instead of a type in the variable declaration:


Mutex m;
synchronized(m) int i;

synchronized(i)
{
// implicit: m.lock();
// implicit: scope (exit) m.unlock();
i++;
}

Also, if you have a read-write mutex and only need read access, you  
could declare that you only need read access using const:


synchronized(RWMutex) int i;

synchronized(const i)
{
// implicit: i.mutexof.constLock();
// implicit: scope (exit) i.mutexof.constUnlock();
i++; // error, i is const
}

And finally, if you want to use atomic operations, declare it this way:

synchronized(Atomic) int i;

You can't really synchronize on something protected by Atomic:

	syncronized(i) // cannot make sycnronized block, no lock/unlock method  
in Atomic

{}

But you can call operators on it while synchronized, it works for  
anything implemented by Atomic:


synchronized(i)++; // implicit: Atomic.opUnary!"++"(i);

Because the policy object is associated with the variable declaration,  
when locking the mutex you need direct access to the original variable,  
or an alias to it. Same for performing atomic operations. You can't pass  
a reference to some function and have that function perform the locking.  
If that's a problem it can be avoided by having a way to pass the mutex  
to the function, or by passing an alias to a template.


+1

I suggested something similar as did Sönke:
http://forum.dlang.org/thread/k7orpj$1tt5$1...@digitalmars.com?page=2#post-op.wnnuiio554xghj:40puck.auriga.bhead.co.uk

According to deadalnix the compiler magic I suggested to add the mutex  
isn't possible:

http://forum.dlang.org/thread/k7orpj$1tt5$1...@digitalmars.com?page=3#post-k7qsb5:242gqk:241:40digitalmars.com

Most of our ideas can be implemented with a wrapper template containing  
the sync object (mutex, etc).


So... my feeling is that the best solution for "shared", ignoring the  
memory barrier aspect which I would relegate to a different feature and  
solve a different way, is..


1. Remove the existing mutex from object.
2. Require that all objects passed to synchronized() {} statements  
implement a synchable(*) interface
3. Design a Shared(*) wrapper template/struct that contains a mutex and  
implements synchable(*)
4. Design a Shared(*) base class which contains a mutex and implements  
synchable(*)


Then we design classes which are always shared using the base class and we  
wrap other objects we want to share in Shared() and use them in  
synchronized statements.


This would then relegate any builtin "shared" statement to be solely a  
storage class which makes the object global and not thread local.


(*) names up for debate

R

--
Using Opera's revolutionary email client: http://www.opera.com/mail/

Re: Something needs to happen with shared, and soon.

On 15 November 2012 12:14, Jacob Carlborg  wrote:

> On 2012-11-15 10:22, Manu wrote:
>
>  Not to repeat my prev post... but in reply to Walter's take on it, it
>> would be interesting if 'shared' just added implicit lock()/unlock()
>> methods to do the mutex acquisition and then remove the cast
>> requirement, but have the language runtime assert that the object is
>> locked whenever it is accessed (this guarantees the safety in a more
>> useful way, the casts are really annying). I can't imagine a simpler and
>> more immediately useful solution.
>>
>
> How about implementing a library function, something like this:
>
> shared int i;
>
> lock(i, (x) {
> // operate on x
> });
>
> * "lock" will acquire a lock
> * Cast away shared for "i"
> * Call the delegate with the now plain "int"
> * Release the lock
>
> http://pastebin.com/tfQ12nJB

Interesting concept. Nice idea, could certainly be useful, but it doesn't
address the problem as directly as my suggestion.
There are still many problem situations, for instance, any time a template
is involved. The template doesn't know to do that internally, but under my
proposal, you lock it prior to the workload, and then the template works as
expected. Templates won't just break and fail whenever shared is involved,
because assignments would be legal. They'll just assert that the thing is
locked at the time, which is the programmers responsibility to ensure.

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Jacob Carlborg


On 2012-11-15 10:22, Manu wrote:


Not to repeat my prev post... but in reply to Walter's take on it, it
would be interesting if 'shared' just added implicit lock()/unlock()
methods to do the mutex acquisition and then remove the cast
requirement, but have the language runtime assert that the object is
locked whenever it is accessed (this guarantees the safety in a more
useful way, the casts are really annying). I can't imagine a simpler and
more immediately useful solution.


How about implementing a library function, something like this:

shared int i;

lock(i, (x) {
// operate on x
});

* "lock" will acquire a lock
* Cast away shared for "i"
* Call the delegate with the now plain "int"
* Release the lock

http://pastebin.com/tfQ12nJB

--
/Jacob Carlborg

Re: Something needs to happen with shared, and soon.

2012-11-15 Thread Jacob Carlborg


On 2012-11-14 22:06, Walter Bright wrote:


I hate to repeat myself, but:

Thread 1:
 1. create shared object
 2. pass reference to that object to Thread 2
 3. destroy object

Thread 2:
 1. manipulate that object


Why would the object be destroyed if there's still a reference to it? If 
the object is manually destroyed I don't see what threads have to do 
with it since you can do the same thing in a single thread application.


--
/Jacob Carlborg

Re: Something needs to happen with shared, and soon.

On 15 November 2012 04:30, Andrei Alexandrescu <
seewebsiteforem...@erdani.org> wrote:

> On 11/11/12 6:30 PM, Walter Bright wrote:
>
>> 1. ensure single threaded access by aquiring a mutex
>> 2. cast away shared
>> 3. operate on the data
>> 4. cast back to shared
>> 5. release the mutex
>>
>
> This is very different from how I view we should do things (and how we
> actually agreed to do things and how I wrote in TDPL).
>
> I can't believe I need to restart this on a cold cache.

The pattern Walter describes is primitive and useful, I'd like to see
shared assist to that end (see my previous post).
You can endeavour to do any other fancy stuff you like, but until some
distant future when it's actually done, then proven and well supported,
I'll keep doing this.

Not to repeat my prev post... but in reply to Walter's take on it, it would
be interesting if 'shared' just added implicit lock()/unlock() methods to
do the mutex acquisition and then remove the cast requirement, but have the
language runtime assert that the object is locked whenever it is accessed
(this guarantees the safety in a more useful way, the casts are really
annying). I can't imagine a simpler and more immediately useful solution.

In fact, it's a reasonably small step to this being possible with
user-defined attributes. Although attributes have no current mechanism to
add a mutex, and lock/unlock methods to the object being attributed (like
is possible in Java/C#), but maybe it's not a huge leap.

Re: Something needs to happen with shared, and soon.

On 14 November 2012 19:54, Andrei Alexandrescu <
seewebsiteforem...@erdani.org> wrote:

> On 11/14/12 9:31 AM, David Nadlinger wrote:
>
>> On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrote:
>>
>>> Sorry, I was imprecise. We need to (a) define intrinsics for loading
>>> and storing data with high-level semantics (a short list: acquire,
>>> release, acquire+release, and sequentially-consistent) and THEN (b)
>>> implement the needed code generation appropriately for each
>>> architecture. Indeed on x86 there is little need to insert fence
>>> instructions, BUT there is a definite need for the compiler to prevent
>>> certain reorderings. That's why implementing shared data operations
>>> (whether implicit or explicit) as sheer library code is NOT possible.
>>>
>>
>> Sorry, I didn't see this message of yours before replying (the perils of
>> threaded news readers…).
>>
>> You are right about the fact that we need some degree of compiler
>> support for atomic instructions. My point was that is it already
>> available, otherwise it would have been impossible to implement
>> core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which
>> prohibits compiler code motion).
>>
>
> Yah, the whole point here is that we need something IN THE LANGUAGE
> DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION.
>
> THIS IS VERY IMPORTANT.

I won't outright disagree, but this seems VERY dangerous to me.

You need to carefully study all popular architectures, and consider that if
the language is made to depend on these primitives, and the architecture
doesn't support it, or support that particular style of implementation
(fairly likely), than D will become incompatible with a huge number of
architectures on that day.

This is a very big deal. I would be scared to see the compiler generate
intrinsic calls to atomic synchronisation primitives. It's almost like
banning architectures from the language.

The Nintendo Wii for instance, not an unpopular machine, only sold 130
million units! Does not have synchronisation instructions in the
architecture (insane, I know, but there it is. I've had to spend time
working around this in the past).
I'm sure it's not unique in this way.

People getting fancy with lock-free/atomic operations will probably wrap it
up in libraries. And they're not globally applicable, atomic memory
operations don't magically solve problems, they require very specific
structures and access patterns around them. I'm just not convinced they
should be intrinsics issued by the language. They're just not as well
standardised as 'int' or 'float'.

Side note: I still think a convenient and fairly practical solution is to
make 'shared' things 'lockable'; where you can lock()/unlock() them, and
assignment to/from shared things is valid (no casting), but a runtime
assert insists that the entity is locked whenever it is accessed.* *It's
simplistic, but it's safe, and it works with the same primitives that
already exist and are proven. Let the programmer mark the lock/unlock
moments, worry about sequencing, etc... at least for the time being. Don't
try and do it automatically (yet).
The broad use cases in D aren't yet known, but making 'shared' useful today
would be valuable.

 Thus, »we«, meaning on a language level, don't need to change anything
>> about the current situations, with the possible exception of adding
>> finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the
>> duty of the compiler writers to provide the appropriate means to
>> implement druntime on their code generation infrastructure – and indeed,
>> the situation in DMD could be improved, using inline asm is hitting a
>> fly with a sledgehammer.
>>
>
> That is correct. My point is that compiler implementers would follow some
> specification. That specification would contain informationt hat atomicLoad
> and atomicStore must have special properties that put them apart from any
> other functions.
>
>
>  David
>>
>>
>> [1] I am not sure where the point of diminishing returns is here,
>> although it might make sense to provide the same options as C++11. If I
>> remember correctly, D1/Tango supported a lot more levels of
>> synchronization.
>>
>
> We could start with sequential consistency and then explore riskier/looser
> policies.
>
>
> Andrei
>

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Michel Fortin


On 2012-11-15 02:51:13 +, "Jonathan M Davis"  said:


I have no idea what we want to do about this situation though. Regardless of
what we do with memory barriers and the like, it has no impact on whether
casts are required.


One thing I'm confused about right now is how people are using shared. 
If you're using shared with atomic operations, then you need barriers 
when accessing or mutating the variable. If you're using shared with 
mutexes, spin-locks, etc., you don't care about the barriers. But you 
can't use it with both at the same time. So which of these shared 
stands for?


In both of these cases, there's an implicit policy for accessing or 
mutating the variable. I think the language need some way to express 
that policy. I suggested some time ago a way to protect variables with 
mutexes so that the compiler can actually help you use those mutexes 
correctly[1]. The idea was to associate a mutex to the variable 
declaration. This could be extended to support an atomic access policy.


Let me restate and extend that idea to atomic operations. Declare a 
variable using the synchronized storage class and it automatically get 
a mutex:


synchronized int i; // declaration

i++; // error, variable shared

synchronized (i)
i++; // fine, variable is thread-local inside synchronized block

Synchronized here is some kind of storage class causing two things: a 
mutex is attached to the variable declaration, and the type of the 
variable is made shared. The variable being shared, you can't access it 
directly. But a synchronized statement will make the variable 
non-shared within its bounds.


Now, if you want a custom mutex class, write it like this:

synchronized(SpinLock) int i;

synchronized(i)
{
// implicit: i.mutexof.lock();
// implicit: scope (exit) i.mutexof.unlock();
i++;
}

If you want to declare the mutex separately, you could do it by 
specifying a variable instead of a type in the variable declaration:


Mutex m;
synchronized(m) int i;

synchronized(i)
{
// implicit: m.lock();
// implicit: scope (exit) m.unlock();
i++;
}

Also, if you have a read-write mutex and only need read access, you 
could declare that you only need read access using const:


synchronized(RWMutex) int i;

synchronized(const i)
{
// implicit: i.mutexof.constLock();
// implicit: scope (exit) i.mutexof.constUnlock();
i++; // error, i is const
}

And finally, if you want to use atomic operations, declare it this way:

synchronized(Atomic) int i;

You can't really synchronize on something protected by Atomic:

	syncronized(i) // cannot make sycnronized block, no lock/unlock method 
in Atomic

{}

But you can call operators on it while synchronized, it works for 
anything implemented by Atomic:


synchronized(i)++; // implicit: Atomic.opUnary!"++"(i);

Because the policy object is associated with the variable declaration, 
when locking the mutex you need direct access to the original variable, 
or an alias to it. Same for performing atomic operations. You can't 
pass a reference to some function and have that function perform the 
locking. If that's a problem it can be avoided by having a way to pass 
the mutex to the function, or by passing an alias to a template.


Okay, this syntax probably still has some problems, feel free to point 
them out. I don't really care about the syntax though. The important 
thing is that you need a way to define the policy for accessing the 
shared data in a way the compiler can actually enforce it and that 
programmers can actually reuse it.


Because right now there is no policy. Having to cast things everywhere 
is equivalent to having to redefine the policy everywhere. Same for 
having to write encapsulation types that work with shared for 
everything you want to share: each type has to implement the policy. 
There's nothing worse than constantly rewriting the sharing policies. 
Concurrency error-prone because of all the subtleties; you don't want 
to encourage people to write policies of their own every time they 
invent a new type. You need to reuse existing ones, and the compiler 
can help with that.


[1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/


--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

Re: Something needs to happen with shared, and soon.


On 11/14/12 7:24 PM, Jonathan M Davis wrote:

On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:

I have no idea what we want to do about this situation though. Regardless of
what we do with memory barriers and the like, it has no impact on whether
casts are required. And I think that introducing the shared equivalent of
const would be a huge mistake, because then most code would end up being
written using that attribute, meaning that all code essentially has to be
treated as shared from the standpoint of compiler optimizations. It would
almost be the same as making everything shared by default again. So, as far
as I can see, casting is what we're forced to do.


Actually, I think that what it comes down to is that shared works nicely when
you have a type which is designed to be shared, and it encapsulates everything
that it needs. Where it starts requiring casting is when you need to pass it
to other stuff.

- Jonathan M Davis


TDPL 13.14 explains that inside synchronized classes, top-level shared 
is automatically lifted.


Andrei

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Jonathan M Davis

On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
> I have no idea what we want to do about this situation though. Regardless of
> what we do with memory barriers and the like, it has no impact on whether
> casts are required. And I think that introducing the shared equivalent of
> const would be a huge mistake, because then most code would end up being
> written using that attribute, meaning that all code essentially has to be
> treated as shared from the standpoint of compiler optimizations. It would
> almost be the same as making everything shared by default again. So, as far
> as I can see, casting is what we're forced to do.

Actually, I think that what it comes down to is that shared works nicely when 
you have a type which is designed to be shared, and it encapsulates everything 
that it needs. Where it starts requiring casting is when you need to pass it 
to other stuff.

- Jonathan M Davis

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Jonathan M Davis

On Thursday, November 15, 2012 04:12:47 Andrej Mitrovic wrote:
> On 11/15/12, Jonathan M Davis  wrote:
> > From what I recall of what TDPL says
> 
> It says (on p.413) reading and writing shared values are guaranteed to
> be atomic, for pointers, arrays, function pointers, delegates, class
> references, and struct types containing exactly one of these types.
> Reals are not supported.
> 
> It also talks about automatically inserting memory barriers on page 414.

Good to know, but none of that really has anything to do with the casting, 
which is what I was responding to. And looking at that list, it sounds 
reasonable that all of that would be guaranteed to be atomic, but I think that 
the fundamental problem that's affecting usability is all of the casting that's 
typically required. And I don't see any way around that other than writing 
code that doesn't need to pass shared objects around or using templates very 
heavily.

- Jonathan M Davis

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Andrej Mitrovic

On 11/15/12, Jonathan M Davis  wrote:
> From what I recall of what TDPL says

It says (on p.413) reading and writing shared values are guaranteed to
be atomic, for pointers, arrays, function pointers, delegates, class
references, and struct types containing exactly one of these types.
Reals are not supported.

It also talks about automatically inserting memory barriers on page 414.

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Jonathan M Davis

On Wednesday, November 14, 2012 18:30:56 Andrei Alexandrescu wrote:
> On 11/11/12 6:30 PM, Walter Bright wrote:
> > 1. ensure single threaded access by aquiring a mutex
> > 2. cast away shared
> > 3. operate on the data
> > 4. cast back to shared
> > 5. release the mutex
> 
> This is very different from how I view we should do things (and how we
> actually agreed to do things and how I wrote in TDPL).
> 
> I can't believe I need to restart this on a cold cache.

Well, this is clearly how things work now, and if you want to use shared with 
much of anything, it's how things generally have to work, because almost 
nothing takes shared. Templated stuff will at least some of the time (though 
it's often untested for it and probably will get screwed by Unqual in quite a 
few cases), but there's no way aside from templates or casting to get shared 
variables to share the same functions as non-shared ones, leading to code 
duplication.

>From what I recall of what TDPL says, this doesn't really contradict it. It's 
just that TDPL doesn't really say much about the fact that almost nothing will 
work with shared, which means that casting is necessary.

I have no idea what we want to do about this situation though. Regardless of 
what we do with memory barriers and the like, it has no impact on whether 
casts are required. And I think that introducing the shared equivalent of 
const would be a huge mistake, because then most code would end up being 
written using that attribute, meaning that all code essentially has to be 
treated as shared from the standpoint of compiler optimizations. It would 
almost be the same as making everything shared by default again. So, as far as 
I can see, casting is what we're forced to do.

- Jonathan M Davis

Re: Something needs to happen with shared, and soon.


On 11/11/12 6:30 PM, Walter Bright wrote:

1. ensure single threaded access by aquiring a mutex
2. cast away shared
3. operate on the data
4. cast back to shared
5. release the mutex


This is very different from how I view we should do things (and how we 
actually agreed to do things and how I wrote in TDPL).


I can't believe I need to restart this on a cold cache.


Andrei

Re: Something needs to happen with shared, and soon.


On 11/14/12 4:50 PM, Sean Kelly wrote:

On Nov 14, 2012, at 2:25 PM, Andrei
Alexandrescu  wrote:


On 11/14/12 1:09 PM, Walter Bright wrote:

Yes. And also, I agree that having something typed as "shared"
must prevent the compiler from reordering them. But that's
separate from inserting memory barriers.


It's the same issue at hand: ordering properly and inserting
barriers are two ways to ensure one single goal, sequential
consistency. Same thing.


Sequential consistency is great and all, but it doesn't render
concurrent code correct.  At worst, it provides a false sense of
security that somehow it does accomplish this, and people end up
actually using it as such.


Yah, but the baseline here is acquire-release which has subtle 
differences that are all the more maddening.


Andrei

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Jason House


On Monday, 12 November 2012 at 02:31:05 UTC, Walter Bright wrote:


To make a shared type work in an algorithm, you have to:

1. ensure single threaded access by aquiring a mutex
2. cast away shared
3. operate on the data
4. cast back to shared
5. release the mutex


This is a fairly reasonable use of shared, but it is bypassing 
the type system. Once shared is cast away, it is free to be mixed 
with thread local variables. Pieces can be assigned to non-shared 
globals, impure functions can stash reference, weakly pure 
functions can mix their arguments together, etc... If locking 
converts shared(T) to bikeshed(T), I bet some of safeD's logic 
for no escaping references could be used to improve things.


It's also interesting to note that casting away shared after 
taking a lock implicitly means that everything was transitively 
owned by that lock. I wonder how well a library could 
promote/enforce such a thing?

Re: Something needs to happen with shared, and soon.

On Nov 14, 2012, at 2:25 PM, Andrei Alexandrescu 
 wrote:

> On 11/14/12 1:09 PM, Walter Bright wrote:
>> Yes. And also, I agree that having something typed as "shared" must
>> prevent the compiler from reordering them. But that's separate from
>> inserting memory barriers.
> 
> It's the same issue at hand: ordering properly and inserting barriers are two 
> ways to ensure one single goal, sequential consistency. Same thing.

Sequential consistency is great and all, but it doesn't render concurrent code 
correct.  At worst, it provides a false sense of security that somehow it does 
accomplish this, and people end up actually using it as such.

Re: Something needs to happen with shared, and soon.

On Nov 14, 2012, at 2:21 PM, Andrei Alexandrescu 
 wrote:

> On 11/14/12 12:00 PM, Sean Kelly wrote:
>> On Nov 14, 2012, at 6:16 AM, Andrei 
>> Alexandrescu  wrote:
>> 
>>> On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
> If the compiler should/does not add memory barriers, then is there a
> reason for
> having it built into the language? Can a library solution be enough?
 
 Memory barriers can certainly be added using library functions.
>>> 
>>> The compiler must understand the semantics of barriers such as e.g. it 
>>> doesn't hoist code above an acquire barrier or below a release barrier.
>> 
>> That was the point of the now deprecated "volatile" statement.  I still 
>> don't entirely understand why it was deprecated.
> 
> Because it's better to associate volatility with data than with code.

Fair enough.  Though this may mean building a bunch of different forms of 
volatility into the language.  I always saw "volatile" as a library tool 
anyway, so while making it code-related was a bit weird, it was a sufficient 
tool for the job.

Re: Something needs to happen with shared, and soon.


On 11/14/12 1:09 PM, Walter Bright wrote:

Yes. And also, I agree that having something typed as "shared" must
prevent the compiler from reordering them. But that's separate from
inserting memory barriers.


It's the same issue at hand: ordering properly and inserting barriers 
are two ways to ensure one single goal, sequential consistency. Same thing.


Andrei

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread luka8088


On 14.11.2012 20:54, Sean Kelly wrote:

On Nov 13, 2012, at 1:14 AM, luka8088  wrote:


On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:

On 12.11.2012 3:30, Walter Bright wrote:

On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:

It's starting to get outright embarrassing to talk to newcomers about D's
concurrency support because the most fundamental part of it -- the
shared type
qualifier -- does not have well-defined semantics at all.


I think a couple things are clear:

1. Slapping shared on a type is never going to make algorithms on that
type work in a concurrent context, regardless of what is done with
memory barriers. Memory barriers ensure sequential consistency, they do
nothing for race conditions that are sequentially consistent. Remember,
single core CPUs are all sequentially consistent, and still have major
concurrency problems. This also means that having templates accept
shared(T) as arguments and have them magically generate correct
concurrent code is a pipe dream.

2. The idea of shared adding memory barriers for access is not going to
ever work. Adding barriers has to be done by someone who knows what
they're doing for that particular use case, and the compiler inserting
them is not going to substitute.


However, and this is a big however, having shared as compiler-enforced
self-documentation is immensely useful. It flags where and when data is
being shared. So, your algorithm won't compile when you pass it a shared
type? That is because it is NEVER GOING TO WORK with a shared type. At
least you get a compile time indication of this, rather than random
runtime corruption.

To make a shared type work in an algorithm, you have to:

1. ensure single threaded access by aquiring a mutex
2. cast away shared
3. operate on the data
4. cast back to shared
5. release the mutex

Also, all op= need to be disabled for shared types.



This clarifies a lot, but still a lot of people get confused with:
http://dlang.org/faq.html#shared_memory_barriers
is it a faq error ?

and also with http://dlang.org/faq.html#shared_guarantees said, I come to think 
that the fact that the following code compiles is either lack of 
implementation, a compiler bug or a faq error ?


//

import core.thread;

void main () {
  int i;
  (new Thread({ i++; })).start();
}


It's intentional.  core.thread is for people who know what they're doing, and 
there are legitimate uses along these lines:

void main() {
 int i;
 auto t = new Thread({i++;});
 t.start();
 t.join();
 write(i);
}

This is perfectly safe and has a deterministic result.


Yes, that makes perfect sense... I just wanted to point out the 
misguidance in FAQ because (at least before this forum thread) there is 
not much written about shared and you can get a wrong idea from it (at 
least I did).

Re: Something needs to happen with shared, and soon.


On 11/14/12 1:06 PM, Walter Bright wrote:

On 11/14/2012 3:14 AM, Benjamin Thaut wrote:

A small code example which would break as soon as we allow destructing
of shared
value types would really be nice.


I hate to repeat myself, but:

Thread 1:
1. create shared object
2. pass reference to that object to Thread 2


That should be disallowed at least in safe code. If I had my way I'd 
explore disallowing in all code.


Andrei

Re: Something needs to happen with shared, and soon.


On 11/14/12 12:00 PM, Sean Kelly wrote:

On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu 
 wrote:


On 11/14/12 1:20 AM, Walter Bright wrote:

On 11/13/2012 11:37 PM, Jacob Carlborg wrote:

If the compiler should/does not add memory barriers, then is there a
reason for
having it built into the language? Can a library solution be enough?


Memory barriers can certainly be added using library functions.


The compiler must understand the semantics of barriers such as e.g. it doesn't 
hoist code above an acquire barrier or below a release barrier.


That was the point of the now deprecated "volatile" statement.  I still don't 
entirely understand why it was deprecated.


Because it's better to associate volatility with data than with code.

Andrei

Re: Something needs to happen with shared, and soon.


On 11/14/12 12:04 PM, Sean Kelly wrote:

On Nov 14, 2012, at 9:50 AM, Andrei Alexandrescu 
 wrote:


First, there are more kinds of atomic loads and stores. Then, the fact that the 
calls are not supposed to be reordered must be a guarantee of the language, not 
a speculation about an implementation. We can't argue that a feature works just 
because it so happens an implementation works a specific way.


I've always been a fan of release consistency, and it dovetails well with the 
behavior of mutexes (http://en.wikipedia.org/wiki/Release_consistency).  It 
would be cool if we could sort out transactional memory as well, but that's not 
a short term thing.


I think we should focus on sequential consistency as that's where the 
industry is converging.


Andrei

Re: Something needs to happen with shared, and soon.


On 11/14/12 11:21 AM, Iain Buclaw wrote:

On 14 November 2012 17:50, Andrei Alexandrescu
  wrote:

On 11/14/12 9:15 AM, David Nadlinger wrote:


On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:


On 11/14/12 1:20 AM, Walter Bright wrote:


On 11/13/2012 11:37 PM, Jacob Carlborg wrote:


If the compiler should/does not add memory barriers, then is there a
reason for
having it built into the language? Can a library solution be enough?



Memory barriers can certainly be added using library functions.



The compiler must understand the semantics of barriers such as e.g. it
doesn't hoist code above an acquire barrier or below a release barrier.



Again, this is true, but it would be a fallacy to conclude that
compiler-inserted memory barriers for »shared« are required due to this
(and it is »shared« we are discussing here!).

Simply having compiler intrinsics for atomic loads/stores is enough,
which is hardly »built into the language«.



Compiler intrinsics == built into the language.

Andrei



Not necessarily. For example, printf is a compiler intrinsic for GDC,
but it's not built into the language in the sense of the compiler
*provides* the codegen for it.  Though it is aware of what it is and
what it does, so can perform relevant optimisations around the use of
it.


aware of what it is and what it does == built into the language.

Andrei

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Alex Rønne Petersen


On 14-11-2012 21:15, Sean Kelly wrote:

On Nov 14, 2012, at 12:07 PM, Alex Rønne Petersen  wrote:


On 14-11-2012 21:00, Sean Kelly wrote:

On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu 
 wrote:


On 11/14/12 1:20 AM, Walter Bright wrote:

On 11/13/2012 11:37 PM, Jacob Carlborg wrote:

If the compiler should/does not add memory barriers, then is there a
reason for
having it built into the language? Can a library solution be enough?


Memory barriers can certainly be added using library functions.


The compiler must understand the semantics of barriers such as e.g. it doesn't 
hoist code above an acquire barrier or below a release barrier.


That was the point of the now deprecated "volatile" statement.  I still don't 
entirely understand why it was deprecated.



The volatile statement was too general. All relevant compiler back ends today 
only know of two kinds of volatile operations: Loads and stores. Volatile 
statements couldn't ever be properly implemented in GDC and LDC for example.


Well, the semantics of volatile are that there's an acquire barrier before the 
statement block and a release barrier after the statement block.  Or for a 
first cut just insert a full barrier at the beginning and end of the block.  
Either way, it should be pretty simply for a compiler to handle if the compiler 
supports mutex use.

I do like the idea of built-in load and store intrinsics only because D only 
supports x86 assembler right now.  But really, it would be just as easy to fan 
out a D template function to a bunch of C functions implemented in separate ASM 
code files.  Druntime actually had this for core.atomic on PPC until not too 
long ago.



Well, there's not much point in that when all compilers have intrinsics 
anyway (e.g. GDC has __sync_* and __atomic_* and LDC has some intrinsics 
in ldc.intrinsics that map to certain LLVM instructions).


--
Alex Rønne Petersen
a...@lycus.org
http://lycus.org

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Walter Bright


On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:

On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:

On 14-11-2012 15:14, Andrei Alexandrescu wrote:

On 11/14/12 1:19 AM, Walter Bright wrote:

On 11/13/2012 11:56 PM, Jonathan M Davis wrote:

Being able to have double-checked locking work would be valuable, and
having
memory barriers would reduce race condition weirdness when locks
aren't used
properly, so I think that it would be desirable to have memory
barriers.


I'm not saying "memory barriers are bad". I'm saying that having the
compiler blindly insert them for shared reads/writes is far from the
right way to do it.


Let's not hasten. That works for Java and C#, and is allowed in C++.

Andrei




I need some clarification here: By memory barrier, do you mean x86's
mfence, sfence, and lfence?


Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing
data with high-level semantics (a short list: acquire, release, acquire+release,
and sequentially-consistent) and THEN (b) implement the needed code generation
appropriately for each architecture. Indeed on x86 there is little need to
insert fence instructions, BUT there is a definite need for the compiler to
prevent certain reorderings. That's why implementing shared data operations
(whether implicit or explicit) as sheer library code is NOT possible.


Because as Walter said, inserting those blindly when unnecessary can
lead to terrible performance because it practically murders
pipelining.


I think at this point we need to develop a better understanding of what's going
on before issuing assessments.


Yes. And also, I agree that having something typed as "shared" must prevent the 
compiler from reordering them. But that's separate from inserting memory barriers.

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Walter Bright


On 11/14/2012 3:14 AM, Benjamin Thaut wrote:

A small code example which would break as soon as we allow destructing of shared
value types would really be nice.


I hate to repeat myself, but:

Thread 1:
1. create shared object
2. pass reference to that object to Thread 2
3. destroy object

Thread 2:
1. manipulate that object

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Michel Fortin


On 2012-11-14 14:30:19 +, Timon Gehr  said:


On 11/14/2012 01:42 PM, Michel Fortin wrote:

On 2012-11-14 10:30:46 +, Timon Gehr  said:


So do I. A thread-local static variable does not imply global state.
(The execution stack is static.) Eg. in a few cases it is sensible to
use static variables as implicit arguments to avoid having to pass
them around by copying them all over the execution stack.

private int x = 0;

int foo(){
 int xold = x;
 scope(exit) x = xold;
 x = new_value;
 bar(); // reads x
 return baz(); // reads x
}


I'd consider that poor style.


I'd consider this a poor statement to make. Universally quantified 
assertions require more rigorous justification.


Indeed. There's not enough context to judge fairly. I can accept the 
idea there are situations where it is really inconvenient or impossible 
to pass the state as an argument.


That said, I disagree that this is not using global state. It might not 
be globally accessible (because x is private), but the state still 
exists globally since variable x exists in all threads irrespective of 
whether they use foo or not.



If done in such a way that it makes refactoring error prone, it is to 
be considered poor style.


I guess we agree.

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

Re: Something needs to happen with shared, and soon.

On Nov 14, 2012, at 9:50 AM, Andrei Alexandrescu 
 wrote:
> 
> First, there are more kinds of atomic loads and stores. Then, the fact that 
> the calls are not supposed to be reordered must be a guarantee of the 
> language, not a speculation about an implementation. We can't argue that a 
> feature works just because it so happens an implementation works a specific 
> way.

I've always been a fan of release consistency, and it dovetails well with the 
behavior of mutexes (http://en.wikipedia.org/wiki/Release_consistency).  It 
would be cool if we could sort out transactional memory as well, but that's not 
a short term thing.

Re: Something needs to happen with shared, and soon.

On Nov 13, 2012, at 1:14 AM, luka8088  wrote:

> On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
>> On 12.11.2012 3:30, Walter Bright wrote:
>>> On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.
>>> 
>>> I think a couple things are clear:
>>> 
>>> 1. Slapping shared on a type is never going to make algorithms on that
>>> type work in a concurrent context, regardless of what is done with
>>> memory barriers. Memory barriers ensure sequential consistency, they do
>>> nothing for race conditions that are sequentially consistent. Remember,
>>> single core CPUs are all sequentially consistent, and still have major
>>> concurrency problems. This also means that having templates accept
>>> shared(T) as arguments and have them magically generate correct
>>> concurrent code is a pipe dream.
>>> 
>>> 2. The idea of shared adding memory barriers for access is not going to
>>> ever work. Adding barriers has to be done by someone who knows what
>>> they're doing for that particular use case, and the compiler inserting
>>> them is not going to substitute.
>>> 
>>> 
>>> However, and this is a big however, having shared as compiler-enforced
>>> self-documentation is immensely useful. It flags where and when data is
>>> being shared. So, your algorithm won't compile when you pass it a shared
>>> type? That is because it is NEVER GOING TO WORK with a shared type. At
>>> least you get a compile time indication of this, rather than random
>>> runtime corruption.
>>> 
>>> To make a shared type work in an algorithm, you have to:
>>> 
>>> 1. ensure single threaded access by aquiring a mutex
>>> 2. cast away shared
>>> 3. operate on the data
>>> 4. cast back to shared
>>> 5. release the mutex
>>> 
>>> Also, all op= need to be disabled for shared types.
>> 
>> 
>> This clarifies a lot, but still a lot of people get confused with:
>> http://dlang.org/faq.html#shared_memory_barriers
>> is it a faq error ?
>> 
>> and also with http://dlang.org/faq.html#shared_guarantees said, I come to 
>> think that the fact that the following code compiles is either lack of 
>> implementation, a compiler bug or a faq error ?
> 
> //
> 
> import core.thread;
> 
> void main () {
>  int i;
>  (new Thread({ i++; })).start();
> }

It's intentional.  core.thread is for people who know what they're doing, and 
there are legitimate uses along these lines:

void main() {
int i;
auto t = new Thread({i++;});
t.start();
t.join();
write(i);
}

This is perfectly safe and has a deterministic result.

Re: Something needs to happen with shared, and soon.

On Nov 14, 2012, at 12:07 PM, Alex Rønne Petersen  wrote:

> On 14-11-2012 21:00, Sean Kelly wrote:
>> On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu 
>>  wrote:
>> 
>>> On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
> If the compiler should/does not add memory barriers, then is there a
> reason for
> having it built into the language? Can a library solution be enough?

 Memory barriers can certainly be added using library functions.
>>> 
>>> The compiler must understand the semantics of barriers such as e.g. it 
>>> doesn't hoist code above an acquire barrier or below a release barrier.
>> 
>> That was the point of the now deprecated "volatile" statement.  I still 
>> don't entirely understand why it was deprecated.
>> 
> 
> The volatile statement was too general. All relevant compiler back ends today 
> only know of two kinds of volatile operations: Loads and stores. Volatile 
> statements couldn't ever be properly implemented in GDC and LDC for example.

Well, the semantics of volatile are that there's an acquire barrier before the 
statement block and a release barrier after the statement block.  Or for a 
first cut just insert a full barrier at the beginning and end of the block.  
Either way, it should be pretty simply for a compiler to handle if the compiler 
supports mutex use.

I do like the idea of built-in load and store intrinsics only because D only 
supports x86 assembler right now.  But really, it would be just as easy to fan 
out a D template function to a bunch of C functions implemented in separate ASM 
code files.  Druntime actually had this for core.atomic on PPC until not too 
long ago.

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Alex Rønne Petersen


On 14-11-2012 21:00, Sean Kelly wrote:

On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu 
 wrote:


On 11/14/12 1:20 AM, Walter Bright wrote:

On 11/13/2012 11:37 PM, Jacob Carlborg wrote:

If the compiler should/does not add memory barriers, then is there a
reason for
having it built into the language? Can a library solution be enough?


Memory barriers can certainly be added using library functions.


The compiler must understand the semantics of barriers such as e.g. it doesn't 
hoist code above an acquire barrier or below a release barrier.


That was the point of the now deprecated "volatile" statement.  I still don't 
entirely understand why it was deprecated.



The volatile statement was too general. All relevant compiler back ends 
today only know of two kinds of volatile operations: Loads and stores. 
Volatile statements couldn't ever be properly implemented in GDC and LDC 
for example.


See also: http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP20

--
Alex Rønne Petersen
a...@lycus.org
http://lycus.org

Re: Something needs to happen with shared, and soon.

On Nov 14, 2012, at 12:01 PM, Sean Kelly  wrote:

> On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu 
>  wrote:
>> 
>> This is a simplification of what should be going on. The 
>> core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the 
>> compiler generate sequentially consistent code with them (i.e. not perform 
>> certain reorderings). Then there are loads and stores with weaker 
>> consistency semantics (acquire, release, acquire/release, and consume).
> 
> No.  These functions all contain volatile ask blocks.  If the compiler 
> respected the "volatile" it would be enough.

asm blocks.  Darn auto-correct.

Re: Something needs to happen with shared, and soon.

On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu 
 wrote:
> 
> This is a simplification of what should be going on. The 
> core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the 
> compiler generate sequentially consistent code with them (i.e. not perform 
> certain reorderings). Then there are loads and stores with weaker consistency 
> semantics (acquire, release, acquire/release, and consume).

No.  These functions all contain volatile ask blocks.  If the compiler 
respected the "volatile" it would be enough.

Re: Something needs to happen with shared, and soon.

On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu 
 wrote:

> On 11/14/12 1:20 AM, Walter Bright wrote:
>> On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>>> If the compiler should/does not add memory barriers, then is there a
>>> reason for
>>> having it built into the language? Can a library solution be enough?
>> 
>> Memory barriers can certainly be added using library functions.
> 
> The compiler must understand the semantics of barriers such as e.g. it 
> doesn't hoist code above an acquire barrier or below a release barrier.

That was the point of the now deprecated "volatile" statement.  I still don't 
entirely understand why it was deprecated.

Re: Something needs to happen with shared, and soon.

On Nov 12, 2012, at 2:57 AM, Johannes Pfau  wrote:

> Am Sun, 11 Nov 2012 18:30:17 -0800
> schrieb Walter Bright :
> 
>> 
>> To make a shared type work in an algorithm, you have to:
>> 
>> 1. ensure single threaded access by aquiring a mutex
>> 2. cast away shared
>> 3. operate on the data
>> 4. cast back to shared
>> 5. release the mutex
>> 
>> Also, all op= need to be disabled for shared types.
> 
> But there are also shared member functions and they're kind of annoying
> right now:
> 
> * You can't call shared methods from non-shared methods or vice versa.
>  This leads to code duplication, you basically have to implement
>  everything twice:
> 
> --
> struct ABC
> {
>Mutext mutex;
>   void a()
>   {
>   aImpl();
>   }
>   shared void a()
>   {
>   synchronized(mutex)
>   aImpl();  //not allowed
>   }
>   private void aImpl()
>   {
>   
>   }
> }
> --
> The only way to avoid this is casting away shared in the shared a
> method, but that really is annoying.

Yes.  You end up having two methods for each function, one as a synchronized 
wrapper that casts away shared and another that does the actual work.

> and then there's also the druntime issue: core.sync doesn't work with
> shared which leads to this schizophrenic situation:
> struct A
> {
>Mutex m;
>void a() //Doesn't compile with shared
>{
>m.lock();  //Compiles, but locks on a TLS mutex!
>m.unlock();
>}
> }

Most of the reason for this was that I didn't like the old implications of 
shared, which was that shared methods would at some time in the future end up 
with memory barriers all over the place.  That's been dropped, but I'm still 
not a fan of the wrapper method for each function.  It makes for a crappy class 
design.

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Jacob Carlborg


On 2012-11-14 18:40, Andrei Alexandrescu wrote:


Memory ordering must be built into the language and understood by the
compiler.


Ok, thanks for the expatiation.

--
/Jacob Carlborg

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Jacob Carlborg


On 2012-11-14 18:36, Andrei Alexandrescu wrote:


The hypothesis that atomic primitives can be implemented as a library.


I don't know these kind of things, that's why I'm asking.

--
/Jacob Carlborg

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread Iain Buclaw

On 14 November 2012 17:50, Andrei Alexandrescu
 wrote:
> On 11/14/12 9:15 AM, David Nadlinger wrote:
>>
>> On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:
>>>
>>> On 11/14/12 1:20 AM, Walter Bright wrote:

 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
>
> If the compiler should/does not add memory barriers, then is there a
> reason for
> having it built into the language? Can a library solution be enough?


 Memory barriers can certainly be added using library functions.
>>>
>>>
>>> The compiler must understand the semantics of barriers such as e.g. it
>>> doesn't hoist code above an acquire barrier or below a release barrier.
>>
>>
>> Again, this is true, but it would be a fallacy to conclude that
>> compiler-inserted memory barriers for »shared« are required due to this
>> (and it is »shared« we are discussing here!).
>>
>> Simply having compiler intrinsics for atomic loads/stores is enough,
>> which is hardly »built into the language«.
>
>
> Compiler intrinsics == built into the language.
>
> Andrei
>

Not necessarily. For example, printf is a compiler intrinsic for GDC,
but it's not built into the language in the sense of the compiler
*provides* the codegen for it.  Though it is aware of what it is and
what it does, so can perform relevant optimisations around the use of
it.


Regards,
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Re: Something needs to happen with shared, and soon.

2012-11-14 Thread David Nadlinger

On Wednesday, 14 November 2012 at 17:31:07 UTC, David Nadlinger 
wrote:
Thus, »we«, meaning on a language level, don't need to change 
anything about the current situations, […]


Let my clarify that: We don't necessarily need to tuck on any 
extra semantics to the language other than what we currently 
have. However, what we must indeed do is clarifying/specifying 
the implicit consensus on which the current implementations are 
built. We really need a »The D Memory Model«-style document.


David

Re: Something needs to happen with shared, and soon.


On 11/14/12 9:31 AM, David Nadlinger wrote:

On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrote:

Sorry, I was imprecise. We need to (a) define intrinsics for loading
and storing data with high-level semantics (a short list: acquire,
release, acquire+release, and sequentially-consistent) and THEN (b)
implement the needed code generation appropriately for each
architecture. Indeed on x86 there is little need to insert fence
instructions, BUT there is a definite need for the compiler to prevent
certain reorderings. That's why implementing shared data operations
(whether implicit or explicit) as sheer library code is NOT possible.


Sorry, I didn't see this message of yours before replying (the perils of
threaded news readers…).

You are right about the fact that we need some degree of compiler
support for atomic instructions. My point was that is it already
available, otherwise it would have been impossible to implement
core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which
prohibits compiler code motion).


Yah, the whole point here is that we need something IN THE LANGUAGE 
DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION.


THIS IS VERY IMPORTANT.


Thus, »we«, meaning on a language level, don't need to change anything
about the current situations, with the possible exception of adding
finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the
duty of the compiler writers to provide the appropriate means to
implement druntime on their code generation infrastructure – and indeed,
the situation in DMD could be improved, using inline asm is hitting a
fly with a sledgehammer.


That is correct. My point is that compiler implementers would follow 
some specification. That specification would contain informationt hat 
atomicLoad and atomicStore must have special properties that put them 
apart from any other functions.



David


[1] I am not sure where the point of diminishing returns is here,
although it might make sense to provide the same options as C++11. If I
remember correctly, D1/Tango supported a lot more levels of
synchronization.


We could start with sequential consistency and then explore 
riskier/looser policies.



Andrei

Re: Something needs to happen with shared, and soon.


On 11/14/12 9:15 AM, David Nadlinger wrote:

On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:

On 11/14/12 1:20 AM, Walter Bright wrote:

On 11/13/2012 11:37 PM, Jacob Carlborg wrote:

If the compiler should/does not add memory barriers, then is there a
reason for
having it built into the language? Can a library solution be enough?


Memory barriers can certainly be added using library functions.


The compiler must understand the semantics of barriers such as e.g. it
doesn't hoist code above an acquire barrier or below a release barrier.


Again, this is true, but it would be a fallacy to conclude that
compiler-inserted memory barriers for »shared« are required due to this
(and it is »shared« we are discussing here!).

Simply having compiler intrinsics for atomic loads/stores is enough,
which is hardly »built into the language«.


Compiler intrinsics == built into the language.

Andrei

Re: Something needs to happen with shared, and soon.


On 11/14/12 8:59 AM, David Nadlinger wrote:

On Wednesday, 14 November 2012 at 14:32:34 UTC, Andrei Alexandrescu wrote:

On 11/14/12 4:23 AM, David Nadlinger wrote:

On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:

That is what java's volatile do. It have several uses cases, including
valid double check locking (It has to be noted that this idiom is used
incorrectly in druntime ATM, which proves both its usefullness and
that it require language support) and disruptor which I wanted to
implement for message passing in D but couldn't because of lack of
support at the time.


What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't
know whether there might be a weird spec loophole which could
theoretically lead to them being undefined behavior, but I'm sure that
they are guaranteed to produce the right code on all relevant compilers.
You can even specify the memory order semantics if you know what you are
doing (although this used to trigger a template resolution bug in the
frontend, no idea if it works now).

David


This is a simplification of what should be going on. The
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so
the compiler generate sequentially consistent code with them (i.e. not
perform certain reorderings). Then there are loads and stores with
weaker consistency semantics (acquire, release, acquire/release, and
consume).


Sorry, I don't quite see where I simplified things.


First, there are more kinds of atomic loads and stores. Then, the fact 
that the calls are not supposed to be reordered must be a guarantee of 
the language, not a speculation about an implementation. We can't argue 
that a feature works just because it so happens an implementation works 
a specific way.



Yes, in the
implementation of atomicLoad/atomicStore, one would probably use
compiler intrinsics, as done in LDC's druntime, or inline assembly, as
done for DMD.

But an optimizer will never move instructions across opaque function
calls, because they could have arbitrary side effects.


Nowhere in the language definition is explained what an opaque function 
call is and what optimizations can and cannot be done in the presence of 
such.



So, either we are
fine by definition,


s/definition/happenstance/


or if the compiler inlines the
atomicLoad/atomicStore calls (which is actually possible in LDC), then
its optimizer will detect the presence of inline assembly resp. the
load/store intrinsics, and take care of not reordering the instructions
in an invalid way.

I don't see how this makes my answer to deadalnix (that »volatile« is
not necessary to implement sequentially consistent loads/stores) any
less valid.


Using load/store everywhere would make volatile unneeded (and for us, 
shared). But the advantage there is that you qualify the type/value once 
and then you don't need to remember to only use specific primitives to 
manipulate it.



Andrei

Re: Something needs to happen with shared, and soon.