[fpc-other] Multi-threaded programming (was: volatile variables)

Jonas Maebe Thu, 30 Jun 2011 02:39:24 -0700


(moved here from fpc-devel)


On 30 Jun 2011, at 11:31, Hans-Peter Diettrich wrote:

Vinzent Höfler schrieb:
When it's up to every coder, to insert explicit synchronizationwhenever required, how to determine the places where explicit codeis required?
By careful analysis. Although there may exist tools which detectpotentiallyun-synchronised accesses to shared variables, there will be no toolthat
inserts synchronisation code automatically for you.
I wouldn't like such tools, except the compiler itself :-(

Data races can only be detected without false positives/negatives viadynamic analysis, by intercepting all memory accesses and allsynchronization operations.

This will report all potential data races on the execution path ofthat particular run (i.e., not only the cases where some valueactually did get lost due to timing during that particular run, butall of the ones that could have happened during that run), but on theother hand it's also *only* for that particular run (so potential dataraces on different execution paths will not be caught).

You can read about the principle here: http://escher.elis.ugent.be/publ/Edocs/DOC/P104_116.pdf(and it contains further references to other papers that go intoeven more detail).

If a program would contain nothing but FPC-compiled code it would intheory be possible to modify the RTL and the compiler to insert allrequired analysis code, but I would not consider that to be worth theeffort. The reason is that it would only work for programs that do notcontain assembler code and which do not perform any system calls(system calls can also read/write memory).

Consider the shareable bi-linked list, where insertion requires codelike this:
 list.Lock; //prevent concurrent access
 ... //determine affected list elements
 new.prev := prev; //prev must be guaranteed to be valid
 new.next := next;
 prev.next := new;
 next.prev := new;
 list.Unlock;
What can we expect from the Lock method/instruction - what kind ofsynchronizaton (memory barrier) can, will or should it provide?

Use a critical section as provided by the FPC RTL, and all necessarysynchronization will be performed (including the required memorybarriers). Manual memory barriers are only required if you use lock-free multithreading (which I would not recommend unless you arealready an absolute expert in multi-threaded programming) or whenwriting your own synchronization primitives.

And to repeat what I mentioned before: the program state cannot becomeunsynchronized if a thread switches from one core to another in themiddle of a critical section. If you (I don't mean you personallyhere) think that you are observing something like that, you haveanother bug. Using the standard synchronization primitives is all youneed to write correct multi-threaded programs.

My understanding of a *full* cache synchronization would slow downnot only the current core and cache, but also all other caches?
If so, would it help to enclose above instructions in e.g.
 Synchronized begin
   update the links...
 end;
so that the compiler can make all memory references (at least reads)occur read/write-through, inside such a code block?

Even if that were possible, disabling caching does in no way solve anykind of data race. It would slow down programs without any gainwhatsoever as far as thread safety is concerned.

As Nikolai mentioned, you need a critical section (or a lock-freeequivalent specific to an individual case, but that cannot beautomatically generated -- and even the first published papers withmanually constructed lock-free algorithms turned out to contain errorsafterwards).

After these considerations I'd understand that using Interlockedinstructions in the code would ensure such read/write-through, butmerely as a side effect - they also lock the bus for everyinstruction,

They only did so on ancient x86 processors. Nowadays they don'tanymore. They also never did on most other architectures.

We need a documentation of the FPC specific means of cachesynchronization, with their guaranteed effects on every target[1].

Such documentation is not compiler-specific, but architecture-specific. The FPC documentation is not a tutorial on computerarchitecture, nor a tutorial on multi-threaded programming.

FPC exports routines that for all existing kinds of memory barriers(ReadBarrier, ReadDependencyBarrier, ReadWriteBarrier andWriteBarrier), as well as for various synchronization primitives(criticalsection/mutex, event/conditional signal, ...). That is thejob of the compiler/RTL. Language extensions are of course alwayspossible, but that is unrelated to "documentation of the FPC specificmeans of cache synchronization, with their guaranteed effects on everytarget" (language constructs by definition would have the same effecteverywhere).


Tutorials and architecture documentation are a separate department.

Furthermore we need concrete examples[2], how (to what extent) it'srequired to use these special instructions/procedures, in exampleslike above.

Yes, many people could use tutorials about computer architecture,operating system principles and multi-threading. But again, those arecompletely unrelated to the development of FPC itself.


A good starter would probably be Module 4 of 
http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-823-computer-system-architecture-fall-2005/lecture-notes/


Jonas

_______________________________________________
fpc-other maillist  -  fpc-other@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-other

[fpc-other] Multi-threaded programming (was: volatile variables)

Reply via email to