On 07/01/2011 03:03 PM, Vincent Snijders wrote:

So how you expect us to find the description *you* want us to read in
all those mails, if even you cannot find it.
I can't find it in the backlog website. I did find it in my mailstore (no idea if this helps, though):

This is the  message of Andrew's (06/28/2011 02:47 PM):
On Tue, Jun 28, 2011 at 7:40 AM, Michael Schnell<mschn...@lumino.de>  wrote:
>  All references I read today say that pthread_mutex (on which supposedly
>  TCriticalSection is based) and the appropriate Windows stuff does contain an
>  MB. But there might be issues with other OSes and Archs.
>
Yes, any object that requires atomic features will employ a memory barrier.
That is to say the MB is employed in order for the spincount to be
accurate across all cores.

>  If they would not do so, the complete plain old threaded application
>  paradigm would be invalid, and tons of applications would need to be
>  trashed.
>  -Michael
Probably right here.  My engine had worked fine on a tripple core AMD.
  It wasn't until I upgraded to the 6 core system did I have to start
looking into what was causing random problems with pointer
assignments.


------------------------------------------------------------------------

This is the initial message of Andrew's (06/23/2011 02:02 PM):
Partially.  Read barriers may not be necessary.  It depends on the
implementation.  By having barriers placed at strategic locations in
the code, you can make an implementation thread safe.

Add, Delete, and Clear would be some.  List iterations would be
difficult if not unsafe outside a lock but if accessed under a manager
thread you can schedule and operation and it will wait done until
conditions are safe to perform.

Getting an item from the list rather than locking the entire list and
having an inUse boolean set to true for that item, would allow for any
deletion to be blocked until another time (by a manager thread).

With multi-core systems simply adding a mutex/lock is not enough.  A
manager is needed to be in place to safely provision, delegate, and
use items.  IMO, this forces efficiency in a system design.  The basic
logic for a manager thread is that the manager thread is accepting
commands, and scheduled execution happens automatically.

Further, you could create a command system for each list so when you
add/remove an item from the list you are merely scheduling that item
for deletion. I would avoid polling for large lists.  The command
should have a completion routine like Item.onComplete(Item) where the
item is passed for use in the application.

This way there would be absolutely no waiting for data in code and the
system in general at rest until needed.

Here he goes on (06/27/2011 04:58 PM):
You're totally underestimating the need for a memory barrier :

"Multithreaded programming and memory visibility

See also: Memory model (computing)
Multithreaded programs usually use synchronization primitives provided
by a high-level programming environment, such as Java and .NET
Framework, or an application programming interface (API) such as POSIX
Threads or Windows API. Primitives such as mutexes and semaphores are
provided to synchronize access to resources from parallel threads of
execution. These primitives are usually implemented with the memory
barriers required to provide the expected memory visibility semantics.
In such environments explicit use of memory barriers is not generally
necessary. "

Focus on the "These primitives are usually implemented with the memory
barriers required to provide the expected memory visibility semantics.
"

These primitives include TCrticialSection.  It's not enough to ensure
that someone from another thread is reading a variable at will.  And
it would safe to assume that CPU management systems can switch code
execution on any core at any time.
(06/27/2011 08:03 AM):
On Mon, Jun 27, 2011 at 12:52 PM, Hans-Peter Diettrich
<drdiettri...@aol.com>  wrote:
>  You forget the strength of protection. A MB will disallow immediately any
>  concurrent access to the memory area - no way around. Protection by other
>  means only can work in*perfect*  cooperation, a very weak model.
Absolutely incorrect.

entercricitalsection();
loop
   a:=b+c;
end loop;
leavecriticalsection();

thread 2

can read a and b and c at any state.  If you want an accurate view of
a,b,c you need to employ interlocked statements:-)
(06/28/2011 01:41 AM):
2011/6/27 Malcom Haak<insane...@gmail.com>:
>  Tell me then why any of what you have said is relevant. In fact in cases
>  this this the use of CriticalSections would be sensible and would not cause
>  'tons of wait' as you have all your worker threads off doing things 99% of
>  the time.
Thread 1:
a=b+c
a2=a+c2
SignalEvent(E1)

Thread 2:
   repeat
   WaitForEvent(E1,120);
     We can read anything now
   until terminated or complete

This the prime example.  On a 6 core system a looks like one value to
one core than it does to another core.  It's that simple.  No getting
around this issue.

While spinlocks can block a entrance - it cannot guarantee memory
order / code execution order.  Therefore it is good practice to adopt
interlocked assignments to guarantee memory is what it is.


Core X computes a=b+c
Core X+Y computes a2

This is relatively new theory which required low-level CPU code to
perform such locks.  This was never needed until the introduction of
multi-core systems.  Of which I did extensive tests on AMD via
FPC/Lazarus.
(06/28/2011 04:41 AM):
On Tue, Jun 28, 2011 at 9:47 AM, Hans-Peter Diettrich
<drdiettri...@aol.com>  wrote:

>  I don't see anything like memory barriers here.
Compare and swap mechanisms aren't quite like memory barriers but they
to get the CPU to send a "fresh" copy of a variable to all cores'
cache...




_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to