Hello, 

So in this post  i will explain to you what is exactly my invention that is 
my scalable MLock algorithm.. 

If you look the Enter() method of my scalable MLock algorithm you will read 
this(that's easy to undertand the code in Object Pascal)  , and read my 
explanation after: 

You can download the source code here: 

https://sites.google.com/site/aminer68/scalable-mlock 


== 
procedure TMLock.Enter; 
var prev,fcount1:PListEntry; 
k:integer; 
Buffer1:pointer; 

begin 
Buffer1 := AllocMem(SizeOf(TListEntry) + Alignment); 
FCount1 := PListEntry((int(Buffer1) + Alignment - 1) 
and not (Alignment - 1)); 

fcount1^.Data:=0; 
fcount1^.Next:=nil; 
fcount1^.mem:=buffer1; 

long(prev) := LockedExchange(long(m_head), long(fcount1)); 

prev.next := fcount1; 

if (flag=1) 
then 
begin 
if (m_tail=prev) 
then 
if CAS(flag,1,0) 
then 
begin 
fcount1^.Data:=-1; 
exit; 
end; 
end; 
k:=1; 
repeat 

if fcount1^.data=1 
then 
begin 
freemem(fcount1^.mem); 
break; 
end 
else if fcount1^.data=2 
then 
begin 
fcount1^.Data:=-1; 
break 
end; 
if (flag=1) 
then 
begin 
if (m_tail.next.next=fcount1) 
then 
if CAS(flag,1,0) 
then 
begin 
fcount1^.Data:=-1; 
break; 
end; 
end; 
inc(k); 
if (k mod 140)=0 
then 
{$IFDEF FPC} 
ThreadSwitch; 
{$ENDIF} 
{$IFDEF Delphi} 
sleep(0); 
{$ENDIF} 
asm pause end; 
until false; 

end; 
=== 



First i am allocating a Buffer1 struct and using fcount1 that is a pointer 
to a  64 bytes aligned record and i am assigning zero to fcount1^.Data 
cause the threads that enter will have to wait for there turn and for 
fcount1^.Data to become 1 to enter the scalable Lock except for the first 
thread that enters the lock, the first thread that enter the lock will find 
the "flag" variable equal to 1 and notice with me that the CAS(flag,1,0) 
will succeed for the first thread that enters the Enter() method... 

Look that i am using this inside the source code: 

long(prev) := LockedExchange(long(m_head), long(fcount1)); 

prev.next := fcount1; 

this is a waitfree queue and the push() and the pop() inside my scalable 
MLock algorithm are waitfree... 

So notice with me that the threads on the Enter() method will enter a 
repeat-until loop waiting for there turn to enter, and notice 
that the CAS will be touched rarely , i have benchmarked and collected some 
empiric statistics and i have noticed that the CAS inside the repeat-unil 
loop inside the Enter() method will be touched only 1% of the time, so the 
threads will wait 99% of the time for fcount1^.data to become 1 or 
fcount1^.data to become 2 , and notice that fcount1^.data inside the 
Enter() method is not generating cache-lines transfers for each thread 
waiting in the Enter() method, cause fcount1^.data is only touched by its 
correponding threads on its local cache, this is why my MLock algorithm is 
scalable, cause it minimizes efficiently the cache-coherence traffic. 

For the Leave() method it is waitfree and it is easy to prove that, just 
look at the source code and you will notice that even though there is a 
repeat-until loop inside the Leave() method , my algorithm will not loop 
more than 2 times inside the Leave() method, so it makes my algorithm 
waitfree and my algorithm can be used on parallel and realtime safety 
critical systems. 

And also you will notice easily that my scalable MLock is FIFO fair , so 
it's starvation-free. 

This is a node based Lock that is scalable, FIFO fair and starvation-free. 

- Discovered by Amine Moulay Ramdane 

- This lock is scalable 

- It has the same space requirement as the scalable MCS lock 

- Doesn't require a local "queue node" to be passed in as a parameter as is 
doing the MCS and CLH locks. 

- Spins only on local locations on a cache-coherent machine 

- And it's fast. 



Please read also this: 

A bigger problem with the MCS lock is its API. It requires a second 
structure to be passed in addition to the address of the lock. The 
algorithm uses this second structure to store the information which 
describes the queue of threads waiting for the lock. Unfortunately, most 
code written using spinlocks doesn't have this extra information, so the 
fact that the MCS algorithm isn't a drop-in replacement to a standard spin 
lock is a problem. 

An IBM working group found a way to improve the MCS algorithm to remove the 
need to pass the extra structure as a parameter. Instead, on-stack 
information was used instead. The result is the K42 lock algorithm: 

Unfortunately, the K42 algorithm has another problem. It appears that it 
may be patented by IBM. Thus it cannot be used either. (Without perhaps 
paying royalties to IBM.) 

So you have to know that my scalable MLock doesn't require a local "queue 
node" to be passed in as a parameter  as is doing the MCS and CLH locks, my 
scalable MLock doesn't require any parameter to be passed, just call the 
Enter() and Leave() method and that's all. 


You can download my scalable node based Lock that is FIFO fair and waitfree 
from: 


https://sites.google.com/site/aminer68/scalable-mlock 



Thank you, 
Amine Moulay Ramdane. 


-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Scalable Synchronization Algorithms" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/lock-free/0fe20a06-baf4-46a7-8c21-edbd80d142d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to