is core Mutex lock "fast"?

2021-01-26 Thread ludo via Digitalmars-d-learn

Hi guys,

still working on old D1 code, to be updated to D2. At some point 
the previous dev wrote a FastLock class. The top comment is from 
the dev himself, not me. My question is after the code.


---

class FastLock
{
protected Mutex mutex;
protected int lockCount;
protected Thread owner;

///
this()
{   mutex = new Mutex();
}

/**
	 * This works the same as Tango's Mutex's lock()/unlock except 
provides extra performance in the special case where
	 * a thread calls lock()/unlock() multiple times while it 
already has ownership from a previous call to lock().

 * This is a common case in Yage.
 *
	 * For convenience, lock() and unlock() calls may be nested.  
Subsequent lock() calls will still maintain the lock,
	 * but unlocking will only occur after unlock() has been called 
an equal number of times.

 *
	 * On Windows, Tango's lock() is always faster than D's 
synchronized statement.  */

void lock()
{   auto self = Thread.getThis();
if (self !is owner)
{   mutex.lock();
owner = self;
}
lockCount++;
}
void unlock() /// ditto
{   assert(Thread.getThis() is owner);
lockCount--;
if (!lockCount)
{   owner = null;
mutex.unlock();
}
}
}

---

Now if I look at the doc , in particular Class 
core.sync.mutex.Mutex, I see:

---
 lock () 	If this lock is not already held by the caller, the 
lock is acquired, then the internal counter is incremented by one.

--
Which looks exactly like the behavior of "fastLock". Is it so 
that the old Tango's mutex lock was not keeping count and would 
lock the same object several time? Do we agree that the FastLock 
class is obsolete considering current D core?


cheers


Re: is core Mutex lock "fast"?

2021-01-26 Thread Steven Schveighoffer via Digitalmars-d-learn

On 1/26/21 1:07 PM, ludo wrote:

Hi guys,

still working on old D1 code, to be updated to D2. At some point the 
previous dev wrote a FastLock class. The top comment is from the dev 
himself, not me. My question is after the code.




[snip]

Is it so that the 
old Tango's mutex lock was not keeping count and would lock the same 
object several time? Do we agree that the FastLock class is obsolete 
considering current D core?


Just want to point out that druntime was based off of Tango's runtime. 
So I would expect D2's mutex to have at least as much performance as 
Tango's mutex.


If this is D1 code, it was before this happened (D1 phobos did not share 
a runtime with Tango).


-Steve


Re: is core Mutex lock "fast"?

2021-01-26 Thread IGotD- via Digitalmars-d-learn

On Tuesday, 26 January 2021 at 18:07:06 UTC, ludo wrote:

Hi guys,

still working on old D1 code, to be updated to D2. At some 
point the previous dev wrote a FastLock class. The top comment 
is from the dev himself, not me. My question is after the code.


---

class FastLock
{
protected Mutex mutex;
protected int lockCount;
protected Thread owner;

///
this()
{   mutex = new Mutex();
}

/**
	 * This works the same as Tango's Mutex's lock()/unlock except 
provides extra performance in the special case where
	 * a thread calls lock()/unlock() multiple times while it 
already has ownership from a previous call to lock().

 * This is a common case in Yage.
 *
	 * For convenience, lock() and unlock() calls may be nested.  
Subsequent lock() calls will still maintain the lock,
	 * but unlocking will only occur after unlock() has been 
called an equal number of times.

 *
	 * On Windows, Tango's lock() is always faster than D's 
synchronized statement.  */

void lock()
{   auto self = Thread.getThis();
if (self !is owner)
{   mutex.lock();
owner = self;
}
lockCount++;
}
void unlock() /// ditto
{   assert(Thread.getThis() is owner);
lockCount--;
if (!lockCount)
{   owner = null;
mutex.unlock();
}
}
}

---

Now if I look at the doc , in particular Class 
core.sync.mutex.Mutex, I see:

---
 lock () 	If this lock is not already held by the caller, the 
lock is acquired, then the internal counter is incremented by 
one.

--
Which looks exactly like the behavior of "fastLock". Is it so 
that the old Tango's mutex lock was not keeping count and would 
lock the same object several time? Do we agree that the 
FastLock class is obsolete considering current D core?


cheers


That code isn't thread safe at all (assuming FastLock is used 
from several threads). lockCount isn't atomic which means the 
code will not work with several threads. Also the assignment of 
the variable owner isn't thread safe. As soon you start to 
include more that one supposedly atomic assignment in 
synchronization primitives, things quickly get out of hand.


Normal D Mutex uses pthread_mutex on Linux and the usual 
CriticalSection stuff on Windows. Neither is particularly fast. 
Futex on Linux isn't exactly super fast either. Synchronization 
primitives aren't exactly fast on any system because that's how 
it is. There are ways to makes things faster but adding stuff 
like timeouts and the complexity goes exponential. There so many 
pitfalls with synchronization primitives that it is hardly worth 
it making your own.


Re: is core Mutex lock "fast"?

2021-01-26 Thread Steven Schveighoffer via Digitalmars-d-learn

On 1/26/21 3:56 PM, IGotD- wrote:

That code isn't thread safe at all (assuming FastLock is used from 
several threads). lockCount isn't atomic which means the code will not 
work with several threads.> Also the assignment of the variable owner 
isn't thread safe. As soon you start to include more that one supposedly 
atomic assignment in synchronization primitives, things quickly get out 
of hand.


The only item that is read without being locked is owner. If you change 
that to an atomic read and write, it should be fine (and is likely fine 
on x86* without atomics anyway).


All the other data is protected by the actual mutex, and so should be 
synchronous.


However, I think this is all moot, druntime is the same as Tango.

-Steve


Re: is core Mutex lock "fast"?

2021-01-26 Thread ludo via Digitalmars-d-learn
However, I think this is all moot, druntime is the same as 
Tango.


Moot you mean debatable? Or irrelevant :) Thanks to your 
explanations, I understand now that the dev tried to imitate a 
Tango feature with very old D1 code. This is 2005/2009 code


And as pointed out by IGotD-, better not to mess around with 
synch / reinvent the wheel. I will trash this class. Still 
learning a lot, thank you guys.


Re: is core Mutex lock "fast"?

2021-01-26 Thread IGotD- via Digitalmars-d-learn
On Tuesday, 26 January 2021 at 21:09:34 UTC, Steven Schveighoffer 
wrote:


The only item that is read without being locked is owner. If 
you change that to an atomic read and write, it should be fine 
(and is likely fine on x86* without atomics anyway).


All the other data is protected by the actual mutex, and so 
should be synchronous.


However, I think this is all moot, druntime is the same as 
Tango.


-Steve


Yes, I didn't see the lock would block subsequent threads.

Both pthread_mutex_lock and EnterCriticalSection do exactly the 
same as FastLock and the only difference is the the check is 
"closer" to the running code. Performance increase should be low.


As I was wrong about the thread safety, I will not write here any 
further.


Re: is core Mutex lock "fast"?

2021-01-26 Thread Steven Schveighoffer via Digitalmars-d-learn

On 1/26/21 4:40 PM, ludo wrote:

However, I think this is all moot, druntime is the same as Tango.


Moot you mean debatable? Or irrelevant :)


I *think* it's irrelevant. The comment makes it sound like it's slightly 
different than Tango, but for sure reentrant locks are possible with D2 
phobos. I am *100% sure* that druntime is from Tango, because I was 
there when it was a controversy and this was the solution.


Thanks to your explanations, I 
understand now that the dev tried to imitate a Tango feature with very 
old D1 code. This is 2005/2009 code


And as pointed out by IGotD-, better not to mess around with synch / 
reinvent the wheel. I will trash this class. Still learning a lot, thank 
you guys.


Yes, I agree. I'm no expert on thread atomics, but I know enough to know 
I shouldn't mess with the tried-and-true primitives that are generally used.


Consider that very very smart people have tried to write "lock free" 
stuff and commonly fail (and some dumb people, including me). And when 
you fail here, it's not a failure you see immediately, because it may 
happen once in a blue moon.


Performance is irrelevant if it's not correct.

-Steve