[zfs-code] R/W lock portability issue

2007-11-24 Thread Ricardo M. Correia
Hi,

I am having a problem porting Solaris R/W lock functionality to Linux. 
The problem is that RW_LOCK_HELD() doesn't have an equivalent function 
in pthreads.

This was not a problem before because RW_LOCK_HELD() was only used in 
ASSERT statements. In this case, I was able to make it work by making it 
behave a bit differently (specifically, it would return true if *any* 
thread was holding the lock, and false if there were no threads holding 
the lock. This way ASSERTs never failed).

However, now there is a small bit of code that actually relies on 
RW_LOCK_HELD() working correctly:

boolean_t need_lock = !RW_LOCK_HELD(&dp->dp_config_rwlock);

if (need_lock)
rw_enter(&dp->dp_config_rwlock, RW_READER);

I can't think of a way to emulate this functionality with pthreads 
without incurring a huge overhead when locking...

Any ideas?

Thanks,
Ricardo

-- 


*Ricardo Manuel Correia*

Lustre Engineering
*Sun Microsystems, Inc.*
Portugal

-- next part --
An HTML attachment was scrubbed...
URL: 

-- next part --
A non-text attachment was scrubbed...
Name: 6g_top.gif
Type: image/gif
Size: 1257 bytes
Desc: not available
URL: 



[zfs-code] R/W lock portability issue

2007-11-24 Thread Ricardo M. Correia
Ricardo M. Correia wrote:
> boolean_t need_lock = !RW_LOCK_HELD(&dp->dp_config_rwlock);
>
> if (need_lock)
> rw_enter(&dp->dp_config_rwlock, RW_READER);
>

Maybe I posted to soon.
Am I right that this specific code (in dsl_dataset.c) would work 
correctly if RW_LOCK_HELD() returned true if *another* thread is holding 
the lock as a reader?

Thanks in advance,
Ricardo

-- 


*Ricardo Manuel Correia*

Lustre Engineering
*Sun Microsystems, Inc.*
Portugal

-- next part --
An HTML attachment was scrubbed...
URL: 

-- next part --
A non-text attachment was scrubbed...
Name: 6g_top.gif
Type: image/gif
Size: 1257 bytes
Desc: not available
URL: 



[zfs-code] R/W lock portability issue

2007-11-24 Thread Neil Perrin


Ricardo M. Correia wrote:
> Ricardo M. Correia wrote:
>> boolean_t need_lock = !RW_LOCK_HELD(&dp->dp_config_rwlock);
>>
>> if (need_lock)
>> rw_enter(&dp->dp_config_rwlock, RW_READER);
>>
> 
> Maybe I posted to soon.
> Am I right that this specific code (in dsl_dataset.c) would work 
> correctly if RW_LOCK_HELD() returned true if *another* thread is holding 
> the lock as a reader?

Yes this looks like a bug to me as well.




[zfs-code] R/W lock portability issue

2007-11-25 Thread Frank Batschulat (Home)
Ricardo M. Correia wrote:
> Ricardo M. Correia wrote:
>> boolean_t need_lock = !RW_LOCK_HELD(&dp->dp_config_rwlock);
>>
>> if (need_lock)
>> rw_enter(&dp->dp_config_rwlock, RW_READER);
>>
> 
> Maybe I posted to soon.
> Am I right that this specific code (in dsl_dataset.c) would work 
> correctly if RW_LOCK_HELD() returned true if *another* thread is holding 
> the lock as a reader?

RW_LOCK_HELD() returns true if any thread is holding the lock no matter 
if as reader or writer.

---
frankB



[zfs-code] R/W lock portability issue

2007-11-26 Thread Matthew Ahrens
Neil Perrin wrote:
> 
> Ricardo M. Correia wrote:
>> Ricardo M. Correia wrote:
>>> boolean_t need_lock = !RW_LOCK_HELD(&dp->dp_config_rwlock);
>>>
>>> if (need_lock)
>>> rw_enter(&dp->dp_config_rwlock, RW_READER);
>>>
>> Maybe I posted to soon.
>> Am I right that this specific code (in dsl_dataset.c) would work 
>> correctly if RW_LOCK_HELD() returned true if *another* thread is holding 
>> the lock as a reader?
> 
> Yes this looks like a bug to me as well.

This is not a bug, it is the designed behavior.  To tell if this specific 
thread is holding the lock for reader would require a performance hit (eg, see 
rrwlock.c behavior when rr_writer_wanted is set).

--matt



[zfs-code] R/W lock portability issue

2007-11-26 Thread Neil Perrin
Matthew Ahrens wrote:
> Neil Perrin wrote:
>>
>> Ricardo M. Correia wrote:
>>> Ricardo M. Correia wrote:
 boolean_t need_lock = !RW_LOCK_HELD(&dp->dp_config_rwlock);

 if (need_lock)
 rw_enter(&dp->dp_config_rwlock, RW_READER);

>>> Maybe I posted to soon.
>>> Am I right that this specific code (in dsl_dataset.c) would work 
>>> correctly if RW_LOCK_HELD() returned true if *another* thread is 
>>> holding the lock as a reader?
>>
>> Yes this looks like a bug to me as well.
> 
> This is not a bug, it is the designed behavior.  To tell if this 
> specific thread is holding the lock for reader would require a 
> performance hit (eg, see rrwlock.c behavior when rr_writer_wanted is set).
> 
> --matt

- I'm confused then. The dp_config_rwlock lock must be held as it's
asserted in dsl_prop_get_ds_locked() However, the lock can be
dropped immediately after the test dsl_dataset_open_obj() as it only
checks if anyone is holding it - not just the caller. Am I missing something?

Neil.




[zfs-code] R/W lock portability issue

2007-11-26 Thread Matthew Ahrens
Neil Perrin wrote:
> Matthew Ahrens wrote:
>> Neil Perrin wrote:
>>>
>>> Ricardo M. Correia wrote:
 Ricardo M. Correia wrote:
> boolean_t need_lock = !RW_LOCK_HELD(&dp->dp_config_rwlock);
>
> if (need_lock)
> rw_enter(&dp->dp_config_rwlock, RW_READER);
>
 Maybe I posted to soon.
 Am I right that this specific code (in dsl_dataset.c) would work 
 correctly if RW_LOCK_HELD() returned true if *another* thread is 
 holding the lock as a reader?
>>>
>>> Yes this looks like a bug to me as well.
>>
>> This is not a bug, it is the designed behavior.  To tell if this 
>> specific thread is holding the lock for reader would require a 
>> performance hit (eg, see rrwlock.c behavior when rr_writer_wanted is 
>> set).
>>
>> --matt
> 
> - I'm confused then. The dp_config_rwlock lock must be held as it's
> asserted in dsl_prop_get_ds_locked() However, the lock can be
> dropped immediately after the test dsl_dataset_open_obj() as it only
> checks if anyone is holding it - not just the caller. Am I missing 
> something?

So, we use RW_LOCK_HELD() to mean, "might this thread hold the lock?"  and it 
is generally only used in assertions.  Eg, some routine should only be called 
with the lock held, so we "ASSERT(RW_LOCK_HELD(lock))".  The fact that 
sometimes the implementation of RW_LOCK_HELD() returns TRUE when this thread 
does not in fact hold the lock is fine.

In this case, we are using it a bit differently.  The current thread must 
*not* hold the lock for reader.  We use RW_LOCK_HELD() to determine 
definitively if this thread holds the lock for writer or not.  The behavior we 
want is:  if this thread already holds the lock for writer, we don't need to 
re-grab it.  Otherwise, we grab the lock for READER.

However, now that I go look at the implementation of RW_LOCK_HELD(), it 
doesn't do what this code expects; it should be using RW_WRITE_HELD().

--matt



[zfs-code] R/W lock portability issue

2007-11-26 Thread Chris Kirby

> However, now that I go look at the implementation of RW_LOCK_HELD(), it 
> doesn't do what this code expects; it should be using RW_WRITE_HELD().

I'm about to file a CR for this (and fix it).

-Chris



[zfs-code] R/W lock portability issue

2007-11-27 Thread Chris Kirby
Matthew Ahrens wrote:
>
> So, we use RW_LOCK_HELD() to mean, "might this thread hold the lock?"  and it 
> is generally only used in assertions.  Eg, some routine should only be called 
> with the lock held, so we "ASSERT(RW_LOCK_HELD(lock))".  The fact that 
> sometimes the implementation of RW_LOCK_HELD() returns TRUE when this thread 
> does not in fact hold the lock is fine.
> 
> In this case, we are using it a bit differently.  The current thread must 
> *not* hold the lock for reader.  We use RW_LOCK_HELD() to determine 
> definitively if this thread holds the lock for writer or not.  The behavior 
> we 
> want is:  if this thread already holds the lock for writer, we don't need to 
> re-grab it.  Otherwise, we grab the lock for READER.
> 
> However, now that I go look at the implementation of RW_LOCK_HELD(), it 
> doesn't do what this code expects; it should be using RW_WRITE_HELD().

The problem in this case turns out to be a bit harder. 
dsl_dataset_open_obj can be called with dp_config_rwlock held for
read or write, or not held at all.  An obvious way to fix this would
be to have the callers pass an arg saying whether or not the lock is
held, but that's esp. ugly in this case since there are ~30 callers,
and at least some of them probably don't grab the lock directly.

-Chris



[zfs-code] R/W lock portability issue

2007-11-27 Thread Darren J Moffat
Chris Kirby wrote:
> Matthew Ahrens wrote:
>> So, we use RW_LOCK_HELD() to mean, "might this thread hold the lock?"  and 
>> it 
>> is generally only used in assertions.  Eg, some routine should only be 
>> called 
>> with the lock held, so we "ASSERT(RW_LOCK_HELD(lock))".  The fact that 
>> sometimes the implementation of RW_LOCK_HELD() returns TRUE when this thread 
>> does not in fact hold the lock is fine.
>>
>> In this case, we are using it a bit differently.  The current thread must 
>> *not* hold the lock for reader.  We use RW_LOCK_HELD() to determine 
>> definitively if this thread holds the lock for writer or not.  The behavior 
>> we 
>> want is:  if this thread already holds the lock for writer, we don't need to 
>> re-grab it.  Otherwise, we grab the lock for READER.
>>
>> However, now that I go look at the implementation of RW_LOCK_HELD(), it 
>> doesn't do what this code expects; it should be using RW_WRITE_HELD().
> 
> The problem in this case turns out to be a bit harder. 
> dsl_dataset_open_obj can be called with dp_config_rwlock held for
> read or write, or not held at all.  An obvious way to fix this would
> be to have the callers pass an arg saying whether or not the lock is
> held, but that's esp. ugly in this case since there are ~30 callers,
> and at least some of them probably don't grab the lock directly.

Interesting thread, I know I've seen problems in this area in my ZFS 
crypto work for exactly this lock.

-- 
Darren J Moffat



[zfs-code] R/W lock portability issue

2007-11-27 Thread Mark Maybee
Chris Kirby wrote:
> Matthew Ahrens wrote:
>> So, we use RW_LOCK_HELD() to mean, "might this thread hold the lock?"  and 
>> it 
>> is generally only used in assertions.  Eg, some routine should only be 
>> called 
>> with the lock held, so we "ASSERT(RW_LOCK_HELD(lock))".  The fact that 
>> sometimes the implementation of RW_LOCK_HELD() returns TRUE when this thread 
>> does not in fact hold the lock is fine.
>>
>> In this case, we are using it a bit differently.  The current thread must 
>> *not* hold the lock for reader.  We use RW_LOCK_HELD() to determine 
>> definitively if this thread holds the lock for writer or not.  The behavior 
>> we 
>> want is:  if this thread already holds the lock for writer, we don't need to 
>> re-grab it.  Otherwise, we grab the lock for READER.
>>
>> However, now that I go look at the implementation of RW_LOCK_HELD(), it 
>> doesn't do what this code expects; it should be using RW_WRITE_HELD().
> 
> The problem in this case turns out to be a bit harder. 
> dsl_dataset_open_obj can be called with dp_config_rwlock held for
> read or write, or not held at all.  An obvious way to fix this would
> be to have the callers pass an arg saying whether or not the lock is
> held, but that's esp. ugly in this case since there are ~30 callers,
> and at least some of them probably don't grab the lock directly.
> 
Its actually OK if we re-grab a read lock when we already hold it.  This
will just cause the readers count to increment in the lock, and we will
drop it twice, so the count will be correct in the end.  But we don't
want to try to obtain a read lock if we already have a write lock.  So
it works to just use 'RW_WRITE_HELD()' here.

-Mark



[zfs-code] R/W lock portability issue

2007-11-27 Thread Chris Kirby
Mark Maybee wrote:
> Chris Kirby wrote:
> 
>> Matthew Ahrens wrote:
>>
>>> So, we use RW_LOCK_HELD() to mean, "might this thread hold the 
>>> lock?"  and it is generally only used in assertions.  Eg, some 
>>> routine should only be called with the lock held, so we 
>>> "ASSERT(RW_LOCK_HELD(lock))".  The fact that sometimes the 
>>> implementation of RW_LOCK_HELD() returns TRUE when this thread does 
>>> not in fact hold the lock is fine.
>>>
>>> In this case, we are using it a bit differently.  The current thread 
>>> must *not* hold the lock for reader.  We use RW_LOCK_HELD() to 
>>> determine definitively if this thread holds the lock for writer or 
>>> not.  The behavior we want is:  if this thread already holds the lock 
>>> for writer, we don't need to re-grab it.  Otherwise, we grab the lock 
>>> for READER.
>>>
>>> However, now that I go look at the implementation of RW_LOCK_HELD(), 
>>> it doesn't do what this code expects; it should be using 
>>> RW_WRITE_HELD().
>>
>>
>> The problem in this case turns out to be a bit harder. 
>> dsl_dataset_open_obj can be called with dp_config_rwlock held for
>> read or write, or not held at all.  An obvious way to fix this would
>> be to have the callers pass an arg saying whether or not the lock is
>> held, but that's esp. ugly in this case since there are ~30 callers,
>> and at least some of them probably don't grab the lock directly.
>>
> Its actually OK if we re-grab a read lock when we already hold it.  This
> will just cause the readers count to increment in the lock, and we will
> drop it twice, so the count will be correct in the end.  But we don't
> want to try to obtain a read lock if we already have a write lock.  So
> it works to just use 'RW_WRITE_HELD()' here.


But doesn't the rw_enter code block if there's a waiting
writer?  In other words, can't we cause a deadlock
if we already hold the read lock, there's a waiting writer,
and we try to grab the read lock again?

-Chris



[zfs-code] R/W lock portability issue

2007-11-27 Thread Ricardo M. Correia
Chris Kirby wrote:
> But doesn't the rw_enter code block if there's a waiting
> writer?  In other words, can't we cause a deadlock
> if we already hold the read lock, there's a waiting writer,
> and we try to grab the read lock again?
>   

FYI, the krwlock_t implementation in ZFS's kernel.c specifically asserts 
if a thread tries to grab an rwlock twice as a reader, which indicates 
this is undesired behaviour (although it won't detect if another thread 
also grabs the lock in between those two rw_enter() calls).

Regards,
Ricardo

-- 


*Ricardo Manuel Correia*

Lustre Engineering
*Sun Microsystems, Inc.*
Portugal

-- next part --
An HTML attachment was scrubbed...
URL: 

-- next part --
A non-text attachment was scrubbed...
Name: 6g_top.gif
Type: image/gif
Size: 1257 bytes
Desc: not available
URL: 



[zfs-code] R/W lock portability issue

2007-11-27 Thread Mark Maybee
Chris Kirby wrote:
> Mark Maybee wrote:
>> Chris Kirby wrote:
>>
>>> Matthew Ahrens wrote:
>>>
 So, we use RW_LOCK_HELD() to mean, "might this thread hold the 
 lock?"  and it is generally only used in assertions.  Eg, some 
 routine should only be called with the lock held, so we 
 "ASSERT(RW_LOCK_HELD(lock))".  The fact that sometimes the 
 implementation of RW_LOCK_HELD() returns TRUE when this thread does 
 not in fact hold the lock is fine.

 In this case, we are using it a bit differently.  The current thread 
 must *not* hold the lock for reader.  We use RW_LOCK_HELD() to 
 determine definitively if this thread holds the lock for writer or 
 not.  The behavior we want is:  if this thread already holds the 
 lock for writer, we don't need to re-grab it.  Otherwise, we grab 
 the lock for READER.

 However, now that I go look at the implementation of RW_LOCK_HELD(), 
 it doesn't do what this code expects; it should be using 
 RW_WRITE_HELD().
>>>
>>>
>>> The problem in this case turns out to be a bit harder. 
>>> dsl_dataset_open_obj can be called with dp_config_rwlock held for
>>> read or write, or not held at all.  An obvious way to fix this would
>>> be to have the callers pass an arg saying whether or not the lock is
>>> held, but that's esp. ugly in this case since there are ~30 callers,
>>> and at least some of them probably don't grab the lock directly.
>>>
>> Its actually OK if we re-grab a read lock when we already hold it.  This
>> will just cause the readers count to increment in the lock, and we will
>> drop it twice, so the count will be correct in the end.  But we don't
>> want to try to obtain a read lock if we already have a write lock.  So
>> it works to just use 'RW_WRITE_HELD()' here.
> 
> 
> But doesn't the rw_enter code block if there's a waiting
> writer?  In other words, can't we cause a deadlock
> if we already hold the read lock, there's a waiting writer,
> and we try to grab the read lock again?
> 
Ah right, this isn't the "unfair" version of the rw lock that we use in
the SPA is it?  Yup, you're right, a writer-waiting here could lead to a
deadlock.

-Mark



[zfs-code] R/W lock portability issue

2007-11-27 Thread Neil Perrin


Chris Kirby wrote:
> Mark Maybee wrote:
>> Chris Kirby wrote:
>>
>>> Matthew Ahrens wrote:
>>>
 So, we use RW_LOCK_HELD() to mean, "might this thread hold the 
 lock?"  and it is generally only used in assertions.  Eg, some 
 routine should only be called with the lock held, so we 
 "ASSERT(RW_LOCK_HELD(lock))".  The fact that sometimes the 
 implementation of RW_LOCK_HELD() returns TRUE when this thread does 
 not in fact hold the lock is fine.

 In this case, we are using it a bit differently.  The current thread 
 must *not* hold the lock for reader.  We use RW_LOCK_HELD() to 
 determine definitively if this thread holds the lock for writer or 
 not.  The behavior we want is:  if this thread already holds the lock 
 for writer, we don't need to re-grab it.  Otherwise, we grab the lock 
 for READER.

 However, now that I go look at the implementation of RW_LOCK_HELD(), 
 it doesn't do what this code expects; it should be using 
 RW_WRITE_HELD().
>>>
>>> The problem in this case turns out to be a bit harder. 
>>> dsl_dataset_open_obj can be called with dp_config_rwlock held for
>>> read or write, or not held at all.  An obvious way to fix this would
>>> be to have the callers pass an arg saying whether or not the lock is
>>> held, but that's esp. ugly in this case since there are ~30 callers,
>>> and at least some of them probably don't grab the lock directly.
>>>
>> Its actually OK if we re-grab a read lock when we already hold it.  This
>> will just cause the readers count to increment in the lock, and we will
>> drop it twice, so the count will be correct in the end.  But we don't
>> want to try to obtain a read lock if we already have a write lock.  So
>> it works to just use 'RW_WRITE_HELD()' here.
> 
> 
> But doesn't the rw_enter code block if there's a waiting
> writer?  In other words, can't we cause a deadlock
> if we already hold the read lock, there's a waiting writer,
> and we try to grab the read lock again?
> 
> -Chris

I agree. As a thread, there's no way to determine if I already hold
the read lock and regrabbing it can deadlock as you describe.
The only options I see are for the callers to grab it or pass
down an indication that they have it.



[zfs-code] R/W lock portability issue

2007-11-27 Thread Chris Kirby
Neil Perrin wrote:
> 
> Chris Kirby wrote:
>>But doesn't the rw_enter code block if there's a waiting
>>writer?  In other words, can't we cause a deadlock
>>if we already hold the read lock, there's a waiting writer,
>>and we try to grab the read lock again?
>>
>>-Chris
> 
> 
> I agree. As a thread, there's no way to determine if I already hold
> the read lock and regrabbing it can deadlock as you describe.
> The only options I see are for the callers to grab it or pass
> down an indication that they have it.

It looks like there are currently 30 callers of dsl_dataset_open_obj.

The only callers from open context that don't grab
the lock are dsl_dataset_create_root and
dsl_dataset_snapshot_rename_check.

In sync context, we either have no lock or we have the write lock.

So if we fix up dsl_dataset_create_root to grab the lock; and
change dsl_dataset_snapshot_rename_check not to do any work
unless we're syncing, I think we can say this:

need_lock = !RW_WRITE_HELD && dsl_pool_sync_context(dp);

Does that make sense?  It has to be less painful than
adding an indication that the caller already holds the
lock.

-Chris





[zfs-code] R/W lock portability issue

2007-11-27 Thread Neil Perrin


Chris Kirby wrote:
> Neil Perrin wrote:
>>
>> Chris Kirby wrote:
>>> But doesn't the rw_enter code block if there's a waiting
>>> writer?  In other words, can't we cause a deadlock
>>> if we already hold the read lock, there's a waiting writer,
>>> and we try to grab the read lock again?
>>>
>>> -Chris
>>
>>
>> I agree. As a thread, there's no way to determine if I already hold
>> the read lock and regrabbing it can deadlock as you describe.
>> The only options I see are for the callers to grab it or pass
>> down an indication that they have it.
> 
> It looks like there are currently 30 callers of dsl_dataset_open_obj.
> 
> The only callers from open context that don't grab
> the lock are dsl_dataset_create_root and
> dsl_dataset_snapshot_rename_check.
> 
> In sync context, we either have no lock or we have the write lock.
> 
> So if we fix up dsl_dataset_create_root to grab the lock; and
> change dsl_dataset_snapshot_rename_check not to do any work
> unless we're syncing, I think we can say this:
> 
> need_lock = !RW_WRITE_HELD && dsl_pool_sync_context(dp);
> 
> Does that make sense?  It has to be less painful than
> adding an indication that the caller already holds the
> lock.
> 
> -Chris

Makes sense to me. I don't know the rest of the code well
enough to know if this cane be more cleanly or elegantly fixed.

Neil.



[zfs-code] R/W lock portability issue

2007-12-05 Thread Ricardo M. Correia
Hi Robert,

Robert Milkowski wrote:
>
> >
>
>   
>
> Hi,
>
>
> I am having a problem porting Solaris R/W lock functionality to Linux. 
> The problem is that RW_LOCK_HELD() doesn't have an equivalent function 
> in pthreads.
>
>
>
>
>
> I'm just curious - are you porting ZFS to Linux? If so can you share 
> any details?
>

Last year, as part of the Google Summer of Code program and with the 
mentoring help of Mark Shellenbaum, I've already ported ZFS to Linux as 
a userspace filesystem (implemented on top of the FUSE framework).

You can go to http://www.wizy.org/wiki/ZFS_on_FUSE for details.

The work I was doing was related to ZFS-FUSE and also to the Lustre ZFS 
integration, both of which share most of the code (except for the ZPL 
which is only used in ZFS-FUSE).

As part of this work I need to keep the code synchronized with 
OpenSolaris. I was in fact refactoring the rwlock functionality 
implemented on top of PThreads and ran into that issue, hence my question.

Regards,
Ricardo


-- 


*Ricardo Manuel Correia*

Lustre Engineering
*Sun Microsystems, Inc.*
Portugal

-- next part --
An HTML attachment was scrubbed...
URL: 

-- next part --
A non-text attachment was scrubbed...
Name: 6g_top.gif
Type: image/gif
Size: 1257 bytes
Desc: not available
URL: 



[zfs-code] R/W lock portability issue

2007-12-07 Thread Robert Milkowski
An HTML attachment was scrubbed...
URL: 



[zfs-code] R/W lock portability issue

2007-12-05 Thread Robert Milkowski
An HTML attachment was scrubbed...
URL: