Re: SMP question w.r.t. reading kernel variables
On Tue, Apr 19, 2011 at 12:00:29PM +, freebsd-hackers-requ...@freebsd.org wrote: Subject: Re: SMP question w.r.t. reading kernel variables To: Rick Macklem rmack...@uoguelph.ca Cc: freebsd-hackers@freebsd.org Message-ID: 201104181712.14457@freebsd.org [John Baldwin] On Monday, April 18, 2011 4:22:37 pm Rick Macklem wrote: On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote: ... All of this makes sense. What I was concerned about was memory cache consistency and whet (if anything) has to be done to make sure a thread doesn't see a stale cached value for the memory location. Here's a generic example of what I was thinking of: (assume x is a global int and y is a local int on the thread's stack) - time proceeds down the screen thread X on CPU 0 thread Y on CPU 1 x = 0; x = 0; /* 0 for x's location in CPU 1's memory cache */ x = 1; y = x; -- now, is y guaranteed to be 1 or can it get the stale cached 0 value? if not, what needs to be done to guarantee it? Well, the bigger problem is getting the CPU and compiler to order the instructions such that they don't execute out of order, etc. Because of that, even if your code has 'x = 0; x = 1;' as adjacent threads in thread X, the 'x = 1' may actually execute a good bit after the 'y = x' on CPU 1. Actually, as I recall the rules for C, it's worse than that. For this (admittedly simplified scenario), x=0; in thread X may never execute unless it's declared volatile, as the compiler may optimize it out and emit no code for it. Locks force that to sychronize as the CPUs coordinate around the lock cookie (e.g. the 'mtx_lock' member of 'struct mutex'). Also, I see cases of: mtx_lock(np); np-n_attrstamp = 0; mtx_unlock(np); in the regular NFS client. Why is the assignment mutex locked? (I had assumed it was related to the above memory caching issue, but now I'm not so sure.) In general I think writes to data that are protected by locks should always be protected by locks. In some cases you may be able to read data using weaker locking (where no locking can be a form of weaker locking, but also a read/shared lock is weak, and if a variable is protected by multiple locks, then any singe lock is weak, but sufficient for reading while all of the associated locks must be held for writing) than writing, but writing generally requires full locking (write locks, etc.). Oops, I now see that you've differentiated between writing and reading. (I mistakenly just stated that you had recommended a lock for reading. Sorry about my misinterpretation of the above on the first quick read.) What he said. In addition to all that, lock operations generate atomic barriers which a compiler or optimizer is prevented from moving code across. All good and useful comments, thanks. The above example was meant to be contrived, to indicate what I was worried about w.r.t. memory caches. Here's a somewhat simplified version of what my actual problem is: (Mostly fyi, in case you are interested.) Thread X is doing a forced dismount of an NFS volume, it (in dounmount()): - sets MNTK_UNMOUNTF - calls VFS_SYNC()/nfs_sync() - so this doesn't get hung on an unresponsive server it must test for MNTK_UNMOUNTF and return an error it is set. This seems fine, since it is the same thread and in a called function. (I can't imagine that the optimizer could move setting of a global flag to after a function call which might use it.) - calls VFS_UNMOUNT()/nfs_unmount() - now the fun begins... after some other stuff, it calls nfscl_umount() to get rid of the state info (opens/locks...) nfscl_umount() - synchronizes with other threads that will use this state (see below) using the combination of a mutex and a shared/exclusive sleep lock. (Because of various quirks in the code, this shared/exclusive lock is a locally coded version and I happenned to call the shared case a refcnt and the exclusive case just a lock.) Other threads that will use state info (open/lock...) will: -call nfscl_getcl() - this function does two things that are relevant 1 - it allocates a new clientid, as required, while holding the mutex - this case needs to check for MNTK_UNMOUNTF and return error, in case the clientid has already been deleted by nfscl_umount() above. (This happens before #2 because the sleep lock is in the clientd structure.) -- it must see the MNTK_UNMOUNTF set if it happens after (in a temporal sense) being set by dounmount() 2 - while holding the mutex, it acquires the shared lock - if this happens before nfscl_umount() gets the exclusive lock, it is fine, since acquisition of the exclusive lock above will wait for its
Re: SMP question w.r.t. reading kernel variables
On Wed, Apr 20, 2011 at 7:42 AM, Rick Macklem rmack...@uoguelph.ca wrote: On Tue, Apr 19, 2011 at 12:00:29PM +, freebsd-hackers-requ...@freebsd.org wrote: Subject: Re: SMP question w.r.t. reading kernel variables To: Rick Macklem rmack...@uoguelph.ca Cc: freebsd-hackers@freebsd.org Message-ID: 201104181712.14457@freebsd.org [John Baldwin] On Monday, April 18, 2011 4:22:37 pm Rick Macklem wrote: On Sunday, April 17, 2011 3:49:48 pm Rick Macklem wrote: ... All of this makes sense. What I was concerned about was memory cache consistency and whet (if anything) has to be done to make sure a thread doesn't see a stale cached value for the memory location. Here's a generic example of what I was thinking of: (assume x is a global int and y is a local int on the thread's stack) - time proceeds down the screen thread X on CPU 0 thread Y on CPU 1 x = 0; x = 0; /* 0 for x's location in CPU 1's memory cache */ x = 1; y = x; -- now, is y guaranteed to be 1 or can it get the stale cached 0 value? if not, what needs to be done to guarantee it? Well, the bigger problem is getting the CPU and compiler to order the instructions such that they don't execute out of order, etc. Because of that, even if your code has 'x = 0; x = 1;' as adjacent threads in thread X, the 'x = 1' may actually execute a good bit after the 'y = x' on CPU 1. Actually, as I recall the rules for C, it's worse than that. For this (admittedly simplified scenario), x=0; in thread X may never execute unless it's declared volatile, as the compiler may optimize it out and emit no code for it. Locks force that to sychronize as the CPUs coordinate around the lock cookie (e.g. the 'mtx_lock' member of 'struct mutex'). Also, I see cases of: mtx_lock(np); np-n_attrstamp = 0; mtx_unlock(np); in the regular NFS client. Why is the assignment mutex locked? (I had assumed it was related to the above memory caching issue, but now I'm not so sure.) In general I think writes to data that are protected by locks should always be protected by locks. In some cases you may be able to read data using weaker locking (where no locking can be a form of weaker locking, but also a read/shared lock is weak, and if a variable is protected by multiple locks, then any singe lock is weak, but sufficient for reading while all of the associated locks must be held for writing) than writing, but writing generally requires full locking (write locks, etc.). Oops, I now see that you've differentiated between writing and reading. (I mistakenly just stated that you had recommended a lock for reading. Sorry about my misinterpretation of the above on the first quick read.) What he said. In addition to all that, lock operations generate atomic barriers which a compiler or optimizer is prevented from moving code across. All good and useful comments, thanks. The above example was meant to be contrived, to indicate what I was worried about w.r.t. memory caches. Here's a somewhat simplified version of what my actual problem is: (Mostly fyi, in case you are interested.) Thread X is doing a forced dismount of an NFS volume, it (in dounmount()): - sets MNTK_UNMOUNTF - calls VFS_SYNC()/nfs_sync() - so this doesn't get hung on an unresponsive server it must test for MNTK_UNMOUNTF and return an error it is set. This seems fine, since it is the same thread and in a called function. (I can't imagine that the optimizer could move setting of a global flag to after a function call which might use it.) - calls VFS_UNMOUNT()/nfs_unmount() - now the fun begins... after some other stuff, it calls nfscl_umount() to get rid of the state info (opens/locks...) nfscl_umount() - synchronizes with other threads that will use this state (see below) using the combination of a mutex and a shared/exclusive sleep lock. (Because of various quirks in the code, this shared/exclusive lock is a locally coded version and I happenned to call the shared case a refcnt and the exclusive case just a lock.) Other threads that will use state info (open/lock...) will: -call nfscl_getcl() - this function does two things that are relevant 1 - it allocates a new clientid, as required, while holding the mutex - this case needs to check for MNTK_UNMOUNTF and return error, in case the clientid has already been deleted by nfscl_umount() above. (This happens before #2 because the sleep lock is in the clientd structure.) -- it must see the MNTK_UNMOUNTF set if it happens after (in a temporal sense) being set by dounmount() 2 - while holding the mutex, it acquires the shared lock
Re: SMP question w.r.t. reading kernel variables
[good stuff snipped for brevity] 1. Set MNTK_UNMOUNTF 2. Acquire a standard FreeBSD mutex m. 3. Update some data structures. 4. Release mutex m. Then, other threads that acquire m after step 4 has occurred will see MNTK_UNMOUNTF as set. But, other threads that beat thread X to step 2 may or may not see MNTK_UNMOUNTF as set. First off, Alan, thanks for the great explanation. I think it would be nice if this was captured somewhere in the docs, if it isn't already there somewhere (I couldn't spot it, but that doesn't mean anything:-). The question that I have about your specific scenario is concerned with VOP_SYNC(). Do you care if another thread performing nfscl_getcl() after thread X has performed VOP_SYNC() doesn't see MNTK_UNMOUNTF as set? Well, no and yes. It doesn't matter if it doesn't see it after thread X performed nfs_sync(), but it does matter that the threads calling nfscl_getcl() see it before they compete with thread X for the sleep lock. Another relevant question is Does VOP_SYNC() acquire and release the same mutex as nfscl_umount() and nfscl_getcl()? No. So, to get this to work correctly it sounds like I have to do one of the following: 1 - mtx_lock(m); mtx_unlock(m); in nfs_sync(), where m is the mutex used by nfscl_getcl() for the NFS open/lock state. or 2 - mtx_lock(m); mtx_unlock(m); mtx_lock(m); before the point where I care that the threads executing nfscl_getcl() see MNTK_UMOUNTF set in nfscl_umount(). or 3 - mtx_lock(m2); mtx_unlock(m2); in nfscl_getcl(), where m2 is the mutex used by thread X when setting MNTK_UMOUNTF, before mtx_lock(m); and then testing MNTK_UMOUNTF plus acquiring the sleep lock. (By doing it before, I can avoid any LOR issue and do an msleep() without worrying about having two mutex locks.) I think #3 reads the best, so I'll probably do that one. One more question, if you don't mind. Is step 3 in your explanation necessary for this to work? If it is, I can just create some global variable that I assign a value to between mtx_lock(m2); mtx_unlock(m2); but it won't be used for anything, so I thought I'd check if it is necessary? Thanks again for the clear explanation, rick ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: SMP question w.r.t. reading kernel variables
[good stuff snipped for brevity] 1. Set MNTK_UNMOUNTF 2. Acquire a standard FreeBSD mutex m. 3. Update some data structures. 4. Release mutex m. Then, other threads that acquire m after step 4 has occurred will see MNTK_UNMOUNTF as set. But, other threads that beat thread X to step 2 may or may not see MNTK_UNMOUNTF as set. First off, Alan, thanks for the great explanation. I think it would be nice if this was captured somewhere in the docs, if it isn't already there somewhere (I couldn't spot it, but that doesn't mean anything:-). The question that I have about your specific scenario is concerned with VOP_SYNC(). Do you care if another thread performing nfscl_getcl() after thread X has performed VOP_SYNC() doesn't see MNTK_UNMOUNTF as set? Well, no and yes. It doesn't matter if it doesn't see it after thread X performed nfs_sync(), but it does matter that the threads calling nfscl_getcl() see it before they compete with thread X for the sleep lock. Another relevant question is Does VOP_SYNC() acquire and release the same mutex as nfscl_umount() and nfscl_getcl()? No. So, to get this to work correctly it sounds like I have to do one of the following: 1 - mtx_lock(m); mtx_unlock(m); in nfs_sync(), where m is the mutex used by nfscl_getcl() for the NFS open/lock state. or 2 - mtx_lock(m); mtx_unlock(m); mtx_lock(m); before the point where I care that the threads executing nfscl_getcl() see MNTK_UMOUNTF set in nfscl_umount(). or 3 - mtx_lock(m2); mtx_unlock(m2); in nfscl_getcl(), where m2 is the mutex used by thread X when setting MNTK_UMOUNTF, before mtx_lock(m); and then testing MNTK_UMOUNTF plus acquiring the sleep lock. (By doing it before, I can avoid any LOR issue and do an msleep() without worrying about having two mutex locks.) I think #3 reads the best, so I'll probably do that one. One more question, if you don't mind. Is step 3 in your explanation necessary for this to work? If it is, I can just create some global variable that I assign a value to between mtx_lock(m2); mtx_unlock(m2); but it won't be used for anything, so I thought I'd check if it is necessary? Oops, I screwed up this question. For my #3, all that needs to be done in nfscl_getcl() before I care if it sees MNTK_UMOUNTF set is mtx_lock(m2); since that has already gone through your steps 1-4. The question w.r.t. do you really need your step 3 would apply to the cases where I was using m (the mutex nfscl_umount() and nfscl_getcl() already use instead of the one used by thread X). rick ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org