Re: [PATCH] mutex: set owner only once on acquisition
On Thu, Jan 15, 2009 at 09:41:01AM +0100, Johannes Weiner wrote: mutex_lock() sets the lock owner, no need to set it upfront in __mutex_lock_common(). Inside __mutex_lock_common() we can cope with the case where the successful acquirer got preempted by us before setting the owner field: there is an explicit check in the spinning code and the sleeping part doesn't rely on it. The debug code does owner checks only on unlock where the field is garuanteed to be set. Signed-off-by: Johannes Weiner han...@cmpxchg.org --- kernel/mutex.c |2 -- 1 file changed, 2 deletions(-) Just a small observation. Peter said it wouldn't matter much as the write is to a hot cache line. But otoh, why keep it if it's not necessary. :) Damn, I'm really async here, sorry Peter. Just noticed you already picked it up. Hannes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
* Chris Mason chris.ma...@oracle.com wrote: On Wed, 2009-01-14 at 19:33 +0100, Ingo Molnar wrote: * Peter Zijlstra pet...@infradead.org wrote: Full series, including changelogs available at: http://programming.kicks-ass.net/kernel-patches/mutex-adaptive-spin/ and should shortly appear in a git tree near Ingo :-) Linus, Please pull the adaptive-mutexes-for-linus git tree from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git adaptive-mutexes-for-linus I was going to put this into the btrfs tree, but since you have a branch just for adaptive mutexes, is it easier to put there? From: Chris Mason chris.ma...@oracle.com Btrfs: stop spinning on mutex_trylock and let the adaptive code spin for us Mutexes now spin internally and the btrfs spin is no longer required for performance. Signed-off-by: Chris Mason chris.ma...@oracle.com applied it to tip/core/locking, thanks Chris! Ingo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
You just disproved your own case :( how so? 80% is not enough? I also checked Fedora and it has SCHED_DEBUG=y in its kernel rpms. Ubuntu has CONFIG_SCHED_DEBUG=y as well in their kernels. Debian too: mauer:~/bin# grep CONFIG_SCHED_DEBUG /boot/config-2.6.2* /boot/config-2.6.24-1-amd64:# CONFIG_SCHED_DEBUG is not set /boot/config-2.6.25-2-amd64:CONFIG_SCHED_DEBUG=y /boot/config-2.6.26-1-amd64:CONFIG_SCHED_DEBUG=y $ zgrep DEBUG_MUTEX /proc/config.gz # CONFIG_DEBUG_MUTEXES is not set mauer:~/bin# grep DEBUG_MUTEX /boot/config-2.6.2* /boot/config-2.6.22-3-amd64:# CONFIG_DEBUG_MUTEXES is not set /boot/config-2.6.24-1-amd64:# CONFIG_DEBUG_MUTEXES is not set /boot/config-2.6.25-2-amd64:# CONFIG_DEBUG_MUTEXES is not set /boot/config-2.6.26-1-amd64:# CONFIG_DEBUG_MUTEXES is not set Folkert van Heusden -- Looking for a cheap but fast webhoster with an excellent helpdesk? http://keetweej.vanheusden.com/redir.php?id=1001 -- Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
On Wed, Jan 14, 2009 at 08:28:11PM +0100, Ingo Molnar wrote: [v2.6.14] [v2.6.29] Semaphores | Mutexes -- | no-spin spin | [tmpfs] ops/sec: 50713 | 291038 392865 (+34.9%) [ext3]ops/sec: 45214 | 283291 435674 (+53.7%) A 10x macro-performance improvement on ext3, compared to 2.6.14 :-) While lots of other details got changed meanwhile, i'm sure most of the performance win on this particular VFS workload comes from mutexes. I asked a couple of our benchmarking teams to try -v9. Neither the OLTP benchmark, nor the kernel-perf test suite found any significant performance change. I suspect mutex contention isn't a significant problem for most workloads. Has anyone found a non-synthetic benchmark where this makes a significant difference? Aside from btrfs, I mean. -- Matthew Wilcox Intel Open Source Technology Centre Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
* Matthew Wilcox matt...@wil.cx wrote: On Wed, Jan 14, 2009 at 08:28:11PM +0100, Ingo Molnar wrote: [v2.6.14] [v2.6.29] Semaphores | Mutexes -- | no-spin spin | [tmpfs] ops/sec: 50713 | 291038 392865 (+34.9%) [ext3]ops/sec: 45214 | 283291 435674 (+53.7%) A 10x macro-performance improvement on ext3, compared to 2.6.14 :-) While lots of other details got changed meanwhile, i'm sure most of the performance win on this particular VFS workload comes from mutexes. I asked a couple of our benchmarking teams to try -v9. Neither the OLTP benchmark, nor the kernel-perf test suite found any significant performance change. I suspect mutex contention isn't a significant problem for most workloads. basically only VFS is mutex-bound really, and few of the 'benchmarks' tend to be VFS intense. Maybe things like mail-server benchmarks would do that. Also, -v9 is like two days old code ;-) Old and crufty. The real performance uptick was not even in -v10 but in -v11 (the one we submitted in this thread). Ingo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
* Linus Torvalds torva...@linux-foundation.org wrote: Has anyone found a non-synthetic benchmark where this makes a significant difference? Aside from btrfs, I mean. Yea, if you have some particular filesystem (or other subsystem) that uses a global mutex, you'll obviously see way more contention. Btrfs may not be _unique_ in this regard, but it's definitely doing something that isn't good. Btw, it's doing something that ext3 also used to do iirc, until we fixed it to use spinlocks instead (the block group lock in particular). Yeah - just double-checked. Commit c12b9866ea52 in the historical Linux archive, from 2003. Which made block allocation protected by a per-group spinlock, rather than lock_super(). btw., i think spin-mutexes have a design advantage here: in a lot of code areas it's quite difficult to use spinlocks - cannot allocate memory, cannot call any code that can sporadically block (but does not _normally_ block), etc. With mutexes those atomicity constraints go away - and the performance profile should now be quite close to that of spinlocks as well. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
On Thu, 15 Jan 2009, Ingo Molnar wrote: btw., i think spin-mutexes have a design advantage here: in a lot of code areas it's quite difficult to use spinlocks - cannot allocate memory, cannot call any code that can sporadically block (but does not _normally_ block), etc. With mutexes those atomicity constraints go away - and the performance profile should now be quite close to that of spinlocks as well. Umm. Except if you wrote the code nicely and used spinlocks, you wouldn't hold the lock over all those unnecessary and complex operations. IOW, if you do pre-allocation instead of holding a lock over the allocation, you win. So yes, spin-mutexes makes it easier to write the code, but it also makes it easier to just plain be lazy. Linus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
* Chris Mason chris.ma...@oracle.com wrote: [ re: pipes, ok I don't know of realistic pipe benchmarks but I'll run them if people can suggest one ] Threaded servers making heavy use of sys_splice() ought to hit the pipe mutex all the time. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
On Thu, 15 Jan 2009, Paul E. McKenney wrote: On Thu, Jan 15, 2009 at 10:16:53AM -0800, Linus Torvalds wrote: IOW, if you do pre-allocation instead of holding a lock over the allocation, you win. So yes, spin-mutexes makes it easier to write the code, but it also makes it easier to just plain be lazy. In infrequently invoked code such as some error handling, lazy/simple can be a big win. Sure. I don't disagree at all. On such code we don't even care about locking. If it _really_ is fundamentally very rarely invoked. But if we're talking things like core filesystem locks, it's _really_ irritating when one of those (supposedly rare) allocation delays or the need to do IO then blocks all those (supposedly common) nice cached cases. So I don't dispute at all that mutex with spinning performs better than a mutex, but I _do_ claim that it has some potentially huge downsides compared to a real spinlock. It may perform as well as a spinlock in the nice common case, but then when you hit the non-common case you see the difference between well-written code and badly written code. And sadly, while allocations _usually_ are nice and immediate, and while our caches _usually_ mean that we don't need to do IO, bad behavior when we do need to do IO is what really kills interactive feel. Suddenly everything else is hurting too, because they wanted that lock - even if they didn't need to do IO or allocate anything. Linus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
On Thu, Jan 15, 2009 at 05:01:32PM -0800, Linus Torvalds wrote: On Thu, 15 Jan 2009, Paul E. McKenney wrote: On Thu, Jan 15, 2009 at 10:16:53AM -0800, Linus Torvalds wrote: IOW, if you do pre-allocation instead of holding a lock over the allocation, you win. So yes, spin-mutexes makes it easier to write the code, but it also makes it easier to just plain be lazy. In infrequently invoked code such as some error handling, lazy/simple can be a big win. Sure. I don't disagree at all. On such code we don't even care about locking. If it _really_ is fundamentally very rarely invoked. But if we're talking things like core filesystem locks, it's _really_ irritating when one of those (supposedly rare) allocation delays or the need to do IO then blocks all those (supposedly common) nice cached cases. Certainly if there was one big mutex covering all the operations, it would indeed be bad. On the other hand, if the filesystem/cache was partitioned (perhaps hashed) so that there was a large number of such locks, then if should be OK. Yes, I am making the perhaps naive assumption that hot spots such as the root inode would be in the cache. And that they would rarely collide with allocation or I/O, which might also be naive. But on this point I must defer to the filesystem folks. So I don't dispute at all that mutex with spinning performs better than a mutex, but I _do_ claim that it has some potentially huge downsides compared to a real spinlock. It may perform as well as a spinlock in the nice common case, but then when you hit the non-common case you see the difference between well-written code and badly written code. And sadly, while allocations _usually_ are nice and immediate, and while our caches _usually_ mean that we don't need to do IO, bad behavior when we do need to do IO is what really kills interactive feel. Suddenly everything else is hurting too, because they wanted that lock - even if they didn't need to do IO or allocate anything. I certainly agree that there are jobs that a spin-mutex is ill-suited for. Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] adaptive spinning mutexes
On Thu, Jan 15, 2009 at 10:16:53AM -0800, Linus Torvalds wrote: On Thu, 15 Jan 2009, Ingo Molnar wrote: btw., i think spin-mutexes have a design advantage here: in a lot of code areas it's quite difficult to use spinlocks - cannot allocate memory, cannot call any code that can sporadically block (but does not _normally_ block), etc. With mutexes those atomicity constraints go away - and the performance profile should now be quite close to that of spinlocks as well. Umm. Except if you wrote the code nicely and used spinlocks, you wouldn't hold the lock over all those unnecessary and complex operations. IOW, if you do pre-allocation instead of holding a lock over the allocation, you win. So yes, spin-mutexes makes it easier to write the code, but it also makes it easier to just plain be lazy. Yeah, I agree often it is harder to get the locking right but you end up with a better result. With mutexes, on the off chance you do have t oblock while holding the lock, performance and latency of other threads will tank. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: simplify iteration codes
merge list_for_each and list_entry to list_for_each_entry. Signed-off-by: Qinghuang Feng qhfeng.ker...@gmail.com --- diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b187b53..70f0248 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -104,10 +104,8 @@ static noinline struct btrfs_device *__find_device(struct list_head *head, u64 devid, u8 *uuid) { struct btrfs_device *dev; - struct list_head *cur; - list_for_each(cur, head) { - dev = list_entry(cur, struct btrfs_device, dev_list); + list_for_each_entry(dev, head, dev_list) { if (dev-devid == devid (!uuid || !memcmp(dev-uuid, uuid, BTRFS_UUID_SIZE))) { return dev; @@ -118,11 +116,9 @@ static noinline struct btrfs_device *__find_device(struct list_head *head, static noinline struct btrfs_fs_devices *find_fsid(u8 *fsid) { - struct list_head *cur; struct btrfs_fs_devices *fs_devices; - list_for_each(cur, fs_uuids) { - fs_devices = list_entry(cur, struct btrfs_fs_devices, list); + list_for_each_entry(fs_devices, fs_uuids, list) { if (memcmp(fsid, fs_devices-fsid, BTRFS_FSID_SIZE) == 0) return fs_devices; } -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html