On 24.10.2012 21:30, Alexander Motin wrote:
On 24.10.2012 22:16, Andre Oppermann wrote:
On 24.10.2012 20:56, Jim Harris wrote:
On Wed, Oct 24, 2012 at 11:41 AM, Adrian Chadd <adr...@freebsd.org>
wrote:
On 24 October 2012 11:36, Jim Harris <jimhar...@freebsd.org> wrote:
Pad tdq_lock to avoid false sharing with tdq_load and tdq_cpu_idle.
Ok, but..
struct mtx tdq_lock; /* run queue lock. */
+ char pad[64 - sizeof(struct mtx)];
.. don't we have an existing compile time macro for the cache line
size, which can be used here?
Yes, but I didn't use it for a couple of reasons:
1) struct tdq itself is currently using __aligned(64), so I wanted to
keep it consistent.
2) CACHE_LINE_SIZE is currently defined as 128 on x86, due to
NetBurst-based processors having 128-byte cache sectors a while back.
I had planned to start a separate thread on arch@ about this today on
whether this was still appropriate.
See also the discussion on svn-src-all regarding global struct mtx
alignment.
Thank you for proving my point. ;)
Let's go back and see how we can do this the sanest way. These are
the options I see at the moment:
1. sprinkle __aligned(CACHE_LINE_SIZE) all over the place
2. use a macro like MTX_ALIGN that can be SMP/UP aware and in
the future possibly change to a different compiler dependent
align attribute
3. embed __aligned(CACHE_LINE_SIZE) into struct mtx itself so it
automatically gets aligned in all cases, even when dynamically
allocated.
Personally I'm undecided between #2 and #3. #1 is ugly. In favor
of #3 is that there possibly isn't any case where you'd actually
want the mutex to share a cache line with anything else, even a data
structure.
I'm sorry, could you hint me with some theory? I think I can agree that cache
line sharing can be a
problem in case of spin locks -- waiting thread will constantly try to access
page modified by other
CPU, that I guess will cause cache line writes to the RAM. But why is it so bad
to share lock with
respective data in case of non-spin locks? Won't benefits from free regular
prefetch of the right
data while grabbing lock compensate penalties from relatively rare collisions?
Cliff Click describes it in detail:
http://www.azulsystems.com/blog/cliff/2009-04-14-odds-ends
For a classic mutex it likely doesn't make much difference since the
cache line is exclusive anyway while the lock is held. On LL/SC systems
there may be cache line dirtying on a failed locking attempt.
For spin mutexes it hurts badly as you noted.
Especially on RW mutexes it hurts because a read lock dirties the cache
line for all other CPU's. Here the RW mutex should be on its own cache
line in all cases.
--
Andre
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"