On 07/11/11 12:08 PM, Simon Marlow wrote:
On 10/07/2011 15:01, Karel Gardas wrote:
Hello,
while working on ARM port, I've found bug in WSDeque.c which might
possibly also hurt other architectures -- depending on how weak is their
memory model (i.e. how much load/store reordering it permits).
I'm certainly not sure if it affects x86, but I'm not sure if PowerPC is
safe here, hence I'm writing this.
For fix preview and explanation see:
https://github.com/kgardas/ghc/commit/ac7c59eba9806fc8216f44342c12b784e6680483
Yes - and I'm pretty certain you'll run into other issues like this if
ARM doesn't provide total store ordering.
Ehm, that'll be interesting...
Well done for tracking this
down anyway (was it the testwsdeque.c test case that failed?).
Yes, testwsdeque failed but just threaded1 (debug) way.
Will you submit this patch, or a pull request?
Patch attached. I guess this is easier this way. Whole ARM support will
be probably better by pull request I guess but it's not done yet.
Have you tried running some heavy -N2 tests? e.g. try the benchmarks in
nofib/parallel.
If this is part of normal nofib's `make' then this runs already.
Thanks,
Karel
>From 959518b7ee84566af63d82ee979a6751ef85f2d5 Mon Sep 17 00:00:00 2001
From: Karel Gardas <[email protected]>
Date: Sat, 9 Jul 2011 18:07:00 +0200
Subject: [PATCH 4/4] RTS: fix pushWSDeque to invoke write barrier when element
is added
This patch fixes RTS' pushWSDeque function. We need to invoke
write barrier after element is added to the queue and before moving
bottom. The reason for this is possible write reordering on modern CPUs
(e.g. ARMv7MP) where setting of element might be done later after moving
bottom. When such situation happen other thread might be waiting to steal
data from the queue and when bottom is moved it eagerly steals undefined
data from the queue since setting of element has not happened yet.
---
rts/WSDeque.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/rts/WSDeque.c b/rts/WSDeque.c
index 090a549..11ae8a1 100644
--- a/rts/WSDeque.c
+++ b/rts/WSDeque.c
@@ -279,6 +279,15 @@ pushWSDeque (WSDeque* q, void * elem)
}
q->elements[b & sz] = elem;
+ /*
+ KG: we need to put write barrier here since otherwise we might
+ end with elem not added to q->elements, but q->bottom already
+ modified (write reordering) and with stealWSDeque_ failing
+ later when invoked from another thread since it thinks elem is
+ there (in case there is just added element in the queue). This
+ issue concretely hit me on ARMv7 multi-core CPUs
+ */
+ write_barrier();
q->bottom = b + 1;
ASSERT_WSDEQUE_INVARIANTS(q);
--
1.7.4.3
_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc