On Mon, Jan 18, 2016 at 8:09 PM, Alex Bennée <alex.ben...@linaro.org> wrote:
>
>
> Alex Bennée <alex.ben...@linaro.org> writes:
>
> > alvise rigo <a.r...@virtualopensystems.com> writes:
> >
> >> On Fri, Jan 15, 2016 at 4:25 PM, Alex Bennée <alex.ben...@linaro.org> 
> >> wrote:
> >>>
> >>> alvise rigo <a.r...@virtualopensystems.com> writes:
> >>>
> >>>> On Fri, Jan 15, 2016 at 3:51 PM, Alex Bennée <alex.ben...@linaro.org> 
> >>>> wrote:
> >>>>>
> >>>>> alvise rigo <a.r...@virtualopensystems.com> writes:
> >>>>>
> <snip>
> >>>> Keep in mind that Linux on arm64 uses the LDXP/STXP instructions that
> >>>> exist solely in aarch64.
> >>>> These instructions are purely emulated now and can potentially write
> >>>> 128 bits of data in a non-atomic fashion.
> >>>
> >>> Sure, but I doubt they are the reason for this hang as the kernel
> >>> doesn't use them.
> >>
> >> The kernel does use them for __cmpxchg_double in
> >> arch/arm64/include/asm/atomic_ll_sc.h.
> >
> > I take it back, if I'd have grepped for "ldxp" instead of "stxp" I would
> > have seen it, sorry about that ;-)
> >
> >> In any case, the normal exclusive instructions are also emulated in
> >> target-arm/translate-a64.c.
> >
> > I'll check on them on Monday. I'd assumed all the stuff was in the
> > helpers as I scanned through and missed the translate.c changes Fred
> > made. Hopefully that will be the last hurdle.
>
> I'm pleased to confirm you were right. I hacked up Fred's helper based
> solution for aarch64 including the ldxp/stxp stuff. It's not
> semantically correct because:
>
>   result = atomic_bool_cmpxchg(p, oldval, (uint8_t)newval) &&
>            atomic_bool_cmpxchg(&p[1], oldval2, (uint8_t)newval2);
>
> won't leave the system as it was before if the race causes the second

Exactly.

> cmpxchg to fail. I assume this won't be a problem in the LL/SC world as
> we'll be able to serialise all accesses to the exclusive page properly?

In LL/SC the idea would be to dedicate one ARM-specific helper (in
target-arm/helper-a64.c) to handle this case.
Once the helper grabbed the excl mutex, we are allowed to make 128
bits or bigger accesses.

>
>
> See:
>
> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r2
>
> >
> > In the meantime if I'm not booting Jessie I can get MTTCG aarch64
> > working with a initrd based rootfs. Once I've gone through those I'm
> > planning on giving it a good stress test with -fsantize=threads.
>
> My first pass with this threw up a bunch of errors with the RCU code
> like this:
>
> WARNING: ThreadSanitizer: data race (pid=15387)
>   Atomic write of size 4 at 0x7f59efa51d48 by main thread (mutexes: write 
> M172):
>     #0 __tsan_atomic32_fetch_add <null> (libtsan.so.0+0x000000058e8f)
>     #1 call_rcu1 util/rcu.c:288 (qemu-system-aarch64+0x0000006c3bd0)
>     #2 address_space_update_topology 
> /home/alex/lsrc/qemu/qemu.git/memory.c:806 
> (qemu-system-aarch64+0x0000001ed9ca)
>     #3 memory_region_transaction_commit 
> /home/alex/lsrc/qemu/qemu.git/memory.c:842 
> (qemu-system-aarch64+0x0000001ed9ca)
>     #4 address_space_init /home/alex/lsrc/qemu/qemu.git/memory.c:2136 
> (qemu-system-aarch64+0x0000001f1fa6)
>     #5 memory_map_init /home/alex/lsrc/qemu/qemu.git/exec.c:2344 
> (qemu-system-aarch64+0x000000196607)
>     #6 cpu_exec_init_all /home/alex/lsrc/qemu/qemu.git/exec.c:2795 
> (qemu-system-aarch64+0x000000196607)
>     #7 main /home/alex/lsrc/qemu/qemu.git/vl.c:4083 
> (qemu-system-aarch64+0x0000001829aa)
>
>   Previous read of size 4 at 0x7f59efa51d48 by thread T1:
>     #0 call_rcu_thread util/rcu.c:242 (qemu-system-aarch64+0x0000006c3d92)
>     #1 <null> <null> (libtsan.so.0+0x0000000235f9)
>
>   Location is global 'rcu_call_count' of size 4 at 0x7f59efa51d48 
> (qemu-system-aarch64+0x0000010f1d48)
>
>   Mutex M172 (0x7f59ef6254e0) created at:
>     #0 pthread_mutex_init <null> (libtsan.so.0+0x000000027ee5)
>     #1 qemu_mutex_init util/qemu-thread-posix.c:55 
> (qemu-system-aarch64+0x0000006ad747)
>     #2 qemu_init_cpu_loop /home/alex/lsrc/qemu/qemu.git/cpus.c:890 
> (qemu-system-aarch64+0x0000001d4166)
>     #3 main /home/alex/lsrc/qemu/qemu.git/vl.c:3005 
> (qemu-system-aarch64+0x0000001820ac)
>
>   Thread T1 (tid=15389, running) created by main thread at:
>     #0 pthread_create <null> (libtsan.so.0+0x0000000274c7)
>     #1 qemu_thread_create util/qemu-thread-posix.c:525 
> (qemu-system-aarch64+0x0000006ae04d)
>     #2 rcu_init_complete util/rcu.c:320 (qemu-system-aarch64+0x0000006c3d52)
>     #3 rcu_init util/rcu.c:351 (qemu-system-aarch64+0x00000018e288)
>     #4 __libc_csu_init <null> (qemu-system-aarch64+0x0000006c63ec)
>
>
> but I don't know how many are false positives so I'm going to look in more
> detail now.

Umm...I'm not very familiar with the sanitize option, I'll let you
follow this lead :).

alvise

>
> <snip>
>
> --
> Alex Bennée

Reply via email to