Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-19 Thread alvise rigo
On Mon, Jan 18, 2016 at 8:09 PM, Alex Bennée  wrote:
>
>
> Alex Bennée  writes:
>
> > alvise rigo  writes:
> >
> >> On Fri, Jan 15, 2016 at 4:25 PM, Alex Bennée  
> >> wrote:
> >>>
> >>> alvise rigo  writes:
> >>>
>  On Fri, Jan 15, 2016 at 3:51 PM, Alex Bennée  
>  wrote:
> >
> > alvise rigo  writes:
> >
> 
>  Keep in mind that Linux on arm64 uses the LDXP/STXP instructions that
>  exist solely in aarch64.
>  These instructions are purely emulated now and can potentially write
>  128 bits of data in a non-atomic fashion.
> >>>
> >>> Sure, but I doubt they are the reason for this hang as the kernel
> >>> doesn't use them.
> >>
> >> The kernel does use them for __cmpxchg_double in
> >> arch/arm64/include/asm/atomic_ll_sc.h.
> >
> > I take it back, if I'd have grepped for "ldxp" instead of "stxp" I would
> > have seen it, sorry about that ;-)
> >
> >> In any case, the normal exclusive instructions are also emulated in
> >> target-arm/translate-a64.c.
> >
> > I'll check on them on Monday. I'd assumed all the stuff was in the
> > helpers as I scanned through and missed the translate.c changes Fred
> > made. Hopefully that will be the last hurdle.
>
> I'm pleased to confirm you were right. I hacked up Fred's helper based
> solution for aarch64 including the ldxp/stxp stuff. It's not
> semantically correct because:
>
>   result = atomic_bool_cmpxchg(p, oldval, (uint8_t)newval) &&
>atomic_bool_cmpxchg([1], oldval2, (uint8_t)newval2);
>
> won't leave the system as it was before if the race causes the second

Exactly.

> cmpxchg to fail. I assume this won't be a problem in the LL/SC world as
> we'll be able to serialise all accesses to the exclusive page properly?

In LL/SC the idea would be to dedicate one ARM-specific helper (in
target-arm/helper-a64.c) to handle this case.
Once the helper grabbed the excl mutex, we are allowed to make 128
bits or bigger accesses.

>
>
> See:
>
> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r2
>
> >
> > In the meantime if I'm not booting Jessie I can get MTTCG aarch64
> > working with a initrd based rootfs. Once I've gone through those I'm
> > planning on giving it a good stress test with -fsantize=threads.
>
> My first pass with this threw up a bunch of errors with the RCU code
> like this:
>
> WARNING: ThreadSanitizer: data race (pid=15387)
>   Atomic write of size 4 at 0x7f59efa51d48 by main thread (mutexes: write 
> M172):
> #0 __tsan_atomic32_fetch_add  (libtsan.so.0+0x00058e8f)
> #1 call_rcu1 util/rcu.c:288 (qemu-system-aarch64+0x006c3bd0)
> #2 address_space_update_topology 
> /home/alex/lsrc/qemu/qemu.git/memory.c:806 
> (qemu-system-aarch64+0x001ed9ca)
> #3 memory_region_transaction_commit 
> /home/alex/lsrc/qemu/qemu.git/memory.c:842 
> (qemu-system-aarch64+0x001ed9ca)
> #4 address_space_init /home/alex/lsrc/qemu/qemu.git/memory.c:2136 
> (qemu-system-aarch64+0x001f1fa6)
> #5 memory_map_init /home/alex/lsrc/qemu/qemu.git/exec.c:2344 
> (qemu-system-aarch64+0x00196607)
> #6 cpu_exec_init_all /home/alex/lsrc/qemu/qemu.git/exec.c:2795 
> (qemu-system-aarch64+0x00196607)
> #7 main /home/alex/lsrc/qemu/qemu.git/vl.c:4083 
> (qemu-system-aarch64+0x001829aa)
>
>   Previous read of size 4 at 0x7f59efa51d48 by thread T1:
> #0 call_rcu_thread util/rcu.c:242 (qemu-system-aarch64+0x006c3d92)
> #1   (libtsan.so.0+0x000235f9)
>
>   Location is global 'rcu_call_count' of size 4 at 0x7f59efa51d48 
> (qemu-system-aarch64+0x010f1d48)
>
>   Mutex M172 (0x7f59ef6254e0) created at:
> #0 pthread_mutex_init  (libtsan.so.0+0x00027ee5)
> #1 qemu_mutex_init util/qemu-thread-posix.c:55 
> (qemu-system-aarch64+0x006ad747)
> #2 qemu_init_cpu_loop /home/alex/lsrc/qemu/qemu.git/cpus.c:890 
> (qemu-system-aarch64+0x001d4166)
> #3 main /home/alex/lsrc/qemu/qemu.git/vl.c:3005 
> (qemu-system-aarch64+0x001820ac)
>
>   Thread T1 (tid=15389, running) created by main thread at:
> #0 pthread_create  (libtsan.so.0+0x000274c7)
> #1 qemu_thread_create util/qemu-thread-posix.c:525 
> (qemu-system-aarch64+0x006ae04d)
> #2 rcu_init_complete util/rcu.c:320 (qemu-system-aarch64+0x006c3d52)
> #3 rcu_init util/rcu.c:351 (qemu-system-aarch64+0x0018e288)
> #4 __libc_csu_init  (qemu-system-aarch64+0x006c63ec)
>
>
> but I don't know how many are false positives so I'm going to look in more
> detail now.

Umm...I'm not very familiar with the sanitize option, I'll let you
follow this lead :).

alvise

>
> 
>
> --
> Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-18 Thread Alex Bennée

Alex Bennée  writes:

> alvise rigo  writes:
>
>> On Fri, Jan 15, 2016 at 4:25 PM, Alex Bennée  wrote:
>>>
>>> alvise rigo  writes:
>>>
 On Fri, Jan 15, 2016 at 3:51 PM, Alex Bennée  
 wrote:
>
> alvise rigo  writes:
>

 Keep in mind that Linux on arm64 uses the LDXP/STXP instructions that
 exist solely in aarch64.
 These instructions are purely emulated now and can potentially write
 128 bits of data in a non-atomic fashion.
>>>
>>> Sure, but I doubt they are the reason for this hang as the kernel
>>> doesn't use them.
>>
>> The kernel does use them for __cmpxchg_double in
>> arch/arm64/include/asm/atomic_ll_sc.h.
>
> I take it back, if I'd have grepped for "ldxp" instead of "stxp" I would
> have seen it, sorry about that ;-)
>
>> In any case, the normal exclusive instructions are also emulated in
>> target-arm/translate-a64.c.
>
> I'll check on them on Monday. I'd assumed all the stuff was in the
> helpers as I scanned through and missed the translate.c changes Fred
> made. Hopefully that will be the last hurdle.

I'm pleased to confirm you were right. I hacked up Fred's helper based
solution for aarch64 including the ldxp/stxp stuff. It's not
semantically correct because:

  result = atomic_bool_cmpxchg(p, oldval, (uint8_t)newval) &&
   atomic_bool_cmpxchg([1], oldval2, (uint8_t)newval2);

won't leave the system as it was before if the race causes the second
cmpxchg to fail. I assume this won't be a problem in the LL/SC world as
we'll be able to serialise all accesses to the exclusive page properly?

See:

https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r2

>
> In the meantime if I'm not booting Jessie I can get MTTCG aarch64
> working with a initrd based rootfs. Once I've gone through those I'm
> planning on giving it a good stress test with -fsantize=threads.

My first pass with this threw up a bunch of errors with the RCU code
like this:

WARNING: ThreadSanitizer: data race (pid=15387)
  Atomic write of size 4 at 0x7f59efa51d48 by main thread (mutexes: write M172):
#0 __tsan_atomic32_fetch_add  (libtsan.so.0+0x00058e8f)
#1 call_rcu1 util/rcu.c:288 (qemu-system-aarch64+0x006c3bd0)
#2 address_space_update_topology /home/alex/lsrc/qemu/qemu.git/memory.c:806 
(qemu-system-aarch64+0x001ed9ca)
#3 memory_region_transaction_commit 
/home/alex/lsrc/qemu/qemu.git/memory.c:842 (qemu-system-aarch64+0x001ed9ca)
#4 address_space_init /home/alex/lsrc/qemu/qemu.git/memory.c:2136 
(qemu-system-aarch64+0x001f1fa6)
#5 memory_map_init /home/alex/lsrc/qemu/qemu.git/exec.c:2344 
(qemu-system-aarch64+0x00196607)
#6 cpu_exec_init_all /home/alex/lsrc/qemu/qemu.git/exec.c:2795 
(qemu-system-aarch64+0x00196607)
#7 main /home/alex/lsrc/qemu/qemu.git/vl.c:4083 
(qemu-system-aarch64+0x001829aa)

  Previous read of size 4 at 0x7f59efa51d48 by thread T1:
#0 call_rcu_thread util/rcu.c:242 (qemu-system-aarch64+0x006c3d92)
#1   (libtsan.so.0+0x000235f9)

  Location is global 'rcu_call_count' of size 4 at 0x7f59efa51d48 
(qemu-system-aarch64+0x010f1d48)

  Mutex M172 (0x7f59ef6254e0) created at:
#0 pthread_mutex_init  (libtsan.so.0+0x00027ee5)
#1 qemu_mutex_init util/qemu-thread-posix.c:55 
(qemu-system-aarch64+0x006ad747)
#2 qemu_init_cpu_loop /home/alex/lsrc/qemu/qemu.git/cpus.c:890 
(qemu-system-aarch64+0x001d4166)
#3 main /home/alex/lsrc/qemu/qemu.git/vl.c:3005 
(qemu-system-aarch64+0x001820ac)

  Thread T1 (tid=15389, running) created by main thread at:
#0 pthread_create  (libtsan.so.0+0x000274c7)
#1 qemu_thread_create util/qemu-thread-posix.c:525 
(qemu-system-aarch64+0x006ae04d)
#2 rcu_init_complete util/rcu.c:320 (qemu-system-aarch64+0x006c3d52)
#3 rcu_init util/rcu.c:351 (qemu-system-aarch64+0x0018e288)
#4 __libc_csu_init  (qemu-system-aarch64+0x006c63ec)


but I don't know how many are false positives so I'm going to look in more
detail now.



--
Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread Alex Bennée

Pranith Kumar  writes:

> Hi Alex,
>
> On Tue, Jan 12, 2016 at 12:29 PM, Alex Bennée 
> wrote:
>
>>
> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks
>>
>
> I built this branch and ran an arm64 guest. It seems to be failing
> similarly to what I reported earlier:

Hi Pranith,

Can you try this branch:

https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1

I think I've caught all the things likely to screw up addressing.

--
Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread alvise rigo
This problem could be related to a missing multi-threaded aware
translation of the atomic instructions.
I'm working on this missing piece, probably the next week I will
publish something.

Regards,
alvise

On Fri, Jan 15, 2016 at 3:24 PM, Pranith Kumar  wrote:
> Hi Alex,
>
> On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée  wrote:
>> Can you try this branch:
>>
>>
>> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1
>>
>> I think I've caught all the things likely to screw up addressing.
>>
>
> I tried this branch and the boot hangs like follows:
>
> [2.001083] random: systemd-udevd urandom read with 1 bits of entropy
> available
> main-loop: WARNING: I/O thread spun for 1000 iterations
> [   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected
> by 0, t=2102 jiffies, g=-165, c=-166, q=83)
> [   23.780265] All QSes seen, last rcu_sched kthread activity 2101
> (4294939656-4294937555), jiffies_till_next_fqs=1, root ->qsmask 0x0
> [   23.781228] swapper/0   R  running task0 0  0
> 0x0080
> [   23.781977] Call trace:
> [   23.782375] [] dump_backtrace+0x0/0x170
> [   23.782852] [] show_stack+0x20/0x2c
> [   23.783279] [] sched_show_task+0x9c/0xf0
> [   23.783746] [] rcu_check_callbacks+0x7b8/0x828
> [   23.784230] [] update_process_times+0x40/0x74
> [   23.784723] [] tick_sched_handle.isra.15+0x38/0x7c
> [   23.785247] [] tick_sched_timer+0x48/0x84
> [   23.785705] [] __run_hrtimer+0x90/0x200
> [   23.786148] [] hrtimer_interrupt+0xec/0x268
> [   23.786612] [] arch_timer_handler_virt+0x38/0x48
> [   23.787120] [] handle_percpu_devid_irq+0x90/0x12c
> [   23.787621] [] generic_handle_irq+0x38/0x54
> [   23.788093] [] __handle_domain_irq+0x68/0xc4
> [   23.788578] [] gic_handle_irq+0x38/0x84
> [   23.789035] Exception stack(0xffc00073bde0 to 0xffc00073bf00)
> [   23.789650] bde0: 00738000 ffc0 0073e71c ffc0 0073bf20 ffc0
> 00086948 ffc0
> [   23.790356] be00: 000d848c ffc0   3ffcdb0c ffc0
>  0100
> [   23.791030] be20: 38b97100 ffc0 0073bea0 ffc0 67f6e000 0005
> 567f1c33 
> [   23.791744] be40: 00748cf0 ffc0 0073be70 ffc0 c1e2e4a0 ffbd
> 3a801148 ffc0
> [   23.792406] be60:  0040 0073e000 ffc0 3a801168 ffc0
> 97bbb588 007f
> [   23.793055] be80: 0021d7e8 ffc0 97b3d6ec 007f c37184d0 007f
> 00738000 ffc0
> [   23.793720] bea0: 0073e71c ffc0 006ff7e8 ffc0 007c8000 ffc0
> 0073e680 ffc0
> [   23.794373] bec0: 0072fac0 ffc0 0001  0073bf30 ffc0
> 0050e9e8 ffc0
> [   23.795025] bee0:   0073bf20 ffc0 00086944 ffc0
> 0073bf20 ffc0
> [   23.795721] [] el1_irq+0x64/0xc0
> [   23.796131] [] cpu_startup_entry+0x130/0x204
> [   23.796605] [] rest_init+0x78/0x84
> [   23.797028] [] start_kernel+0x3a0/0x3b8
> [   23.797528] rcu_sched kthread starved for 2101 jiffies!
>
>
> I will try to debug and see where it is hanging.
>
> Thanks!
> --
> Pranith



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread KONRAD Frederic



Le 15/01/2016 15:24, Pranith Kumar a écrit :

Hi Alex,

On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée > wrote:

> Can you try this branch:
>
> 
https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1

>
> I think I've caught all the things likely to screw up addressing.
>

I tried this branch and the boot hangs like follows:

[2.001083] random: systemd-udevd urandom read with 1 bits of 
entropy available

main-loop: WARNING: I/O thread spun for 1000 iterations
[   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {} 
(detected by 0, t=2102 jiffies, g=-165, c=-166, q=83)
[   23.780265] All QSes seen, last rcu_sched kthread activity 2101 
(4294939656-4294937555), jiffies_till_next_fqs=1, root ->qsmask 0x0
[   23.781228] swapper/0   R  running task0 0  0 
0x0080

[   23.781977] Call trace:
[   23.782375] [] dump_backtrace+0x0/0x170
[   23.782852] [] show_stack+0x20/0x2c
[   23.783279] [] sched_show_task+0x9c/0xf0
[   23.783746] [] rcu_check_callbacks+0x7b8/0x828
[   23.784230] [] update_process_times+0x40/0x74
[   23.784723] [] tick_sched_handle.isra.15+0x38/0x7c
[   23.785247] [] tick_sched_timer+0x48/0x84
[   23.785705] [] __run_hrtimer+0x90/0x200
[   23.786148] [] hrtimer_interrupt+0xec/0x268
[   23.786612] [] arch_timer_handler_virt+0x38/0x48
[   23.787120] [] handle_percpu_devid_irq+0x90/0x12c
[   23.787621] [] generic_handle_irq+0x38/0x54
[   23.788093] [] __handle_domain_irq+0x68/0xc4
[   23.788578] [] gic_handle_irq+0x38/0x84
[   23.789035] Exception stack(0xffc00073bde0 to 0xffc00073bf00)
[   23.789650] bde0: 00738000 ffc0 0073e71c ffc0 0073bf20 
ffc0 00086948 ffc0
[   23.790356] be00: 000d848c ffc0   3ffcdb0c 
ffc0  0100
[   23.791030] be20: 38b97100 ffc0 0073bea0 ffc0 67f6e000 
0005 567f1c33 
[   23.791744] be40: 00748cf0 ffc0 0073be70 ffc0 c1e2e4a0 
ffbd 3a801148 ffc0
[   23.792406] be60:  0040 0073e000 ffc0 3a801168 
ffc0 97bbb588 007f
[   23.793055] be80: 0021d7e8 ffc0 97b3d6ec 007f c37184d0 
007f 00738000 ffc0
[   23.793720] bea0: 0073e71c ffc0 006ff7e8 ffc0 007c8000 
ffc0 0073e680 ffc0
[   23.794373] bec0: 0072fac0 ffc0 0001  0073bf30 
ffc0 0050e9e8 ffc0
[   23.795025] bee0:   0073bf20 ffc0 00086944 
ffc0 0073bf20 ffc0

[   23.795721] [] el1_irq+0x64/0xc0
[   23.796131] [] cpu_startup_entry+0x130/0x204
[   23.796605] [] rest_init+0x78/0x84
[   23.797028] [] start_kernel+0x3a0/0x3b8
[   23.797528] rcu_sched kthread starved for 2101 jiffies!


I will try to debug and see where it is hanging.

Thanks!
--
Pranith


Hi Pranith,

I don't have time today to look into that.

But I missed a tb_find_physical which happen during tb_lock not held..
This hack should fix that (and probably slow things down):

diff --git a/cpu-exec.c b/cpu-exec.c
index 903126f..25a005a 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -252,9 +252,9 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
 }

 /* Move the TB to the head of the list */
-*ptb1 = tb->phys_hash_next;
-tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
-tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
+//*ptb1 = tb->phys_hash_next;
+//tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
+//tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
 return tb;
 }

Fred


Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread Alex Bennée

alvise rigo  writes:

> This problem could be related to a missing multi-threaded aware
> translation of the atomic instructions.
> I'm working on this missing piece, probably the next week I will
> publish something.

Maybe. We still have Fred's:

  Use atomic cmpxchg to atomically check the exclusive value in a STREX

Which I think papers over the cracks for both arm and aarch64 in MTTCG
while not being as correct as your work.

>
> Regards,
> alvise
>
> On Fri, Jan 15, 2016 at 3:24 PM, Pranith Kumar  wrote:
>> Hi Alex,
>>
>> On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée  wrote:
>>> Can you try this branch:
>>>
>>>
>>> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1
>>>
>>> I think I've caught all the things likely to screw up addressing.
>>>
>>
>> I tried this branch and the boot hangs like follows:
>>
>> [2.001083] random: systemd-udevd urandom read with 1 bits of entropy
>> available
>> main-loop: WARNING: I/O thread spun for 1000 iterations
>> [   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected
>> by 0, t=2102 jiffies, g=-165, c=-166, q=83)

This is just saying the kernel has been waiting for a while and nothing
has happened.

>> I will try to debug and see where it is hanging.

If we knew what the kernel was waiting for that would be useful to know.

>>
>> Thanks!
>> --
>> Pranith


--
Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread Alex Bennée

alvise rigo  writes:

> On Fri, Jan 15, 2016 at 3:51 PM, Alex Bennée  wrote:
>>
>> alvise rigo  writes:
>>
>>> This problem could be related to a missing multi-threaded aware
>>> translation of the atomic instructions.
>>> I'm working on this missing piece, probably the next week I will
>>> publish something.
>>
>> Maybe. We still have Fred's:
>>
>>   Use atomic cmpxchg to atomically check the exclusive value in a STREX
>>
>> Which I think papers over the cracks for both arm and aarch64 in MTTCG
>> while not being as correct as your work.
>
> Keep in mind that Linux on arm64 uses the LDXP/STXP instructions that
> exist solely in aarch64.
> These instructions are purely emulated now and can potentially write
> 128 bits of data in a non-atomic fashion.

Sure, but I doubt they are the reason for this hang as the kernel
doesn't use them.

>
> Regards,
> alvise
>
>>
>>>
>>> Regards,
>>> alvise
>>>
>>> On Fri, Jan 15, 2016 at 3:24 PM, Pranith Kumar  
>>> wrote:
 Hi Alex,

 On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée  
 wrote:
> Can you try this branch:
>
>
> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1
>
> I think I've caught all the things likely to screw up addressing.
>

 I tried this branch and the boot hangs like follows:

 [2.001083] random: systemd-udevd urandom read with 1 bits of entropy
 available
 main-loop: WARNING: I/O thread spun for 1000 iterations
 [   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected
 by 0, t=2102 jiffies, g=-165, c=-166, q=83)
>>
>> This is just saying the kernel has been waiting for a while and nothing
>> has happened.
>>
 I will try to debug and see where it is hanging.
>>
>> If we knew what the kernel was waiting for that would be useful to know.
>>

 Thanks!
 --
 Pranith
>>
>>
>> --
>> Alex Bennée


--
Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread KONRAD Frederic



Le 15/01/2016 15:46, Alex Bennée a écrit :

KONRAD Frederic  writes:


Le 15/01/2016 15:24, Pranith Kumar a écrit :

Hi Alex,

On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée > wrote:

Can you try this branch:



https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1

I think I've caught all the things likely to screw up addressing.


I tried this branch and the boot hangs like follows:

[2.001083] random: systemd-udevd urandom read with 1 bits of
entropy available
main-loop: WARNING: I/O thread spun for 1000 iterations
[   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {}
(detected by 0, t=2102 jiffies, g=-165, c=-166, q=83)
[   23.780265] All QSes seen, last rcu_sched kthread activity 2101
(4294939656-4294937555), jiffies_till_next_fqs=1, root ->qsmask 0x0
[   23.781228] swapper/0   R  running task0 0  0
0x0080
[   23.781977] Call trace:
[   23.782375] [] dump_backtrace+0x0/0x170
[   23.782852] [] show_stack+0x20/0x2c
[   23.783279] [] sched_show_task+0x9c/0xf0
[   23.783746] [] rcu_check_callbacks+0x7b8/0x828
[   23.784230] [] update_process_times+0x40/0x74
[   23.784723] [] tick_sched_handle.isra.15+0x38/0x7c
[   23.785247] [] tick_sched_timer+0x48/0x84
[   23.785705] [] __run_hrtimer+0x90/0x200
[   23.786148] [] hrtimer_interrupt+0xec/0x268
[   23.786612] [] arch_timer_handler_virt+0x38/0x48
[   23.787120] [] handle_percpu_devid_irq+0x90/0x12c
[   23.787621] [] generic_handle_irq+0x38/0x54
[   23.788093] [] __handle_domain_irq+0x68/0xc4
[   23.788578] [] gic_handle_irq+0x38/0x84
[   23.789035] Exception stack(0xffc00073bde0 to 0xffc00073bf00)
[   23.789650] bde0: 00738000 ffc0 0073e71c ffc0 0073bf20
ffc0 00086948 ffc0
[   23.790356] be00: 000d848c ffc0   3ffcdb0c
ffc0  0100
[   23.791030] be20: 38b97100 ffc0 0073bea0 ffc0 67f6e000
0005 567f1c33 
[   23.791744] be40: 00748cf0 ffc0 0073be70 ffc0 c1e2e4a0
ffbd 3a801148 ffc0
[   23.792406] be60:  0040 0073e000 ffc0 3a801168
ffc0 97bbb588 007f
[   23.793055] be80: 0021d7e8 ffc0 97b3d6ec 007f c37184d0
007f 00738000 ffc0
[   23.793720] bea0: 0073e71c ffc0 006ff7e8 ffc0 007c8000
ffc0 0073e680 ffc0
[   23.794373] bec0: 0072fac0 ffc0 0001  0073bf30
ffc0 0050e9e8 ffc0
[   23.795025] bee0:   0073bf20 ffc0 00086944
ffc0 0073bf20 ffc0
[   23.795721] [] el1_irq+0x64/0xc0
[   23.796131] [] cpu_startup_entry+0x130/0x204
[   23.796605] [] rest_init+0x78/0x84
[   23.797028] [] start_kernel+0x3a0/0x3b8
[   23.797528] rcu_sched kthread starved for 2101 jiffies!


I will try to debug and see where it is hanging.

Thanks!
--
Pranith

Hi Pranith,

I don't have time today to look into that.

But I missed a tb_find_physical which happen during tb_lock not held..
This hack should fix that (and probably slow things down):

diff --git a/cpu-exec.c b/cpu-exec.c
index 903126f..25a005a 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -252,9 +252,9 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
   }

   /* Move the TB to the head of the list */
-*ptb1 = tb->phys_hash_next;
-tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
-tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
+//*ptb1 = tb->phys_hash_next;
+//tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
+//tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
   return tb;
   }

Hmm not in my build cpu_exec:

 ...
 tb_lock();
 tb = tb_find_fast(cpu);
 ...

Which I think is right. I mean I can see if it wasn't then breakage
could occur when you manipulate the lookup but I think we should keep
the lock there and if it proves to be a performance hit come up with a
safe optimisation. I think Paolo talked about using RCU type locks.

That's definitely a performance hit.
Ok we should talk about that Monday.

Fred


--
Alex Bennée





Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread alvise rigo
On Fri, Jan 15, 2016 at 3:51 PM, Alex Bennée  wrote:
>
> alvise rigo  writes:
>
>> This problem could be related to a missing multi-threaded aware
>> translation of the atomic instructions.
>> I'm working on this missing piece, probably the next week I will
>> publish something.
>
> Maybe. We still have Fred's:
>
>   Use atomic cmpxchg to atomically check the exclusive value in a STREX
>
> Which I think papers over the cracks for both arm and aarch64 in MTTCG
> while not being as correct as your work.

Keep in mind that Linux on arm64 uses the LDXP/STXP instructions that
exist solely in aarch64.
These instructions are purely emulated now and can potentially write
128 bits of data in a non-atomic fashion.

Regards,
alvise

>
>>
>> Regards,
>> alvise
>>
>> On Fri, Jan 15, 2016 at 3:24 PM, Pranith Kumar  wrote:
>>> Hi Alex,
>>>
>>> On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée  wrote:
 Can you try this branch:


 https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1

 I think I've caught all the things likely to screw up addressing.

>>>
>>> I tried this branch and the boot hangs like follows:
>>>
>>> [2.001083] random: systemd-udevd urandom read with 1 bits of entropy
>>> available
>>> main-loop: WARNING: I/O thread spun for 1000 iterations
>>> [   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected
>>> by 0, t=2102 jiffies, g=-165, c=-166, q=83)
>
> This is just saying the kernel has been waiting for a while and nothing
> has happened.
>
>>> I will try to debug and see where it is hanging.
>
> If we knew what the kernel was waiting for that would be useful to know.
>
>>>
>>> Thanks!
>>> --
>>> Pranith
>
>
> --
> Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread Pranith Kumar
Hi Alex,

On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée  wrote:
> Can you try this branch:
>
>
https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1
>
> I think I've caught all the things likely to screw up addressing.
>

I tried this branch and the boot hangs like follows:

[2.001083] random: systemd-udevd urandom read with 1 bits of entropy
available
main-loop: WARNING: I/O thread spun for 1000 iterations
[   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected
by 0, t=2102 jiffies, g=-165, c=-166, q=83)
[   23.780265] All QSes seen, last rcu_sched kthread activity 2101
(4294939656-4294937555), jiffies_till_next_fqs=1, root ->qsmask 0x0
[   23.781228] swapper/0   R  running task0 0  0
0x0080
[   23.781977] Call trace:
[   23.782375] [] dump_backtrace+0x0/0x170
[   23.782852] [] show_stack+0x20/0x2c
[   23.783279] [] sched_show_task+0x9c/0xf0
[   23.783746] [] rcu_check_callbacks+0x7b8/0x828
[   23.784230] [] update_process_times+0x40/0x74
[   23.784723] [] tick_sched_handle.isra.15+0x38/0x7c
[   23.785247] [] tick_sched_timer+0x48/0x84
[   23.785705] [] __run_hrtimer+0x90/0x200
[   23.786148] [] hrtimer_interrupt+0xec/0x268
[   23.786612] [] arch_timer_handler_virt+0x38/0x48
[   23.787120] [] handle_percpu_devid_irq+0x90/0x12c
[   23.787621] [] generic_handle_irq+0x38/0x54
[   23.788093] [] __handle_domain_irq+0x68/0xc4
[   23.788578] [] gic_handle_irq+0x38/0x84
[   23.789035] Exception stack(0xffc00073bde0 to 0xffc00073bf00)
[   23.789650] bde0: 00738000 ffc0 0073e71c ffc0 0073bf20 ffc0
00086948 ffc0
[   23.790356] be00: 000d848c ffc0   3ffcdb0c ffc0
 0100
[   23.791030] be20: 38b97100 ffc0 0073bea0 ffc0 67f6e000 0005
567f1c33 
[   23.791744] be40: 00748cf0 ffc0 0073be70 ffc0 c1e2e4a0 ffbd
3a801148 ffc0
[   23.792406] be60:  0040 0073e000 ffc0 3a801168 ffc0
97bbb588 007f
[   23.793055] be80: 0021d7e8 ffc0 97b3d6ec 007f c37184d0 007f
00738000 ffc0
[   23.793720] bea0: 0073e71c ffc0 006ff7e8 ffc0 007c8000 ffc0
0073e680 ffc0
[   23.794373] bec0: 0072fac0 ffc0 0001  0073bf30 ffc0
0050e9e8 ffc0
[   23.795025] bee0:   0073bf20 ffc0 00086944 ffc0
0073bf20 ffc0
[   23.795721] [] el1_irq+0x64/0xc0
[   23.796131] [] cpu_startup_entry+0x130/0x204
[   23.796605] [] rest_init+0x78/0x84
[   23.797028] [] start_kernel+0x3a0/0x3b8
[   23.797528] rcu_sched kthread starved for 2101 jiffies!


I will try to debug and see where it is hanging.

Thanks!
-- 
Pranith


Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread Alex Bennée

KONRAD Frederic  writes:

> Le 15/01/2016 15:24, Pranith Kumar a écrit :
>> Hi Alex,
>>
>> On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée > > wrote:
>> > Can you try this branch:
>> >
>> >
>> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1
>> >
>> > I think I've caught all the things likely to screw up addressing.
>> >
>>
>> I tried this branch and the boot hangs like follows:
>>
>> [2.001083] random: systemd-udevd urandom read with 1 bits of
>> entropy available
>> main-loop: WARNING: I/O thread spun for 1000 iterations
>> [   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {}
>> (detected by 0, t=2102 jiffies, g=-165, c=-166, q=83)
>> [   23.780265] All QSes seen, last rcu_sched kthread activity 2101
>> (4294939656-4294937555), jiffies_till_next_fqs=1, root ->qsmask 0x0
>> [   23.781228] swapper/0   R  running task0 0  0
>> 0x0080
>> [   23.781977] Call trace:
>> [   23.782375] [] dump_backtrace+0x0/0x170
>> [   23.782852] [] show_stack+0x20/0x2c
>> [   23.783279] [] sched_show_task+0x9c/0xf0
>> [   23.783746] [] rcu_check_callbacks+0x7b8/0x828
>> [   23.784230] [] update_process_times+0x40/0x74
>> [   23.784723] [] tick_sched_handle.isra.15+0x38/0x7c
>> [   23.785247] [] tick_sched_timer+0x48/0x84
>> [   23.785705] [] __run_hrtimer+0x90/0x200
>> [   23.786148] [] hrtimer_interrupt+0xec/0x268
>> [   23.786612] [] arch_timer_handler_virt+0x38/0x48
>> [   23.787120] [] handle_percpu_devid_irq+0x90/0x12c
>> [   23.787621] [] generic_handle_irq+0x38/0x54
>> [   23.788093] [] __handle_domain_irq+0x68/0xc4
>> [   23.788578] [] gic_handle_irq+0x38/0x84
>> [   23.789035] Exception stack(0xffc00073bde0 to 0xffc00073bf00)
>> [   23.789650] bde0: 00738000 ffc0 0073e71c ffc0 0073bf20
>> ffc0 00086948 ffc0
>> [   23.790356] be00: 000d848c ffc0   3ffcdb0c
>> ffc0  0100
>> [   23.791030] be20: 38b97100 ffc0 0073bea0 ffc0 67f6e000
>> 0005 567f1c33 
>> [   23.791744] be40: 00748cf0 ffc0 0073be70 ffc0 c1e2e4a0
>> ffbd 3a801148 ffc0
>> [   23.792406] be60:  0040 0073e000 ffc0 3a801168
>> ffc0 97bbb588 007f
>> [   23.793055] be80: 0021d7e8 ffc0 97b3d6ec 007f c37184d0
>> 007f 00738000 ffc0
>> [   23.793720] bea0: 0073e71c ffc0 006ff7e8 ffc0 007c8000
>> ffc0 0073e680 ffc0
>> [   23.794373] bec0: 0072fac0 ffc0 0001  0073bf30
>> ffc0 0050e9e8 ffc0
>> [   23.795025] bee0:   0073bf20 ffc0 00086944
>> ffc0 0073bf20 ffc0
>> [   23.795721] [] el1_irq+0x64/0xc0
>> [   23.796131] [] cpu_startup_entry+0x130/0x204
>> [   23.796605] [] rest_init+0x78/0x84
>> [   23.797028] [] start_kernel+0x3a0/0x3b8
>> [   23.797528] rcu_sched kthread starved for 2101 jiffies!
>>
>>
>> I will try to debug and see where it is hanging.
>>
>> Thanks!
>> --
>> Pranith
>
> Hi Pranith,
>
> I don't have time today to look into that.
>
> But I missed a tb_find_physical which happen during tb_lock not held..
> This hack should fix that (and probably slow things down):
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 903126f..25a005a 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -252,9 +252,9 @@ static TranslationBlock *tb_find_physical(CPUState *cpu,
>   }
>
>   /* Move the TB to the head of the list */
> -*ptb1 = tb->phys_hash_next;
> -tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
> -tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
> +//*ptb1 = tb->phys_hash_next;
> +//tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
> +//tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
>   return tb;
>   }

Hmm not in my build cpu_exec:

...
tb_lock();
tb = tb_find_fast(cpu);
...

Which I think is right. I mean I can see if it wasn't then breakage
could occur when you manipulate the lookup but I think we should keep
the lock there and if it proves to be a performance hit come up with a
safe optimisation. I think Paolo talked about using RCU type locks.

--
Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread Alex Bennée

alvise rigo  writes:

> On Fri, Jan 15, 2016 at 4:25 PM, Alex Bennée  wrote:
>>
>> alvise rigo  writes:
>>
>>> On Fri, Jan 15, 2016 at 3:51 PM, Alex Bennée  wrote:

 alvise rigo  writes:

> This problem could be related to a missing multi-threaded aware
> translation of the atomic instructions.
> I'm working on this missing piece, probably the next week I will
> publish something.

 Maybe. We still have Fred's:

   Use atomic cmpxchg to atomically check the exclusive value in a STREX

 Which I think papers over the cracks for both arm and aarch64 in MTTCG
 while not being as correct as your work.
>>>
>>> Keep in mind that Linux on arm64 uses the LDXP/STXP instructions that
>>> exist solely in aarch64.
>>> These instructions are purely emulated now and can potentially write
>>> 128 bits of data in a non-atomic fashion.
>>
>> Sure, but I doubt they are the reason for this hang as the kernel
>> doesn't use them.
>
> The kernel does use them for __cmpxchg_double in
> arch/arm64/include/asm/atomic_ll_sc.h.

I take it back, if I'd have grepped for "ldxp" instead of "stxp" I would
have seen it, sorry about that ;-)

> In any case, the normal exclusive instructions are also emulated in
> target-arm/translate-a64.c.

I'll check on them on Monday. I'd assumed all the stuff was in the
helpers as I scanned through and missed the translate.c changes Fred
made. Hopefully that will be the last hurdle.

In the meantime if I'm not booting Jessie I can get MTTCG aarch64
working with a initrd based rootfs. Once I've gone through those I'm
planning on giving it a good stress test with -fsantize=threads.

>
> alvise
>
>>
>>>
>>> Regards,
>>> alvise
>>>

>
> Regards,
> alvise
>
> On Fri, Jan 15, 2016 at 3:24 PM, Pranith Kumar  
> wrote:
>> Hi Alex,
>>
>> On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée  
>> wrote:
>>> Can you try this branch:
>>>
>>>
>>> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1
>>>
>>> I think I've caught all the things likely to screw up addressing.
>>>
>>
>> I tried this branch and the boot hangs like follows:
>>
>> [2.001083] random: systemd-udevd urandom read with 1 bits of entropy
>> available
>> main-loop: WARNING: I/O thread spun for 1000 iterations
>> [   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {} 
>> (detected
>> by 0, t=2102 jiffies, g=-165, c=-166, q=83)

 This is just saying the kernel has been waiting for a while and nothing
 has happened.

>> I will try to debug and see where it is hanging.

 If we knew what the kernel was waiting for that would be useful to know.

>>
>> Thanks!
>> --
>> Pranith


 --
 Alex Bennée
>>
>>
>> --
>> Alex Bennée


--
Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread alvise rigo
On Fri, Jan 15, 2016 at 4:25 PM, Alex Bennée  wrote:
>
> alvise rigo  writes:
>
>> On Fri, Jan 15, 2016 at 3:51 PM, Alex Bennée  wrote:
>>>
>>> alvise rigo  writes:
>>>
 This problem could be related to a missing multi-threaded aware
 translation of the atomic instructions.
 I'm working on this missing piece, probably the next week I will
 publish something.
>>>
>>> Maybe. We still have Fred's:
>>>
>>>   Use atomic cmpxchg to atomically check the exclusive value in a STREX
>>>
>>> Which I think papers over the cracks for both arm and aarch64 in MTTCG
>>> while not being as correct as your work.
>>
>> Keep in mind that Linux on arm64 uses the LDXP/STXP instructions that
>> exist solely in aarch64.
>> These instructions are purely emulated now and can potentially write
>> 128 bits of data in a non-atomic fashion.
>
> Sure, but I doubt they are the reason for this hang as the kernel
> doesn't use them.

The kernel does use them for __cmpxchg_double in
arch/arm64/include/asm/atomic_ll_sc.h.
In any case, the normal exclusive instructions are also emulated in
target-arm/translate-a64.c.

alvise

>
>>
>> Regards,
>> alvise
>>
>>>

 Regards,
 alvise

 On Fri, Jan 15, 2016 at 3:24 PM, Pranith Kumar  
 wrote:
> Hi Alex,
>
> On Fri, Jan 15, 2016 at 8:53 AM, Alex Bennée  
> wrote:
>> Can you try this branch:
>>
>>
>> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks-r1
>>
>> I think I've caught all the things likely to screw up addressing.
>>
>
> I tried this branch and the boot hangs like follows:
>
> [2.001083] random: systemd-udevd urandom read with 1 bits of entropy
> available
> main-loop: WARNING: I/O thread spun for 1000 iterations
> [   23.778970] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected
> by 0, t=2102 jiffies, g=-165, c=-166, q=83)
>>>
>>> This is just saying the kernel has been waiting for a while and nothing
>>> has happened.
>>>
> I will try to debug and see where it is hanging.
>>>
>>> If we knew what the kernel was waiting for that would be useful to know.
>>>
>
> Thanks!
> --
> Pranith
>>>
>>>
>>> --
>>> Alex Bennée
>
>
> --
> Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-15 Thread Paolo Bonzini


On 15/01/2016 15:49, KONRAD Frederic wrote:
> 
>/* Move the TB to the head of the list */
> -*ptb1 = tb->phys_hash_next;
> -tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
> -tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
> +//*ptb1 = tb->phys_hash_next;
> +//tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
> +//tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;

This is the right fix.  It's a *huge* performance hit to take the
tb_lock around tb_find_fast.

Paolo



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-14 Thread KONRAD Frederic



Le 14/01/2016 14:10, Alex Bennée a écrit :

Alex Bennée  writes:


Pranith Kumar  writes:


Hi Alex,

On Tue, Jan 12, 2016 at 12:29 PM, Alex Bennée 
wrote:

https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks
I built this branch and ran an arm64 guest. It seems to be failing
similarly to what I reported earlier:

#0  0x72211cc9 in __GI_raise (sig=sig@entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x722150d8 in __GI_abort () at abort.c:89
#2  0x5572014c in qemu_ram_addr_from_host_nofail
(ptr=0xffc000187863) at /home/pranith/devops/code/qemu/cputlb.c:357
#3  0x557209dd in get_page_addr_code (env1=0x56702058,
addr=18446743798833248356) at /home/pranith/devops/code/qemu/cputlb.c:568
#4  0x556db98c in tb_find_physical (cpu=0x566f9dd0,
pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
/home/pranith/devops/code/qemu/cpu-exec.c:224
#5  0x556dbaf4 in tb_find_slow (cpu=0x566f9dd0,
pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
/home/pranith/devops/code/qemu/cpu-exec.c:268
#6  0x556dbc77 in tb_find_fast (cpu=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpu-exec.c:311
#7  0x556dc0f1 in cpu_arm_exec (cpu=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpu-exec.c:492
#8  0x557050ee in tcg_cpu_exec (cpu=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpus.c:1486
#9  0x557051af in tcg_exec_all (cpu=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpus.c:1515
#10 0x55704800 in qemu_tcg_cpu_thread_fn (arg=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpus.c:1187
#11 0x725a8182 in start_thread (arg=0x7fffd20c8700) at
pthread_create.c:312
#12 0x722d547d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111



Having seen a backtrace of a crash while the other thread was flushing
the TLB entries I sprinkled a bunch of:

   g_assert(cpu == current_cpu);

In all public functions in cputlb that took a CPU. There are a bunch of
cases that don't defer actions across CPUs which need to be fixed up. I
suspect they don't hit in the arm case because the type of TLB flushing
pattern is different. In aarch64 it my backtrace it was triggered by
tlbi_aa64_vae1is_write:

   7Thread 0x7ffe777fe700 (LWP 32705) "worker" sem_timedwait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
   6Thread 0x7ffe77fff700 (LWP 32704) "worker" sem_timedwait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
   5Thread 0x7fff8d9d0700 (LWP 32703) "CPU 1/TCG" 0x5572cc18 in memcpy 
(__len=8, __src=, __dest=)
 at /usr/include/x86_64-linux-gnu/bits/string3.h:51
* 4Thread 0x7fff8e1d1700 (LWP 32702) "CPU 0/TCG" memset () at 
../sysdeps/x86_64/memset.S:94
   3Thread 0x7fff8f1cb700 (LWP 32701) "worker" sem_timedwait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
   2Thread 0x7fffe45c8700 (LWP 32700) "qemu-system-aar" syscall () at 
../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
   1Thread 0x77f98c00 (LWP 32696) "qemu-system-aar" 0x70ba01ef in 
__GI_ppoll (fds=0x575cb5b0, nfds=8, timeout=,
 timeout@entry=0x7fffdf60, sigmask=sigmask@entry=0x0) at 
../sysdeps/unix/sysv/linux/ppoll.c:56
#0  memset () at ../sysdeps/x86_64/memset.S:94
#1  0x55728bee in memset (__len=32768, __ch=0, __dest=0x56632568) 
at /usr/include/x86_64-linux-gnu/bits/string3.h:84
#2  v_tlb_flush_by_mmuidx (argp=0x7fff8e1d0430, cpu=0x56632380) at 
/home/alex/lsrc/qemu/qemu.git/cputlb.c:136
#3  tlb_flush_page_by_mmuidx (cpu=cpu@entry=0x56632380, 
addr=addr@entry=547976253440) at /home/alex/lsrc/qemu/qemu.git/cputlb.c:243
#4  0x557fcb4a in tlbi_aa64_vae1is_write (env=, ri=, value=)
 at /home/alex/lsrc/qemu/qemu.git/target-arm/helper.c:2757
#5  0x7fffa441dac5 in code_gen_buffer ()
#6  0x556eef4b in cpu_tb_exec (tb_ptr=, 
cpu=0x565eddd0) at /home/alex/lsrc/qemu/qemu.git/cpu-exec.c:157
#7  cpu_arm_exec (cpu=cpu@entry=0x565eddd0) at 
/home/alex/lsrc/qemu/qemu.git/cpu-exec.c:520
#8  0x557108e8 in tcg_cpu_exec (cpu=0x565eddd0) at 
/home/alex/lsrc/qemu/qemu.git/cpus.c:1486
#9  tcg_exec_all (cpu=0x565eddd0) at 
/home/alex/lsrc/qemu/qemu.git/cpus.c:1515
#10 qemu_tcg_cpu_thread_fn (arg=0x565eddd0) at 
/home/alex/lsrc/qemu/qemu.git/cpus.c:1187
#11 0x70e80182 in start_thread (arg=0x7fff8e1d1700) at 
pthread_create.c:312
#12 0x70bad47d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
[Switching to thread 5 (Thread 0x7fff8d9d0700 (LWP 32703))]
#0  0x5572cc18 in memcpy (__len=8, __src=, 
__dest=) at /usr/include/x86_64-linux-gnu/bits/string3.h:51
51return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
#0  0x5572cc18 in memcpy (__len=8, __src=, 
__dest=) at 

Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-14 Thread Alex Bennée

Alex Bennée  writes:

> Pranith Kumar  writes:
>
>> Hi Alex,
>>
>> On Tue, Jan 12, 2016 at 12:29 PM, Alex Bennée 
>> wrote:
>>
>>>
>> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks
>>>
>>
>> I built this branch and ran an arm64 guest. It seems to be failing
>> similarly to what I reported earlier:
>>
>> #0  0x72211cc9 in __GI_raise (sig=sig@entry=6) at
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>> #1  0x722150d8 in __GI_abort () at abort.c:89
>> #2  0x5572014c in qemu_ram_addr_from_host_nofail
>> (ptr=0xffc000187863) at /home/pranith/devops/code/qemu/cputlb.c:357
>> #3  0x557209dd in get_page_addr_code (env1=0x56702058,
>> addr=18446743798833248356) at /home/pranith/devops/code/qemu/cputlb.c:568
>> #4  0x556db98c in tb_find_physical (cpu=0x566f9dd0,
>> pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
>> /home/pranith/devops/code/qemu/cpu-exec.c:224
>> #5  0x556dbaf4 in tb_find_slow (cpu=0x566f9dd0,
>> pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
>> /home/pranith/devops/code/qemu/cpu-exec.c:268
>> #6  0x556dbc77 in tb_find_fast (cpu=0x566f9dd0) at
>> /home/pranith/devops/code/qemu/cpu-exec.c:311
>> #7  0x556dc0f1 in cpu_arm_exec (cpu=0x566f9dd0) at
>> /home/pranith/devops/code/qemu/cpu-exec.c:492
>> #8  0x557050ee in tcg_cpu_exec (cpu=0x566f9dd0) at
>> /home/pranith/devops/code/qemu/cpus.c:1486
>> #9  0x557051af in tcg_exec_all (cpu=0x566f9dd0) at
>> /home/pranith/devops/code/qemu/cpus.c:1515
>> #10 0x55704800 in qemu_tcg_cpu_thread_fn (arg=0x566f9dd0) at
>> /home/pranith/devops/code/qemu/cpus.c:1187
>> #11 0x725a8182 in start_thread (arg=0x7fffd20c8700) at
>> pthread_create.c:312
>> #12 0x722d547d in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


Having seen a backtrace of a crash while the other thread was flushing
the TLB entries I sprinkled a bunch of:

  g_assert(cpu == current_cpu);

In all public functions in cputlb that took a CPU. There are a bunch of
cases that don't defer actions across CPUs which need to be fixed up. I
suspect they don't hit in the arm case because the type of TLB flushing
pattern is different. In aarch64 it my backtrace it was triggered by
tlbi_aa64_vae1is_write:

  7Thread 0x7ffe777fe700 (LWP 32705) "worker" sem_timedwait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
  6Thread 0x7ffe77fff700 (LWP 32704) "worker" sem_timedwait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
  5Thread 0x7fff8d9d0700 (LWP 32703) "CPU 1/TCG" 0x5572cc18 in 
memcpy (__len=8, __src=, __dest=)
at /usr/include/x86_64-linux-gnu/bits/string3.h:51
* 4Thread 0x7fff8e1d1700 (LWP 32702) "CPU 0/TCG" memset () at 
../sysdeps/x86_64/memset.S:94
  3Thread 0x7fff8f1cb700 (LWP 32701) "worker" sem_timedwait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
  2Thread 0x7fffe45c8700 (LWP 32700) "qemu-system-aar" syscall () at 
../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  1Thread 0x77f98c00 (LWP 32696) "qemu-system-aar" 0x70ba01ef 
in __GI_ppoll (fds=0x575cb5b0, nfds=8, timeout=,
timeout@entry=0x7fffdf60, sigmask=sigmask@entry=0x0) at 
../sysdeps/unix/sysv/linux/ppoll.c:56
#0  memset () at ../sysdeps/x86_64/memset.S:94
#1  0x55728bee in memset (__len=32768, __ch=0, __dest=0x56632568) 
at /usr/include/x86_64-linux-gnu/bits/string3.h:84
#2  v_tlb_flush_by_mmuidx (argp=0x7fff8e1d0430, cpu=0x56632380) at 
/home/alex/lsrc/qemu/qemu.git/cputlb.c:136
#3  tlb_flush_page_by_mmuidx (cpu=cpu@entry=0x56632380, 
addr=addr@entry=547976253440) at /home/alex/lsrc/qemu/qemu.git/cputlb.c:243
#4  0x557fcb4a in tlbi_aa64_vae1is_write (env=, 
ri=, value=)
at /home/alex/lsrc/qemu/qemu.git/target-arm/helper.c:2757
#5  0x7fffa441dac5 in code_gen_buffer ()
#6  0x556eef4b in cpu_tb_exec (tb_ptr=, 
cpu=0x565eddd0) at /home/alex/lsrc/qemu/qemu.git/cpu-exec.c:157
#7  cpu_arm_exec (cpu=cpu@entry=0x565eddd0) at 
/home/alex/lsrc/qemu/qemu.git/cpu-exec.c:520
#8  0x557108e8 in tcg_cpu_exec (cpu=0x565eddd0) at 
/home/alex/lsrc/qemu/qemu.git/cpus.c:1486
#9  tcg_exec_all (cpu=0x565eddd0) at 
/home/alex/lsrc/qemu/qemu.git/cpus.c:1515
#10 qemu_tcg_cpu_thread_fn (arg=0x565eddd0) at 
/home/alex/lsrc/qemu/qemu.git/cpus.c:1187
#11 0x70e80182 in start_thread (arg=0x7fff8e1d1700) at 
pthread_create.c:312
#12 0x70bad47d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
[Switching to thread 5 (Thread 0x7fff8d9d0700 (LWP 32703))]
#0  0x5572cc18 in memcpy (__len=8, __src=, 
__dest=) at /usr/include/x86_64-linux-gnu/bits/string3.h:51
51return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
#0  0x5572cc18 in memcpy 

Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-14 Thread Alex Bennée

KONRAD Frederic  writes:

> Le 14/01/2016 14:10, Alex Bennée a écrit :
>> Alex Bennée  writes:
>>
>>> Pranith Kumar  writes:
>>>
 Hi Alex,

 On Tue, Jan 12, 2016 at 12:29 PM, Alex Bennée 
 wrote:

 https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks
 I built this branch and ran an arm64 guest. It seems to be failing
 similarly to what I reported earlier:

 #0  0x72211cc9 in __GI_raise (sig=sig@entry=6) at
 ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 #1  0x722150d8 in __GI_abort () at abort.c:89
 #2  0x5572014c in qemu_ram_addr_from_host_nofail
 (ptr=0xffc000187863) at /home/pranith/devops/code/qemu/cputlb.c:357
 #3  0x557209dd in get_page_addr_code (env1=0x56702058,
 addr=18446743798833248356) at /home/pranith/devops/code/qemu/cputlb.c:568
 #4  0x556db98c in tb_find_physical (cpu=0x566f9dd0,
 pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
 /home/pranith/devops/code/qemu/cpu-exec.c:224
 #5  0x556dbaf4 in tb_find_slow (cpu=0x566f9dd0,
 pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
 /home/pranith/devops/code/qemu/cpu-exec.c:268
 #6  0x556dbc77 in tb_find_fast (cpu=0x566f9dd0) at
 /home/pranith/devops/code/qemu/cpu-exec.c:311
 #7  0x556dc0f1 in cpu_arm_exec (cpu=0x566f9dd0) at
 /home/pranith/devops/code/qemu/cpu-exec.c:492
 #8  0x557050ee in tcg_cpu_exec (cpu=0x566f9dd0) at
 /home/pranith/devops/code/qemu/cpus.c:1486
 #9  0x557051af in tcg_exec_all (cpu=0x566f9dd0) at
 /home/pranith/devops/code/qemu/cpus.c:1515
 #10 0x55704800 in qemu_tcg_cpu_thread_fn (arg=0x566f9dd0) at
 /home/pranith/devops/code/qemu/cpus.c:1187
 #11 0x725a8182 in start_thread (arg=0x7fffd20c8700) at
 pthread_create.c:312
 #12 0x722d547d in clone () at
 ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>> 
>>
>> Having seen a backtrace of a crash while the other thread was flushing
>> the TLB entries I sprinkled a bunch of:
>>
>>g_assert(cpu == current_cpu);
>>
>> In all public functions in cputlb that took a CPU. There are a bunch of
>> cases that don't defer actions across CPUs which need to be fixed up. I
>> suspect they don't hit in the arm case because the type of TLB flushing
>> pattern is different. In aarch64 it my backtrace it was triggered by
>> tlbi_aa64_vae1is_write:
>>
>>7Thread 0x7ffe777fe700 (LWP 32705) "worker" sem_timedwait () at 
>> ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
>>6Thread 0x7ffe77fff700 (LWP 32704) "worker" sem_timedwait () at 
>> ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
>>5Thread 0x7fff8d9d0700 (LWP 32703) "CPU 1/TCG" 0x5572cc18 in 
>> memcpy (__len=8, __src=, __dest=)
>>  at /usr/include/x86_64-linux-gnu/bits/string3.h:51
>> * 4Thread 0x7fff8e1d1700 (LWP 32702) "CPU 0/TCG" memset () at 
>> ../sysdeps/x86_64/memset.S:94
>>3Thread 0x7fff8f1cb700 (LWP 32701) "worker" sem_timedwait () at 
>> ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
>>2Thread 0x7fffe45c8700 (LWP 32700) "qemu-system-aar" syscall () at 
>> ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
>>1Thread 0x77f98c00 (LWP 32696) "qemu-system-aar" 
>> 0x70ba01ef in __GI_ppoll (fds=0x575cb5b0, nfds=8, 
>> timeout=,
>>  timeout@entry=0x7fffdf60, sigmask=sigmask@entry=0x0) at 
>> ../sysdeps/unix/sysv/linux/ppoll.c:56
>> #0  memset () at ../sysdeps/x86_64/memset.S:94
>> #1  0x55728bee in memset (__len=32768, __ch=0, 
>> __dest=0x56632568) at /usr/include/x86_64-linux-gnu/bits/string3.h:84
>> #2  v_tlb_flush_by_mmuidx (argp=0x7fff8e1d0430, cpu=0x56632380) at 
>> /home/alex/lsrc/qemu/qemu.git/cputlb.c:136
>> #3  tlb_flush_page_by_mmuidx (cpu=cpu@entry=0x56632380, 
>> addr=addr@entry=547976253440) at /home/alex/lsrc/qemu/qemu.git/cputlb.c:243
>> #4  0x557fcb4a in tlbi_aa64_vae1is_write (env=, 
>> ri=, value=)
>>  at /home/alex/lsrc/qemu/qemu.git/target-arm/helper.c:2757
>> #5  0x7fffa441dac5 in code_gen_buffer ()
>> #6  0x556eef4b in cpu_tb_exec (tb_ptr=, 
>> cpu=0x565eddd0) at /home/alex/lsrc/qemu/qemu.git/cpu-exec.c:157
>> #7  cpu_arm_exec (cpu=cpu@entry=0x565eddd0) at 
>> /home/alex/lsrc/qemu/qemu.git/cpu-exec.c:520
>> #8  0x557108e8 in tcg_cpu_exec (cpu=0x565eddd0) at 
>> /home/alex/lsrc/qemu/qemu.git/cpus.c:1486
>> #9  tcg_exec_all (cpu=0x565eddd0) at 
>> /home/alex/lsrc/qemu/qemu.git/cpus.c:1515
>> #10 qemu_tcg_cpu_thread_fn (arg=0x565eddd0) at 
>> /home/alex/lsrc/qemu/qemu.git/cpus.c:1187
>> #11 0x70e80182 in start_thread (arg=0x7fff8e1d1700) at 
>> pthread_create.c:312
>> #12 0x70bad47d in clone 

Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-13 Thread Alex Bennée

Pranith Kumar  writes:

> Hi Alex,
>
> On Tue, Jan 12, 2016 at 12:29 PM, Alex Bennée 
> wrote:
>
>>
> https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks
>>
>
> I built this branch and ran an arm64 guest. It seems to be failing
> similarly to what I reported earlier:
>
> #0  0x72211cc9 in __GI_raise (sig=sig@entry=6) at
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x722150d8 in __GI_abort () at abort.c:89
> #2  0x5572014c in qemu_ram_addr_from_host_nofail
> (ptr=0xffc000187863) at /home/pranith/devops/code/qemu/cputlb.c:357
> #3  0x557209dd in get_page_addr_code (env1=0x56702058,
> addr=18446743798833248356) at /home/pranith/devops/code/qemu/cputlb.c:568
> #4  0x556db98c in tb_find_physical (cpu=0x566f9dd0,
> pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
> /home/pranith/devops/code/qemu/cpu-exec.c:224
> #5  0x556dbaf4 in tb_find_slow (cpu=0x566f9dd0,
> pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
> /home/pranith/devops/code/qemu/cpu-exec.c:268
> #6  0x556dbc77 in tb_find_fast (cpu=0x566f9dd0) at
> /home/pranith/devops/code/qemu/cpu-exec.c:311
> #7  0x556dc0f1 in cpu_arm_exec (cpu=0x566f9dd0) at
> /home/pranith/devops/code/qemu/cpu-exec.c:492
> #8  0x557050ee in tcg_cpu_exec (cpu=0x566f9dd0) at
> /home/pranith/devops/code/qemu/cpus.c:1486
> #9  0x557051af in tcg_exec_all (cpu=0x566f9dd0) at
> /home/pranith/devops/code/qemu/cpus.c:1515
> #10 0x55704800 in qemu_tcg_cpu_thread_fn (arg=0x566f9dd0) at
> /home/pranith/devops/code/qemu/cpus.c:1187
> #11 0x725a8182 in start_thread (arg=0x7fffd20c8700) at
> pthread_create.c:312
> #12 0x722d547d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>
> The arguments I used are as follows:
>
> (gdb) show args
>
> Argument list to give program being debugged when it is started is "-m
> 1024M -M virt -cpu cortex-a57 -global virtio-blk-device.scsi=off -device
> virtio-scsi-device,id=scsi -drive
> file=arm64disk.qcow2,id=coreimg,cache=unsafe,if=none -device
> scsi-hd,drive=coreimg -netdev user,id=unet -device
> virtio-net-device,netdev=unet -kernel vmlinuz -initrd initrd.img -append
> root=/dev/sda2 -display sdl -redir tcp:::22 -smp 2".

With my command line:

/home/alex/lsrc/qemu/qemu.git/aarch64-softmmu/qemu-system-aarch64
-machine type=virt -display none -smp 1 -m 4096 -cpu cortex-a57 -serial
telnet:127.0.0.1: -monitor stdio -netdev
user,id=unet,hostfwd=tcp::-:22 -device virtio-net-device,netdev=unet
-drive
file=/home/alex/lsrc/qemu/images/jessie-arm64.qcow2,id=myblock,index=0,if=none
-device virtio-blk-device,drive=myblock -kernel
/home/alex/lsrc/qemu/images/aarch64-current-linux-kernel-only.img
-append console=ttyAMA0 root=/dev/vda1 -s -name arm,debug-threads=on
-smp 4

I see the bad ram pointer failure in aarch64. It work in plain arm. Time
to dig out the debugging tools again ;-)

>
> Does something look obviously wrong to you in the arg list?
>
> Thanks!


--
Alex Bennée



Re: [Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-12 Thread Pranith Kumar
Hi Alex,

On Tue, Jan 12, 2016 at 12:29 PM, Alex Bennée 
wrote:

>
https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks
>

I built this branch and ran an arm64 guest. It seems to be failing
similarly to what I reported earlier:

#0  0x72211cc9 in __GI_raise (sig=sig@entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x722150d8 in __GI_abort () at abort.c:89
#2  0x5572014c in qemu_ram_addr_from_host_nofail
(ptr=0xffc000187863) at /home/pranith/devops/code/qemu/cputlb.c:357
#3  0x557209dd in get_page_addr_code (env1=0x56702058,
addr=18446743798833248356) at /home/pranith/devops/code/qemu/cputlb.c:568
#4  0x556db98c in tb_find_physical (cpu=0x566f9dd0,
pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
/home/pranith/devops/code/qemu/cpu-exec.c:224
#5  0x556dbaf4 in tb_find_slow (cpu=0x566f9dd0,
pc=18446743798833248356, cs_base=0, flags=18446744071830503424) at
/home/pranith/devops/code/qemu/cpu-exec.c:268
#6  0x556dbc77 in tb_find_fast (cpu=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpu-exec.c:311
#7  0x556dc0f1 in cpu_arm_exec (cpu=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpu-exec.c:492
#8  0x557050ee in tcg_cpu_exec (cpu=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpus.c:1486
#9  0x557051af in tcg_exec_all (cpu=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpus.c:1515
#10 0x55704800 in qemu_tcg_cpu_thread_fn (arg=0x566f9dd0) at
/home/pranith/devops/code/qemu/cpus.c:1187
#11 0x725a8182 in start_thread (arg=0x7fffd20c8700) at
pthread_create.c:312
#12 0x722d547d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The arguments I used are as follows:

(gdb) show args

Argument list to give program being debugged when it is started is "-m
1024M -M virt -cpu cortex-a57 -global virtio-blk-device.scsi=off -device
virtio-scsi-device,id=scsi -drive
file=arm64disk.qcow2,id=coreimg,cache=unsafe,if=none -device
scsi-hd,drive=coreimg -netdev user,id=unet -device
virtio-net-device,netdev=unet -kernel vmlinuz -initrd initrd.img -append
root=/dev/sda2 -display sdl -redir tcp:::22 -smp 2".

Does something look obviously wrong to you in the arg list?

Thanks!
-- 
Pranith


[Qemu-devel] Status of my hacks on the MTTCG WIP branch

2016-01-12 Thread Alex Bennée

Hi Fred,

OK so I've made progress with the WIP branch you gave me to look at.
Basically I've:

 * re-based to recent master
 * fixed up exit_request/tcg_current_cpu (in tcg: switch on multithread)
 * dropped the TLS exit_request and tcg_current_cpu patch
 * dropped the WIP: lock patch (was breaking virtio due pushing up locks)

And I can boot upand run  a -smp 2-4 arm system without any major
issues. For reference see:

  https://github.com/stsquad/qemu/tree/mttcg/multi_tcg_v8_wip_ajb_fix_locks

What issues are you currently looking at with your work?

--
Alex Bennée