On 5/11/24 13:26, Jiaxun Yang wrote:
MTTCG was disabled in a092a9554771 ("configure: disable MTTCG
for MIPS guests") due to test case instability.

I was able to reproduce this issue with in latest QEMU and look
into reason behind that.

What actually happend is kernel's CP0 timer synchronisation
mechanism assumed a consistent latency in memory access between
cores, which TCG can't guarantee. Thus there is a huge drift in
count value between cores, and in early kernel versions CP0 timer
is always used as sched_clock.

sched_clock drift back on some cores triggered RCU watchdog in
some extreme cases.

This can be resolved by setting clocksource to MIPS, which allows
clocksource to drift together with sched_clock. However this will
leed to other problems after boot.

Another option would beupdating kernel to later version, which
will use GIC as sched_clock.

In non-MTTCG build, the execution is slow enough so kernel won't
observe back drifts.

Test results:

With clocksource=MIPS
```
  ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
     -display none -vga none -serial mon:stdio \
     -machine malta -kernel ./vmlinux-4.7.0-rc1.I6400 \
     -cpu I6400 -smp 8 -vga std \
     -append "printk.time=0 clocksource=MIPS console=tty0 console=ttyS0 
panic=-1" \
     --no-reboot

100, 0, PASS, 5.258126, 100, 100, -
Results summary:
0: 100 times (100.00%), avg time 6.508 (55.53 varience/7.45 deviation)
Ran command 100 times, 100 passes
```

With linux-next:
```
  ~/tmp/retry/retry.py -n 100 -c -- ./qemu-system-mips64el \
     -display none -vga none -serial mon:stdio \
     -machine malta -kernel ~/linux-next/vmlinux \
     -cpu I6400 -smp 8 -vga std \
     -append "printk.time=0 console=tty0 console=ttyS0 panic=-1" \
     --no-reboot

100, 0, PASS, 4.507921, 100, 100, -
Results summary:
0: 100 times (100.00%), avg time 4.233 (0.04 varience/0.21 deviation)
Ran command 100 times, 100 passes
```

Signed-off-by: Jiaxun Yang <jiaxun.y...@flygoat.com>
---
I'll leave the test case alone as it's already marked as QEMU_TEST_FLAKY_TESTS
---
  configs/targets/mips64el-softmmu.mak | 1 +
  1 file changed, 1 insertion(+)

diff --git a/configs/targets/mips64el-softmmu.mak 
b/configs/targets/mips64el-softmmu.mak
index 8d9ab3ddc4b1..199b1d909a7d 100644
--- a/configs/targets/mips64el-softmmu.mak
+++ b/configs/targets/mips64el-softmmu.mak
@@ -1,3 +1,4 @@
  TARGET_ARCH=mips64
  TARGET_BASE_ARCH=mips
+TARGET_SUPPORTS_MTTCG=y
  TARGET_NEED_FDT=y

---
base-commit: 248f6f62df073a3b4158fd0093863ab885feabb5
change-id: 20240511-mips_mttcg-47a6b19074b3

Best regards,

Hi Jiaxun,
Thanks for your analysis!

We should see to update concerned test in another series.
I'm not sure which way is preferred between updating kernel used or changing current command line.

Reviewed-by: Pierrick Bouvier <pierrick.bouv...@linaro.org>

Reply via email to