Re: [PATCH v2 0/5] linux-user: Rewrite target_shmat
On Wed, 2024-02-28 at 10:25 -1000, Richard Henderson wrote: > There are multiple issues with the implementation of shmat(). > > (1) With reserved_va, which is the default for 32-on-64-bit, we mmap > the > entire guest address space. Unlike mmap, shmat refuses to > replace an > existing mapping without setting SHM_REMAP. This is the original > subject of issue #115, though it quicky gets distracted by > something else. > > (2) With target page size > host page size, and a shm area > that is not a multiple of the target page size, we leave > an unmapped hole that the target expects to be mapped. > This is the subject of > > > https://lore.kernel.org/qemu-devel/2no4imvz2zrar5kchz2l3oddqbgpj77jg > wcuf7aritkn2ok763@i2mvpcihztho/ > > wherein qemu itself expects a mapping to exist, and > dies in open_self_maps_2. > > So: reimplement the thing. > > Changes for v2: > - Include Ilya's test case, which caught extra errors: Yay! > - Include x86_64 /proc/self/maps fix, which the test triggers. > - Dropped r-b for the shmat rewrite due to number of changes. I tested these against our problem with webkitgkt and an happy to report it does solve the segfault we were seeing, thanks! Cheers, Richard
Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps
On Mon, 2024-02-05 at 13:05 +1000, Richard Henderson wrote: > On 1/26/24 23:52, Richard Purdie wrote: > > On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote: > > > 26.01.2024 16:03, Richard Purdie wrote: > > > > I've run into a problem with this change. > > > > > > > > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we > > > > started seeing errors cross compiling webkitgtk on x86_64 for x86_64 > > > > during the introspection code which runs under user mode qemu. > > > > > > Besides your observations, please be aware there's quite a few issues in > > > 8.2.0. > > > Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/ > > > (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is > > > updated > > > less often) for fixes already queued up, if you haven't looked there > > > already. > > > 8.2.1 stable/bugfix release is scheduled for the beginning of the next > > > week. > > > > Thanks. > > > > I should note that I did test the staging-8.2 branch and nothing there > > helped. The issue was also present with master as of yesterday. > > > > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto > > Projects tracking of the issue which has the commits for master and > > staging-8.2 that I tested. > > The yocto logs referenced here are not helpful for reproducing the problem. It took me a couple of days I didn't have to workout which commit caused it, which versions showed the issue and how to work around it. It looks host kernel specific since it doesn't happen on some systems so even with the binaries/command/environment vars, it may not be enough. I was hoping the indication of the cause might help point to the fix as there is quite a bit of work in trying to extract this into a reproducer. The failure is 20 mins into a webkitgtk compile on a remote CI system which no longer has the context on it. > Please extract a binary to run, inputs, and command-line. I wish I could say that to the bug reports I get! :) I'll do my best but finding the time is going to be a challenge. Cheers, Richard
Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps
Hi Michael, On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote: > 26.01.2024 16:03, Richard Purdie wrote: > > I've run into a problem with this change. > > > > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we > > started seeing errors cross compiling webkitgtk on x86_64 for x86_64 > > during the introspection code which runs under user mode qemu. > > Besides your observations, please be aware there's quite a few issues in > 8.2.0. > Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/ > (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is > updated > less often) for fixes already queued up, if you haven't looked there already. > 8.2.1 stable/bugfix release is scheduled for the beginning of the next week. Thanks. I should note that I did test the staging-8.2 branch and nothing there helped. The issue was also present with master as of yesterday. https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto Projects tracking of the issue which has the commits for master and staging-8.2 that I tested. Cheers, Richard
Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps
Hi, I've run into a problem with this change. We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we started seeing errors cross compiling webkitgtk on x86_64 for x86_64 during the introspection code which runs under user mode qemu. The error we see is: qemu-x86_64: QEMU internal SIGSEGV {code=MAPERR, addr=0x20} Segmentation fault e.g. here: https://autobuilder.yoctoproject.org/typhoon/#/builders/40/builds/8488/steps/11/logs/stdio This usually seems to happen on our debian 11 based autobuilder machines. I took one of the broken builds and bisected it to this change (commit 7b7a3366e142d3baeb3fd1d3660a50e7956c19eb). There was a change in output from commit 7dfd3ca8d95f9962cdd2ebdfcdd699279b98fa18, before that it was: ERROR:../git/accel/tcg/cpu-exec.c:532:cpu_exec_longjmp_cleanup: assertion failed: (cpu == current_cpu) Bail out! ERROR:../git/accel/tcg/cpu-exec.c:532:cpu_exec_longjmp_cleanup: assertion failed: (cpu == current_cpu) After digging into the code and trying to work out what is going on, I realised that n is NULL when it fails so this makes the problem "go away": diff --git a/linux-user/syscall.c b/linux-user/syscall.c index e384e14248..2577fb770d 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8085,6 +8085,9 @@ static int open_self_maps_2(void *opaque, target_ulong guest_start, while (1) { IntervalTreeNode *n = interval_tree_iter_first(d->host_maps, host_start, host_start); +if (!n) { +return 0; +} MapInfo *mi = container_of(n, MapInfo, itree); uintptr_t this_hlast = MIN(host_last, n->last); target_ulong this_gend = h2g(this_hlast) + 1; I'm hoping that might be enough to give you an idea of what is going on and what the correct fix may be? I haven't managed to make an isolated test to reproduce the issue yet. Cheers, Richard
Re: mips system emulation failure with virtio
On Wed, 2023-09-06 at 17:50 +0200, Philippe Mathieu-Daudé wrote: > +rth/pm215/dhildenb > > On 5/9/23 16:50, Richard Purdie wrote: > > On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote: > > > Richard Purdie writes: > > > > > > > With qemu 8.1.0 we see boot hangs fox x86-64 targets. > > > > > > > > These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu: > > > > Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and > > > > mips64 break, hanging at boot unable to find a rootfs. > > > > > > (Widen CC list) > > > > > > > > > > > We use virtio for network and disk and both of those change in the > > > > bootlog from messages like: > > > > > > > > [1.726118] virtio-pci :00:13.0: enabling device ( -> 0003) > > > > [1.728864] virtio-pci :00:14.0: enabling device ( -> 0003) > > > > [1.729948] virtio-pci :00:15.0: enabling device ( -> 0003) > > > > ... > > > > [2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues > > > > [2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical > > > > > > > > to: > > > > > > > > [1.777051] virtio-pci :00:13.0: enabling device ( -> 0003) > > > > [1.779822] virtio-pci :00:14.0: enabling device ( -> 0003) > > > > [1.780926] virtio-pci :00:15.0: enabling device ( -> 0003) > > > > ... > > > > [1.894852] virtio_rng: probe of virtio1 failed with error -28 > > > > ... > > > > [2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues > > > > [2.064260] virtio_blk: probe of virtio2 failed with error -28 > > > > [2.069080] virtio_net: probe of virtio0 failed with error -28 > > > > > > > > > > > > i.e. the virtio drivers no longer work. > > > > > > Interesting, as you say this seems to be VirtIO specific as the baseline > > > tests (using IDE) work fine: > > > > > >➜ ./tests/venv/bin/avocado run > > > ./tests/avocado/tuxrun_baselines.py:test_mips64 > > >JOB ID : 71f3e3b7080164b78ef1c8c1bb6bc880932d8c9b > > >JOB LOG: > > > /home/alex/avocado/job-results/job-2023-09-05T15.01-71f3e3b/job.log > > > (1/2) > > > ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64: PASS > > > (12.19 s) > > > (2/2) > > > ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64el: > > > PASS (11.78 s) > > >RESULTS: PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 > > > | CANCEL 0 > > >JOB TIME : 24.79 s > > > > > > > I tested with current qemu master > > > > (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken > > > > there. > > > > > > > > Is this issue known about? > > > > > > Could you raise a bug at: > > > > > >https://gitlab.com/qemu-project/qemu/-/issues > > > > Done, https://gitlab.com/qemu-project/qemu/-/issues/1866 > > > > > I'm curious why MIPS VirtIO is affected but nothing else is... > > > > Me too, it seems there is a real code issue somewhere in this... > > This seems to fix the issue for me, but I'm not really sure what > I'm doing after various hours debugging, so sharing here before > I take some rest: > > -- >8 -- > diff --git a/softmmu/physmem.c b/softmmu/physmem.c > index 18277ddd67..ec31ebcb56 100644 > --- a/softmmu/physmem.c > +++ b/softmmu/physmem.c > @@ -2517,7 +2517,7 @@ static void tcg_commit(MemoryListener *listener) >* That said, the listener is also called during realize, before >* all of the tcg machinery for run-on is initialized: thus halt_cond. >*/ > -if (cpu->halt_cond) { > +if (cpu->halt_cond && !qemu_cpu_is_self(cpu)) { > async_run_on_cpu(cpu, tcg_commit_cpu, RUN_ON_CPU_HOST_PTR(cpuas)); > } else { > tcg_commit_cpu(cpu, RUN_ON_CPU_HOST_PTR(cpuas)); I tested with the above and confirmed it does fix 8.1.0 for the mips test I was using. Thanks! Richard
Re: mips system emulation failure with virtio
On Tue, 2023-09-05 at 18:46 +0200, Philippe Mathieu-Daudé wrote: > On 5/9/23 17:53, Richard Purdie wrote: > > On Tue, 2023-09-05 at 17:12 +0200, Philippe Mathieu-Daudé wrote: > > > Hi Richard, > > > > > > On 5/9/23 16:50, Richard Purdie wrote: > > > > On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote: > > > > > Richard Purdie writes: > > > > > > > > > > > With qemu 8.1.0 we see boot hangs fox x86-64 targets. > > > > > > > > > > > > These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b > > > > > > (softmmu: > > > > > > Use async_run_on_cpu in tcg_commit) but if I add that commit, mips > > > > > > and > > > > > > mips64 break, hanging at boot unable to find a rootfs. > > > > > > Are you testing mipsel / mips64el? > > > > No, it was mips/mips64, i.e. big endian. > > Sorry my question was not clear. I meant: Do you also > test mipsel / mips64el guests, and if so, do they work? > (IOW, is this bug only big-endian guest specific?) Sorry, I misunderstood. We don't test mipsel/mips64el so I don't know if that is working or not unfortunately. Cheers, Richard
Re: mips system emulation failure with virtio
On Tue, 2023-09-05 at 17:12 +0200, Philippe Mathieu-Daudé wrote: > Hi Richard, > > On 5/9/23 16:50, Richard Purdie wrote: > > On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote: > > > Richard Purdie writes: > > > > > > > With qemu 8.1.0 we see boot hangs fox x86-64 targets. > > > > > > > > These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu: > > > > Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and > > > > mips64 break, hanging at boot unable to find a rootfs. > > Are you testing mipsel / mips64el? No, it was mips/mips64, i.e. big endian. Cheers, Richard
Re: mips system emulation failure with virtio
On Tue, 2023-09-05 at 14:59 +0100, Alex Bennée wrote: > Richard Purdie writes: > > > With qemu 8.1.0 we see boot hangs fox x86-64 targets. > > > > These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu: > > Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and > > mips64 break, hanging at boot unable to find a rootfs. > > (Widen CC list) > > > > > We use virtio for network and disk and both of those change in the > > bootlog from messages like: > > > > [1.726118] virtio-pci :00:13.0: enabling device ( -> 0003) > > [1.728864] virtio-pci :00:14.0: enabling device ( -> 0003) > > [1.729948] virtio-pci :00:15.0: enabling device ( -> 0003) > > ... > > [2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues > > [2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical > > > > to: > > > > [1.777051] virtio-pci :00:13.0: enabling device ( -> 0003) > > [1.779822] virtio-pci :00:14.0: enabling device ( -> 0003) > > [1.780926] virtio-pci :00:15.0: enabling device ( -> 0003) > > ... > > [1.894852] virtio_rng: probe of virtio1 failed with error -28 > > ... > > [2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues > > [2.064260] virtio_blk: probe of virtio2 failed with error -28 > > [2.069080] virtio_net: probe of virtio0 failed with error -28 > > > > > > i.e. the virtio drivers no longer work. > > Interesting, as you say this seems to be VirtIO specific as the baseline > tests (using IDE) work fine: > > ➜ ./tests/venv/bin/avocado run > ./tests/avocado/tuxrun_baselines.py:test_mips64 > JOB ID : 71f3e3b7080164b78ef1c8c1bb6bc880932d8c9b > JOB LOG: > /home/alex/avocado/job-results/job-2023-09-05T15.01-71f3e3b/job.log >(1/2) ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64: > PASS (12.19 s) >(2/2) > ./tests/avocado/tuxrun_baselines.py:TuxRunBaselineTest.test_mips64el: PASS > (11.78 s) > RESULTS: PASS 2 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | > CANCEL 0 > JOB TIME : 24.79 s > > > I tested with current qemu master > > (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken > > there. > > > > Is this issue known about? > > Could you raise a bug at: > > https://gitlab.com/qemu-project/qemu/-/issues Done, https://gitlab.com/qemu-project/qemu/-/issues/1866 > I'm curious why MIPS VirtIO is affected but nothing else is... Me too, it seems there is a real code issue somewhere in this... Cheers, Richard
mips system emulation failure with virtio
With qemu 8.1.0 we see boot hangs fox x86-64 targets. These are fixed by 0d58c660689f6da1e3feff8a997014003d928b3b (softmmu: Use async_run_on_cpu in tcg_commit) but if I add that commit, mips and mips64 break, hanging at boot unable to find a rootfs. We use virtio for network and disk and both of those change in the bootlog from messages like: [1.726118] virtio-pci :00:13.0: enabling device ( -> 0003) [1.728864] virtio-pci :00:14.0: enabling device ( -> 0003) [1.729948] virtio-pci :00:15.0: enabling device ( -> 0003) ... [2.162148] virtio_blk virtio2: 1/0/0 default/read/poll queues [2.168311] virtio_blk virtio2: [vda] 1184242 512-byte logical to: [1.777051] virtio-pci :00:13.0: enabling device ( -> 0003) [1.779822] virtio-pci :00:14.0: enabling device ( -> 0003) [1.780926] virtio-pci :00:15.0: enabling device ( -> 0003) ... [1.894852] virtio_rng: probe of virtio1 failed with error -28 ... [2.063553] virtio_blk virtio2: 1/0/0 default/read/poll queues [2.064260] virtio_blk: probe of virtio2 failed with error -28 [2.069080] virtio_net: probe of virtio0 failed with error -28 i.e. the virtio drivers no longer work. I tested with current qemu master (17780edd81d27fcfdb7a802efc870a99788bd2fc) and mips is still broken there. Is this issue known about? Cheers, Richard
qemu-system-ppc failures with glibc 2.38
Hi, Yocto Project's CI is noticing a lot of issues with qemu-system-ppc emulation on loaded systems after we switch glibc to 2.38. This is manifesting as hangs in the emulated system and for example, systemd units then timeout and restart. If we have long running commands running over ssh (e.g. configure or kernel module builds), these are then are then disconnected. It does appear the system does 'freeze' periodically. It also seems related to the load on the system running the emulation. On an otherwise idle system it is fine, when there is other load, we hit the freezes, watchdog resets and resulting faiilures. We do sometimes see kernel rcu stalls too, the frequency of them may be higher with the new glibc, it is hard to tell. Can anyone think of something that changed in glibc 2.38 which might affect qemu-system-ppc in this way? Everything else we test seems ok. I've tested with qemu 8.0.0, 8.0.3, 8.0.4 and 8.1.0 with similar results. The latter has other problems unfortunately which we're still trying to debug (x86 hangs, we then tried the "softmmu: Assert data in bounds in iotlb_to_section" patches but that breaks mips). Cheers, Richard
Re: [PULL 00/10] ppc queue
On Mon, 2023-05-29 at 16:30 +1000, Nicholas Piggin wrote: > On Mon May 29, 2023 at 4:01 PM AEST, Michael Tokarev wrote: > > 29.05.2023 05:18, Nicholas Piggin wrote: > > .. > > > > > > 01/10 target/ppc: Fix fallback to MFSS for MFFS* instructions on pre > > > > 3.0 ISAs > > > > 02/10 target/ppc: Fix width of some 32-bit SPRs > > > > 03/10 target/ppc: Alignment faults do not set DSISR in ISA v3.0 onward > > > > 05/10 hw/ppc/prep: Fix wiring of PIC -> CPU interrupt > > > > > > > > Or are these not important for -stable? Or maybe there are other > > > > changes > > > > which should be picked too? > > > > > > They certainly fix some parts of target emulation, but what is the > > > guidance for backporting those type of fixes? Most of the patches I sent > > > including 2,3 were just found from inspection or new test code and not > > > real software failing. > > > > > > Should just simple ones go in? 32-bit SPRs do not fix entirely the > > > behaviour of all SPRs, just one aspect. In another fix I had (that > > > didn't make it in this merge), was a bit more complicated and the > > > first iteration caused a deadlock that didn't show up in basic test > > > like booting Linux. > > > > > > My guess is that fixes that correct an issue with real software running > > > on the target should be ported to stable. Perhaps "obviously correct" > > > small fixes as well. But not sure about larger changes. > > > > This is exactly why I asked, - because I don't clearly understand how > > important these to have in -stable. And also to remind that -stable > > exist, just in case.. ;) > > Ah okay, makes sense. I was just clarifying myself since I wasn't > too sure. > > > So be it, no actual issue so not applying to -stable. > > I will think about it and try to keep -stable in mind. Of my patches > there are one or two coming up that could probably go in there, if > not these ones. 1/10 from me (fallback to MFSS) did fix software failures for Yocto Project so might be a good candidate for stable. We're carrying that patch against the last release for now. Cheers, Richard
[PATCH v3] target/ppc: Fix fallback to MFSS for MFFS* instructions on pre 3.0 ISAs
The following commits changed the code such that the fallback to MFSS for MFFSCRN, MFFSCRNI, MFFSCE and MFFSL on pre 3.0 ISAs was removed and became an illegal instruction: bf8adfd88b547680aa857c46098f3a1e94373160 - target/ppc: Move mffscrn[i] to decodetree 394c2e2fda70da722f20fb60412d6c0ca4bfaa03 - target/ppc: Move mffsce to decodetree 3e5bce70efe6bd1f684efbb21fd2a316cbf0657e - target/ppc: Move mffsl to decodetree The hardware will handle them as a MFFS instruction as the code did previously. This means applications that were segfaulting under qemu when encountering these instructions which is used in glibc libm functions for example. The fallback for MFFSCDRN and MFFSCDRNI added in a later patch was also missing. This patch restores the fallback to MFSS for these instructions on pre 3.0s ISAs as the hardware decoder would, fixing the segfaulting libm code. It doesn't have the fallback for 3.0 onwards to match hardware behaviour. Signed-off-by: Richard Purdie --- target/ppc/insn32.decode | 20 +--- target/ppc/translate/fp-impl.c.inc | 22 -- 2 files changed, 29 insertions(+), 13 deletions(-) v3 - drop fallback to MFFS for 3.0 ISA to match hardware v2 - switch to use decodetree pattern groups per feedback diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index f8f589e9fd..4fcf3af8d0 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -390,13 +390,19 @@ SETNBCR 01 . . - 00 - @X_bi ### Move To/From FPSCR -MFFS11 . 0 - 1001000111 . @X_t_rc -MFFSCE 11 . 1 - 1001000111 - @X_t -MFFSCRN 11 . 10110 . 1001000111 - @X_tb -MFFSCDRN11 . 10100 . 1001000111 - @X_tb -MFFSCRNI11 . 10111 ---.. 1001000111 - @X_imm2 -MFFSCDRNI 11 . 10101 --... 1001000111 - @X_imm3 -MFFSL 11 . 11000 - 1001000111 - @X_t +{ + # Before Power ISA v3.0, MFFS bits 11~15 were reserved and should be ignored + MFFS_ISA207 11 . - - 1001000111 . @X_t_rc + [ +MFFS11 . 0 - 1001000111 . @X_t_rc +MFFSCE 11 . 1 - 1001000111 - @X_t +MFFSCRN 11 . 10110 . 1001000111 - @X_tb +MFFSCDRN11 . 10100 . 1001000111 - @X_tb +MFFSCRNI11 . 10111 ---.. 1001000111 - @X_imm2 +MFFSCDRNI 11 . 10101 --... 1001000111 - @X_imm3 +MFFSL 11 . 11000 - 1001000111 - @X_t + ] +} ### Decimal Floating-Point Arithmetic Instructions diff --git a/target/ppc/translate/fp-impl.c.inc b/target/ppc/translate/fp-impl.c.inc index 57d8437851..874774eade 100644 --- a/target/ppc/translate/fp-impl.c.inc +++ b/target/ppc/translate/fp-impl.c.inc @@ -568,6 +568,22 @@ static void store_fpscr_masked(TCGv_i64 fpscr, uint64_t clear_mask, gen_helper_store_fpscr(cpu_env, fpscr_masked, st_mask); } +static bool trans_MFFS_ISA207(DisasContext *ctx, arg_X_t_rc *a) +{ +if (!(ctx->insns_flags2 & PPC2_ISA300)) { +/* + * Before Power ISA v3.0, MFFS bits 11~15 were reserved, any instruction + * with OPCD=63 and XO=583 should be decoded as MFFS. + */ +return trans_MFFS(ctx, a); +} +/* + * For Power ISA v3.0+, return false and let the pattern group + * select the correct instruction. + */ +return false; +} + static bool trans_MFFS(DisasContext *ctx, arg_X_t_rc *a) { REQUIRE_FPU(ctx); @@ -584,7 +600,6 @@ static bool trans_MFFSCE(DisasContext *ctx, arg_X_t *a) { TCGv_i64 fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); REQUIRE_FPU(ctx); gen_reset_fpstatus(); @@ -597,7 +612,6 @@ static bool trans_MFFSCRN(DisasContext *ctx, arg_X_tb *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -614,7 +628,6 @@ static bool trans_MFFSCDRN(DisasContext *ctx, arg_X_tb *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -631,7 +644,6 @@ static bool trans_MFFSCRNI(DisasContext *ctx, arg_X_imm2 *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -647,7 +659,6 @@ static bool trans_MFFSCDRNI(DisasContext *ctx, arg_X_imm3 *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -661,7 +672,6 @@ static bool trans_MFFSCDRNI(DisasContext *ctx, arg_X_imm3 *a) static bool trans_MFFSL(DisasContext *ctx, arg_X_t *a) { -REQUIRE_INSNS_FLAGS2(ctx, ISA300); REQUIRE_FPU(ctx); gen_reset_fpstatus(); -- 2.39.2
[PATCH v2] target/ppc: Fix fallback to MFSS for MFFS* instructions on pre 3.0 ISAs
The following commits changed the code such that the fallback to MFSS for MFFSCRN, MFFSCRNI, MFFSCE and MFFSL on pre 3.0 ISAs was removed and became an illegal instruction: bf8adfd88b547680aa857c46098f3a1e94373160 - target/ppc: Move mffscrn[i] to decodetree 394c2e2fda70da722f20fb60412d6c0ca4bfaa03 - target/ppc: Move mffsce to decodetree 3e5bce70efe6bd1f684efbb21fd2a316cbf0657e - target/ppc: Move mffsl to decodetree The hardware will handle them as a MFFS instruction as the code did previously. This means applications that were segfaulting under qemu when encountering these instructions which is used in glibc libm functions for example. The fallback for MFFSCDRN and MFFSCDRNI added in a later patch was also missing. This patch restores the fallback to MFSS for these instructions on pre 3.0s ISAs as the hardware decoder would, fixing the segfaulting libm code. It and also ensures the MFSS instruction is used for currently reserved bits to handle other potential ISA additions more correctly. Signed-off-by: Richard Purdie --- target/ppc/insn32.decode | 19 --- target/ppc/translate/fp-impl.c.inc | 30 -- 2 files changed, 36 insertions(+), 13 deletions(-) v2 - switch to use decodetree pattern groups per feedback diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index f8f589e9fd..3c4e2c2fc2 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -390,13 +390,18 @@ SETNBCR 01 . . - 00 - @X_bi ### Move To/From FPSCR -MFFS11 . 0 - 1001000111 . @X_t_rc -MFFSCE 11 . 1 - 1001000111 - @X_t -MFFSCRN 11 . 10110 . 1001000111 - @X_tb -MFFSCDRN11 . 10100 . 1001000111 - @X_tb -MFFSCRNI11 . 10111 ---.. 1001000111 - @X_imm2 -MFFSCDRNI 11 . 10101 --... 1001000111 - @X_imm3 -MFFSL 11 . 11000 - 1001000111 - @X_t +{ + # Before Power ISA v3.0, MFFS bits 11~15 were reserved and should be ignored + [ +MFFSCE 11 . 1 - 1001000111 - @X_t +MFFSCRN 11 . 10110 . 1001000111 - @X_tb +MFFSCDRN11 . 10100 . 1001000111 - @X_tb +MFFSCRNI11 . 10111 ---.. 1001000111 - @X_imm2 +MFFSCDRNI 11 . 10101 --... 1001000111 - @X_imm3 +MFFSL 11 . 11000 - 1001000111 - @X_t + ] + MFFS11 . - - 1001000111 . @X_t_rc +} ### Decimal Floating-Point Arithmetic Instructions diff --git a/target/ppc/translate/fp-impl.c.inc b/target/ppc/translate/fp-impl.c.inc index 57d8437851..10dfd91aa4 100644 --- a/target/ppc/translate/fp-impl.c.inc +++ b/target/ppc/translate/fp-impl.c.inc @@ -584,7 +584,10 @@ static bool trans_MFFSCE(DisasContext *ctx, arg_X_t *a) { TCGv_i64 fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (!(ctx->insns_flags2 & PPC2_ISA300)) { +return false; +} + REQUIRE_FPU(ctx); gen_reset_fpstatus(); @@ -597,7 +600,10 @@ static bool trans_MFFSCRN(DisasContext *ctx, arg_X_tb *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (!(ctx->insns_flags2 & PPC2_ISA300)) { +return false; +} + REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -614,7 +620,10 @@ static bool trans_MFFSCDRN(DisasContext *ctx, arg_X_tb *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (!(ctx->insns_flags2 & PPC2_ISA300)) { +return false; +} + REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -631,7 +640,10 @@ static bool trans_MFFSCRNI(DisasContext *ctx, arg_X_imm2 *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (!(ctx->insns_flags2 & PPC2_ISA300)) { +return false; +} + REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -647,7 +659,10 @@ static bool trans_MFFSCDRNI(DisasContext *ctx, arg_X_imm3 *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (!(ctx->insns_flags2 & PPC2_ISA300)) { +return false; +} + REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -661,7 +676,10 @@ static bool trans_MFFSCDRNI(DisasContext *ctx, arg_X_imm3 *a) static bool trans_MFFSL(DisasContext *ctx, arg_X_t *a) { -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (!(ctx->insns_flags2 & PPC2_ISA300)) { +return false; +} + REQUIRE_FPU(ctx); gen_reset_fpstatus(); -- 2.39.2
[PATCH] target/ppc: Fix fallback to MFSS for MFFSCRN, MFFSCRNI, MFFSCE and MFFSL
The following commits changed the code such that these instructions became invalid on pre 3.0 ISAs: bf8adfd88b547680aa857c46098f3a1e94373160 - target/ppc: Move mffscrn[i] to decodetree 394c2e2fda70da722f20fb60412d6c0ca4bfaa03 - target/ppc: Move mffsce to decodetree 3e5bce70efe6bd1f684efbb21fd2a316cbf0657e - target/ppc: Move mffsl to decodetree The hardware will handle them as a MFFS instruction as the code did previously. Restore that behaviour. This means applications that were segfaulting under qemu when encountering these instructions now operate correctly. The instruction is used in glibc libm functions for example. Signed-off-by: Richard Purdie --- target/ppc/translate/fp-impl.c.inc | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/target/ppc/translate/fp-impl.c.inc b/target/ppc/translate/fp-impl.c.inc index 57d8437851..cb86381c3f 100644 --- a/target/ppc/translate/fp-impl.c.inc +++ b/target/ppc/translate/fp-impl.c.inc @@ -584,7 +584,10 @@ static bool trans_MFFSCE(DisasContext *ctx, arg_X_t *a) { TCGv_i64 fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (unlikely(!(ctx->insns_flags2 & PPC2_ISA300))) { +return trans_MFFS(ctx, a); +} + REQUIRE_FPU(ctx); gen_reset_fpstatus(); @@ -597,7 +600,10 @@ static bool trans_MFFSCRN(DisasContext *ctx, arg_X_tb *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (unlikely(!(ctx->insns_flags2 & PPC2_ISA300))) { +return trans_MFFS(ctx, a); +} + REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -631,7 +637,10 @@ static bool trans_MFFSCRNI(DisasContext *ctx, arg_X_imm2 *a) { TCGv_i64 t1, fpscr; -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (unlikely(!(ctx->insns_flags2 & PPC2_ISA300))) { +return trans_MFFS(ctx, a); +} + REQUIRE_FPU(ctx); t1 = tcg_temp_new_i64(); @@ -661,7 +670,10 @@ static bool trans_MFFSCDRNI(DisasContext *ctx, arg_X_imm3 *a) static bool trans_MFFSL(DisasContext *ctx, arg_X_t *a) { -REQUIRE_INSNS_FLAGS2(ctx, ISA300); +if (unlikely(!(ctx->insns_flags2 & PPC2_ISA300))) { +return trans_MFFS(ctx, a); +} + REQUIRE_FPU(ctx); gen_reset_fpstatus();
Re: VMs hanging with rcu stall problems
On Thu, 2021-06-03 at 11:02 +0100, Richard Purdie wrote: > We're seeing intermittent rcu stalls in qemu system mode emulation which is > causing CI issues for us (Yocto Project). We're struggling to understand > the cause and have tried several things to mitigate without much success. > The stalls are odd in that the log messages look incomplete. An example > from last night: To answer my own question, the issue is a kernel bug in RCU stall handling. The fix: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?h=rcu/next&id=c583bcb8f5ed and patch which introduced the problem: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/commit/?h=rcu/next&id=c583bcb8f5ed Cheers, Richard
VMs hanging with rcu stall problems
Hi, We're seeing intermittent rcu stalls in qemu system mode emulation which is causing CI issues for us (Yocto Project). We're struggling to understand the cause and have tried several things to mitigate without much success. The stalls are odd in that the log messages look incomplete. An example from last night: [ 133.333475] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 133.337109] (detected by 2, t=25864 jiffies, g=1529, q=10) [ 133.339025] rcu: All QSes seen, last rcu_preempt kthread activity 4865 (4294800423-4294795558), jiffies_till_next_fqs=3, root ->qsmask 0x0 [ 133.343445] rcu: rcu_preempt kthread starved for 4870 jiffies! g1529 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=2 [ 133.346976] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 133.350262] rcu: RCU grace-period kthread stack dump: [ 133.352704] task:rcu_preempt state:R stack:0 pid: 13 ppid: 2 flags:0x4000 [ 133.355581] Call Trace: [ 133.356488] __schedule+0x1dc/0x570 [ 133.357693] ? __mod_timer+0x220/0x3c0 [ 133.359018] schedule+0x68/0xe0 [ 133.36] schedule_timeout+0x8f/0x160 [ 133.361267] ? force_qs_rnp+0x8d/0x1c0 [ 133.362515] ? __next_timer_interrupt+0x100/0x100 [ 133.364264] rcu_gp_kthread+0x55f/0xba0 [ 133.365701] ? note_gp_changes+0x70/0x70 [ 133.367356] kthread+0x145/0x170 [ 133.368597] ? kthread_associate_blkcg+0xc0/0xc0 [ 133.370686] ret_from_fork+0x22/0x30 [ 133.371976] BUG: scheduling while atomic: swapper/2/0/0x0002 [ 133.374066] Modules linked in: bnep [ 133.375324] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.10.41-yocto-standard #1 [ 133.377813] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 133.381882] Call Trace: [ 133.382744] dump_stack+0x5e/0x74 [ 133.384027] __schedule_bug.cold+0x4b/0x59 [ 133.385362] __schedule+0x3f6/0x570 [ 133.386655] schedule_idle+0x2c/0x40 [ 133.388033] do_idle+0x15a/0x250 [ 133.389257] ? complete+0x3f/0x50 [ 133.390406] cpu_startup_entry+0x20/0x30 [ 133.391827] start_secondary+0xf1/0x100 [ 133.393143] secondary_startup_64_no_verify+0xc2/0xcb [ 191.482302] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 255.155323] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: (full kernel+boot log is below) What strikes me as odd is the scheduling whilst atomic (which we don't normally see) and the lack of task info (which is never there). We have some evidence that the qemus are using say 350% of cpu when this happens so it doesn't actually appear to be staved of cpu in reality. Unfortunately catching that data is hard. We only see this issue intermittently, it does seem to coincide with load but equally, the kernel messages seem odd and ps output suggests the qemu processes are getting ample cpu time. This is for x86_64 and using kvm but this isn't kvm on kvm. We have many different base distros (Centos, Debian, OpenSUSE, Ubuntu), it happens on them all. We're using qemu 6.0.0 although we saw the same issue with 5.2.0. Our kernel is mainly a 5.10 although we do have a 5.4 kernel and have also tested a 5.12 and they all seem to do this. We can have periods of significant load on the autobuilders from build processes however we have taken steps to mitigate this. The qemu processes are priority -5, the builds are at 5 and we also ioprio 2,0 for qemu. We don't use cgroups since not all of our builders have support for v2 and it complicates permission issues as our builds are not virtualised and don't run as root. I'm not convinced it would really help. In an effort to try and mitigate issues we moved the images we're booting onto a tmpfs. We have also ended up "pre-caching" all the mmaped libraries used by the binary just in case something was stalling on IO whilst qemu was running. To do that we start with qmp, wait for the port, read through the files in /proc//map_files. We also did try disabling kvmclock but that doesn't appear to help. We also did change the systems from a single cpu to having 4 cpus. This changed the trace output to be multi cpu but the basic issue seems to remain. We wanted to use multiple cores anyway so will keep that but it doesn't help this issue. The systems running these qemu images are 48 core dual socket Xeons E5-2697 with 128GB memory. We have seen some stalls in arm/mips/ppc qemu system testing without KVM too although not since the above mitigations were added. I've suspected there are multiple causes. The corresponding build in CI was this: https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/3538 I've detailed more about the background on how we're using this below. Does anyone have any idea what might be causing this and how we might fix or mitigate it? We're at the point where a component of our CI fails on pretty much every run and I'm struggling with ideas on how to proceed at this point other than reducing t
[PATCH v2] linux-user/mmap: Return EFAULT/EINVAL for invalid addresses
When using qemu-i386 to build qemux86 webkitgtk on musl, it sits in an infinite loop of mremap calls of ever decreasing/increasing addresses. I suspect something in the musl memory allocation code loops indefinitely if it only sees ENOMEM and only exits when it hits other errors such as EFAULT or EINVAL. According to the docs, trying to mremap outside the address space can/should return EFAULT and changing this allows the build to succeed. A better return value for the other cases of invalid addresses is EINVAL rather than ENOMEM so adjust the other part of the test to this. Signed-off-by: Richard Purdie
Re: [RFC PATCH] linux-user/mmap: Return EFAULT for invalid addresses
On Sat, 2021-02-13 at 18:40 +0100, Laurent Vivier wrote: > Le 08/01/2021 à 18:46, Richard Purdie a écrit : > > When using qemu-i386 to run gobject introspection parts of a webkitgtk > > build using musl as libc on a 64 bit host, it sits in an infinite loop > > of mremap calls of ever decreasing/increasing addresses. > > > > I suspect something in the musl memory allocation code loops indefinitely > > if it only sees ENOMEM and only exits when it hits EFAULT. > > > > According to the docs, trying to mremap outside the address space > > can/should return EFAULT and changing this allows the build to succeed. > > > > There was previous discussion of this as it used to work before qemu 2.11 > > and we've carried hacks to work around it since, this appears to be a > > better fix of the real issue? > > > > Signed-off-by: Richard Purdie > > > Index: qemu-5.2.0/linux-user/mmap.c > > === > > --- qemu-5.2.0.orig/linux-user/mmap.c > > +++ qemu-5.2.0/linux-user/mmap.c > > @@ -727,7 +727,7 @@ abi_long target_mremap(abi_ulong old_add > > !guest_range_valid(new_addr, new_size)) || > > ((flags & MREMAP_MAYMOVE) == 0 && > > !guest_range_valid(old_addr, new_size))) { > > -errno = ENOMEM; > > +errno = EFAULT; > > return -1; > > } > > > > > > > > > > I agree with that, the ENOMEM is returned when there is not enough virtual > memory (the > mmap_find_vma() case). > > According to the manpage, EFAULT is returned when old_addr and old_addr + > old_size is an invalid > address space. > > So: > > if (!guest_range_valid(old_addr, old_size)) { > errno = EFAULT; > return -1; > } > > But in the case of new_size and new_addr, it seems the good value to use is > EINVAL. > > So: > > if (((flags & MREMAP_FIXED) && !guest_range_valid(new_addr, new_size)) || > ((flags & MREMAP_MAYMOVE) == 0 && !guest_range_valid(old_addr, > new_size))) { > errno = EINVAL; > return -1; > } > > Did you try that? Its taken me a short while to reproduce the test environment but I did so and can confirm that using EINVAL works just as well as EFAULT in the test case we have. The above would therefore seem to make sense to me and would fix the case we found. Cheers, Richard
Re: [Qemu-devel] [PATCH v2 07/11] chardev: Let IOReadHandler use unsigned type
On Fri, 2021-01-22 at 14:55 +0100, Philippe Mathieu-Daudé wrote: > Hi Prasad, Richard. > > On 1/22/21 12:52 PM, P J P wrote: > > +-- On Fri, 22 Jan 2021, Richard Purdie wrote --+ > > > If so can anyone point me at that change? > > > > > > I ask since CVE-2018-18438 is marked as affecting all qemu versions > > > (https://nvd.nist.gov/vuln/detail/CVE-2018-18438). > > > > > > If it was fixed, the version mask could be updated. If the fix wasn't > > > deemed > > > worthwhile for some reason that is also fine and I can mark this one as > > > such > > > in our system. I'm being told we only need one of the patches in this > > > series > > > which I also don't believe as I suspect we either need the set or none of > > > them! > > > > > > Any info would be most welcome. > > > > -> https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg02239.html > > -> https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg02231.html > > > > * Yes, the type change fix had come up during patch reviews above, and this > > series implemented the change. > > > > * Series is required IIUC, didn't realise it's not merged. > > Audit from Marc-André pointed that this is unlikely, we asked the > reporter for a reproducer and got not news, and eventually closed > this as NOTABUG (not even WONTFIX): > https://bugzilla.redhat.com/show_bug.cgi?id=1609015 I guessed there some resolution like this but couldn't find it thanks for the pointer. It's now clear in the archives and I can handle appropriately rejecting carrying those patches, thanks! Cheers, Richard
Re: [Qemu-devel] [PATCH v2 07/11] chardev: Let IOReadHandler use unsigned type
On Fri, 2018-10-12 at 02:22 +0200, Philippe Mathieu-Daudé wrote: > The number of bytes can not be negative nor zero. > > Fixed 2 format string: > - hw/char/spapr_vty.c > - hw/usb/ccid-card-passthru.c > > Suggested-by: Paolo Bonzini > Signed-off-by: Philippe Mathieu-Daudé > Acked-by: Alberto Garcia Sorry to drag up an old patch series. As far as I can see this series was never applied. I suspect a better way of solving the issue may have been found? If so can anyone point me at that change? I ask since CVE-2018-18438 is marked as affecting all qemu versions (https://nvd.nist.gov/vuln/detail/CVE-2018-18438). If it was fixed, the version mask could be updated. If the fix wasn't deemed worthwhile for some reason that is also fine and I can mark this one as such in our system. I'm being told we only need one of the patches in this series which I also don't believe as I suspect we either need the set or none of them! Any info would be most welcome. Cheers, Richard
Re: [RFC PATCH] linux-user/mmap: Return EFAULT for invalid addresses
On Fri, 2021-01-08 at 17:46 +, Richard Purdie wrote: > When using qemu-i386 to run gobject introspection parts of a webkitgtk > build using musl as libc on a 64 bit host, it sits in an infinite loop > of mremap calls of ever decreasing/increasing addresses. > > I suspect something in the musl memory allocation code loops indefinitely > if it only sees ENOMEM and only exits when it hits EFAULT. > > According to the docs, trying to mremap outside the address space > can/should return EFAULT and changing this allows the build to succeed. > > There was previous discussion of this as it used to work before qemu 2.11 > and we've carried hacks to work around it since, this appears to be a > better fix of the real issue? > > Signed-off-by: Richard Purdie > Index: qemu-5.2.0/linux-user/mmap.c > === > --- qemu-5.2.0.orig/linux-user/mmap.c > +++ qemu-5.2.0/linux-user/mmap.c > @@ -727,7 +727,7 @@ abi_long target_mremap(abi_ulong old_add > !guest_range_valid(new_addr, new_size)) || > ((flags & MREMAP_MAYMOVE) == 0 && > !guest_range_valid(old_addr, new_size))) { > -errno = ENOMEM; > +errno = EFAULT; > return -1; > } Any comments on this? I believe its a valid issue that needs fixing and multiple distros appear to be carrying fixes in this area related to this. Cheers, Richard
Re: [PATCH] linux-user/mmap: Avoid asserts for out of range mremap calls
On Fri, 2021-01-08 at 17:42 +, Richard Purdie wrote: > If mremap() is called without the MREMAP_MAYMOVE flag with a start address > just before the end of memory (reserved_va) where new_size would exceed > it (and GUEST_ADDR_MAX), the assert(end - 1 <= GUEST_ADDR_MAX) in > page_set_flags() would trigger. > > Add an extra guard to the guest_range_valid() checks to prevent this and > avoid asserting binaries when reserved_va is set. > > This meant a bug I was seeing locally now gives the same behaviour > regardless of whether reserved_va is set or not. > > Signed-off-by: Richard Purdie > Index: qemu-5.2.0/linux-user/mmap.c > === > --- qemu-5.2.0.orig/linux-user/mmap.c > +++ qemu-5.2.0/linux-user/mmap.c > @@ -727,7 +727,9 @@ abi_long target_mremap(abi_ulong old_add > > > if (!guest_range_valid(old_addr, old_size) || > ((flags & MREMAP_FIXED) && > - !guest_range_valid(new_addr, new_size))) { > + !guest_range_valid(new_addr, new_size)) || > +((flags & MREMAP_MAYMOVE) == 0 && > + !guest_range_valid(old_addr, new_size))) { > errno = ENOMEM; > return -1; > } > > Any comments on this? I believe its a valid issue that needs fixing and multiple distros appear to be carrying fixes in this area related to this. Cheers, Richard
[RFC PATCH] linux-user/mmap: Return EFAULT for invalid addresses
When using qemu-i386 to run gobject introspection parts of a webkitgtk build using musl as libc on a 64 bit host, it sits in an infinite loop of mremap calls of ever decreasing/increasing addresses. I suspect something in the musl memory allocation code loops indefinitely if it only sees ENOMEM and only exits when it hits EFAULT. According to the docs, trying to mremap outside the address space can/should return EFAULT and changing this allows the build to succeed. There was previous discussion of this as it used to work before qemu 2.11 and we've carried hacks to work around it since, this appears to be a better fix of the real issue? Signed-off-by: Richard Purdie
[PATCH] linux-user/mmap: Avoid asserts for out of range mremap calls
If mremap() is called without the MREMAP_MAYMOVE flag with a start address just before the end of memory (reserved_va) where new_size would exceed it (and GUEST_ADDR_MAX), the assert(end - 1 <= GUEST_ADDR_MAX) in page_set_flags() would trigger. Add an extra guard to the guest_range_valid() checks to prevent this and avoid asserting binaries when reserved_va is set. This meant a bug I was seeing locally now gives the same behaviour regardless of whether reserved_va is set or not. Signed-off-by: Richard Purdie
Re: [RFC PATCH 0/3] target/mips: Make the number of TLB entries a CPU property
On Tue, 2020-10-13 at 19:22 -0700, Richard Henderson wrote: > On 10/13/20 4:11 PM, Richard Henderson wrote: > > On 10/13/20 6:25 AM, Philippe Mathieu-Daudé wrote: > > > Yocto developers have expressed interest in running MIPS32 > > > CPU with custom number of TLB: > > > https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg03428.html > > > > > > Help them by making the number of TLB entries a CPU property, > > > keeping our set of CPU definitions in sync with real hardware. > > > > You mean keeping the 34kf model within qemu in sync, rather than > > creating a > > nonsense model that doesn't exist. > > > > Question: is this cpu parameter useful for anything else? > > > > Because the ideal solution for a CI loop is to use one of the mips > > cpu models > > that has the hw page table walker (CP0C3_PW). Having qemu being > > able to refill > > the tlb itself is massively faster. > > > > We do not currently implement a mips cpu that has the PW. When I > > downloaded > > Bah, "mips32 cpu". > > We do have the P5600 that does has it, though the code is wrapped up > in TARGET_MIPS64. I'll also note that the code could be better > placed [*] > > > (1) anyone know if the PW incompatible with mips32? > > I've since found a copy of the mips32-pra in the wayback machine and > have answered this as "no" -- PW is documented for mips32. > > > (2) if not, was there any mips32 hw built with PW > > that we could model? > > But I still don't know about this. > > A further question for the Yocto folks: could you make use of a 64- > bit kernel in testing a 32-bit userspace? We run testing of 64 bit kernel with 64 bit userspace and 32 bit kernel with 32 bit userspace, we've tested that for years. I know some of our users do use 64 bit kernels with 32 bit userspace and we do limited testing of that for multilib support. I think we did try testing an R2 setup but found little performance change and I think it may have been unreliable so we didn't make the switch. We did move to 34kf relatively recently as that did perform marginally better and matched qemu's recommendations. We've also run into a lot of problems with 32 bit mips in general if we go over 256MB memory since that seems to trigger highmem and hangs regularly for us. We're working on infrastructure to save out those hung VMs to help us debug such issues but don't have that yet. Its not regular enough and we don't have the expertise to debug it properly as yet unfortunately. There is a question of how valid a 32 bit kernel is these days, particularly given the memory issues we seem to run into there with larger images. Cheers, Richard
Re: [RFC PATCH 0/3] target/mips: Make the number of TLB entries a CPU property
On Wed, 2020-10-14 at 01:36 +, Victor Kamensky (kamensky) wrote: > Thank you very much for looking at this. I gave a spin to > your 3 patch series in original setup, and as expected with > '-cpu 34Kf,tlb-entries=64' option it works great. > > If nobody objects, and your patches could be merged, we > would greatly appreciate it. Speaking as one of the Yocto Project maintainers, this is really helpful for us, thanks! qemumips is one of our slowest platforms for automated testing so this performance improvement helps a lot. Cheers, Richard
Re: [Qemu-devel] [Qemu-ppc] [PATCH v2] target/ppc: Fix system lockups caused by interrupt_request state corruption
On Mon, 2017-12-04 at 12:44 +1100, David Gibson wrote: > On Mon, Dec 04, 2017 at 12:00:40PM +1100, David Gibson wrote: > > > > On Fri, Dec 01, 2017 at 03:49:07PM +, Richard Purdie wrote: > > > > > > Occasionally in Linux guests on x86_64 we're seeing logs like: > > > > > > ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending > > > 0100req 0004 > > > > > > when they should read: > > > > > > ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending > > > 0100req 0002 > > > > > > The "0004" is CPU_INTERRUPT_EXITTB yet the code calls > > > cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("0002") in this > > > function > > > just before the log message. Something is causing the HARD bit > > > setting > > > to get lost. > > > > > > The knock on effect of losing that bit is the decrementer timer > > > interrupts > > > don't get delivered which causes the guest to sit idle in its > > > idle handler > > > and 'hang'. > > > > > > The issue occurs due to races from code which sets > > > CPU_INTERRUPT_EXITTB. > > > > > > Rather than poking directly into cs->interrupt_request, that code > > > needs to: > > > > > > a) hold BQL > > > b) use the cpu_interrupt() helper > > > > > > This patch fixes the call sites to do this, fixing the hang. > > > > > > Signed-off-by: Richard Purdie > > > > > I strongly suspect there's a better way to do this long term - a > > lot > > of that old ppc TCG code is really crufty. But as best I can tell, > > this is certainly a fix over what we had. So, applied to > > ppc-for-2.11. > I take that back. Running make check with this patch results in: > > GTESTER check-qtest-ppc64 > ** > ERROR:/home/dwg/src/qemu/cpus.c:1582:qemu_mutex_lock_iothread: > assertion failed: (!qemu_mutex_iothread_locked()) > Broken pipe > qemu-system-ppc64: RP: Received invalid message 0x length 0x > GTester: last random seed: R02S895b0f4813776bf68c147bf987e73f7b > make: *** [/home/dwg/src/qemu/tests/Makefile.include:852: check- > qtest-ppc64] Error 1 > > So, I've reverted it. Sorry about that. I tried to stress the code paths and no issues showed up in our testing but hadn't realised those tests existed. Given there do seem to be mixed locked and unlocked code paths I've taken a different approach and sent a v3 which passes those tests. I do agree with you that there is probably a better way but that would need someone with a better understanding of the bigger picture than I have. This patch does stop our image tests locking up so does seem to fix a valid/real problem which people can run into. Cheers, Richard
[Qemu-devel] [PATCH v3] target/ppc: Fix system lockups caused by interrupt_request state corruption
Occasionally in Linux guests on x86_64 we're seeing logs like: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0004 when they should read: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0002 The "0004" is CPU_INTERRUPT_EXITTB yet the code calls cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("0002") in this function just before the log message. Something is causing the HARD bit setting to get lost. The knock on effect of losing that bit is the decrementer timer interrupts don't get delivered which causes the guest to sit idle in its idle handler and 'hang'. The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB. Rather than poking directly into cs->interrupt_request, that code needs to: a) hold BQL b) use the cpu_interrupt() helper This patch fixes the call sites to do this, fixing the hang. The calls are made from a variety of contexts so a helper function is added to handle the necessary locking. This can likely be improved and optimised in the future but it ensures the code is correct and doesn't lockup as it stands today. Signed-off-by: Richard Purdie --- target/ppc/excp_helper.c | 7 +++ target/ppc/helper_regs.h | 17 +++-- 2 files changed, 18 insertions(+), 6 deletions(-) v3: Fix make check failures v2: Fixes a compile issue with master and ensures BQL is held in one case where it potentially wasn't. diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c index e6009e70e5..37d2410726 100644 --- a/target/ppc/excp_helper.c +++ b/target/ppc/excp_helper.c @@ -207,7 +207,7 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp) "Entering checkstop state\n"); } cs->halted = 1; -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +cpu_interrupt_exittb(cs); } if (env->msr_mask & MSR_HVB) { /* ISA specifies HV, but can be delivered to guest with HV clear @@ -940,7 +940,7 @@ void helper_store_msr(CPUPPCState *env, target_ulong val) if (excp != 0) { CPUState *cs = CPU(ppc_env_get_cpu(env)); -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +cpu_interrupt_exittb(cs); raise_exception(env, excp); } } @@ -995,8 +995,7 @@ static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr) /* No need to raise an exception here, * as rfi is always the last insn of a TB */ -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; - +cpu_interrupt_exittb(cs); /* Reset the reservation */ env->reserve_addr = -1; diff --git a/target/ppc/helper_regs.h b/target/ppc/helper_regs.h index 2627a70176..84fd30c2db 100644 --- a/target/ppc/helper_regs.h +++ b/target/ppc/helper_regs.h @@ -20,6 +20,8 @@ #ifndef HELPER_REGS_H #define HELPER_REGS_H +#include "qemu/main-loop.h" + /* Swap temporary saved registers with GPRs */ static inline void hreg_swap_gpr_tgpr(CPUPPCState *env) { @@ -96,6 +98,17 @@ static inline void hreg_compute_hflags(CPUPPCState *env) env->hflags |= env->hflags_nmsr; } +static inline void cpu_interrupt_exittb(CPUState *cs) +{ +if (!qemu_mutex_iothread_locked()) { +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); +} else { +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +} +} + static inline int hreg_store_msr(CPUPPCState *env, target_ulong value, int alter_hv) { @@ -114,11 +127,11 @@ static inline int hreg_store_msr(CPUPPCState *env, target_ulong value, } if (((value >> MSR_IR) & 1) != msr_ir || ((value >> MSR_DR) & 1) != msr_dr) { -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +cpu_interrupt_exittb(cs); } if ((env->mmu_model & POWERPC_MMU_BOOKE) && ((value >> MSR_GS) & 1) != msr_gs) { -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +cpu_interrupt_exittb(cs); } if (unlikely((env->flags & POWERPC_FLAG_TGPR) && ((value ^ env->msr) & (1 << MSR_TGPR { -- 2.14.1
[Qemu-devel] [PATCH v2] target/ppc: Fix system lockups caused by interrupt_request state corruption
Occasionally in Linux guests on x86_64 we're seeing logs like: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0004 when they should read: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0002 The "0004" is CPU_INTERRUPT_EXITTB yet the code calls cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("0002") in this function just before the log message. Something is causing the HARD bit setting to get lost. The knock on effect of losing that bit is the decrementer timer interrupts don't get delivered which causes the guest to sit idle in its idle handler and 'hang'. The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB. Rather than poking directly into cs->interrupt_request, that code needs to: a) hold BQL b) use the cpu_interrupt() helper This patch fixes the call sites to do this, fixing the hang. Signed-off-by: Richard Purdie --- target/ppc/excp_helper.c | 16 +--- target/ppc/helper_regs.h | 10 -- 2 files changed, 21 insertions(+), 5 deletions(-) v2: Fixes a compile issue with master and ensures BQL is held in one case where it potentially wasn't. diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c index e6009e7..8040277 100644 --- a/target/ppc/excp_helper.c +++ b/target/ppc/excp_helper.c @@ -207,7 +207,13 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp) "Entering checkstop state\n"); } cs->halted = 1; -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +if (!qemu_mutex_iothread_locked()) { +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); +} else { +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +} } if (env->msr_mask & MSR_HVB) { /* ISA specifies HV, but can be delivered to guest with HV clear @@ -940,7 +946,9 @@ void helper_store_msr(CPUPPCState *env, target_ulong val) if (excp != 0) { CPUState *cs = CPU(ppc_env_get_cpu(env)); -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); raise_exception(env, excp); } } @@ -995,7 +1003,9 @@ static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr) /* No need to raise an exception here, * as rfi is always the last insn of a TB */ -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); /* Reset the reservation */ env->reserve_addr = -1; diff --git a/target/ppc/helper_regs.h b/target/ppc/helper_regs.h index 2627a70..0beaad5 100644 --- a/target/ppc/helper_regs.h +++ b/target/ppc/helper_regs.h @@ -20,6 +20,8 @@ #ifndef HELPER_REGS_H #define HELPER_REGS_H +#include "qemu/main-loop.h" + /* Swap temporary saved registers with GPRs */ static inline void hreg_swap_gpr_tgpr(CPUPPCState *env) { @@ -114,11 +116,15 @@ static inline int hreg_store_msr(CPUPPCState *env, target_ulong value, } if (((value >> MSR_IR) & 1) != msr_ir || ((value >> MSR_DR) & 1) != msr_dr) { -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); } if ((env->mmu_model & POWERPC_MMU_BOOKE) && ((value >> MSR_GS) & 1) != msr_gs) { -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); } if (unlikely((env->flags & POWERPC_FLAG_TGPR) && ((value ^ env->msr) & (1 << MSR_TGPR { -- 2.7.4
Re: [Qemu-devel] qemu-system-ppc hangs
On Tue, 2017-11-21 at 09:55 +, Alex Bennée wrote: > Richard Purdie writes: > > At this point I therefore wanted to seek advice on what the real > > issue > > is here and how to fix it! > Code should be using cpu_interrupt() to change things but we have > plenty > of examples in the code of messing with cpu->interrupt_request > directly > which is often why the assert() in cpu_interrupt() doesn't get a > chance > to fire. > > The two most prolific direct users seems to be i386 and ppc. > > The i386 cases are all most likely OK as it tends to be in KVM > emulation > code where by definition the BQL is already held by the time you get > there. For PPC it will depend on how you got there. The > multi-thread-tcg.txt document describes the approach: > > Emulated hardware state > --- > > Currently thanks to KVM work any access to IO memory is > automatically > protected by the global iothread mutex, also known as the BQL (Big > Qemu Lock). Any IO region that doesn't use global mutex is expected > to > do its own locking. > > However IO memory isn't the only way emulated hardware state can be > modified. Some architectures have model specific registers that > trigger hardware emulation features. Generally any translation > helper > that needs to update more than a single vCPUs of state should take > the > BQL. > > As the BQL, or global iothread mutex is shared across the system we > push the use of the lock as far down into the TCG code as possible > to > minimise contention. > > (Current solution) > > MMIO access automatically serialises hardware emulation by way of > the > BQL. Currently ARM targets serialise all ARM_CP_IO register > accesses > and also defer the reset/startup of vCPUs to the vCPU context by > way > of async_run_on_cpu(). > > Updates to interrupt state are also protected by the BQL as they > can > often be cross vCPU. > > So basically it comes down to the call-path into your final > cpu_interrupt() call which is why I guess your doing the: > > if (!qemu_mutex_iothread_locked()) { > qemu_mutex_lock_iothread(); > cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); > qemu_mutex_unlock_iothread(); > } > > dance. I suspect the helper functions are called both from device > emulation (where BQL is held) and from vCPU helpers (where it is > currently not). > > So I suggest the fix is: > > 1. Convert sites doing their own manipulation of > cpu->interrupt_request() to call the helper instead > 2. If helper directly called from TCG code: > - take BQL before calling cpu_interrupt(), release after > Else if helper shared between MMIO/TCG paths > - take BQL from TCG path, call helper, release after > > It might be there is some sensible re-factoring that could be done to > make things clearer but I'll leave that to the PPC experts to weigh > in > on. > > Hope that helps! It does indeed, thanks. I've sent out a proposed patch which does the above so people can review it and see if its the right thing to do. Certainly its working fine locally. Cheers, Richard
[Qemu-devel] [PATCH] target/ppc: Fix system lockups caused by interrupt_request state corruption
Occasionally in Linux guests on x86_64 we're seeing logs like: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0004 when they should read: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0002 The "0004" is CPU_INTERRUPT_EXITTB yet the code calls cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("0002") in this function just before the log message. Something is causing the HARD bit setting to get lost. The knock on effect of losing that bit is the decrementer timer interrupts don't get delivered which causes the guest to sit idle in its idle handler and 'hang'. The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB. Rather than poking directly into cs->interrupt_request, that code needs to: a) hold BQL b) use the cpu_interrupt() helper This patch fixes the call sites to do this, fixing the hang. Signed-off-by: Richard Purdie --- target/ppc/excp_helper.c | 12 +--- target/ppc/helper_regs.h | 8 ++-- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c index e6009e7..f175c21 100644 --- a/target/ppc/excp_helper.c +++ b/target/ppc/excp_helper.c @@ -207,7 +207,9 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp) "Entering checkstop state\n"); } cs->halted = 1; -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); } if (env->msr_mask & MSR_HVB) { /* ISA specifies HV, but can be delivered to guest with HV clear @@ -940,7 +942,9 @@ void helper_store_msr(CPUPPCState *env, target_ulong val) if (excp != 0) { CPUState *cs = CPU(ppc_env_get_cpu(env)); -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); raise_exception(env, excp); } } @@ -995,7 +999,9 @@ static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr) /* No need to raise an exception here, * as rfi is always the last insn of a TB */ -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); /* Reset the reservation */ env->reserve_addr = -1; diff --git a/target/ppc/helper_regs.h b/target/ppc/helper_regs.h index 2627a70..13dd0b8 100644 --- a/target/ppc/helper_regs.h +++ b/target/ppc/helper_regs.h @@ -114,11 +114,15 @@ static inline int hreg_store_msr(CPUPPCState *env, target_ulong value, } if (((value >> MSR_IR) & 1) != msr_ir || ((value >> MSR_DR) & 1) != msr_dr) { -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); } if ((env->mmu_model & POWERPC_MMU_BOOKE) && ((value >> MSR_GS) & 1) != msr_gs) { -cs->interrupt_request |= CPU_INTERRUPT_EXITTB; +qemu_mutex_lock_iothread(); +cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); +qemu_mutex_unlock_iothread(); } if (unlikely((env->flags & POWERPC_FLAG_TGPR) && ((value ^ env->msr) & (1 << MSR_TGPR { -- 2.7.4
Re: [Qemu-devel] qemu-system-ppc hangs
On Tue, 2017-11-21 at 07:50 +, Mark Cave-Ayland wrote: > On 21/11/17 00:00, Richard Purdie wrote: > > I work on the Yocto Project and we use qemu to test boot our Linux > > images and run tests against them. We've been noticing some > > instability > > for ppc where the images sometimes hang, usually around udevd bring > > up > > time so just after booting into userspace. > > > > To cut a long story short, I've tracked down what I think is the > > problem. I believe the decrementer timer stops receiving interrupts > > so > > tasks in our images hang indefinitely as the timer stopped. > > > > It can be summed up with this line of debug: > > > > ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req > > 0004 > > > > It should normally read: > > > > ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req > > 0002 > > > > The question is why CPU_INTERRUPT_EXITTB ends up being set when the > > lines above this log message clearly sets CPU_INTERRUPT_HARD (via > > cpu_interrupt() ). > > > > I note in cpu.h: > > > > /* updates protected by BQL */ > > uint32_t interrupt_request; > > > > (for struct CPUState) > > > > The ppc code does "cs->interrupt_request |= CPU_INTERRUPT_EXITTB" > > in 5 > > places, 3 in excp_helper.c and 2 in helper_regs.h. In all cases, > > g_assert(qemu_mutex_iothread_locked()); fails. If I do something > > like: > > > > if (!qemu_mutex_iothread_locked()) { > > qemu_mutex_lock_iothread(); > > cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); > > qemu_mutex_unlock_iothread(); > > } else { > > cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); > > } > > > > in these call sites then I can no longer lock qemu up with my test > > case. > > > > I suspect the _HARD setting gets overwritten which stops the > > decrementer interrupts being delivered. > > > > I don't know if taking this lock in these situations is going to be > > bad > > for performance and whether such a patch would be right/wrong. > > > > At this point I therefore wanted to seek advice on what the real > > issue > > is here and how to fix it! > > Thanks for the report - given that a lot of work has been done on > MTTCG and iothread over the past few releases, it wouldn't be a > complete surprise if something had crept in here. > > Firstly let's start off with some basics: what is your host > architecture, QEMU version and full command line being used to launch > QEMU? I'm running this on x86_64, I'm using qemu 2.10.1 and the commandline being used for qemu is: qemu-system-ppc -device virtio-net-pci,netdev=net0,mac=52:54:00:12:34:02 -netdev tap,id=net0,ifname=tap0,script=no,downscript=no -drive file=./core-image-sato-qemuppc.rootfs.ext4,if=virtio,format=raw -show-cursor -device virtio-rng-pci -nographic -pidfile /tmp/zzqemu.1.pid -d unimp,guest_errors,int -D /tmp/qemu.1 -monitor pty -machine mac99 -cpu G4 -m 256 -snapshot -serial mon:stdio -serial null -kernel /tmp/repro/vmlinux-qemuppc.bin -append 'root=/dev/vda rw highres=off console=ttyS0 mem=256M ip=192.168.7.2::192.168.7.1:255.255.255.0 console=tty console=ttyS0 udev.log-priority=debug powersave=off' > Would it also be possible for you to make your test image available > for other people to see if they can recreate the same issue? I've shared the image, kernel and my "reproducer" script: http://www.rpsys.net/wp/rp/qemuppc-hang-reproducer.tgz We needed to find a way to reproduce this at will and it doesn't seem to happen often. The scripts in there are partly extracted from our test setup and partly ugly brute forcing. To run them you'd do something like: cc tunctl.c -o tunctl sudo ./runqemu-gen-tapdevs 1000 1000 50 (This sets up tap0-tap49 accessible by user/group 1000/1000, only need to do this once - its how we enable easy networking without needing sudo on our test infrastructure) vi core-image-sato-qemuppc.qemuboot.conf [set the last three lines to point at where qemu-system-ppc lives] vi ./runqemu-parallel.py [set mydir to wherever you extracted it to] python3 ./runqemu-parallel.py This will launch 50 copies of qemu, dumping logging and output into /tmp/qemu.X and /tmp/*runqemu* files and than monitor the logs to see which if any "stall". Its normal for the image to stall for a few seconds towards the end of boot but if any are printing stalled messages for a minute, they've properly hung. You'll see a: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0004 in the logs of a hung qemu. The image output would usually stop with a ep_poll4. The kernel I've provided there is a very verbose debug kernel which is hang to tell when its hung, if a bit slower to boot. I didn't promise it was neat, sorry :) Cheers, Richard
[Qemu-devel] qemu-system-ppc hangs
Hi, I work on the Yocto Project and we use qemu to test boot our Linux images and run tests against them. We've been noticing some instability for ppc where the images sometimes hang, usually around udevd bring up time so just after booting into userspace. To cut a long story short, I've tracked down what I think is the problem. I believe the decrementer timer stops receiving interrupts so tasks in our images hang indefinitely as the timer stopped. It can be summed up with this line of debug: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0004 It should normally read: ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 0100req 0002 The question is why CPU_INTERRUPT_EXITTB ends up being set when the lines above this log message clearly sets CPU_INTERRUPT_HARD (via cpu_interrupt() ). I note in cpu.h: /* updates protected by BQL */ uint32_t interrupt_request; (for struct CPUState) The ppc code does "cs->interrupt_request |= CPU_INTERRUPT_EXITTB" in 5 places, 3 in excp_helper.c and 2 in helper_regs.h. In all cases, g_assert(qemu_mutex_iothread_locked()); fails. If I do something like: if (!qemu_mutex_iothread_locked()) { qemu_mutex_lock_iothread(); cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); qemu_mutex_unlock_iothread(); } else { cpu_interrupt(cs, CPU_INTERRUPT_EXITTB); } in these call sites then I can no longer lock qemu up with my test case. I suspect the _HARD setting gets overwritten which stops the decrementer interrupts being delivered. I don't know if taking this lock in these situations is going to be bad for performance and whether such a patch would be right/wrong. At this point I therefore wanted to seek advice on what the real issue is here and how to fix it! Cheers, Richard
Re: [Qemu-devel] [RFC PATCH v1 0/8] QOM prop overloading + ARM MPCore CPUs
On Fri, 2015-09-18 at 11:14 -0700, Peter Crosthwaite wrote: > On Fri, Sep 18, 2015 at 10:23 AM, Richard Purdie > wrote: > > On Fri, 2015-09-18 at 09:46 -0700, Peter Crosthwaite wrote: > >> >> My biggest fear is testing of the changes for the affected boards. > >> >> Peter, do you much coverage of these boards in your regressions? Do you > >> >> have automated tests in a git repo somewhere? > >> > > >> > The answers to these questions are "nowhere near enough" and > >> > "unfortunately not"... > >> > > >> > >> How hard would it be to do something Yocto powered? AFAIK Yocto only > >> supports the one ARM board (Vexpress), three (+ZynqMP, +Zynq) with the > >> Meta-Xilinx layer and there may be more with other layers (anything in > >> meta-linaro?). Can we bitbake something that builds out a large number > >> of ARM machines and tests them all on QEMU? > > > > Running our standard ARM board tests is a case of: > > > > git clone http://git.yoctoproject.org/git/poky > > cd poky > > source oe-init-build-env > > echo 'INHERIT += "testimage"' >> ./conf/local.conf > > MACHINE=qemuarm bitbake core-image-sato > > MACHINE=qemuarm bitbake core-image-sato -c testimage > > > > So qemuarm is implicitly vexpress, I guess we would want to add more, > such as qemuarm-zynq, qemuarm-zaurus, qemuarm-virt etc. Can a single > bitbake core-image-foo build out multiple MACHINEs or does it not work > like that? You'd usually just MACHINE=X bitbake ; MACHINE=Y bitbake . If the configuration shares a common set of compiler optimisation flags, it will reuse the image binaries. > > You could replace core-image-sato -> core-image-minimal for a smaller > > image and fewer tests or try core-image-sato-sdk or core-image-lsb-sdk > > for more. > > > > The Quick Start guide is at > > http://www.yoctoproject.org/docs/1.8/yocto-project-qs/yocto-project-qs.html > > and has various things like precanned lists of prerequisites for the > > package manager. > > > > Not sure which other boards you could try booting but I know the Zaurus > > machines did work a long time ago as we submitted the qemu code. They > > are now in their own layer and I've not tried them in a long time. > > Do these multiple vendor layers conflict with each other and is > merging all the different ARM machines to poky mainline feasible? The layer model is intentional like a plugin architecture. OE was once a monolithic repository, it grew too large and painful to work. We therefore now have layers which have separate maintainership. The quality does vary a bit but it did with the monolithic repo too. In theory you can just plug the layers you want into the core and use them, they shouldn't conflict. > Something else that is on topic, is we should consider changing > (subject to backwards compat) the default qemuarm machine to virt, as > this machine is well maintained and intended for use as a pure virtual > machine (which is intent of Yocto qemu specific target IIUC). That would need some wider discussion with the OE community and our kernel maintainers since I know we build a specific qemuarm kernel but in principle it could be done. There are also qemux86, qemux86-64, qemuppc, qemumips, qemuarm64 and qemumips64 fwiw. I'm not sure we make the best use of qemu in these so I'd be interested in any opinions on what we're doing there... :) FWIW we did leave qemuarm as an armv5 cpu since we tend to test a lot of of armv7 on real hardware. Cheers, Richard
Re: [Qemu-devel] [RFC PATCH v1 0/8] QOM prop overloading + ARM MPCore CPUs
On Fri, 2015-09-18 at 09:46 -0700, Peter Crosthwaite wrote: > >> My biggest fear is testing of the changes for the affected boards. > >> Peter, do you much coverage of these boards in your regressions? Do you > >> have automated tests in a git repo somewhere? > > > > The answers to these questions are "nowhere near enough" and > > "unfortunately not"... > > > > How hard would it be to do something Yocto powered? AFAIK Yocto only > supports the one ARM board (Vexpress), three (+ZynqMP, +Zynq) with the > Meta-Xilinx layer and there may be more with other layers (anything in > meta-linaro?). Can we bitbake something that builds out a large number > of ARM machines and tests them all on QEMU? Running our standard ARM board tests is a case of: git clone http://git.yoctoproject.org/git/poky cd poky source oe-init-build-env echo 'INHERIT += "testimage"' >> ./conf/local.conf MACHINE=qemuarm bitbake core-image-sato MACHINE=qemuarm bitbake core-image-sato -c testimage You could replace core-image-sato -> core-image-minimal for a smaller image and fewer tests or try core-image-sato-sdk or core-image-lsb-sdk for more. The Quick Start guide is at http://www.yoctoproject.org/docs/1.8/yocto-project-qs/yocto-project-qs.html and has various things like precanned lists of prerequisites for the package manager. Not sure which other boards you could try booting but I know the Zaurus machines did work a long time ago as we submitted the qemu code. They are now in their own layer and I've not tried them in a long time. The above will build its own qemu-native as there are some patches we rely on (like the network fixes). You can point the qemu recipe at different source easily enough. Cheers, Richard
Re: [Qemu-devel] [RFT PATCH v1 0/3] net: smc91c111 can_receive fixes
On Tue, 2015-09-15 at 11:02 -0700, Peter Crosthwaite wrote: > On Mon, Sep 14, 2015 at 2:09 PM, Richard Purdie > wrote: > > Hi Peter, > > > > On Thu, 2015-09-10 at 21:23 -0700, Peter Crosthwaite wrote: > >> This should hopefully fix your bug, while addressing the extra concern > >> I raised. > >> > >> There was also inconsistent behaviour with corking packets through a > >> soft reset which I notice and fixed. > >> > >> Please let me know if this works for you. > > > > I've run it through a few builds/tests in place of my own patches and so > > far it seems to be working, thanks! > > > > Can we take that as a formal Tested-by? :) Tested-by: Richard Purdie if that helps :) Cheers, Richard
Re: [Qemu-devel] [RFT PATCH v1 0/3] net: smc91c111 can_receive fixes
Hi Peter, On Thu, 2015-09-10 at 21:23 -0700, Peter Crosthwaite wrote: > This should hopefully fix your bug, while addressing the extra concern > I raised. > > There was also inconsistent behaviour with corking packets through a > soft reset which I notice and fixed. > > Please let me know if this works for you. I've run it through a few builds/tests in place of my own patches and so far it seems to be working, thanks! Cheers, Richard
Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
On Sun, 2015-09-06 at 17:48 -0700, Peter Crosthwaite wrote: > This doesn't sound right. There are other network controllers that > rely of can_receive catching all cases properly. Is this a regression? > Looking at logs, I see some refactoring of QEMU net framework around > June timeframe, if you rewind to QEMU 2.3 (or earlier) does the bug go > away? I did find an interesting comment in this commit: http://git.qemu.org/?p=qemu.git;a=commitdiff;h=625de449fc5597f2e1aff9cb586e249e198f03c9 """ Since commit 6e99c63 "net/socket: Drop net_socket_can_send" and friends, net queues need to be explicitly flushed after qemu_can_send_packet() returns false, because the netdev side will disable the polling of fd. """ smc91x111 is calling flush functions when it knows can_receive would/should return false. I believe that is the bug here. I suspect the driver needs: * can_receive to actually return the right value * the locations of the flush calls to be when there is receive space This could explain what changed to break this and why moving the flush calls works in my patch. Cheers, Richard
Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
On Sun, 2015-09-06 at 17:48 -0700, Peter Crosthwaite wrote: > On Sun, Sep 6, 2015 at 4:26 PM, Richard Purdie > wrote: > > On Sun, 2015-09-06 at 11:37 -0700, Peter Crosthwaite wrote: > > I tested an assert in _recieve() which confirms it can be called when > > can_receive() says it isn't ready. > > > > A backtrace of this would be handy. This is the trace with my assert against smc91c111_can_receive(nc) being false when we're in receive(), before we allocate_packet: #0 0x7f355f276267 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55 #1 0x7f355f277eca in __GI_abort () at abort.c:89 #2 0x7f355f26f03d in __assert_fail_base (fmt=0x7f355f3d1028 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f3562158ed7 "smc91c111_can_receive(nc)", file=file@entry=0x7f3562158dc8 "/media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/hw/net/smc91c111.c", line=line@entry=680, function=function@entry=0x7f35621591b0 <__PRETTY_FUNCTION__.26130> "smc91c111_receive") at assert.c:92 #3 0x7f355f26f0f2 in __GI___assert_fail (assertion=assertion@entry=0x7f3562158ed7 "smc91c111_can_receive(nc)", file=file@entry=0x7f3562158dc8 "/media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/hw/net/smc91c111.c", line=line@entry=680, function=function@entry=0x7f35621591b0 <__PRETTY_FUNCTION__.26130> "smc91c111_receive") at assert.c:101 #4 0x7f3561fca4d0 in smc91c111_receive (nc=0x7f3563604d10, buf=0x7f353c09d028 "RT", size=1514) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/hw/net/smc91c111.c:680 #5 0x7f356203058b in qemu_deliver_packet (sender=, flags=, data=data@entry=0x7f353c09d028 "RT", size=, opaque=0x7f3563604d10) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/net/net.c:577 #6 0x7f3562031eaa in qemu_net_queue_deliver (size=, data=, flags=, sender=, queue=0x7f3563604e70) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/net/queue.c:157 #7 qemu_net_queue_flush (queue=0x7f3563604e70) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/net/queue.c:254 #8 0x7f356202fc7c in qemu_flush_or_purge_queued_packets (nc=0x7f3563604d10, purge=) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/net/net.c:606 #9 0x7f3561fcacec in smc91c111_writeb (opaque=0x7f35635178f0, offset=, value=128) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/hw/net/smc91c111.c:382 #10 0x7f3561fcaf84 in smc91c111_writew (opaque=0x7f35635178f0, offset=0, value=128) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/hw/net/smc91c111.c:612 #11 0x7f3561e52cc5 in memory_region_oldmmio_write_accessor (mr=, addr=, value=, size=, shift=, mask=, attrs=...) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:434 #12 0x7f3561e5227d in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7f35572d93f8, size=size@entry=2, access_size_min=, access_size_max=, access=0x7f3561e52c90 , mr=0x7f356351bc80, attrs=...) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:506 #13 0x7f3561e53d5b in memory_region_dispatch_write (mr=mr@entry=0x7f356351bc80, addr=0, data=128, size=size@entry=2, attrs=attrs@entry=...) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:1171 #14 0x7f3561e1fc51 in address_space_rw (as=, addr=268500992, attrs=..., buf=buf@entry=0x7f35572d94c0 "\200", len=2, is_write=is_write@entry=true) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/exec.c:2445 #15 0x7f3561e1fda0 in address_space_write (len=, buf=0x7f35572d94c0 "\200", attrs=..., addr=, as=) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/exec.c:2521 #16 subpage_write (opaque=, addr=, value=, len=, attrs=...) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/exec.c:2081 #17 0x7f3561e5227d in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7f35572d9568, size=size@entry=2, access_size_min=, access_size_max=, access=0x7f3561e521a0 , mr=0x7f356375e360, attrs=...) at /media/build1/poky/build/tmp-lsb/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:506 #18 0x7f3561e53d5b in memory_region_dispatch_write (mr=0x7f356375e360, addr=0, data=128, size=2, attrs=...) at /media/build1/poky/build/tmp-l
Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
On Sun, 2015-09-06 at 17:48 -0700, Peter Crosthwaite wrote: > On Sun, Sep 6, 2015 at 4:26 PM, Richard Purdie > wrote: > > On Sun, 2015-09-06 at 11:37 -0700, Peter Crosthwaite wrote: > > It seems can_receive isn't enough, we'd need to put some checks into > > receive itself since once can_receive says "yes", multiple packets can > > arrive to _receive without further checks of can_receive. > > This doesn't sound right. There are other network controllers that > rely of can_receive catching all cases properly. Is this a regression? > Looking at logs, I see some refactoring of QEMU net framework around > June timeframe, if you rewind to QEMU 2.3 (or earlier) does the bug go > away? We weren't seeing this problem until we upgraded to 2.4. > > > I've either > > messed up my previous test or been lucky. > > > > I tested an assert in _recieve() which confirms it can be called when > > can_receive() says it isn't ready. > > > > A backtrace of this would be handy. > > What is your replicator? I have core-image-minimal handy but it has no > scp or sshd so all I can think of to stress network is wget, but that > doesn't replicate. I've been using a core-image-sato and using the "bitbake core-image-sato -c testimage" which runs a set of tests against the target image. It invariably crashes on the scp test when I put an assert in receive(). To make it simpler, if I just "runqemu qemuarm" to boot the core-image-sato, then scp a 5MB file consisting of zeros into the image, it hits the assert after around 2% transferred. Cheers, Richard
Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
On Sun, 2015-09-06 at 11:37 -0700, Peter Crosthwaite wrote: > On Sun, Sep 6, 2015 at 7:21 AM, Richard Purdie > wrote: > > On Sat, 2015-09-05 at 13:30 -0700, Peter Crosthwaite wrote: > >> On Fri, Sep 4, 2015 at 10:30 AM, Peter Maydell > >> wrote: > >> > On 4 September 2015 at 18:20, Richard Purdie > >> > wrote: > >> >> On Fri, 2015-09-04 at 13:43 +0100, Richard Purdie wrote: > >> >>> On Fri, 2015-09-04 at 12:31 +0100, Peter Maydell wrote: > >> >>> > On 4 September 2015 at 12:24, Richard Purdie > >> >>> > wrote: > >> >>> > > So just based on that, yes, seems that the rx_fifo looks to be > >> >>> > > overrunning. I can add the asserts but I think it would just > >> >>> > > confirm > >> >>> > > this. > >> >>> > > >> >>> > Yes, the point of adding assertions is to confirm a hypothesis. > >> >>> > >> >>> I've now confirmed that it does indeed trigger the assert in > >> >>> smc91c111_receive(). > >> >> > >> >> I just tried an experiment where I put: > >> >> > >> >> if (s->rx_fifo_len >= NUM_PACKETS) > >> >> return -1; > >> >> > >> >> into smc91c111_receive() and my reproducer stops reproducing the > >> >> problem. > >> > >> Does it just stop the crash or does it eliminate the problem > >> completely with a fully now-working network? > > > > It stops the crash, the network works great. > > > >> >> I also noticed can_receive() could also have a check on buffer > >> >> availability. Would one of these changes be the correct fix here? > >> > > >> > The interesting question is why smc91c111_allocate_packet() doesn't > >> > fail in this situation. We only have NUM_PACKETS worth of storage, > >> > shared between the tx and rx buffers, so how could we both have > >> > already filled the rx_fifo and have a spare packet for the allocate > >> > function to return? > >> > >> Maybe this: > >> > >> case 5: /* Release. */ > >> smc91c111_release_packet(s, s->packet_num); > >> break; > >> > >> The guest is able to free an allocated packet without the accompanying > >> pop of tx/rx fifo. This may suggest some sort of guest error? > >> > >> The fix depends on the behaviour of the real hardware. If that MMIO op > >> is supposed to dequeue the corresponding queue entry then we may need > >> to patch that logic to do search the queues and dequeue it. Otherwise > >> we need to find out the genuine length of the rx queue, and clamp it > >> without something like Richards patch. There are a few other bits and > >> pieces that suggest the guest can have independent control of the > >> queues and allocated buffers but i'm confused as to how the rx fifo > >> length can get up to 10 in any case. > > > > I think I have a handle on what is going on. smc91c111_release_packet() > > changes s->allocated() but not rx_fifo. can_receive() only looks at > > s->allocated. We can trigger new network packets to arrive from > > smc91c111_release_packet() which calls qemu_flush_queued_packets() > > *before* we change rx_fifo and this can loop. > > > > The patch below which explicitly orders the qemu_flush_queued_packets() > > call resolved the test case I was able to reproduce this problem in. > > > > So there are three ways to fix this, either can_receive() needs to check > > both s->allocated() and rx_fifo, > > This is probably the winner for me. > > > or the code is more explicit about when > > qemu_flush_queued_packets() is called (as per my patch below), or the > > case 4 where smc91c111_release_packet() and then > > smc91c111_pop_rx_fifo(s) is called is reversed. I also tested the latter > > which also works, albeit with more ugly code. It seems can_receive isn't enough, we'd need to put some checks into receive itself since once can_receive says "yes", multiple packets can arrive to _receive without further checks of can_receive. I've either messed up my previous test or been lucky. I tested an assert in _recieve() which confirms it can be called when can_receive() says it isn't ready. If we return -1 in _receive, the code will stop sending packets and all works as it should, it recovers just fine. So I think that is looking like the correct fix. I'd note that it already effectively has half this check in the allocate_packet call, its just missing the rx_fifo_len one. Cheers, Richard
Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
On Sat, 2015-09-05 at 13:30 -0700, Peter Crosthwaite wrote: > On Fri, Sep 4, 2015 at 10:30 AM, Peter Maydell > wrote: > > On 4 September 2015 at 18:20, Richard Purdie > > wrote: > >> On Fri, 2015-09-04 at 13:43 +0100, Richard Purdie wrote: > >>> On Fri, 2015-09-04 at 12:31 +0100, Peter Maydell wrote: > >>> > On 4 September 2015 at 12:24, Richard Purdie > >>> > wrote: > >>> > > So just based on that, yes, seems that the rx_fifo looks to be > >>> > > overrunning. I can add the asserts but I think it would just confirm > >>> > > this. > >>> > > >>> > Yes, the point of adding assertions is to confirm a hypothesis. > >>> > >>> I've now confirmed that it does indeed trigger the assert in > >>> smc91c111_receive(). > >> > >> I just tried an experiment where I put: > >> > >> if (s->rx_fifo_len >= NUM_PACKETS) > >> return -1; > >> > >> into smc91c111_receive() and my reproducer stops reproducing the > >> problem. > > Does it just stop the crash or does it eliminate the problem > completely with a fully now-working network? It stops the crash, the network works great. > >> I also noticed can_receive() could also have a check on buffer > >> availability. Would one of these changes be the correct fix here? > > > > The interesting question is why smc91c111_allocate_packet() doesn't > > fail in this situation. We only have NUM_PACKETS worth of storage, > > shared between the tx and rx buffers, so how could we both have > > already filled the rx_fifo and have a spare packet for the allocate > > function to return? > > Maybe this: > > case 5: /* Release. */ > smc91c111_release_packet(s, s->packet_num); > break; > > The guest is able to free an allocated packet without the accompanying > pop of tx/rx fifo. This may suggest some sort of guest error? > > The fix depends on the behaviour of the real hardware. If that MMIO op > is supposed to dequeue the corresponding queue entry then we may need > to patch that logic to do search the queues and dequeue it. Otherwise > we need to find out the genuine length of the rx queue, and clamp it > without something like Richards patch. There are a few other bits and > pieces that suggest the guest can have independent control of the > queues and allocated buffers but i'm confused as to how the rx fifo > length can get up to 10 in any case. I think I have a handle on what is going on. smc91c111_release_packet() changes s->allocated() but not rx_fifo. can_receive() only looks at s->allocated. We can trigger new network packets to arrive from smc91c111_release_packet() which calls qemu_flush_queued_packets() *before* we change rx_fifo and this can loop. The patch below which explicitly orders the qemu_flush_queued_packets() call resolved the test case I was able to reproduce this problem in. So there are three ways to fix this, either can_receive() needs to check both s->allocated() and rx_fifo, or the code is more explicit about when qemu_flush_queued_packets() is called (as per my patch below), or the case 4 where smc91c111_release_packet() and then smc91c111_pop_rx_fifo(s) is called is reversed. I also tested the latter which also works, albeit with more ugly code. The problem is much more reproducible with the assert btw, booting a qemu image with this and hitting the network interface with scp of a few large files is usually enough. So which patch would be preferred? :) Cheers, Richard Index: qemu-2.4.0/hw/net/smc91c111.c === --- qemu-2.4.0.orig/hw/net/smc91c111.c +++ qemu-2.4.0/hw/net/smc91c111.c @@ -185,7 +185,6 @@ static void smc91c111_release_packet(smc s->allocated &= ~(1 << packet); if (s->tx_alloc == 0x80) smc91c111_tx_alloc(s); -qemu_flush_queued_packets(qemu_get_queue(s->nic)); } /* Flush the TX FIFO. */ @@ -237,9 +236,11 @@ static void smc91c111_do_tx(smc91c111_st } } #endif -if (s->ctr & CTR_AUTO_RELEASE) +if (s->ctr & CTR_AUTO_RELEASE) { /* Race? */ smc91c111_release_packet(s, packetnum); +qemu_flush_queued_packets(qemu_get_queue(s->nic)); +} else if (s->tx_fifo_done_len < NUM_PACKETS) s->tx_fifo_done[s->tx_fifo_done_len++] = packetnum; qemu_send_packet(qemu_get_queue(s->nic), p, len); @@ -379,9 +380,11 @@ static void smc91c111_writeb(void *opaqu smc91c111_release_packet(s, s->rx_fifo[0]); } smc91c111_pop_rx_fifo(s); +qemu_flush_queued_packets(qemu_get_queue(s->nic)); break; case 5: /* Release. */ smc91c111_release_packet(s, s->packet_num); +qemu_flush_queued_packets(qemu_get_queue(s->nic)); break; case 6: /* Add to TX FIFO. */ smc91c111_queue_tx(s, s->packet_num);
Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
On Fri, 2015-09-04 at 13:43 +0100, Richard Purdie wrote: > On Fri, 2015-09-04 at 12:31 +0100, Peter Maydell wrote: > > On 4 September 2015 at 12:24, Richard Purdie > > wrote: > > > So just based on that, yes, seems that the rx_fifo looks to be > > > overrunning. I can add the asserts but I think it would just confirm > > > this. > > > > Yes, the point of adding assertions is to confirm a hypothesis. > > I've now confirmed that it does indeed trigger the assert in > smc91c111_receive(). I just tried an experiment where I put: if (s->rx_fifo_len >= NUM_PACKETS) return -1; into smc91c111_receive() and my reproducer stops reproducing the problem. I also noticed can_receive() could also have a check on buffer availability. Would one of these changes be the correct fix here? (Still working on a reproducer, ended up fixing the other test continuation issues so the failure is more obvious). Cheers, Richard
Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
On Fri, 2015-09-04 at 12:31 +0100, Peter Maydell wrote: > On 4 September 2015 at 12:24, Richard Purdie > wrote: > > So just based on that, yes, seems that the rx_fifo looks to be > > overrunning. I can add the asserts but I think it would just confirm > > this. > > Yes, the point of adding assertions is to confirm a hypothesis. I've now confirmed that it does indeed trigger the assert in smc91c111_receive(). > >> Also, do you have a more specific reproduce case so I can try > >> to replicate the problem here? > > > > Not sure how familiar you are with the yocto project? > > I don't really know about the Yocto project, no. What I need > is a set of instructions I can follow ("download this, run this > command line", etc) so I can reproduce it on my machine. I'm running some tests at the moment to see which environments this is reproducible in and will try and distil reproduction steps based on that. Cheers, Richard
Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
On Fri, 2015-09-04 at 11:45 +0100, Peter Maydell wrote: > On 4 September 2015 at 11:25, Richard Purdie > wrote: > > We're seeing repeated segfaults in qemu-system-arm when we heavily use > > the network. I have a coredump backtrace: > > > (gdb) print s->tx_fifo_done > > $1 = {99614720, 99614720, 99614720, 99614720} > > (gdb) print s->tx_fifo_done_len > > $2 = 99614719 > > > > so it looks like tx_fifo_done_len has been corrupted, going beyond that > > is harder for me to figure out. Does anyone happen to know what might be > > going on here? This is with qemu 2.4.0. > > That would suggest the rx_fifo buffer is overrunning (assuming > none of the other fields in the struct look like they've > been corrupted). Can you try putting > assert(s->rx_fifo_len < NUM_PACKETS); > before > s->rx_fifo[s->rx_fifo_len++] = packetnum; > in smc91c111_receive(), and see if you hit that assertion? (gdb) print s->tx_fifo_len $2 = 0 (gdb) print s->rx_fifo_len $3 = 10 So just based on that, yes, seems that the rx_fifo looks to be overrunning. I can add the asserts but I think it would just confirm this. > Also, do you have a more specific reproduce case so I can try > to replicate the problem here? Not sure how familiar you are with the yocto project? Basically we build a core-image-sato rootfs image, then boot it under qemu and run some tests against it. This seems to reliably fail for arm, particularly on our debian8 autobuilder for reasons as yet unknown. The build logs for a couple of example failures are: https://autobuilder.yoctoproject.org/main/builders/nightly-oecore/builds/477 https://autobuilder.yoctoproject.org/main/builders/nightly-arm-lsb/builds/465 There is an issue where the tests don't stop running after qemu segfaults, they continue to try and connect to it which is an issue we'll work separately. The is a segfault/coredump showing the same backtrace for both the above builds. So if you had an OE build environment, you could download (or build) a core-image-sato, then just run the tests against it (bitbake core-image-sato -c testimage). We've yet to figure out exactly which environments trigger it but it does seem to fail fairly regularly (>50% of the time) when running these tests. I appreciate its not exactly an easy reproducer but the setup is designed to be replicated and you did ask! :) Cheers, Richard
[Qemu-devel] Segfault using qemu-system-arm in smc91c111
We're seeing repeated segfaults in qemu-system-arm when we heavily use the network. I have a coredump backtrace: Reading symbols from /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/sysroots/x86_64-linux/usr/bin/qemu-system-arm...done. [New LWP 4536] [New LWP 4534] [New LWP 4530] [New LWP 4537] [New LWP 6396] warning: Corrupted shared library list: 0x7f8d5f27e540 != 0x619822507f8d [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/'. Program terminated with signal SIGSEGV, Segmentation fault. #0 smc91c111_pop_tx_fifo_done (s=0x7f8d6158b560) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/hw/net/smc91c111.c:179 179 s->tx_fifo_done[i] = s->tx_fifo_done[i + 1]; (gdb) bt #0 smc91c111_pop_tx_fifo_done (s=0x7f8d6158b560) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/hw/net/smc91c111.c:179 #1 smc91c111_writeb (opaque=0x7f8d6158b560, offset=12, value=) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/hw/net/smc91c111.c:431 #2 0x7f8d5ecacd65 in memory_region_oldmmio_write_accessor (mr=, addr=, value=, size=, shift=, mask=, attrs=...) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:434 #3 0x7f8d5ecac5dd in access_with_adjusted_size (addr=140245200319840, addr@entry=12, value=0xc, value@entry=0x7f8d52ac63e8, size=1, access_size_min=2031671516, access_size_max=32, access=0x7f8d5ecacd30 , mr=0x7f8d6158f8f0, attrs=...) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:506 #4 0x7f8d5ecae08b in memory_region_dispatch_write (mr=mr@entry=0x7f8d6158f8f0, addr=12, data=2, size=size@entry=1, attrs=attrs@entry=...) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:1171 #5 0x7f8d5ec7b78f in address_space_rw (as=0x7f8d5f408600 , addr=268501004, attrs=..., buf=buf@entry=0x7f8d52ac64b0 "\002", len=1, is_write=is_write@entry=true) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/exec.c:2451 #6 0x7f8d5ec7b9e0 in address_space_write (len=, buf=0x7f8d52ac64b0 "\002", attrs=..., addr=, as=) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/exec.c:2521 #7 subpage_write (opaque=, addr=, value=, len=, attrs=...) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/exec.c:2081 #8 0x7f8d5ecac5dd in access_with_adjusted_size (addr=140245200319840, addr@entry=12, value=0xc, value@entry=0x7f8d52ac6558, size=1, access_size_min=2031671516, access_size_max=32, access=0x7f8d5ecac500 , mr=0x7f8d618d5750, attrs=...) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:506 #9 0x7f8d5ecae08b in memory_region_dispatch_write (mr=0x7f8d618d5750, addr=12, data=2, size=1, attrs=...) at /home/pokybuild/yocto-autobuilder/yocto-worker/nightly-arm-lsb/build/build/tmp/work/x86_64-linux/qemu-native/2.4.0-r1/qemu-2.4.0/memory.c:1171 #10 0x7f8d5584b512 in ?? () (gdb) print s->tx_fifo_done $1 = {99614720, 99614720, 99614720, 99614720} (gdb) print s->tx_fifo_done_len $2 = 99614719 so it looks like tx_fifo_done_len has been corrupted, going beyond that is harder for me to figure out. Does anyone happen to know what might be going on here? This is with qemu 2.4.0. Cheers, Richard
[Qemu-devel] [PATCH] hw/usb/wacom: Add missing HID descriptor
The USB wacom device is missing a HID descriptor which causes it to fail to operate with recent kernels (e.g. 3.17). This patch adds a HID desriptor to the device, based upon one from real wacom device and allows the device to work properly again. Signed-off-by: Richard Purdie Index: qemu-2.1.0/hw/usb/dev-wacom.c === --- qemu-2.1.0.orig/hw/usb/dev-wacom.c 2014-08-01 15:12:17.0 +0100 +++ qemu-2.1.0/hw/usb/dev-wacom.c 2014-10-12 12:13:30.540306042 +0100 @@ -68,6 +68,89 @@ [STR_SERIALNUMBER] = "1", }; +static const uint8_t qemu_tablet_hid_report_descriptor[] = { +0x05, 0x01,/* Usage Page (Generic Desktop) */ +0x09, 0x02,/* Usage (Mouse) */ +0xa1, 0x01,/* Collection (Application) */ +0x85, 0x01,/* Report ID (1) */ +0x09, 0x01,/* Usage (Pointer) */ +0xa1, 0x00,/* Collection (Physical) */ +0x05, 0x09,/* Usage Page (Button) */ +0x19, 0x01,/* Usage Minimum (1) */ +0x29, 0x05,/* Usage Maximum (5) */ +0x15, 0x00,/* Logical Minimum (0) */ +0x25, 0x01,/* Logical Maximum (1) */ +0x95, 0x05,/* Report Count (5) */ +0x75, 0x01,/* Report Size (1) */ +0x81, 0x02,/* Input (Data, Variable, Absolute) */ +0x95, 0x01,/* Report Count (1) */ +0x75, 0x03,/* Report Size (3) */ +0x81, 0x01,/* Input (Constant) */ +0x05, 0x01,/* Usage Page (Generic Desktop) */ +0x09, 0x30,/* Usage (X) */ +0x09, 0x31,/* Usage (Y) */ +0x15, 0x81,/* Logical Minimum (-127) */ +0x25, 0x7f,/* Logical Maximum (127) */ +0x75, 0x08,/* Report Size (8) */ +0x95, 0x02,/* Report Count (2) */ +0x81, 0x06,/* Input (Data, Variable, Relative) */ +0xc0, /* End Collection */ +0xc0, /* End Collection */ +0x05, 0x0d,/* Usage Page (Digitizer) */ +0x09, 0x01,/* Usage (Digitizer) */ +0xa1, 0x01,/* Collection (Application) */ +0x85, 0x02,/* Report ID (2) */ +0xa1, 0x00,/* Collection (Physical) */ +0x06, 0x00, 0xff, /* Usage Page (Vendor 0xff00) */ +0x09, 0x01,/* Usage (Digitizer) */ +0x15, 0x00,/* Logical Minimum (0) */ +0x26, 0xff, 0x00, /* Logical Maximum (255) */ +0x75, 0x08,/* Report Size (8) */ +0x95, 0x08,/* Report Count (8) */ +0x81, 0x02,/* Input (Data, Variable, Absolute) */ +0xc0, /* End Collection */ +0x09, 0x01,/* Usage (Digitizer) */ +0x85, 0x02,/* Report ID (2) */ +0x95, 0x01,/* Report Count (1) */ +0xb1, 0x02,/* FEATURE (2) */ +0xc0, /* End Collection */ +0x06, 0x00, 0xff, /* Usage Page (Vendor 0xff00) */ +0x09, 0x01,/* Usage (Digitizer) */ +0xa1, 0x01,/* Collection (Application) */ +0x85, 0x02,/* Report ID (2) */ +0x05, 0x0d,/* Usage Page (Digitizer) */ +0x09, 0x22,/* Usage (Finger) */ +0xa1, 0x00,/* Collection (Physical) */ +0x06, 0x00, 0xff, /* Usage Page (Vendor 0xff00) */ +0x09, 0x01,/* Usage (Digitizer) */ +0x15, 0x00,/* Logical Minimum (0) */ +0x26, 0xff, 0x00, /* Logical Maximum */ +0x75, 0x08,/* Report Size (8) */ +0x95, 0x02,/* Report Count (2) */ +0x81, 0x02,/* Input (Data, Variable, Absolute) */ +0x05, 0x01,/* Usage Page (Generic Desktop) */ +0x09, 0x30,/* Usage (X) */ +0x35, 0x00,/* Physical Minimum */ +0x46, 0xe0, 0x2e, /* Physical Maximum */ +0x26, 0xe0, 0x01, /* Logical Maximum */ +0x75, 0x10,/* Report Size (16) */ +0x95, 0x01,/* Report Count (1) */ +0x81, 0x02,/* Input (Data, Variable, Absolute) */ +0x09, 0x31,/* Usage (Y) */ +0x46, 0x40, 0x1f, /* Physical Maximum */ +0x26, 0x40, 0x01, /* Logical Maximum */ +0x81, 0x02,/* Input (Data, Variable, Absolute) */ +0x06, 0x00, 0xff, /* Usage Page (Vendor 0xff00) */ +0x09, 0x01,/* Usage (Digitizer) */ +0x26, 0xff, 0x00, /* Logical Maximum */ +
Re: [Qemu-devel] [BUG/PATCH] Fix i386 SSE status flag corruption
On Mon, 2013-09-30 at 22:20 +0100, Richard Purdie wrote: > This is a combination of bug report and patch. I'm not sure if you'll want to > fix it > like this but it illustrates the problem and should be easy to fix based on > this. > > When we restore the mxcsr register with FXRSTOR, we need to update the > various SSE > status flags in CPUX86State by calling update_sse_status(). If we don't, we > end up > using the status bits from some other context with interesting results. > > I used a function prototype since it makes the fix clear, some code > rearrangement > might be needed ultimately. > > Signed-off-by: Richard Purdie Ping? This seems to be quite nasty SSE register corruption and it would be nice to get it fixed. Happy to rework the patch if needed, am just not sure the best way to handle moving the function if that is what is needed. Cheers, Richard > Index: qemu-1.5.0/target-i386/fpu_helper.c > === > --- qemu-1.5.0.orig/target-i386/fpu_helper.c 2013-09-30 18:46:39.283377648 > + > +++ qemu-1.5.0/target-i386/fpu_helper.c 2013-09-30 18:46:56.895377232 > + > @@ -1149,6 +1149,8 @@ > } > } > > +static void update_sse_status(CPUX86State *env); > + > void helper_fxrstor(CPUX86State *env, target_ulong ptr, int data64) > { > int i, fpus, fptag, nb_xmm_regs; > @@ -1180,6 +1182,7 @@ > if (env->cr[4] & CR4_OSFXSR_MASK) { > /* XXX: finish it */ > env->mxcsr = cpu_ldl_data(env, ptr + 0x18); > +update_sse_status(env); > /* cpu_ldl_data(env, ptr + 0x1c); */ > if (env->hflags & HF_CS64_MASK) { > nb_xmm_regs = 16; > >
[Qemu-devel] [BUG/PATCH] Fix i386 SSE status flag corruption
This is a combination of bug report and patch. I'm not sure if you'll want to fix it like this but it illustrates the problem and should be easy to fix based on this. When we restore the mxcsr register with FXRSTOR, we need to update the various SSE status flags in CPUX86State by calling update_sse_status(). If we don't, we end up using the status bits from some other context with interesting results. I used a function prototype since it makes the fix clear, some code rearrangement might be needed ultimately. Signed-off-by: Richard Purdie Index: qemu-1.5.0/target-i386/fpu_helper.c === --- qemu-1.5.0.orig/target-i386/fpu_helper.c2013-09-30 18:46:39.283377648 + +++ qemu-1.5.0/target-i386/fpu_helper.c 2013-09-30 18:46:56.895377232 + @@ -1149,6 +1149,8 @@ } } +static void update_sse_status(CPUX86State *env); + void helper_fxrstor(CPUX86State *env, target_ulong ptr, int data64) { int i, fpus, fptag, nb_xmm_regs; @@ -1180,6 +1182,7 @@ if (env->cr[4] & CR4_OSFXSR_MASK) { /* XXX: finish it */ env->mxcsr = cpu_ldl_data(env, ptr + 0x18); +update_sse_status(env); /* cpu_ldl_data(env, ptr + 0x1c); */ if (env->hflags & HF_CS64_MASK) { nb_xmm_regs = 16;
[Qemu-devel] Fix writev syscall emulation
Hi, OpenEmbedded/Poky use qemu for locale generation when cross compiling. When we upgraded to qemu 0.9.1 it started giving locale generation errors on all 64 bit machines and some 32 bit ones. I've traced it to the writev syscall failing. localedef passes several { NULL, 0 } iovec entries which trip up lock_iovec(). That function is returning an error for these but the return value isn't checked. The syscall is therefore always made but sometimes with a iovec that has only been half copied. If the total writes exceed size_t, EINVAL is returned by the kernel and glibc code emulates the call with a write which is most likely to happen on 32 bit so it sometimes works. size_t is unlikely to be exceeded on 64 bit so that returns an EFAULT and always corrupts/fails. Anyhow, it seems 0 length iovec entries are allowed and we shouldn't care about the addresses in those cases. The patch below is one way to fix this. Ideally the return value of lock_iovec() needs be be checked too. Regards, Richard --- linux-user/syscall.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: qemu-0.9.1/linux-user/syscall.c === --- qemu-0.9.1.orig/linux-user/syscall.c2008-02-03 00:00:00.0 + +++ qemu-0.9.1/linux-user/syscall.c 2008-02-03 00:00:38.0 + @@ -1048,7 +1048,7 @@ static abi_long lock_iovec(int type, str base = tswapl(target_vec[i].iov_base); vec[i].iov_len = tswapl(target_vec[i].iov_len); vec[i].iov_base = lock_user(type, base, vec[i].iov_len, copy); - if (!vec[i].iov_base) + if (!vec[i].iov_base && vec[i].iov_len) goto fail; } unlock_user (target_vec, target_addr, 0);