Re: [Qemu-devel] master: intermittent acpi-test failures
On 12 January 2015 at 17:56, Peter Maydell peter.mayd...@linaro.org wrote: On 30 November 2014 at 15:12, Michael S. Tsirkin m...@redhat.com wrote: On Sat, Nov 29, 2014 at 05:39:01PM +, Peter Maydell wrote: On 29 November 2014 at 17:36, Michael S. Tsirkin m...@redhat.com wrote: My guess is VM fails to boot from disk for some reason. Could you trigger a screenshot after this happens? Sure, if you can provide instructions (this is all from make check so there's no display by default and extracting a standalone qemu command line from make check is pretty tedious IME). It's probably easiest to simply drop -nographic from test code to run with a display. OK, I did this, and the result is that there is just a black screen with no graphic output ever. This turns out to be a bug in RTH's patchset which I was testing which coincidentally had the same symptoms. -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On 12 January 2015 at 19:49, Peter Maydell peter.mayd...@linaro.org wrote: This turns out to be a bug in RTH's patchset which I was testing which coincidentally had the same symptoms. More generally, I think this ACPI test is the first (only?) time we try to actually run guest code in make check, so if we break 64-on-32 TCG then it manifests as this ACPI test times out... -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On 12 January 2015 at 18:08, Peter Maydell peter.mayd...@linaro.org wrote: So we're just sat in a loop which never finishes. This seems to be because the first time in to it we set the loop counter EBP to 0x5b207801. Looking further up the trace we seem to be mistranslating movsbl: IN: 0x000f195e: movsbl (%ebx),%eax 0x000f1961: lea-0x30(%eax),%edx 0x000f1964: cmp$0x9,%dl 0x000f1967: ja 0xf1984 OP: ld_i32 tmp18,env,$0xfff4 movi_i32 tmp19,$0x0 brcond_i32 tmp18,tmp19,ne,$0x0 0xf195e mov_i32 tmp4,rbx_0 mov_i32 tmp5,rbx_1 movi_i32 tmp5,$0x0 qemu_ld_i32 tmp0,tmp4,tmp5,leul,$0x4 movi_i32 tmp18,$0x1f sar_i32 tmp1,tmp0,tmp18 mov_i32 rax_0,tmp0 movi_i32 rax_1,$0x0 0xf1961 movi_i32 tmp20,$0xffd0 movi_i32 tmp21,$0x add2_i32 tmp4,tmp5,rax_0,rax_1,tmp20,tmp21 movi_i32 tmp5,$0x0 mov_i32 rdx_0,tmp4 movi_i32 rdx_1,$0x0 [etc] movsbl should be a signed byte load, but we seem to have emitted a qemu_ld_i32 tmp0,tmp4,tmp5,leul,$0x4, which is a 32 bit load (leul), and then sign extended 32-64 bits. [the insn bytes here are 0x0f 0xbe 0x03.] -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On Mon, Jan 12, 2015 at 08:04:31PM +, Peter Maydell wrote: On 12 January 2015 at 19:49, Peter Maydell peter.mayd...@linaro.org wrote: This turns out to be a bug in RTH's patchset which I was testing which coincidentally had the same symptoms. More generally, I think this ACPI test is the first (only?) time we try to actually run guest code in make check, so if we break 64-on-32 TCG then it manifests as this ACPI test times out... -- PMM Though I have plans to add tests like that for device ROMs and PXE as well. So more tests will fail :) -- MST
Re: [Qemu-devel] master: intermittent acpi-test failures
On 12 January 2015 at 17:56, Peter Maydell peter.mayd...@linaro.org wrote: ...but I don't see why that call 0xf106f takes us to f1064, which the trace says it does I think the trace is just confusing. Attaching in gdb we see: = 0xf1133: test %ebp,%ebp 0xf1135: jle0xf1144 0xf1137: mov(%esp),%edx 0xf113a: mov%esi,%eax 0xf113c: call 0xf106f = 0xf106f: mov%eax,%ecx 0xf1071: movsbl %dl,%edx 0xf1074: call *(%ecx) = 0xf1064: mov%edx,%eax 0xf1066: mov0xf68fc,%dx 0xf106d: out%al,(%dx) 0xf106e: ret = 0xf1076: ret = 0xf1141: dec%ebp 0xf1142: jmp0xf1133 So we're just sat in a loop which never finishes. This seems to be because the first time in to it we set the loop counter EBP to 0x5b207801. -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On 30 November 2014 at 15:12, Michael S. Tsirkin m...@redhat.com wrote: On Sat, Nov 29, 2014 at 05:39:01PM +, Peter Maydell wrote: On 29 November 2014 at 17:36, Michael S. Tsirkin m...@redhat.com wrote: My guess is VM fails to boot from disk for some reason. Could you trigger a screenshot after this happens? Sure, if you can provide instructions (this is all from make check so there's no display by default and extracting a standalone qemu command line from make check is pretty tedious IME). It's probably easiest to simply drop -nographic from test code to run with a display. OK, I did this, and the result is that there is just a black screen with no graphic output ever. The guest seems to be stuck in a loop: Trace 0x74e38bc0 [000f1076] Trace 0x74e3f430 [000f1141] Trace 0x74e38a80 [000f1064] Trace 0x74e38bc0 [000f1076] Trace 0x74e3f430 [000f1141] Trace 0x74e38a80 [000f1064] which I think is: 0x000f1064: mov%edx,%eax 0x000f1066: mov0xf68fc,%dx 0x000f106d: out%al,(%dx) 0x000f106e: ret 0x000f1076: ret 0x000f1141: dec%ebp 0x000f1142: jmp0xf1133 0x000f1133: test %ebp,%ebp 0x000f1135: jle0xf1144[not taken] 0x000f1137: mov(%esp),%edx 0x000f113a: mov%esi,%eax 0x000f113c: call 0xf106f ...but I don't see why that call 0xf106f takes us to f1064, which the trace says it does: Trace 0x74e3f300 [000f1137] EAX=0030 EBX=0007 ECX=000f64c0 EDX=0402 ESI=000f64c0 EDI=0800 EBP=5b207800 ESP=6f4c EIP=000f1137 EFL=0006 [-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00cf9300 DPL=0 DS [-WA] CS =0008 00cf9b00 DPL=0 CS32 [-RA] SS =0010 00cf9300 DPL=0 DS [-WA] DS =0010 00cf9300 DPL=0 DS [-WA] FS =0010 00cf9300 DPL=0 DS [-WA] GS =0010 00cf9300 DPL=0 DS [-WA] LDT= 8200 DPL=0 LDT TR = 8b00 DPL=0 TSS32-busy GDT= 000f6be8 6be80037 IDT= 000f6c26 6c26 CR0=6011 CR2= CR3= CR4= DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 CCS=0004 CCD=5b207800 CCO=EFLAGS EFER= Trace 0x74e38a80 [000f1064] EAX=000f64c0 EBX=0007 ECX=000f64c0 EDX=0030 ESI=000f64c0 EDI=0800 EBP=5b207800 ESP=6f44 EIP=000f1064 EFL=0006 [-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00cf9300 DPL=0 DS [-WA] CS =0008 00cf9b00 DPL=0 CS32 [-RA] SS =0010 00cf9300 DPL=0 DS [-WA] DS =0010 00cf9300 DPL=0 DS [-WA] FS =0010 00cf9300 DPL=0 DS [-WA] GS =0010 00cf9300 DPL=0 DS [-WA] LDT= 8200 DPL=0 LDT TR = 8b00 DPL=0 TSS32-busy GDT= 000f6be8 6be80037 IDT= 000f6c26 6c26 CR0=6011 CR2= CR3= CR4= DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 CCS=0004 CCD=5b207800 CCO=EFLAGS EFER= Full trace (300MB!) at: http://people.linaro.org/~peter.maydell/bios-test.log -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On Sun, Nov 30, 2014 at 05:12:55PM +0200, Michael S. Tsirkin wrote: On Sat, Nov 29, 2014 at 05:39:01PM +, Peter Maydell wrote: On 29 November 2014 at 17:36, Michael S. Tsirkin m...@redhat.com wrote: On Fri, Nov 28, 2014 at 01:34:33PM +, Peter Maydell wrote: These failures are back after a long period of not being a problem :-( My guess is VM fails to boot from disk for some reason. Could you trigger a screenshot after this happens? Sure, if you can provide instructions (this is all from make check so there's no display by default and extracting a standalone qemu command line from make check is pretty tedious IME). -- PMM It's probably easiest to simply drop -nographic from test code to run with a display. To trigger a screenshot, just give screendump /path/to/file on hmp. Another idea is to configure debugging in seabios. -- MST
Re: [Qemu-devel] master: intermittent acpi-test failures
On Fri, Nov 28, 2014 at 01:34:33PM +, Peter Maydell wrote: On 27 May 2014 at 22:38, Peter Maydell peter.mayd...@linaro.org wrote: I'm seeing this test failure intermittently on 'make check': ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053 (32 bit ARM host, in case that makes a difference.) Any ideas? It looks from the test as if this may just be that the test is coded to assume a faster machine, which is a bit unfortunate. These failures are back after a long period of not being a problem :-( -- PMM My guess is VM fails to boot from disk for some reason. Could you trigger a screenshot after this happens? -- MST
Re: [Qemu-devel] master: intermittent acpi-test failures
On 29 November 2014 at 17:36, Michael S. Tsirkin m...@redhat.com wrote: On Fri, Nov 28, 2014 at 01:34:33PM +, Peter Maydell wrote: These failures are back after a long period of not being a problem :-( My guess is VM fails to boot from disk for some reason. Could you trigger a screenshot after this happens? Sure, if you can provide instructions (this is all from make check so there's no display by default and extracting a standalone qemu command line from make check is pretty tedious IME). -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On 27 May 2014 at 22:38, Peter Maydell peter.mayd...@linaro.org wrote: I'm seeing this test failure intermittently on 'make check': ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053 (32 bit ARM host, in case that makes a difference.) Any ideas? It looks from the test as if this may just be that the test is coded to assume a faster machine, which is a bit unfortunate. These failures are back after a long period of not being a problem :-( -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On Tue, May 27, 2014 at 10:38:17PM +0100, Peter Maydell wrote: I'm seeing this test failure intermittently on 'make check': ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053 (32 bit ARM host, in case that makes a difference.) Any ideas? It looks from the test as if this may just be that the test is coded to assume a faster machine, which is a bit unfortunate. thanks -- PMM We have a timeout of 1 minute there. Since all VM has to do is run BIOS initialization and then write out the signature, this seems ample. I'm reluctant to wait forever there, that would make debugging harder in case of failures. Does it help if you increase TEST_DELAY? If it does we can do this, though I suspect this is merely a work-around, there's probably something that causes QEMU to pause execution during early BIOS boot. Could you try strace to see what it is? -- MST
Re: [Qemu-devel] master: intermittent acpi-test failures
On 8 June 2014 08:37, Michael S. Tsirkin m...@redhat.com wrote: On Tue, May 27, 2014 at 10:38:17PM +0100, Peter Maydell wrote: I'm seeing this test failure intermittently on 'make check': ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053 (32 bit ARM host, in case that makes a difference.) Any ideas? It looks from the test as if this may just be that the test is coded to assume a faster machine, which is a bit unfortunate. We have a timeout of 1 minute there. Since all VM has to do is run BIOS initialization and then write out the signature, this seems ample. See my earlier email -- when the test completes it does so within 8 or 9 loops (where the max is set at 600); so I don't think raising the timeout will help -- something has got stuck. If it does we can do this, though I suspect this is merely a work-around, there's probably something that causes QEMU to pause execution during early BIOS boot. Could you try strace to see what it is? I'll give this a try. thanks -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On Sun, Jun 08, 2014 at 09:48:39AM +0100, Peter Maydell wrote: On 8 June 2014 08:37, Michael S. Tsirkin m...@redhat.com wrote: On Tue, May 27, 2014 at 10:38:17PM +0100, Peter Maydell wrote: I'm seeing this test failure intermittently on 'make check': ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053 (32 bit ARM host, in case that makes a difference.) Any ideas? It looks from the test as if this may just be that the test is coded to assume a faster machine, which is a bit unfortunate. We have a timeout of 1 minute there. Since all VM has to do is run BIOS initialization and then write out the signature, this seems ample. See my earlier email -- when the test completes it does so within 8 or 9 loops (where the max is set at 600); so I don't think raising the timeout will help -- something has got stuck. If it does we can do this, though I suspect this is merely a work-around, there's probably something that causes QEMU to pause execution during early BIOS boot. Could you try strace to see what it is? I'll give this a try. thanks -- PMM We have a use after free memory corruption ATM, I don't see why it would trigger on this path but can't hurt to try Paolo's patch.
Re: [Qemu-devel] master: intermittent acpi-test failures
On 27 May 2014 22:38, Peter Maydell peter.mayd...@linaro.org wrote: I'm seeing this test failure intermittently on 'make check': ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053 (32 bit ARM host, in case that makes a difference.) Any ideas? It looks from the test as if this may just be that the test is coded to assume a faster machine, which is a bit unfortunate. Well, I put some diagnostic printing in, and it looks like that guess was wrong. Most of the time we complete well within the timeout limit for the test: TEST: tests/acpi-test... (pid=10639) /i386/acpi/tcg: : looped for 8 cycles total (limit 600) main-loop: WARNING: I/O thread spun for 1000 iterations . : looped for 9 cycles total (limit 600) main-loop: WARNING: I/O thread spun for 1000 iterations OK PASS: tests/acpi-test But occasionally we don't: TEST: tests/acpi-test... (pid=10679) /i386/acpi/tcg: : looped for 8 cycles total (limit 600) main-loop: WARNING: I/O thread spun for 1000 iterations : looped for 600 cycles total (limit 600) ** ERROR:/root/qemu/tests/acpi-test.c:620:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) FAIL GTester: last random seed: R02S1fd7be17c4dc962a399b016f9153e15a (pid=10688) FAIL: tests/acpi-test thanks -- PMM
Re: [Qemu-devel] master: intermittent acpi-test failures
On Wed, May 28, 2014 at 12:29:43PM +0100, Peter Maydell wrote: On 27 May 2014 22:38, Peter Maydell peter.mayd...@linaro.org wrote: I'm seeing this test failure intermittently on 'make check': ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053 (32 bit ARM host, in case that makes a difference.) Any ideas? It looks from the test as if this may just be that the test is coded to assume a faster machine, which is a bit unfortunate. Well, I put some diagnostic printing in, and it looks like that guess was wrong. Most of the time we complete well within the timeout limit for the test: TEST: tests/acpi-test... (pid=10639) /i386/acpi/tcg: : looped for 8 cycles total (limit 600) main-loop: WARNING: I/O thread spun for 1000 iterations . : looped for 9 cycles total (limit 600) main-loop: WARNING: I/O thread spun for 1000 iterations OK PASS: tests/acpi-test But occasionally we don't: TEST: tests/acpi-test... (pid=10679) /i386/acpi/tcg: : looped for 8 cycles total (limit 600) main-loop: WARNING: I/O thread spun for 1000 iterations : looped for 600 cycles total (limit 600) ** ERROR:/root/qemu/tests/acpi-test.c:620:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) FAIL GTester: last random seed: R02S1fd7be17c4dc962a399b016f9153e15a (pid=10688) FAIL: tests/acpi-test thanks -- PMM I suspect a memory ordering issue. Looking at it. -- MST
[Qemu-devel] master: intermittent acpi-test failures
I'm seeing this test failure intermittently on 'make check': ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed (signature == SIGNATURE): (0x == 0xdead) GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053 (32 bit ARM host, in case that makes a difference.) Any ideas? It looks from the test as if this may just be that the test is coded to assume a faster machine, which is a bit unfortunate. thanks -- PMM