Re: [Qemu-devel] master: intermittent acpi-test failures

2015-01-12 Thread Peter Maydell
On 12 January 2015 at 17:56, Peter Maydell peter.mayd...@linaro.org wrote:
 On 30 November 2014 at 15:12, Michael S. Tsirkin m...@redhat.com wrote:
 On Sat, Nov 29, 2014 at 05:39:01PM +, Peter Maydell wrote:
 On 29 November 2014 at 17:36, Michael S. Tsirkin m...@redhat.com wrote:
  My guess is VM fails to boot from disk for some reason.
  Could you trigger a screenshot after this happens?

 Sure, if you can provide instructions (this is all from
 make check so there's no display by default and
 extracting a standalone qemu command line from make
 check is pretty tedious IME).

 It's probably easiest to simply drop -nographic
 from test code to run with a display.

 OK, I did this, and the result is that there is just a black
 screen with no graphic output ever.

This turns out to be a bug in RTH's patchset which I was testing
which coincidentally had the same symptoms.

-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2015-01-12 Thread Peter Maydell
On 12 January 2015 at 19:49, Peter Maydell peter.mayd...@linaro.org wrote:
 This turns out to be a bug in RTH's patchset which I was testing
 which coincidentally had the same symptoms.

More generally, I think this ACPI test is the first (only?) time we
try to actually run guest code in make check, so if we break 64-on-32
TCG then it manifests as this ACPI test times out...

-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2015-01-12 Thread Peter Maydell
On 12 January 2015 at 18:08, Peter Maydell peter.mayd...@linaro.org wrote:
 So we're just sat in a loop which never finishes. This
 seems to be because the first time in to it we set
 the loop counter EBP to 0x5b207801.

Looking further up the trace we seem to be mistranslating movsbl:
IN:
0x000f195e:  movsbl (%ebx),%eax
0x000f1961:  lea-0x30(%eax),%edx
0x000f1964:  cmp$0x9,%dl
0x000f1967:  ja 0xf1984

OP:
 ld_i32 tmp18,env,$0xfff4
 movi_i32 tmp19,$0x0
 brcond_i32 tmp18,tmp19,ne,$0x0

  0xf195e
 mov_i32 tmp4,rbx_0
 mov_i32 tmp5,rbx_1
 movi_i32 tmp5,$0x0
 qemu_ld_i32 tmp0,tmp4,tmp5,leul,$0x4
 movi_i32 tmp18,$0x1f
 sar_i32 tmp1,tmp0,tmp18
 mov_i32 rax_0,tmp0
 movi_i32 rax_1,$0x0

  0xf1961
 movi_i32 tmp20,$0xffd0
 movi_i32 tmp21,$0x
 add2_i32 tmp4,tmp5,rax_0,rax_1,tmp20,tmp21
 movi_i32 tmp5,$0x0
 mov_i32 rdx_0,tmp4
 movi_i32 rdx_1,$0x0

[etc]

movsbl should be a signed byte load, but we seem to have
emitted a qemu_ld_i32 tmp0,tmp4,tmp5,leul,$0x4, which is a
32 bit load (leul), and then sign extended 32-64 bits.

[the insn bytes here are 0x0f 0xbe 0x03.]

-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2015-01-12 Thread Michael S. Tsirkin
On Mon, Jan 12, 2015 at 08:04:31PM +, Peter Maydell wrote:
 On 12 January 2015 at 19:49, Peter Maydell peter.mayd...@linaro.org wrote:
  This turns out to be a bug in RTH's patchset which I was testing
  which coincidentally had the same symptoms.
 
 More generally, I think this ACPI test is the first (only?) time we
 try to actually run guest code in make check, so if we break 64-on-32
 TCG then it manifests as this ACPI test times out...
 
 -- PMM

Though I have plans to add tests like that for device ROMs and PXE as
well.  So more tests will fail :)

-- 
MST



Re: [Qemu-devel] master: intermittent acpi-test failures

2015-01-12 Thread Peter Maydell
On 12 January 2015 at 17:56, Peter Maydell peter.mayd...@linaro.org wrote:
 ...but I don't see why that call 0xf106f takes
 us to f1064, which the trace says it does

I think the trace is just confusing. Attaching in gdb we see:

= 0xf1133: test   %ebp,%ebp
   0xf1135: jle0xf1144
   0xf1137: mov(%esp),%edx
   0xf113a: mov%esi,%eax
   0xf113c: call   0xf106f

= 0xf106f: mov%eax,%ecx
   0xf1071: movsbl %dl,%edx
   0xf1074: call   *(%ecx)

= 0xf1064: mov%edx,%eax
   0xf1066: mov0xf68fc,%dx
   0xf106d: out%al,(%dx)
   0xf106e: ret

= 0xf1076: ret

= 0xf1141: dec%ebp
   0xf1142: jmp0xf1133


So we're just sat in a loop which never finishes. This
seems to be because the first time in to it we set
the loop counter EBP to 0x5b207801.

-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2015-01-12 Thread Peter Maydell
On 30 November 2014 at 15:12, Michael S. Tsirkin m...@redhat.com wrote:
 On Sat, Nov 29, 2014 at 05:39:01PM +, Peter Maydell wrote:
 On 29 November 2014 at 17:36, Michael S. Tsirkin m...@redhat.com wrote:
  My guess is VM fails to boot from disk for some reason.
  Could you trigger a screenshot after this happens?

 Sure, if you can provide instructions (this is all from
 make check so there's no display by default and
 extracting a standalone qemu command line from make
 check is pretty tedious IME).

 It's probably easiest to simply drop -nographic
 from test code to run with a display.

OK, I did this, and the result is that there is just a black
screen with no graphic output ever.

The guest seems to be stuck in a loop:

Trace 0x74e38bc0 [000f1076]
Trace 0x74e3f430 [000f1141]
Trace 0x74e38a80 [000f1064]
Trace 0x74e38bc0 [000f1076]
Trace 0x74e3f430 [000f1141]
Trace 0x74e38a80 [000f1064]

which I think is:

0x000f1064:  mov%edx,%eax
0x000f1066:  mov0xf68fc,%dx
0x000f106d:  out%al,(%dx)
0x000f106e:  ret

0x000f1076:  ret

0x000f1141:  dec%ebp
0x000f1142:  jmp0xf1133

0x000f1133:  test   %ebp,%ebp
0x000f1135:  jle0xf1144[not taken]
0x000f1137:  mov(%esp),%edx
0x000f113a:  mov%esi,%eax
0x000f113c:  call   0xf106f

...but I don't see why that call 0xf106f takes
us to f1064, which the trace says it does:

Trace 0x74e3f300 [000f1137]
EAX=0030 EBX=0007 ECX=000f64c0 EDX=0402
ESI=000f64c0 EDI=0800 EBP=5b207800 ESP=6f4c
EIP=000f1137 EFL=0006 [-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010   00cf9300 DPL=0 DS   [-WA]
CS =0008   00cf9b00 DPL=0 CS32 [-RA]
SS =0010   00cf9300 DPL=0 DS   [-WA]
DS =0010   00cf9300 DPL=0 DS   [-WA]
FS =0010   00cf9300 DPL=0 DS   [-WA]
GS =0010   00cf9300 DPL=0 DS   [-WA]
LDT=   8200 DPL=0 LDT
TR =   8b00 DPL=0 TSS32-busy
GDT= 000f6be8 6be80037
IDT= 000f6c26 6c26
CR0=6011 CR2= CR3= CR4=
DR0= DR1= DR2=
DR3=
DR6=0ff0 DR7=0400
CCS=0004 CCD=5b207800 CCO=EFLAGS
EFER=
Trace 0x74e38a80 [000f1064]
EAX=000f64c0 EBX=0007 ECX=000f64c0 EDX=0030
ESI=000f64c0 EDI=0800 EBP=5b207800 ESP=6f44
EIP=000f1064 EFL=0006 [-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010   00cf9300 DPL=0 DS   [-WA]
CS =0008   00cf9b00 DPL=0 CS32 [-RA]
SS =0010   00cf9300 DPL=0 DS   [-WA]
DS =0010   00cf9300 DPL=0 DS   [-WA]
FS =0010   00cf9300 DPL=0 DS   [-WA]
GS =0010   00cf9300 DPL=0 DS   [-WA]
LDT=   8200 DPL=0 LDT
TR =   8b00 DPL=0 TSS32-busy
GDT= 000f6be8 6be80037
IDT= 000f6c26 6c26
CR0=6011 CR2= CR3= CR4=
DR0= DR1= DR2=
DR3=
DR6=0ff0 DR7=0400
CCS=0004 CCD=5b207800 CCO=EFLAGS
EFER=

Full trace (300MB!) at:
http://people.linaro.org/~peter.maydell/bios-test.log

-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2014-11-30 Thread Michael S. Tsirkin
On Sun, Nov 30, 2014 at 05:12:55PM +0200, Michael S. Tsirkin wrote:
 On Sat, Nov 29, 2014 at 05:39:01PM +, Peter Maydell wrote:
  On 29 November 2014 at 17:36, Michael S. Tsirkin m...@redhat.com wrote:
   On Fri, Nov 28, 2014 at 01:34:33PM +, Peter Maydell wrote:
   These failures are back after a long period of not
   being a problem :-(
  
   My guess is VM fails to boot from disk for some reason.
   Could you trigger a screenshot after this happens?
  
  Sure, if you can provide instructions (this is all from
  make check so there's no display by default and
  extracting a standalone qemu command line from make
  check is pretty tedious IME).
  
  -- PMM
 
 It's probably easiest to simply drop -nographic
 from test code to run with a display.
 
 To trigger a screenshot, just give
 screendump /path/to/file
 on hmp.

Another idea is to configure debugging in seabios.

 -- 
 MST



Re: [Qemu-devel] master: intermittent acpi-test failures

2014-11-29 Thread Michael S. Tsirkin
On Fri, Nov 28, 2014 at 01:34:33PM +, Peter Maydell wrote:
 On 27 May 2014 at 22:38, Peter Maydell peter.mayd...@linaro.org wrote:
  I'm seeing this test failure intermittently on 'make check':
 
  ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed
  (signature == SIGNATURE): (0x == 0xdead)
  GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053
 
  (32 bit ARM host, in case that makes a difference.)
 
  Any ideas? It looks from the test as if this may just be
  that the test is coded to assume a faster machine, which
  is a bit unfortunate.
 
 These failures are back after a long period of not
 being a problem :-(
 
 -- PMM

My guess is VM fails to boot from disk for some reason.
Could you trigger a screenshot after this happens?

-- 
MST



Re: [Qemu-devel] master: intermittent acpi-test failures

2014-11-29 Thread Peter Maydell
On 29 November 2014 at 17:36, Michael S. Tsirkin m...@redhat.com wrote:
 On Fri, Nov 28, 2014 at 01:34:33PM +, Peter Maydell wrote:
 These failures are back after a long period of not
 being a problem :-(

 My guess is VM fails to boot from disk for some reason.
 Could you trigger a screenshot after this happens?

Sure, if you can provide instructions (this is all from
make check so there's no display by default and
extracting a standalone qemu command line from make
check is pretty tedious IME).

-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2014-11-28 Thread Peter Maydell
On 27 May 2014 at 22:38, Peter Maydell peter.mayd...@linaro.org wrote:
 I'm seeing this test failure intermittently on 'make check':

 ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed
 (signature == SIGNATURE): (0x == 0xdead)
 GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053

 (32 bit ARM host, in case that makes a difference.)

 Any ideas? It looks from the test as if this may just be
 that the test is coded to assume a faster machine, which
 is a bit unfortunate.

These failures are back after a long period of not
being a problem :-(

-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2014-06-08 Thread Michael S. Tsirkin
On Tue, May 27, 2014 at 10:38:17PM +0100, Peter Maydell wrote:
 I'm seeing this test failure intermittently on 'make check':
 
 ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed
 (signature == SIGNATURE): (0x == 0xdead)
 GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053
 
 (32 bit ARM host, in case that makes a difference.)
 
 Any ideas? It looks from the test as if this may just be
 that the test is coded to assume a faster machine, which
 is a bit unfortunate.
 
 thanks
 -- PMM

We have a timeout of 1 minute there.
Since all VM has to do is run BIOS initialization
and then write out the signature, this seems ample.

I'm reluctant to wait forever there, that would make
debugging harder in case of failures.
Does it help if you increase TEST_DELAY?

If it does we can do this, though I suspect this is merely
a work-around, there's probably something that
causes QEMU to pause execution during early BIOS boot.
Could you try strace to see what it is?

-- 
MST



Re: [Qemu-devel] master: intermittent acpi-test failures

2014-06-08 Thread Peter Maydell
On 8 June 2014 08:37, Michael S. Tsirkin m...@redhat.com wrote:
 On Tue, May 27, 2014 at 10:38:17PM +0100, Peter Maydell wrote:
 I'm seeing this test failure intermittently on 'make check':

 ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed
 (signature == SIGNATURE): (0x == 0xdead)
 GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053

 (32 bit ARM host, in case that makes a difference.)

 Any ideas? It looks from the test as if this may just be
 that the test is coded to assume a faster machine, which
 is a bit unfortunate.

 We have a timeout of 1 minute there.
 Since all VM has to do is run BIOS initialization
 and then write out the signature, this seems ample.

See my earlier email -- when the test completes it does
so within 8 or 9 loops (where the max is set at 600);
so I don't think raising the timeout will help -- something
has got stuck.

 If it does we can do this, though I suspect this is merely
 a work-around, there's probably something that
 causes QEMU to pause execution during early BIOS boot.
 Could you try strace to see what it is?

I'll give this a try.

thanks
-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2014-06-08 Thread Michael S. Tsirkin
On Sun, Jun 08, 2014 at 09:48:39AM +0100, Peter Maydell wrote:
 On 8 June 2014 08:37, Michael S. Tsirkin m...@redhat.com wrote:
  On Tue, May 27, 2014 at 10:38:17PM +0100, Peter Maydell wrote:
  I'm seeing this test failure intermittently on 'make check':
 
  ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed
  (signature == SIGNATURE): (0x == 0xdead)
  GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053
 
  (32 bit ARM host, in case that makes a difference.)
 
  Any ideas? It looks from the test as if this may just be
  that the test is coded to assume a faster machine, which
  is a bit unfortunate.
 
  We have a timeout of 1 minute there.
  Since all VM has to do is run BIOS initialization
  and then write out the signature, this seems ample.
 
 See my earlier email -- when the test completes it does
 so within 8 or 9 loops (where the max is set at 600);
 so I don't think raising the timeout will help -- something
 has got stuck.
 
  If it does we can do this, though I suspect this is merely
  a work-around, there's probably something that
  causes QEMU to pause execution during early BIOS boot.
  Could you try strace to see what it is?
 
 I'll give this a try.
 
 thanks
 -- PMM

We have a use after free memory corruption ATM,
I don't see why it would trigger on this path
but can't hurt to try Paolo's patch.




Re: [Qemu-devel] master: intermittent acpi-test failures

2014-05-28 Thread Peter Maydell
On 27 May 2014 22:38, Peter Maydell peter.mayd...@linaro.org wrote:
 I'm seeing this test failure intermittently on 'make check':

 ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed
 (signature == SIGNATURE): (0x == 0xdead)
 GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053

 (32 bit ARM host, in case that makes a difference.)

 Any ideas? It looks from the test as if this may just be
 that the test is coded to assume a faster machine, which
 is a bit unfortunate.

Well, I put some diagnostic printing in, and it looks like that
guess was wrong. Most of the time we complete well within
the timeout limit for the test:

TEST: tests/acpi-test... (pid=10639)
  /i386/acpi/tcg:
 : looped for 8 cycles total (limit 600)
main-loop: WARNING: I/O thread spun for 1000 iterations
. : looped for 9 cycles total (limit 600)
main-loop: WARNING: I/O thread spun for 1000 iterations
OK
PASS: tests/acpi-test

But occasionally we don't:

TEST: tests/acpi-test... (pid=10679)
  /i386/acpi/tcg:
 : looped for 8 cycles total (limit 600)
main-loop: WARNING: I/O thread spun for 1000 iterations

: looped for 600 cycles total (limit 600)
**
ERROR:/root/qemu/tests/acpi-test.c:620:test_acpi_one: assertion failed
(signature == SIGNATURE): (0x == 0xdead)
FAIL
GTester: last random seed: R02S1fd7be17c4dc962a399b016f9153e15a
(pid=10688)
FAIL: tests/acpi-test

thanks
-- PMM



Re: [Qemu-devel] master: intermittent acpi-test failures

2014-05-28 Thread Michael S. Tsirkin
On Wed, May 28, 2014 at 12:29:43PM +0100, Peter Maydell wrote:
 On 27 May 2014 22:38, Peter Maydell peter.mayd...@linaro.org wrote:
  I'm seeing this test failure intermittently on 'make check':
 
  ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed
  (signature == SIGNATURE): (0x == 0xdead)
  GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053
 
  (32 bit ARM host, in case that makes a difference.)
 
  Any ideas? It looks from the test as if this may just be
  that the test is coded to assume a faster machine, which
  is a bit unfortunate.
 
 Well, I put some diagnostic printing in, and it looks like that
 guess was wrong. Most of the time we complete well within
 the timeout limit for the test:
 
 TEST: tests/acpi-test... (pid=10639)
   /i386/acpi/tcg:
  : looped for 8 cycles total (limit 600)
 main-loop: WARNING: I/O thread spun for 1000 iterations
 . : looped for 9 cycles total (limit 600)
 main-loop: WARNING: I/O thread spun for 1000 iterations
 OK
 PASS: tests/acpi-test
 
 But occasionally we don't:
 
 TEST: tests/acpi-test... (pid=10679)
   /i386/acpi/tcg:
  : looped for 8 cycles total (limit 600)
 main-loop: WARNING: I/O thread spun for 1000 iterations
 
 : looped for 600 cycles total (limit 600)
 **
 ERROR:/root/qemu/tests/acpi-test.c:620:test_acpi_one: assertion failed
 (signature == SIGNATURE): (0x == 0xdead)
 FAIL
 GTester: last random seed: R02S1fd7be17c4dc962a399b016f9153e15a
 (pid=10688)
 FAIL: tests/acpi-test
 
 thanks
 -- PMM

I suspect a memory ordering issue.
Looking at it.

-- 
MST



[Qemu-devel] master: intermittent acpi-test failures

2014-05-27 Thread Peter Maydell
I'm seeing this test failure intermittently on 'make check':

ERROR:/root/qemu/tests/acpi-test.c:618:test_acpi_one: assertion failed
(signature == SIGNATURE): (0x == 0xdead)
GTester: last random seed: R02S8d0d60963e4442ce284a81d20ce32053

(32 bit ARM host, in case that makes a difference.)

Any ideas? It looks from the test as if this may just be
that the test is coded to assume a faster machine, which
is a bit unfortunate.

thanks
-- PMM