Re: [Qemu-devel] [PATCH 1.1] scsi: Add assertion for use-after-free errors

2012-05-04 Thread Paolo Bonzini
Il 03/05/2012 22:58, Stefan Weil ha scritto:
 Am 03.05.2012 19:36, schrieb Stefan Weil:
 The QEMU emulation which is currently used with Raspberry PI images
 (qemu-system-arm -M versatilepb ...) accesses memory which was freed.

 Valgrind output (extract):

 ==17857== Invalid write of size 4
 ==17857== at 0x24EB06: scsi_req_unref (scsi-bus.c:1273)
 ==17857== by 0x24FFAE: scsi_read_complete (scsi-disk.c:277)
 ==17857== by 0x152ACC: bdrv_co_em_bh (block.c:3363)
 ==17857== by 0x13D49C: qemu_bh_poll (async.c:71)
 ==17857== by 0x211A8C: main_loop_wait (main-loop.c:503)
 ==17857== by 0x207954: main_loop (vl.c:1555)
 ==17857== by 0x20E9C9: main (vl.c:3653)
 ==17857== Address 0x1c54383c is 12 bytes inside a block of size 260
 free'd
 ==17857== at 0x4824B3A: free (vg_replace_malloc.c:366)
 ==17857== by 0x20ADFA: free_and_trace (vl.c:2250)
 ==17857== by 0x4899FC5: g_free (in /lib/libglib-2.0.so.0.2400.1)
 ==17857== by 0x24EB3B: scsi_req_unref (scsi-bus.c:1277)
 ==17857== by 0x24F003: scsi_req_complete (scsi-bus.c:1383)
 ==17857== by 0x25022A: scsi_read_data (scsi-disk.c:334)
 ==17857== by 0x24EB9F: scsi_req_continue (scsi-bus.c:1289)
 ==17857== by 0x1C7787: lsi_do_dma (lsi53c895a.c:575)
 ==17857== by 0x1C8CDA: lsi_execute_script (lsi53c895a.c:1147)
 ==17857== by 0x1C74EA: lsi_resume_script (lsi53c895a.c:510)
 ==17857== by 0x1C7ECD: lsi_transfer_data (lsi53c895a.c:746)
 ==17857== by 0x24EC90: scsi_req_data (scsi-bus.c:1307)

Yes, this was reported by David Gibson too.  Interesting that
virtio-scsi doesn't show it, probably it's the sglist support that hides
it.  I queued the fix and I'm sending the pull request in a matter of
minutes.  The patch is a good addition so I queued it too, thanks.

Paolo



[Qemu-devel] [PATCH 1.1] scsi: Add assertion for use-after-free errors

2012-05-03 Thread Stefan Weil
The QEMU emulation which is currently used with Raspberry PI images
(qemu-system-arm -M versatilepb ...) accesses memory which was freed.

Valgrind output (extract):

==17857== Invalid write of size 4
==17857==at 0x24EB06: scsi_req_unref (scsi-bus.c:1273)
==17857==by 0x24FFAE: scsi_read_complete (scsi-disk.c:277)
==17857==by 0x152ACC: bdrv_co_em_bh (block.c:3363)
==17857==by 0x13D49C: qemu_bh_poll (async.c:71)
==17857==by 0x211A8C: main_loop_wait (main-loop.c:503)
==17857==by 0x207954: main_loop (vl.c:1555)
==17857==by 0x20E9C9: main (vl.c:3653)
==17857==  Address 0x1c54383c is 12 bytes inside a block of size 260 free'd
==17857==at 0x4824B3A: free (vg_replace_malloc.c:366)
==17857==by 0x20ADFA: free_and_trace (vl.c:2250)
==17857==by 0x4899FC5: g_free (in /lib/libglib-2.0.so.0.2400.1)
==17857==by 0x24EB3B: scsi_req_unref (scsi-bus.c:1277)
==17857==by 0x24F003: scsi_req_complete (scsi-bus.c:1383)
==17857==by 0x25022A: scsi_read_data (scsi-disk.c:334)
==17857==by 0x24EB9F: scsi_req_continue (scsi-bus.c:1289)
==17857==by 0x1C7787: lsi_do_dma (lsi53c895a.c:575)
==17857==by 0x1C8CDA: lsi_execute_script (lsi53c895a.c:1147)
==17857==by 0x1C74EA: lsi_resume_script (lsi53c895a.c:510)
==17857==by 0x1C7ECD: lsi_transfer_data (lsi53c895a.c:746)
==17857==by 0x24EC90: scsi_req_data (scsi-bus.c:1307)

(There are some more similar messages.)

This patch adds an assertion which also detects those errors:

Calling scsi_req_unref is not allowed when the previous call
of that function has decremented refcount to 0, because in this
case req was freed.

Signed-off-by: Stefan Weil s...@weilnetz.de
---

There are chances that this patch breaks some test scenarios,
but that is intentional: we should not pretend that there are
no errors when there are some.

The Raspberry PI emulation with QEMU is currently used by
a lot of people.

Please apply this patch for the tests of QEMU 1.1.

Of course we should also fix the problem which triggers the
assertion. I still don't know whether it is caused by
lsi53c895a.c or by the scsi code.

Thanks,

Stefan Weil


 hw/scsi-bus.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index dbdb99c..62779c7 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -1270,6 +1270,7 @@ SCSIRequest *scsi_req_ref(SCSIRequest *req)
 
 void scsi_req_unref(SCSIRequest *req)
 {
+assert(req-refcount  0);
 if (--req-refcount == 0) {
 if (req-ops-free_req) {
 req-ops-free_req(req);
-- 
1.7.9




Re: [Qemu-devel] [PATCH 1.1] scsi: Add assertion for use-after-free errors

2012-05-03 Thread Stefan Weil

Am 03.05.2012 19:36, schrieb Stefan Weil:

The QEMU emulation which is currently used with Raspberry PI images
(qemu-system-arm -M versatilepb ...) accesses memory which was freed.

Valgrind output (extract):

==17857== Invalid write of size 4
==17857== at 0x24EB06: scsi_req_unref (scsi-bus.c:1273)
==17857== by 0x24FFAE: scsi_read_complete (scsi-disk.c:277)
==17857== by 0x152ACC: bdrv_co_em_bh (block.c:3363)
==17857== by 0x13D49C: qemu_bh_poll (async.c:71)
==17857== by 0x211A8C: main_loop_wait (main-loop.c:503)
==17857== by 0x207954: main_loop (vl.c:1555)
==17857== by 0x20E9C9: main (vl.c:3653)
==17857== Address 0x1c54383c is 12 bytes inside a block of size 260 free'd
==17857== at 0x4824B3A: free (vg_replace_malloc.c:366)
==17857== by 0x20ADFA: free_and_trace (vl.c:2250)
==17857== by 0x4899FC5: g_free (in /lib/libglib-2.0.so.0.2400.1)
==17857== by 0x24EB3B: scsi_req_unref (scsi-bus.c:1277)
==17857== by 0x24F003: scsi_req_complete (scsi-bus.c:1383)
==17857== by 0x25022A: scsi_read_data (scsi-disk.c:334)
==17857== by 0x24EB9F: scsi_req_continue (scsi-bus.c:1289)
==17857== by 0x1C7787: lsi_do_dma (lsi53c895a.c:575)
==17857== by 0x1C8CDA: lsi_execute_script (lsi53c895a.c:1147)
==17857== by 0x1C74EA: lsi_resume_script (lsi53c895a.c:510)
==17857== by 0x1C7ECD: lsi_transfer_data (lsi53c895a.c:746)
==17857== by 0x24EC90: scsi_req_data (scsi-bus.c:1307)



Hi Paolo,

this is the result of a bisect to narrow the source of problem:

ac6684264642f1aea7cba5c0c3907409b1f7f904 is the first bad commit
commit ac6684264642f1aea7cba5c0c3907409b1f7f904
Author: Paolo Bonzini pbonz...@redhat.com
Date:   Thu Apr 19 11:55:28 2012 +0200

scsi: support FUA on reads

To force unit access on reads, flush the cache *before* doing the read.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com

Regards,

Stefan




(There are some more similar messages.)

This patch adds an assertion which also detects those errors:

Calling scsi_req_unref is not allowed when the previous call
of that function has decremented refcount to 0, because in this
case req was freed.

Signed-off-by: Stefan Weil s...@weilnetz.de
---

There are chances that this patch breaks some test scenarios,
but that is intentional: we should not pretend that there are
no errors when there are some.

The Raspberry PI emulation with QEMU is currently used by
a lot of people.

Please apply this patch for the tests of QEMU 1.1.

Of course we should also fix the problem which triggers the
assertion. I still don't know whether it is caused by
lsi53c895a.c or by the scsi code.


It is the scsi code, see git bisect result.