Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0
Am 19.09.2012 07:49, schrieb Peter Lieven: On 09/18/12 12:31, Kevin Wolf wrote: Am 18.09.2012 12:28, schrieb Peter Lieven: On 09/17/12 22:12, Peter Lieven wrote: On 09/17/12 10:41, Kevin Wolf wrote: Am 16.09.2012 12:13, schrieb Peter Lieven: Hi, when trying to block migrate a VM from one node to another, the source VM crashed with the following assertion: block.c:3829: bdrv_set_in_use: Assertion `bs-in_use != in_use' failed. Is this sth already addresses/known? Not that I'm aware of, at least. Block migration doesn't seem to check whether the device is already in use, maybe this is the problem. Not sure why it would be in use, though, and in my quick test it didn't crash. So we need some more information: What's you command line, did you do anything specific in the monitor with block devices, what does the stacktrace look like, etc.? kevin, it seems that i can very easily force a crash if I cancel a running block migration. if I understand correctly what happens there are aio callbacks coming in after blk_mig_cleanup() has been called. what is the proper way to detect this in blk_mig_read_cb()? You could try this, it doesn't detect the situation in blk_mig_read_cb(), but ensures that all callbacks happen before we do the actual cleanup (completely untested): after testing it for half an hour i can say, it seems to fix the problem. no segfaults and also no other assertions. while searching I have seen that the queses blk_list and bmds_list are initialized at qemu startup. wouldn't it be better to initialize them at init_blk_migration or at least check that they are really empty? i have also seen that prev_time_offset is not initialized. Probably. If you sent this as a proper patch with a SoB, I wouldn't reject it, but considering that block migration is deprecated anyway, I won't bother myself as long as there's no real bug. Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0
On 09/17/12 22:12, Peter Lieven wrote: On 09/17/12 10:41, Kevin Wolf wrote: Am 16.09.2012 12:13, schrieb Peter Lieven: Hi, when trying to block migrate a VM from one node to another, the source VM crashed with the following assertion: block.c:3829: bdrv_set_in_use: Assertion `bs-in_use != in_use' failed. Is this sth already addresses/known? Not that I'm aware of, at least. Block migration doesn't seem to check whether the device is already in use, maybe this is the problem. Not sure why it would be in use, though, and in my quick test it didn't crash. So we need some more information: What's you command line, did you do anything specific in the monitor with block devices, what does the stacktrace look like, etc.? kevin, it seems that i can very easily force a crash if I cancel a running block migration. if I understand correctly what happens there are aio callbacks coming in after blk_mig_cleanup() has been called. what is the proper way to detect this in blk_mig_read_cb()? Thanks, Peter -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0
Am 18.09.2012 12:28, schrieb Peter Lieven: On 09/17/12 22:12, Peter Lieven wrote: On 09/17/12 10:41, Kevin Wolf wrote: Am 16.09.2012 12:13, schrieb Peter Lieven: Hi, when trying to block migrate a VM from one node to another, the source VM crashed with the following assertion: block.c:3829: bdrv_set_in_use: Assertion `bs-in_use != in_use' failed. Is this sth already addresses/known? Not that I'm aware of, at least. Block migration doesn't seem to check whether the device is already in use, maybe this is the problem. Not sure why it would be in use, though, and in my quick test it didn't crash. So we need some more information: What's you command line, did you do anything specific in the monitor with block devices, what does the stacktrace look like, etc.? kevin, it seems that i can very easily force a crash if I cancel a running block migration. if I understand correctly what happens there are aio callbacks coming in after blk_mig_cleanup() has been called. what is the proper way to detect this in blk_mig_read_cb()? You could try this, it doesn't detect the situation in blk_mig_read_cb(), but ensures that all callbacks happen before we do the actual cleanup (completely untested): diff --git a/block-migration.c b/block-migration.c index 7def8ab..ed93301 100644 --- a/block-migration.c +++ b/block-migration.c @@ -519,6 +519,8 @@ static void blk_mig_cleanup(void) BlkMigDevState *bmds; BlkMigBlock *blk; +bdrv_drain_all(); + set_dirty_tracking(0); while ((bmds = QSIMPLEQ_FIRST(block_mig_state.bmds_list)) != NULL) { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0
On 09/18/12 12:31, Kevin Wolf wrote: Am 18.09.2012 12:28, schrieb Peter Lieven: On 09/17/12 22:12, Peter Lieven wrote: On 09/17/12 10:41, Kevin Wolf wrote: Am 16.09.2012 12:13, schrieb Peter Lieven: Hi, when trying to block migrate a VM from one node to another, the source VM crashed with the following assertion: block.c:3829: bdrv_set_in_use: Assertion `bs-in_use != in_use' failed. Is this sth already addresses/known? Not that I'm aware of, at least. Block migration doesn't seem to check whether the device is already in use, maybe this is the problem. Not sure why it would be in use, though, and in my quick test it didn't crash. So we need some more information: What's you command line, did you do anything specific in the monitor with block devices, what does the stacktrace look like, etc.? kevin, it seems that i can very easily force a crash if I cancel a running block migration. if I understand correctly what happens there are aio callbacks coming in after blk_mig_cleanup() has been called. what is the proper way to detect this in blk_mig_read_cb()? You could try this, it doesn't detect the situation in blk_mig_read_cb(), but ensures that all callbacks happen before we do the actual cleanup (completely untested): after testing it for half an hour i can say, it seems to fix the problem. no segfaults and also no other assertions. while searching I have seen that the queses blk_list and bmds_list are initialized at qemu startup. wouldn't it be better to initialize them at init_blk_migration or at least check that they are really empty? i have also seen that prev_time_offset is not initialized. thank you, peter sth like this: --- qemu-kvm-1.2.0/block-migration.c.orig2012-09-17 21:14:44.458429855 +0200 +++ qemu-kvm-1.2.0/block-migration.c2012-09-17 21:15:40.599736962 +0200 @@ -311,8 +311,12 @@ static void init_blk_migration(QEMUFile block_mig_state.prev_progress = -1; block_mig_state.bulk_completed = 0; block_mig_state.total_time = 0; +block_mig_state.prev_time_offset = 0; block_mig_state.reads = 0; +QSIMPLEQ_INIT(block_mig_state.bmds_list); +QSIMPLEQ_INIT(block_mig_state.blk_list); + bdrv_iterate(init_blk_migration_it, NULL); } @@ -760,9 +764,6 @@ SaveVMHandlers savevm_block_handlers = { void blk_mig_init(void) { -QSIMPLEQ_INIT(block_mig_state.bmds_list); -QSIMPLEQ_INIT(block_mig_state.blk_list); - register_savevm_live(NULL, block, 0, 1, savevm_block_handlers, block_mig_state); } diff --git a/block-migration.c b/block-migration.c index 7def8ab..ed93301 100644 --- a/block-migration.c +++ b/block-migration.c @@ -519,6 +519,8 @@ static void blk_mig_cleanup(void) BlkMigDevState *bmds; BlkMigBlock *blk; +bdrv_drain_all(); + set_dirty_tracking(0); while ((bmds = QSIMPLEQ_FIRST(block_mig_state.bmds_list)) != NULL) { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0
Am 16.09.2012 12:13, schrieb Peter Lieven: Hi, when trying to block migrate a VM from one node to another, the source VM crashed with the following assertion: block.c:3829: bdrv_set_in_use: Assertion `bs-in_use != in_use' failed. Is this sth already addresses/known? Not that I'm aware of, at least. Block migration doesn't seem to check whether the device is already in use, maybe this is the problem. Not sure why it would be in use, though, and in my quick test it didn't crash. So we need some more information: What's you command line, did you do anything specific in the monitor with block devices, what does the stacktrace look like, etc.? Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0
On 09/17/12 10:41, Kevin Wolf wrote: Am 16.09.2012 12:13, schrieb Peter Lieven: Hi, when trying to block migrate a VM from one node to another, the source VM crashed with the following assertion: block.c:3829: bdrv_set_in_use: Assertion `bs-in_use != in_use' failed. Is this sth already addresses/known? Not that I'm aware of, at least. Block migration doesn't seem to check whether the device is already in use, maybe this is the problem. Not sure why it would be in use, though, and in my quick test it didn't crash. It seems that it only happens if a vServer that has been block migrated earlier is block migrated the next time. So we need some more information: What's you command line, did you do anything specific in the monitor with block devices, what does the stacktrace look like, etc.? Here is my cmdline: /usr/bin/qemu-kvm-1.2.0 -net tap,vlan=164,script=no,downscript=no,ifname=tap0 -net nic,vlan =164,model=e1000,macaddr=52:54:00:ff:01:19 -drive format=host_device,file=/dev/7cf58855099771c2/lieven-storage-migration-t-hd0,if=virtio,cache=none,aio=nat ive -m 2048 -smp 2,sockets=1,cores=2,threads=1 -monitor tcp:0:4001,server,nowait -vnc :1 -qmp tcp:0:3001,server,nowait -name 'lieven-storage-migration-test' -boot or der=dc,menu=off -k de -incoming tcp:172.21.55.34:5001 -pidfile /var/run/qemu/vm-254.pid -mem-path /hugepages -mem-prealloc -rtc base=utc -usb -usbdevice tablet -no -hpet -vga cirrus -cpu host,+x2apic,model_id='Intel(R) Xeon(R) CPU L5640 @ 2.27GHz',-tsc I have seen other errors as well in the meantime: block-migration.c:471: flush_blks: Assertion `block_mig_state.read_done = 0' failed. qemu-kvm-1.2.0[27851]: segfault at 7f00746e78d7 ip 7f67eca6226d sp 7fff56ae3340 error 4 in qemu-system-x86_64[7f67ec9e9000+418000] I will now try to catch the situation in the debugger. Thanks, Peter Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Block Migration Assertion in qemu-kvm 1.2.0
On 09/17/12 10:41, Kevin Wolf wrote: Am 16.09.2012 12:13, schrieb Peter Lieven: Hi, when trying to block migrate a VM from one node to another, the source VM crashed with the following assertion: block.c:3829: bdrv_set_in_use: Assertion `bs-in_use != in_use' failed. Is this sth already addresses/known? Not that I'm aware of, at least. Block migration doesn't seem to check whether the device is already in use, maybe this is the problem. Not sure why it would be in use, though, and in my quick test it didn't crash. So we need some more information: What's you command line, did you do anything specific in the monitor with block devices, what does the stacktrace look like, etc.? i was also able to reproduce a flush_blks: Assertion `block_mig_state.read_done = 0' failed. by cancelling a block migration and restarting it afterwards. however, how can I grep a stack trace after an assert? thanks, peter Kevin -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html