Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
* Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/02/2015 02:42 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 04:11 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 03:01 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: snip Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. Hmm, yes - I think I need to update to your current world; in the version from the end of May, I get a 'error while loading state for instance 0x0 of device 'block'' if I try to use migrate -d -b (the bdrv_write fails) Can you give me both primary and secondary qemu's command? I think the command line is wrong, and disk migration fails. Primary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-prim,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=virtio,driver=quorum,read-pattern=fifo,no-connect=on,\ cache=none,aio=native,\ children.0.file.filename=./bugzilla.raw,\ children.0.driver=raw,\ children.1.file.driver=nbd,\ children.1.file.host=ibpair,\ children.1.file.port=8889,\ children.1.file.export=colo1,\ children.1.driver=replication,\ children.1.mode=primary,\ children.1.ignore-errors=on Add id=nbd_target1 to primary disk option, and try it. Disk migration needs the same id to sync the disk. Thank you! That worked nicely. The only odd thing was that the hidden disk on the secondary went upto ~2GB in size during the disk copy; (This is still on the version you posted at the end of may). I don't really understand why it was 2GB - the disk was 40GB, and qemu-img tells me that 2.6GB of it were used. Still, it would be good to avoid the overhead of going through the hidden disk on the secondary for the initial replication. Yes, I have fixed it in v7: starting backup job later. Thanks, I'll try it. Dave Thanks Wen Congyang Dave Thanks Wen Congyang Secondary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-slave,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=none,driver=raw,file=bugzilla.raw,id=nbd_target1,cache=none,aio=native \ -drive if=virtio,driver=replication,mode=secondary,export=colo1,throttling.bps-total-max=7000,\ file.file.filename=/run/colo-active-disk.qcow2,\ file.driver=qcow2,\ file.backing_reference.drive_id=nbd_target1,\ file.backing_reference.hidden-disk.file.filename=/run/colo-hidden-disk.qcow2,\ file.backing_reference.hidden-disk.driver=qcow2,\ file.backing_reference.hidden-disk.allow-write-backing-file=on \ -incoming tcp:0: Thanks, Dave If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? Yes, I think so; you should be able to turn on full logging on libvirt and watch the qmp commands it sends. Supporting mirror job in my TODO list now. But I think we should focus the basci function now. Thanks Wen Congyang Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 07/02/2015 03:54 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/02/2015 02:42 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 04:11 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 03:01 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: snip Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. Hmm, yes - I think I need to update to your current world; in the version from the end of May, I get a 'error while loading state for instance 0x0 of device 'block'' if I try to use migrate -d -b (the bdrv_write fails) Can you give me both primary and secondary qemu's command? I think the command line is wrong, and disk migration fails. Primary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-prim,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=virtio,driver=quorum,read-pattern=fifo,no-connect=on,\ cache=none,aio=native,\ children.0.file.filename=./bugzilla.raw,\ children.0.driver=raw,\ children.1.file.driver=nbd,\ children.1.file.host=ibpair,\ children.1.file.port=8889,\ children.1.file.export=colo1,\ children.1.driver=replication,\ children.1.mode=primary,\ children.1.ignore-errors=on Add id=nbd_target1 to primary disk option, and try it. Disk migration needs the same id to sync the disk. Thank you! That worked nicely. The only odd thing was that the hidden disk on the secondary went upto ~2GB in size during the disk copy; (This is still on the version you posted at the end of may). I don't really understand why it was 2GB - the disk was 40GB, and qemu-img tells me that 2.6GB of it were used. Still, it would be good to avoid the overhead of going through the hidden disk on the secondary for the initial replication. Yes, I have fixed it in v7: starting backup job later. Thanks, I'll try it. Note: the usage is changed, and you can get the usage from the wiki: http://wiki.qemu.org/Features/COLO Thanks Wen Congyang Dave Thanks Wen Congyang Dave Thanks Wen Congyang Secondary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-slave,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=none,driver=raw,file=bugzilla.raw,id=nbd_target1,cache=none,aio=native \ -drive if=virtio,driver=replication,mode=secondary,export=colo1,throttling.bps-total-max=7000,\ file.file.filename=/run/colo-active-disk.qcow2,\ file.driver=qcow2,\ file.backing_reference.drive_id=nbd_target1,\ file.backing_reference.hidden-disk.file.filename=/run/colo-hidden-disk.qcow2,\ file.backing_reference.hidden-disk.driver=qcow2,\ file.backing_reference.hidden-disk.allow-write-backing-file=on \ -incoming tcp:0: Thanks, Dave If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? Yes, I think so; you should be able to turn on full logging on libvirt and watch the qmp commands it sends. Supporting mirror job in my TODO list now. But I think we should focus the basci function now. Thanks Wen Congyang Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK .
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 07/01/2015 04:11 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 03:01 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: snip Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. Hmm, yes - I think I need to update to your current world; in the version from the end of May, I get a 'error while loading state for instance 0x0 of device 'block'' if I try to use migrate -d -b (the bdrv_write fails) Can you give me both primary and secondary qemu's command? I think the command line is wrong, and disk migration fails. Primary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-prim,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=virtio,driver=quorum,read-pattern=fifo,no-connect=on,\ cache=none,aio=native,\ children.0.file.filename=./bugzilla.raw,\ children.0.driver=raw,\ children.1.file.driver=nbd,\ children.1.file.host=ibpair,\ children.1.file.port=8889,\ children.1.file.export=colo1,\ children.1.driver=replication,\ children.1.mode=primary,\ children.1.ignore-errors=on Add id=nbd_target1 to primary disk option, and try it. Disk migration needs the same id to sync the disk. Thanks Wen Congyang Secondary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-slave,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=none,driver=raw,file=bugzilla.raw,id=nbd_target1,cache=none,aio=native \ -drive if=virtio,driver=replication,mode=secondary,export=colo1,throttling.bps-total-max=7000,\ file.file.filename=/run/colo-active-disk.qcow2,\ file.driver=qcow2,\ file.backing_reference.drive_id=nbd_target1,\ file.backing_reference.hidden-disk.file.filename=/run/colo-hidden-disk.qcow2,\ file.backing_reference.hidden-disk.driver=qcow2,\ file.backing_reference.hidden-disk.allow-write-backing-file=on \ -incoming tcp:0: Thanks, Dave If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? Yes, I think so; you should be able to turn on full logging on libvirt and watch the qmp commands it sends. Supporting mirror job in my TODO list now. But I think we should focus the basci function now. Thanks Wen Congyang Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK .
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
* Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 04:11 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 03:01 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: snip Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. Hmm, yes - I think I need to update to your current world; in the version from the end of May, I get a 'error while loading state for instance 0x0 of device 'block'' if I try to use migrate -d -b (the bdrv_write fails) Can you give me both primary and secondary qemu's command? I think the command line is wrong, and disk migration fails. Primary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-prim,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=virtio,driver=quorum,read-pattern=fifo,no-connect=on,\ cache=none,aio=native,\ children.0.file.filename=./bugzilla.raw,\ children.0.driver=raw,\ children.1.file.driver=nbd,\ children.1.file.host=ibpair,\ children.1.file.port=8889,\ children.1.file.export=colo1,\ children.1.driver=replication,\ children.1.mode=primary,\ children.1.ignore-errors=on Add id=nbd_target1 to primary disk option, and try it. Disk migration needs the same id to sync the disk. Thank you! That worked nicely. The only odd thing was that the hidden disk on the secondary went upto ~2GB in size during the disk copy; (This is still on the version you posted at the end of may). I don't really understand why it was 2GB - the disk was 40GB, and qemu-img tells me that 2.6GB of it were used. Still, it would be good to avoid the overhead of going through the hidden disk on the secondary for the initial replication. Dave Thanks Wen Congyang Secondary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-slave,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=none,driver=raw,file=bugzilla.raw,id=nbd_target1,cache=none,aio=native \ -drive if=virtio,driver=replication,mode=secondary,export=colo1,throttling.bps-total-max=7000,\ file.file.filename=/run/colo-active-disk.qcow2,\ file.driver=qcow2,\ file.backing_reference.drive_id=nbd_target1,\ file.backing_reference.hidden-disk.file.filename=/run/colo-hidden-disk.qcow2,\ file.backing_reference.hidden-disk.driver=qcow2,\ file.backing_reference.hidden-disk.allow-write-backing-file=on \ -incoming tcp:0: Thanks, Dave If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? Yes, I think so; you should be able to turn on full logging on libvirt and watch the qmp commands it sends. Supporting mirror job in my TODO list now. But I think we should focus the basci function now. Thanks Wen Congyang Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
* Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 03:01 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: snip Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. Hmm, yes - I think I need to update to your current world; in the version from the end of May, I get a 'error while loading state for instance 0x0 of device 'block'' if I try to use migrate -d -b (the bdrv_write fails) Can you give me both primary and secondary qemu's command? I think the command line is wrong, and disk migration fails. Primary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-prim,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=virtio,driver=quorum,read-pattern=fifo,no-connect=on,\ cache=none,aio=native,\ children.0.file.filename=./bugzilla.raw,\ children.0.driver=raw,\ children.1.file.driver=nbd,\ children.1.file.host=ibpair,\ children.1.file.port=8889,\ children.1.file.export=colo1,\ children.1.driver=replication,\ children.1.mode=primary,\ children.1.ignore-errors=on Secondary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-slave,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=none,driver=raw,file=bugzilla.raw,id=nbd_target1,cache=none,aio=native \ -drive if=virtio,driver=replication,mode=secondary,export=colo1,throttling.bps-total-max=7000,\ file.file.filename=/run/colo-active-disk.qcow2,\ file.driver=qcow2,\ file.backing_reference.drive_id=nbd_target1,\ file.backing_reference.hidden-disk.file.filename=/run/colo-hidden-disk.qcow2,\ file.backing_reference.hidden-disk.driver=qcow2,\ file.backing_reference.hidden-disk.allow-write-backing-file=on \ -incoming tcp:0: Thanks, Dave If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? Yes, I think so; you should be able to turn on full logging on libvirt and watch the qmp commands it sends. Supporting mirror job in my TODO list now. But I think we should focus the basci function now. Thanks Wen Congyang Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 07/02/2015 02:42 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 04:11 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 07/01/2015 03:01 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: snip Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. Hmm, yes - I think I need to update to your current world; in the version from the end of May, I get a 'error while loading state for instance 0x0 of device 'block'' if I try to use migrate -d -b (the bdrv_write fails) Can you give me both primary and secondary qemu's command? I think the command line is wrong, and disk migration fails. Primary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-prim,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=virtio,driver=quorum,read-pattern=fifo,no-connect=on,\ cache=none,aio=native,\ children.0.file.filename=./bugzilla.raw,\ children.0.driver=raw,\ children.1.file.driver=nbd,\ children.1.file.host=ibpair,\ children.1.file.port=8889,\ children.1.file.export=colo1,\ children.1.driver=replication,\ children.1.mode=primary,\ children.1.ignore-errors=on Add id=nbd_target1 to primary disk option, and try it. Disk migration needs the same id to sync the disk. Thank you! That worked nicely. The only odd thing was that the hidden disk on the secondary went upto ~2GB in size during the disk copy; (This is still on the version you posted at the end of may). I don't really understand why it was 2GB - the disk was 40GB, and qemu-img tells me that 2.6GB of it were used. Still, it would be good to avoid the overhead of going through the hidden disk on the secondary for the initial replication. Yes, I have fixed it in v7: starting backup job later. Thanks Wen Congyang Dave Thanks Wen Congyang Secondary: ./try/bin/qemu-system-x86_64 -enable-kvm -nographic \ -boot c -m 4096 -smp 4 -S \ -name debug-threads=on -trace events=trace-file \ -netdev tap,id=hn0,script=$PWD/ifup-slave,\ downscript=no,colo_script=$PWD/qemu/scripts/colo-proxy-script.sh,colo_nicname=em4 \ -device e1000,mac=9c:da:4d:1c:b5:89,id=net-pci0,netdev=hn0 \ -drive if=none,driver=raw,file=bugzilla.raw,id=nbd_target1,cache=none,aio=native \ -drive if=virtio,driver=replication,mode=secondary,export=colo1,throttling.bps-total-max=7000,\ file.file.filename=/run/colo-active-disk.qcow2,\ file.driver=qcow2,\ file.backing_reference.drive_id=nbd_target1,\ file.backing_reference.hidden-disk.file.filename=/run/colo-hidden-disk.qcow2,\ file.backing_reference.hidden-disk.driver=qcow2,\ file.backing_reference.hidden-disk.allow-write-backing-file=on \ -incoming tcp:0: Thanks, Dave If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? Yes, I think so; you should be able to turn on full logging on libvirt and watch the qmp commands it sends. Supporting mirror job in my TODO list now. But I think we should focus the basci function now. Thanks Wen Congyang Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK . -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK .
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 07/01/2015 03:01 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: snip Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. Hmm, yes - I think I need to update to your current world; in the version from the end of May, I get a 'error while loading state for instance 0x0 of device 'block'' if I try to use migrate -d -b (the bdrv_write fails) Can you give me both primary and secondary qemu's command? I think the command line is wrong, and disk migration fails. If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? Yes, I think so; you should be able to turn on full logging on libvirt and watch the qmp commands it sends. Supporting mirror job in my TODO list now. But I think we should focus the basci function now. Thanks Wen Congyang Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK .
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
* Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: snip Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. Hmm, yes - I think I need to update to your current world; in the version from the end of May, I get a 'error while loading state for instance 0x0 of device 'block'' if I try to use migrate -d -b (the bdrv_write fails) If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? Yes, I think so; you should be able to turn on full logging on libvirt and watch the qmp commands it sends. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote: * Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/24/2015 10:07 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (ghost...@gmail.com) wrote: At 2015/6/19 18:49, Stefan Hajnoczi Wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). For replication case, we always do checkpoint again and again after migration. If the disk is not synced before migration, we will use disk mirgation or mirror job to sync it. Can you document the way that you use disk migration or mirror, together with your COLO setup? I think it would make it easier to understand this restriction. At the moment I don't understand how you can switch from doing a disk migration into COLO mode - there seems to be a gap between the end of disk migration and the start of COLO. In my test, I use disk migration. After migration and before starting COLO, we will disable disk migration: +/* Disable block migration */ +s-params.blk = 0; +s-params.shared = 0; +qemu_savevm_state_begin(trans, s-params); Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? hidden disk and active disk will be made empty when starting block replication. If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. In normal migration, is mirror job created and cancelled by libvirt? With block migration or mirror, then that pretty much allows you to replace a failed secondary doesn't it without restarting? The only thing stopping you is the NBD client needs to point to the address of the new secondary. Stefan's suggestion of being able to modify the quorum on the fly would seem to fix that problem. Yes, I am implementing adding/removing quorum child now. (The other thing I can see is that the secondary wouldn't have the conntrack data for open connections; but that's a separate problem). A related topic was that a colleague was asking about starting the qemu up and only then setting the NBD target (he's trying to wire it into OpenStack); I suggested that instead of specifying the drive on the command line, it might work to use a drive_add to add the whole quorum/drive stack before starting the guest; however again something to modify the quorum would be easier. Finally the last good reason I can see for being able to modify the quorum on the fly is that you'll need it for continuous ft to be able to transmute a secondary into a new primary. Yes, I think adding a filter on the top is also needed. Thanks
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
* Wen Congyang (we...@cn.fujitsu.com) wrote: On 06/24/2015 10:07 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (ghost...@gmail.com) wrote: At 2015/6/19 18:49, Stefan Hajnoczi Wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). For replication case, we always do checkpoint again and again after migration. If the disk is not synced before migration, we will use disk mirgation or mirror job to sync it. Can you document the way that you use disk migration or mirror, together with your COLO setup? I think it would make it easier to understand this restriction. At the moment I don't understand how you can switch from doing a disk migration into COLO mode - there seems to be a gap between the end of disk migration and the start of COLO. In my test, I use disk migration. After migration and before starting COLO, we will disable disk migration: +/* Disable block migration */ +s-params.blk = 0; +s-params.shared = 0; +qemu_savevm_state_begin(trans, s-params); Ah, I hadn't realised you could do that; so do you just do: migrate_set_parameter colo on migrate -d -b tcp:otherhhost:port How does the secondary know to feed that data straight into the disk without recording all the old data into the hidden-disk ? If the user uses mirror job, we don't cancel the mirror job now. It would be good to get it to work with mirror, that seems preferred these days to the old block migration. With block migration or mirror, then that pretty much allows you to replace a failed secondary doesn't it without restarting? The only thing stopping you is the NBD client needs to point to the address of the new secondary. Stefan's suggestion of being able to modify the quorum on the fly would seem to fix that problem. (The other thing I can see is that the secondary wouldn't have the conntrack data for open connections; but that's a separate problem). A related topic was that a colleague was asking about starting the qemu up and only then setting the NBD target (he's trying to wire it into OpenStack); I suggested that instead of specifying the drive on the command line, it might work to use a drive_add to add the whole quorum/drive stack before starting the guest; however again something to modify the quorum would be easier. Finally the last good reason I can see for being able to modify the quorum on the fly is that you'll need it for continuous ft to be able to transmute a secondary into a new primary. Dave So we cannot start block replication when migration is running. We need that nbd client is not ready when migration is running, and it is ready between migration ends and checkpoint begins. Using a monotir command add the nbd client will cause larger
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
* Wen Congyang (ghost...@gmail.com) wrote: At 2015/6/19 18:49, Stefan Hajnoczi Wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). For replication case, we always do checkpoint again and again after migration. If the disk is not synced before migration, we will use disk mirgation or mirror job to sync it. Can you document the way that you use disk migration or mirror, together with your COLO setup? I think it would make it easier to understand this restriction. At the moment I don't understand how you can switch from doing a disk migration into COLO mode - there seems to be a gap between the end of disk migration and the start of COLO. So we cannot start block replication when migration is running. We need that nbd client is not ready when migration is running, and it is ready between migration ends and checkpoint begins. Using a monotir command add the nbd client will cause larger downtime. So if the nbd client has been added(only not connect to the nbd server), we can connect to nbd server automatically. Without the disk migration or mirror, I can't see the need for the delayed bdrv_connect. I can see that you want to do a disconnect at failover, however you need to take care because if the network is broken at the point of failover you need to make sure nothing blocks. Dave Thanks Wen Congyang -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 06/24/2015 10:07 PM, Dr. David Alan Gilbert wrote: * Wen Congyang (ghost...@gmail.com) wrote: At 2015/6/19 18:49, Stefan Hajnoczi Wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). For replication case, we always do checkpoint again and again after migration. If the disk is not synced before migration, we will use disk mirgation or mirror job to sync it. Can you document the way that you use disk migration or mirror, together with your COLO setup? I think it would make it easier to understand this restriction. At the moment I don't understand how you can switch from doing a disk migration into COLO mode - there seems to be a gap between the end of disk migration and the start of COLO. In my test, I use disk migration. After migration and before starting COLO, we will disable disk migration: +/* Disable block migration */ +s-params.blk = 0; +s-params.shared = 0; +qemu_savevm_state_begin(trans, s-params); If the user uses mirror job, we don't cancel the mirror job now. So we cannot start block replication when migration is running. We need that nbd client is not ready when migration is running, and it is ready between migration ends and checkpoint begins. Using a monotir command add the nbd client will cause larger downtime. So if the nbd client has been added(only not connect to the nbd server), we can connect to nbd server automatically. Without the disk migration or mirror, I can't see the need for the delayed bdrv_connect. I can see that you want to do a disconnect at failover, however you need to take care because if the network is broken at the point of failover you need to make sure nothing blocks. disk migration is needed if the disk is not synced before migration. Thanks Wen Congyang Dave Thanks Wen Congyang -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK .
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
At 2015/6/23 21:42, Stefan Hajnoczi Wrote: On Mon, Jun 22, 2015 at 09:56:24PM +0800, Wen Congyang wrote: The NBD connect should not be performed during downtime, regardless of whether a monitor command is used or not. Why? NBD connection establishment takes unbounded time. It can extend the downtime period and cause users to notice that the guest is unavailable. The point of COLO is fault tolerance and high availability. In order to achieve that, COLO must minimize downtime. If an operation can be performed outside the migration downtime phase, then it should be done outside. Yes, you are right. I will think about it. Thanks Wen Congyang Stefan
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On Mon, Jun 22, 2015 at 09:56:24PM +0800, Wen Congyang wrote: The NBD connect should not be performed during downtime, regardless of whether a monitor command is used or not. Why? NBD connection establishment takes unbounded time. It can extend the downtime period and cause users to notice that the guest is unavailable. The point of COLO is fault tolerance and high availability. In order to achieve that, COLO must minimize downtime. If an operation can be performed outside the migration downtime phase, then it should be done outside. Stefan pgpA2fnuGmeM1.pgp Description: PGP signature
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? We have decieded use this way here: http://lists.gnu.org/archive/html/qemu-devel/2015-04/msg02975.html Thanks Wen Congyang That is cleaner because it doesn't introduce a new state (which is only implemented for nbd).
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 06/19/2015 06:49 PM, Stefan Hajnoczi wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). CCing Alberto Garcia for the quorum block driver.
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On Sat, Jun 20, 2015 at 11:31:52AM +0800, Wen Congyang wrote: At 2015/6/19 18:49, Stefan Hajnoczi Wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). For replication case, we always do checkpoint again and again after migration. If the disk is not synced before migration, we will use disk mirgation or mirror job to sync it. So we cannot start block replication when migration is running. We need that nbd client is not ready when migration is running, and it is ready between migration ends and checkpoint begins. Using a monotir command add the nbd client will cause larger downtime. So if the nbd client has been added(only not connect to the nbd server), we can connect to nbd server automatically. I'm not sure I understand you. Can you rephrase this? The NBD connect should not be performed during downtime, regardless of whether a monitor command is used or not. Stefan pgpZovH74jxzo.pgp Description: PGP signature
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
At 2015/6/22 20:39, Stefan Hajnoczi Wrote: On Sat, Jun 20, 2015 at 11:31:52AM +0800, Wen Congyang wrote: At 2015/6/19 18:49, Stefan Hajnoczi Wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). For replication case, we always do checkpoint again and again after migration. If the disk is not synced before migration, we will use disk mirgation or mirror job to sync it. So we cannot start block replication when migration is running. We need that nbd client is not ready when migration is running, and it is ready between migration ends and checkpoint begins. Using a monotir command add the nbd client will cause larger downtime. So if the nbd client has been added(only not connect to the nbd server), we can connect to nbd server automatically. I'm not sure I understand you. Can you rephrase this? We only use block replication after migration because we assume that the disk has been synced before block replication is started, so we only forward new write requests via nbd. The NBD connect should not be performed during downtime, regardless of whether a monitor command is used or not. Why? Thanks Wen Congyang Stefan
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
At 2015/6/19 18:49, Stefan Hajnoczi Wrote: On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). For replication case, we always do checkpoint again and again after migration. If the disk is not synced before migration, we will use disk mirgation or mirror job to sync it. So we cannot start block replication when migration is running. We need that nbd client is not ready when migration is running, and it is ready between migration ends and checkpoint begins. Using a monotir command add the nbd client will cause larger downtime. So if the nbd client has been added(only not connect to the nbd server), we can connect to nbd server automatically. Thanks Wen Congyang
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. A monitor command to hot add/remove quorum children solves this problem and could also be used in other scenarios (e.g. user decides to take a quorum child offline). pgpJc6Qgzwe7w.pgp Description: PGP signature
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? pgpItNrONYSje.pgp Description: PGP signature
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Thanks Wen Congyang
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). pgpljhMjO5kHz.pgp Description: PGP signature
Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: At 2015/6/18 20:55, Stefan Hajnoczi Wrote: On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: +void bdrv_connect(BlockDriverState *bs, Error **errp) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_connect) { +drv-bdrv_connect(bs, errp); +} else if (bs-file) { +bdrv_connect(bs-file, errp); +} else { +error_setg(errp, this feature or command is not currently supported); +} +} + +void bdrv_disconnect(BlockDriverState *bs) +{ +BlockDriver *drv = bs-drv; + +if (drv drv-bdrv_disconnect) { +drv-bdrv_disconnect(bs); +} else if (bs-file) { +bdrv_disconnect(bs-file); +} +} Please add doc comments describing the semantics of these commands. Where should it be documented? In the header file? block.h doesn't document prototypes in the header file, please document the function definition in block.c. (QEMU is not consistent here, some places do it the other way around.) Why are these operations needed when there is already a bs-drv == NULL case which means the BDS is not ready for read/write? The purpos is that: don't connect to nbd server when opening a nbd client. connect/disconnect to nbd server when we need to do it. IIUC, if bs-drv is NULL, it means that the driver is ejected? Here, connect/disconnect means that connect/disconnect to remote target(The target may be in another host). Connect/disconnect puts something on the QEMU command-line that isn't ready at startup time. How about using monitor commands to add objects when needed instead? That is cleaner because it doesn't introduce a new state (which is only implemented for nbd). The problem is that, nbd client is one child of quorum, and quorum must have more than one child. The nbd server is not ready until colo is running. Any suggestion is welcome. Thanks Wen Congyang