Re: [Qemu-devel] [PATCH v5 08/11] qcow2: Rebuild refcount structure during check
Am 11.10.2014 um 20:56 schrieb Benoît Canet: +int64_t first_free_cluster = 0, rt_ofs = -1, cluster = 0; +int64_t rb_ofs, rb_start, rb_index; Everytime a few day pass and I read rb_ofs and rt_ofs again I found these names obfuscated. I know Linus says that C is a spartan language but these look odd. Maybe refcount_block_offset, refcount_block_* and refcount_table_offset would be better. I would use longer names if there was no line length limit. ;-) I'll try and see how it looks. Max
Re: [Qemu-devel] [PATCH] qcow2: fix double-free of Qcow2DiscardRegion in qcow2_process_discards
Am 11.10.2014 um 09:14 hat Zhang Haoyu geschrieben: In qcow2_update_snapshot_refcount - qcow2_process_discards() - bdrv_discard() may free the Qcow2DiscardRegion which is referenced by next pointer in qcow2_process_discards() now, in next iteration, d = next, so g_free(d) will double-free this Qcow2DiscardRegion. qcow2_snapshot_delete |- qcow2_update_snapshot_refcount |-- qcow2_process_discards |--- bdrv_discard | aio_poll |- aio_dispatch |-- bdrv_co_io_em_complete |--- qemu_coroutine_enter(co-coroutine, NULL); === coroutine entry is bdrv_co_do_rw |--- g_free(d) == free first Qcow2DiscardRegion is okay |--- d = next; == this set is done in QTAILQ_FOREACH_SAFE() macro. |--- g_free(d); == double-free will happen if during previous iteration, bdrv_discard had free this object. Do you have a reproducer for this or did code review lead you to this? At the moment I can't see how bdrv_discard(bs-file) could ever free a Qcow2DiscardRegion of bs, as it's working on a completely different BlockDriverState (which usually won't even be a qcow2 one). bdrv_co_do_rw |- bdrv_co_do_writev |-- bdrv_co_do_pwritev |--- bdrv_aligned_pwritev | qcow2_co_writev |- qcow2_alloc_cluster_link_l2 |-- qcow2_free_any_clusters |--- qcow2_free_clusters | update_refcount |- qcow2_process_discards |-- g_free(d) == In next iteration, this Qcow2DiscardRegion will be double-free. This shouldn't happen in a nested call either, as s-lock can't be taken recursively. Kevin
Re: [Qemu-devel] [PATCH] oslib-posix: change free to g_free
Am 11.10.2014 um 05:23 hat Eric Blake geschrieben: On 10/10/2014 08:54 PM, arei.gong...@huawei.com wrote: From: Gonglei arei.gong...@huawei.com The caller of qemu_vfree() maybe not check whether parameter ptr pointer is NULL or not, such as vpc_open(). Using g_free() is more safe. NACK. g_free is only safe for pointers allocated by g_malloc. qemu_vfree is for use on pointers allocated by qemu_try_memalign and friends (matching the name valloc which is an older spelling of posix_memalign), which are NOT allocated by g_malloc. Furthermore, free(NULL) is safe. I second that. Strong NACK. Kevin
Re: [Qemu-devel] [PATCH v2] libvixl: a64: Skip -Wunused-variable for gcc 5.0.0 or higher
On 12 October 2014 01:32, Chen Gang gang.chen.5...@gmail.com wrote: On 10/12/14 5:25, Peter Maydell wrote: Some other approaches to this that would confine the fix to the makefiles rather than requiring us to modify the vixl source itself: a) add a -Wno- option for the affected .o files It is one way, but may have effect with gcc 4 version, and also it is effect with the whole file which is wider than current way. b) use -isystem rather than -I to include the libvixl directory on the include path It sounds good to me, although for me, it is not related with current issue. -isystem disables a bunch of gcc warnings automatically, which is why I suggested it. I'm not overall sure it's a great idea though. -- PMM
Re: [Qemu-devel] [Bug?] qemu abort when trying to passthrough BCM5719 Gigabit Ethernet
On Sat, Oct 11, 2014 at 01:41:48AM -0600, Alex Williamson wrote: On Sat, 2014-10-11 at 13:58 +0800, zhanghailiang wrote: Hi all, When i try to passthrough BCM5719 Gigabit Ethernet to guest using the qemu master branch, it aborted, and show kvm_set_phys_mem:error registering slot:Bad Address. qemu command: #./qemu/qemu/x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 -vnc :99 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -drive file=/home/suse11_sp3_64,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native -device scsi-hd,bus=scsi0.0,channel=0,scsi- id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 -device pci-assign,host=01:00.1,id=mydevice -net none info about guest and host: host OS: 3.16.5 *guest OS: Novell SuSE Linux Enterprise Server 11 SP3* #cat /proc/cpuinfo processor : 31 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz stepping: 4 microcode : 0x416 cpu MHz : 1926.875 cache size : 20480 KB physical id : 1 siblings: 16 core id : 7 cpu cores : 8 apicid : 47 initial apicid : 47 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips: 4005.35 clflush size: 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: gdb info: (gdb) bt #0 0x71ce9989 in raise () from /usr/lib64/libc.so.6 #1 0x71ceb098 in abort () from /usr/lib64/libc.so.6 #2 0x556275cf in kvm_set_phys_mem (section=0x7fffedcea790, add=true) at /home/qemu/qemu/kvm-all.c:711 #3 0x5562980f in address_space_update_topology_pass (as=as@entry=0x55d01ca0 address_space_memory, adding=adding@entry=true, new_view=optimized out, new_view=optimized out, old_view=0x7fffe8022a90, old_view=0x7fffe8022a90) at /home/qemu/qemu/memory.c:752 #4 0x5562b910 in address_space_update_topology (as=0x55d01ca0 address_space_memory) at /home/qemu/qemu/memory.c:767 #5 memory_region_transaction_commit () at /home/qemu/qemu/memory.c:808 #6 0x557a75b4 in pci_update_mappings (d=0x562ba9f0) at hw/pci/pci.c:1113 #7 0x557a7932 in pci_default_write_config (d=d@entry=0x562ba9f0, addr=addr@entry=20, val_in=val_in@entry=4294967295, l=l@entry=4) at hw/pci/pci.c:1165 #8 0x5566c17e in assigned_dev_pci_write_config (pci_dev=0x562ba9f0, address=20, val=4294967295, len=4) at /home/qemu/qemu/hw/i386/kvm/pci-assign.c:1196 #9 0x55628fea in access_with_adjusted_size (addr=addr@entry=0, value=value@entry=0x7fffedceaae0, size=size@entry=4, access_size_min=optimized out, access_size_max=optimized out, access=0x55629160 memory_region_write_accessor, mr=0x56231f00) at /home/qemu/qemu/memory.c:480 #10 0x5562dbf7 in memory_region_dispatch_write (size=4, data=18446744073709551615, addr=0, mr=0x56231f00) at /home/qemu/qemu/memory.c:1122 #11 io_mem_write (mr=mr@entry=0x56231f00, addr=0, val=optimized out, size=4) at /home/qemu/qemu/memory.c:1958 #12 0x555f8963 in address_space_rw (as=0x55d01d80 address_space_io, addr=addr@entry=3324, buf=0x77fec000 \377\377\377\377, len=len@entry=4, is_write=is_write@entry=true) at /home/qemu/qemu/exec.c:2145 #13 0x55628491 in kvm_handle_io (count=1, size=4, direction=optimized out, data=optimized out, port=3324) at /home/qemu/qemu/kvm-all.c:1614 #14 kvm_cpu_exec (cpu=cpu@entry=0x5620e610) at /home/qemu/qemu/kvm-all.c:1771 #15 0x55617182 in qemu_kvm_cpu_thread_fn (arg=0x5620e610) at /home/qemu/qemu/cpus.c:953 #16 0x76ba2df3 in start_thread () from /usr/lib64/libpthread.so.0 #17 0x71daa3dd in clone () from /usr/lib64/libc.so.6 messages log: Oct 10 07:43:18 localhost kernel: kvm: zapping shadow pages for mmio generation wraparound Oct 10 07:43:27 localhost kernel: kvm [13251]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xabcd Oct 10 07:43:28 localhost kernel: intel_iommu_map: iommu width (48) is not sufficient for the mapped address (fe001000) Oct 10 07:43:28 localhost kernel: kvm_iommu_map_address:iommu failed to map pfn=94880
Re: [Qemu-devel] [PATCH] qcow2: fix double-free of Qcow2DiscardRegion in qcow2_process_discards
On 2014-10-12 15:34, Kevin Wolf wrote: Am 11.10.2014 um 09:14 hat Zhang Haoyu geschrieben: In qcow2_update_snapshot_refcount - qcow2_process_discards() - bdrv_discard() may free the Qcow2DiscardRegion which is referenced by next pointer in qcow2_process_discards() now, in next iteration, d = next, so g_free(d) will double-free this Qcow2DiscardRegion. qcow2_snapshot_delete |- qcow2_update_snapshot_refcount |-- qcow2_process_discards |--- bdrv_discard | aio_poll |- aio_dispatch |-- bdrv_co_io_em_complete |--- qemu_coroutine_enter(co-coroutine, NULL); === coroutine entry is bdrv_co_do_rw |--- g_free(d) == free first Qcow2DiscardRegion is okay |--- d = next; == this set is done in QTAILQ_FOREACH_SAFE() macro. |--- g_free(d); == double-free will happen if during previous iteration, bdrv_discard had free this object. Do you have a reproducer for this or did code review lead you to this? This problem can be reproduced with loop of savevm - delvm - savem - delvm ..., about 4 hours. When I delete the vm snapshot, qemu crashed with a core file, I debug the core file and find the double-free and the stack. So I add a breakpoint at g_free(d);, and find that indeed a double-free happened, twice free with the same address. And only the first discard region have not happened with double-free. At the moment I can't see how bdrv_discard(bs-file) could ever free a Qcow2DiscardRegion of bs, as it's working on a completely different BlockDriverState (which usually won't even be a qcow2 one). I think the aio_context in bdrv_discard - aio_poll(aio_context, true) is the qemu_aio_context, no matter the bs or bs-file passed to bdrv_discard, so aio_poll(aio_context) will poll all of the aio. bdrv_co_do_rw |- bdrv_co_do_writev |-- bdrv_co_do_pwritev |--- bdrv_aligned_pwritev | qcow2_co_writev |- qcow2_alloc_cluster_link_l2 |-- qcow2_free_any_clusters |--- qcow2_free_clusters | update_refcount |- qcow2_process_discards |-- g_free(d) == In next iteration, this Qcow2DiscardRegion will be double-free. This shouldn't happen in a nested call either, as s-lock can't be taken recursively. Could you detail how s-lock prevent that, above stack is from the gdb, when I add a breakpoint in g_free(d). Thanks, Zhang Haoyu Kevin
Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough
On Thu, Oct 09, 2014 at 01:53:16PM +0800, Chen, Tiejun wrote: On 2014/10/7 15:26, Michael S. Tsirkin wrote: On Tue, Sep 30, 2014 at 10:43:09AM +0800, Chen, Tiejun wrote: On 2014/9/29 18:01, Michael S. Tsirkin wrote: On Sun, Sep 28, 2014 at 10:59:05AM +0800, Chen, Tiejun wrote: On 2014/9/3 9:40, Kay, Allen M wrote: -Original Message- From: Chen, Tiejun Sent: Monday, September 01, 2014 12:50 AM To: Michael S. Tsirkin Cc: xen-de...@lists.xensource.com; Kay, Allen M; qemu-devel@nongnu.org; Konrad Rzeszutek Wilk Subject: Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough On 2014/9/1 14:05, Michael S. Tsirkin wrote: On Mon, Sep 01, 2014 at 10:50:37AM +0800, Chen, Tiejun wrote: On 2014/8/31 16:58, Michael S. Tsirkin wrote: On Fri, Aug 29, 2014 at 09:28:50AM +0800, Chen, Tiejun wrote: On 2014/8/28 8:56, Chen, Tiejun wrote: + */ +dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0), + xen-igd-passthrough-isa-bridge); +if (dev) { +r = xen_host_pci_device_get(hdev, 0, 0, + PCI_DEVFN(0x1f, 0), 0); +if (!r) { +pci_config_set_vendor_id(dev-config, hdev.vendor_id); +pci_config_set_device_id(dev-config, + hdev.device_id); Can you, instead, implement the reverse logic, probing the card and supplying the correct device id for PCH? Here what is your so-called reverse logic as I already asked you previously? Do you mean I should list all PCHs with a combo illustrated with the vendor/device id in advance? Then look up if we can find a Michael, Ping. Thanks Tiejun Could you explain this exactly? Then I can try follow-up your idea ASAP if this is necessary and possible. Michel, Could you give us some explanation for your reverse logic when you're free? Thanks Tiejun So future drivers will look at device ID for the card and figure out how things should work from there. Old drivers still poke at device id of the chipset for this, but maybe qemu can do what newer drivers do: look at the card and figure out what guest should do, then present the appropriate chipset id. This is based on what Jesse said: Practically speaking, we could probably assume specific GPU/PCH combos, as I don't think they're generally mixed across generations, though SNB and IVB did have compatible sockets, so there is the possibility of mixing CPT and PPT PCHs, but those are register identical as far as the graphics driver is concerned, so even that should be safe. Michael, Thanks for your explanation. so the idea is to have a reverse table mapping GPU back to PCH. Present to guest the ID that will let it assume the correct GPU. I guess you mean we should create to maintain such a table: [GPU Card: device_id(s), PCH: device_id] Then each time, instead of exposing that real PCH device id directly, qemu first can get the real GPU device id to lookup this table to present a appropriate PCH's device id to the guest. And looks here that appropriate PCH's device id is not always a that real PCH's device id. Right? If I'm wrong please correct me. Exactly: we don't really care what the PCH ID is - we only need it for the guest driver to do the right thing. Agreed. I need to ask Allen to check if one given GPU card device id is always corresponding to one given PCH on both HSW and BDW currently. If yes, I can do this quickly. Otherwise I need some time to establish that sort of connection. Michael, Sorry for this delay response but please keep in mind we really are in this process. You know this involve different GPU components we have to take some time to communicate or even discuss internal. Now I have a draft codes, could you take a look at this? I hope that comment can help us to understand something, but if you have any question, we can further respond inline. --- typedef struct { uint16_t gpu_device_id; uint16_t pch_device_id; } XenIGDDeviceIDInfo; /* In real world different GPU should have different PCH. But actually * the different PCH DIDs likely map to different PCH SKUs. We do the * same thing for the GPU. For PCH, the different SKUs are going to be * all the same silicon design and implementation, just different * features turn on and off with fuses. The SW interfaces should be * consistent across all SKUs in a given family (eg LPT). But just same * features may not be supported. * * Most of these different PCH features probably don't matter to the * Gfx driver, but obviously any difference in display port connections * will so it should be fine with any PCH in case of passthrough. * * So currently use one PCH version, 0x8c4e, to cover all HSW scenarios, * 0x9cc3 for BDW. Pls write Haswell and Broadwell in full in comment. Are you saying I should list all PCH
Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image
Just a wild guess - Alexandre, did you tried detect-zeroes blk option for mirroring targets? Hi, yes, I have also tried with detect-zeroes (on or discard) with virtio and virtio-scsi, doesn't help. (I'm not sure that is implemtend in drive-mirror). As workaround currently, after drive-mirror, I doing fstrim inside the guest (with virtio-scsi + discard), and like this I can free space on rbd storage. - Mail original - De: Andrey Korolyov and...@xdel.ru À: Fam Zheng f...@redhat.com Cc: Alexandre DERUMIER aderum...@odiso.com, qemu-devel qemu-devel@nongnu.org, Ceph Devel ceph-de...@vger.kernel.org Envoyé: Samedi 11 Octobre 2014 10:30:40 Objet: Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image On Sat, Oct 11, 2014 at 12:25 PM, Fam Zheng f...@redhat.com wrote: On Sat, 10/11 10:00, Alexandre DERUMIER wrote: What is the source format? If the zero clusters are actually unallocated in the source image, drive-mirror will not write those clusters either. I.e. with drive-mirror sync=top, both source and target should have the same qemu-img map output. Thanks for your reply, I had tried drive mirror (sync=full) with raw file (sparse) - rbd (no sparse) rbd (sparse) - rbd (no sparse) raw file (sparse) - qcow2 on ext4 (sparse) rbd (sparse) - raw on ext4 (sparse) Also I see that I have the same problem with target file format on xfs. raw file (sparse) - qcow2 on xfs (no sparse) rbd (sparse) - raw on xfs (no sparse) These don't tell me much. Maybe it's better to show the actual commands and how you tell sparse from no sparse? Does qcow2 - qcow2 work for you on xfs? I only have this problem with drive-mirror, qemu-img convert seem to simply skip zero blocks. Or maybe this is because I'm using sync=full ? What is the difference between full and top ? sync: what parts of the disk image should be copied to the destination; possibilities include full for all the disk, top for only the sectors allocated in the topmost image. (what is topmost image ?) For sync=top, only the clusters allocated in the image itself is copied; for full, all those clusters allocated in the image itself, and its backing image, and it's backing's backing image, ..., are copied. The image itself, having a backing image or not, is called the topmost image. Fam -- Just a wild guess - Alexandre, did you tried detect-zeroes blk option for mirroring targets?
Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image
These don't tell me much. Maybe it's better to show the actual commands and how you tell sparse from no sparse? Well, I create 2 empty source images files of 10G. (source.qcow2 and source.raw) then: du -sh source.qcow2 : 2M du -sh source.raw : 0M then I convert them with qemu-img convert : (source.qcow2 - target.qcow2 , source.raw - target.raw) du -sh target.qcow2 : 2M du -sh target.raw : 0M (So it's ok here) But If I convert them with drive-mirror: du -sh target.qcow2 : 11G du -sh target.raw : 11G I have also double check with #df, and I see space allocated on filesystem when using drive-mirror. I have the same behavior if target is a rbd storage. Also I have done test again ext3,ext4, and I see the same problem than with xfs. I'm pretty sure that drive-mirror copy zero block, qemu-img convert take around 2s to convert the empty file (because it's skipping zero block), and drive mirror take around 5min. - Mail original - De: Fam Zheng f...@redhat.com À: Alexandre DERUMIER aderum...@odiso.com Cc: qemu-devel qemu-devel@nongnu.org, Ceph Devel ceph-de...@vger.kernel.org Envoyé: Samedi 11 Octobre 2014 10:25:35 Objet: Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image On Sat, 10/11 10:00, Alexandre DERUMIER wrote: What is the source format? If the zero clusters are actually unallocated in the source image, drive-mirror will not write those clusters either. I.e. with drive-mirror sync=top, both source and target should have the same qemu-img map output. Thanks for your reply, I had tried drive mirror (sync=full) with raw file (sparse) - rbd (no sparse) rbd (sparse) - rbd (no sparse) raw file (sparse) - qcow2 on ext4 (sparse) rbd (sparse) - raw on ext4 (sparse) Also I see that I have the same problem with target file format on xfs. raw file (sparse) - qcow2 on xfs (no sparse) rbd (sparse) - raw on xfs (no sparse) These don't tell me much. Maybe it's better to show the actual commands and how you tell sparse from no sparse? Does qcow2 - qcow2 work for you on xfs? I only have this problem with drive-mirror, qemu-img convert seem to simply skip zero blocks. Or maybe this is because I'm using sync=full ? What is the difference between full and top ? sync: what parts of the disk image should be copied to the destination; possibilities include full for all the disk, top for only the sectors allocated in the topmost image. (what is topmost image ?) For sync=top, only the clusters allocated in the image itself is copied; for full, all those clusters allocated in the image itself, and its backing image, and it's backing's backing image, ..., are copied. The image itself, having a backing image or not, is called the topmost image. Fam
Re: [Qemu-devel] [PATCH v2] libvixl: a64: Skip -Wunused-variable for gcc 5.0.0 or higher
On 10/12/14 15:50, Peter Maydell wrote: On 12 October 2014 01:32, Chen Gang gang.chen.5...@gmail.com wrote: On 10/12/14 5:25, Peter Maydell wrote: Some other approaches to this that would confine the fix to the makefiles rather than requiring us to modify the vixl source itself: a) add a -Wno- option for the affected .o files It is one way, but may have effect with gcc 4 version, and also it is effect with the whole file which is wider than current way. b) use -isystem rather than -I to include the libvixl directory on the include path It sounds good to me, although for me, it is not related with current issue. -isystem disables a bunch of gcc warnings automatically, which is why I suggested it. I'm not overall sure it's a great idea though. OK, thanks. -isystem really can skip this warning, originally, I am not notice about it. :-) But unlucky, other files within 'libvix' which also include this header file, also report this warning. So for me, it is not a good idea to let '-isystem' for the 'libvix' own source files. Next, I shall firstly confirm whether it is a gcc 5.0 (g++) issue or not in gcc upstream mailing list. - if it is gcc 5.0 issue, I shall try to fix it within this month. - else, for me, this patch v2 can still continue. Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image
Il 08/10/2014 13:15, Alexandre DERUMIER ha scritto: Hi, I'm currently planning to migrate our storage to ceph/rbd through qemu drive-mirror and It seem that drive-mirror with rbd block driver, don't create a sparse image. (all zeros are copied to the target rbd). Also note, that it's working fine with qemu-img convert , the rbd volume is sparse after conversion. Could it be related to the bdrv_co_write_zeroes missing features in block/rbd.c ? (It's available in other block drivers (scsi,gluster,raw-aio) , and I don't have this problem with theses block drivers). Lack of bdrv_co_write_zeroes is why detect-zeroes does not work. Lack of bdrv_get_block_status is why sparse-sparse does not work without detect-zeroes. Paolo
Re: [Qemu-devel] [PATCH] qcow2: fix leak of Qcow2DiscardRegion in update_refcount_discard
Am 11.10.2014 um 10:35 schrieb Zhang Haoyu: When the Qcow2DiscardRegion is adjacent to another one referenced by d, free this Qcow2DiscardRegion metadata referenced by p after it was removed from s-discards queue. Signed-off-by: Zhang Haoyu zhan...@sangfor.com --- block/qcow2-refcount.c | 1 + 1 file changed, 1 insertion(+) Reviewed-by: Max Reitz mre...@redhat.com
Re: [Qemu-devel] [question] is it possible that big-endian l1 table offset referenced by other I/O while updating l1 table offset in qcow2_update_snapshot_refcount?
Am 10.10.2014 um 03:54 schrieb Zhang Haoyu: Hi, I encounter a problem that after deleting snapshot, the qcow2 image size is very larger than that it should be displayed by ls command, but the virtual disk size is okay via qemu-img info. I suspect that during updating l1 table offset, other I/O job reference the big-endian l1 table offset (very large value), so the file is truncated to very large. Not quite. Rather, all the data that the snapshot used to occupy is still consuming holes in the file; the maximum offset of the file is still unchanged, even if the file is no longer using as many referenced clusters. Recent changes have gone in to sparsify the file when possible (punching holes if your kernel and file system is new enough to support that), so that it is not consuming the amount of disk space that a mere ls reports. But if what you are asking for is a way to compact the file back down, then you'll need to submit a patch. The idea of having an online defragmenter for qcow2 files has been kicked around before, but it is complex enough that no one has attempted a patch yet. Sorry, I didn't clarify the problem clearly. In qcow2_update_snapshot_refcount(), below code, /* Update L1 only if it isn't deleted anyway (addend = -1) */ if (ret == 0 addend = 0 l1_modified) { for (i = 0; i l1_size; i++) { cpu_to_be64s(l1_table[i]); } ret = bdrv_pwrite_sync(bs-file, l1_table_offset, l1_table, l1_size2); for (i = 0; i l1_size; i++) { be64_to_cpus(l1_table[i]); } } between cpu_to_be64s(l1_table[i]); and be64_to_cpus(l1_table[i]);, is it possible that there is other I/O reference this interim l1 table whose entries contain the be64 l2 table offset? The be64 l2 table offset maybe a very large value, hundreds of TB is possible, then the qcow2 file will be truncated to far larger than normal size. So we'll see the huge size of the qcow2 file by ls -hl, but the size is still normal displayed by qemu-img info. If the possibility mentioned above exists, below raw code may fix it, if (ret == 0 addend = 0 l1_modified) { tmp_l1_table = g_malloc0(l1_size * sizeof(uint64_t)) memcpy(tmp_l1_table, l1_table, l1_size * sizeof(uint64_t)); for (i = 0; i l1_size; i++) { cpu_to_be64s(tmp_l1_table[i]); } ret = bdrv_pwrite_sync(bs-file, l1_table_offset, tmp_l1_table, l1_size2); free(tmp_l1_table); } l1_table is already a local variable (local to qcow2_update_snapshot_refcount()), so I can't really imagine how introducing another local buffer should mitigate the problem, if there is any. Max
Re: [Qemu-devel] [Patch v4 6/8] target_arm: Change the reset values based on the ELF entry
On Wed, Oct 8, 2014 at 1:03 AM, Martin Galvan martin.gal...@tallertechnologies.com wrote: On Tue, Oct 7, 2014 at 11:13 AM, Alistair Francis alistai...@gmail.com wrote: The Netduino 2 machine won't run unless the reset_pc is based on the ELF entry point. Signed-off-by: Alistair Francis alistai...@gmail.com Signed-off-by: Peter Crosthwaite crosthwaite.pe...@gmail.com --- V2: - Malloc straight away, thanks to Peter C hw/arm/armv7m.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c index 7169027..07b36e2 100644 --- a/hw/arm/armv7m.c +++ b/hw/arm/armv7m.c @@ -155,11 +155,19 @@ static void armv7m_bitband_init(void) /* Board init. */ +typedef struct ARMV7MResetArgs { +ARMCPU *cpu; +uint32_t reset_pc; +} ARMV7MResetArgs; + static void armv7m_reset(void *opaque) { -ARMCPU *cpu = opaque; +ARMV7MResetArgs *args = opaque; + +cpu_reset(CPU(args-cpu)); -cpu_reset(CPU(cpu)); +args-cpu-env.thumb = args-reset_pc 1; +args-cpu-env.regs[15] = args-reset_pc ~1; } /* Init CPU and memory for a v7-M based board. @@ -180,6 +188,7 @@ qemu_irq *armv7m_init(MemoryRegion *system_memory, int mem_size, int num_irq, int i; int big_endian; MemoryRegion *hack = g_new(MemoryRegion, 1); +ARMV7MResetArgs *reset_args = g_new0(ARMV7MResetArgs, 1); if (cpu_model == NULL) { cpu_model = cortex-m3; @@ -234,7 +243,11 @@ qemu_irq *armv7m_init(MemoryRegion *system_memory, int mem_size, int num_irq, vmstate_register_ram_global(hack); memory_region_add_subregion(system_memory, 0xf000, hack); -qemu_register_reset(armv7m_reset, cpu); +*reset_args = (ARMV7MResetArgs) { +.cpu = cpu, +.reset_pc = entry, +}; +qemu_register_reset(armv7m_reset, reset_args); return pic; } How does this differ from what's being done in arm_cpu_reset for ARMv7-M? What about the initial MSP? So the problem I was having is that the standard linker meant that the reset_handler wasn't being called. I have made some changes to the linker file so now it calls the reset_handler. This patch is no longer required. Sorry it took me so long to figure out what the problem was Thanks, Alistair -- Martín Galván Software Engineer Taller Technologies Argentina San Lorenzo 47, 3rd Floor, Office 5 Córdoba, Argentina Phone: 54 351 4217888 / +54 351 4218211
Re: [Qemu-devel] [question] is it possible that big-endian l1 tableoffset referenced by other I/O while updating l1 table offset in qcow2_update_snapshot_refcount?
Hi, I encounter a problem that after deleting snapshot, the qcow2 image size is very larger than that it should be displayed by ls command, but the virtual disk size is okay via qemu-img info. I suspect that during updating l1 table offset, other I/O job reference the big-endian l1 table offset (very large value), so the file is truncated to very large. Not quite. Rather, all the data that the snapshot used to occupy is still consuming holes in the file; the maximum offset of the file is still unchanged, even if the file is no longer using as many referenced clusters. Recent changes have gone in to sparsify the file when possible (punching holes if your kernel and file system is new enough to support that), so that it is not consuming the amount of disk space that a mere ls reports. But if what you are asking for is a way to compact the file back down, then you'll need to submit a patch. The idea of having an online defragmenter for qcow2 files has been kicked around before, but it is complex enough that no one has attempted a patch yet. Sorry, I didn't clarify the problem clearly. In qcow2_update_snapshot_refcount(), below code, /* Update L1 only if it isn't deleted anyway (addend = -1) */ if (ret == 0 addend = 0 l1_modified) { for (i = 0; i l1_size; i++) { cpu_to_be64s(l1_table[i]); } ret = bdrv_pwrite_sync(bs-file, l1_table_offset, l1_table, l1_size2); for (i = 0; i l1_size; i++) { be64_to_cpus(l1_table[i]); } } between cpu_to_be64s(l1_table[i]); and be64_to_cpus(l1_table[i]);, is it possible that there is other I/O reference this interim l1 table whose entries contain the be64 l2 table offset? The be64 l2 table offset maybe a very large value, hundreds of TB is possible, then the qcow2 file will be truncated to far larger than normal size. So we'll see the huge size of the qcow2 file by ls -hl, but the size is still normal displayed by qemu-img info. If the possibility mentioned above exists, below raw code may fix it, if (ret == 0 addend = 0 l1_modified) { tmp_l1_table = g_malloc0(l1_size * sizeof(uint64_t)) memcpy(tmp_l1_table, l1_table, l1_size * sizeof(uint64_t)); for (i = 0; i l1_size; i++) { cpu_to_be64s(tmp_l1_table[i]); } ret = bdrv_pwrite_sync(bs-file, l1_table_offset, tmp_l1_table, l1_size2); free(tmp_l1_table); } l1_table is already a local variable (local to qcow2_update_snapshot_refcount()), so I can't really imagine how introducing another local buffer should mitigate the problem, if there is any. l1_table is not necessarily a local variable to qcow2_update_snapshot_refcount, which depends on condition of if (l1_table_offset != s-l1_table_offset), if the condition not true, l1_table = s-l1_table. Thanks, Zhang Haoyu Max
Re: [Qemu-devel] [PATCH V2] net: don't use set/get_pointer() in set/get_netdev()
On 10/10/2014 09:03 PM, Markus Armbruster wrote: Jason Wang jasow...@redhat.com writes: Commit 1ceef9f27359cbe92ef124bf74de6f792e71f6fb (net: multiqueue support) tries to use set_pointer() and get_pointer() to set and get NICPeers which is not a pointer defined in DEFINE_PROP_NETDEV. This trick works but result a unclean and fragile implementation (e.g print_netdev and parse_netdev). This patch solves this issue by not using set/get_pinter() and set and get netdev directly in set_netdev() and get_netdev(). After this the parse_netdev() and print_netdev() were no longer used and dropped from the source. Cc: Markus Armbruster arm...@redhat.com Cc: Stefan Hajnoczi stefa...@redhat.com Cc: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Jason Wang jasow...@redhat.com --- Changes from V1: - validate ncs pointer before accessing them, this fixes the qtest failure on arm. --- hw/core/qdev-properties-system.c | 71 ++-- 1 file changed, 39 insertions(+), 32 deletions(-) diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index ae0900f..6939ea5 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -176,41 +176,67 @@ PropertyInfo qdev_prop_chr = { }; /* --- netdev device --- */ +static void get_netdev(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +DeviceState *dev = DEVICE(obj); +Property *prop = opaque; +NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop); +char *p = g_strdup(peers_ptr-ncs[0] ? peers_ptr-ncs[0]-name : ); Can -ncs[0]-name ever be null? Seems not, id is mandatory for netdev. -static int parse_netdev(DeviceState *dev, const char *str, void **ptr) +visit_type_str(v, p, name, errp); +g_free(p); +} + +static void set_netdev(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) { -NICPeers *peers_ptr = (NICPeers *)ptr; +DeviceState *dev = DEVICE(obj); +Property *prop = opaque; +NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop); NetClientState **ncs = peers_ptr-ncs; NetClientState *peers[MAX_QUEUE_NUM]; -int queues, i = 0; -int ret; +Error *local_err = NULL; +int err, queues, i = 0; +char *str; + +if (dev-realized) { +qdev_prop_set_after_realize(dev, name, errp); +return; +} + +visit_type_str(v, str, name, local_err); +if (local_err) { +error_propagate(errp, local_err); +return; +} queues = qemu_find_net_clients_except(str, peers, NET_CLIENT_OPTIONS_KIND_NIC, MAX_QUEUE_NUM); if (queues == 0) { -ret = -ENOENT; +err = -ENOENT; goto err; } if (queues MAX_QUEUE_NUM) { -ret = -E2BIG; +err = -E2BIG; error_set_from_qdev_prop_error() does not accept -E2BIG. You could call error_setg(...) directly instead. Ok. goto err; } for (i = 0; i queues; i++) { if (peers[i] == NULL) { -ret = -ENOENT; +err = -ENOENT; goto err; } if (peers[i]-peer) { -ret = -EEXIST; +err = -EEXIST; goto err; } if (ncs[i]) { -ret = -EINVAL; +err = -EINVAL; error_set_from_qdev_prop_error() does not accept -EINVAL, either. Ok. goto err; } @@ -219,31 +245,12 @@ static int parse_netdev(DeviceState *dev, const char *str, void **ptr) } peers_ptr-queues = queues; - -return 0; +g_free(str); +return; err: Label err clashes with local variable err. Harmless, but maybe you'd like to rename one of them. It was used in error_set_from_qdev_prop_error(errp, err, dev, prop, str); -return ret; -} - -static char *print_netdev(void *ptr) -{ -NetClientState *netdev = ptr; -const char *val = netdev-name ? netdev-name : ; - -return g_strdup(val); -} - -static void get_netdev(Object *obj, Visitor *v, void *opaque, - const char *name, Error **errp) -{ -get_pointer(obj, v, opaque, print_netdev, name, errp); -} - -static void set_netdev(Object *obj, Visitor *v, void *opaque, - const char *name, Error **errp) -{ -set_pointer(obj, v, opaque, parse_netdev, name, errp); +error_set_from_qdev_prop_error(errp, err, dev, prop, str); +g_free(str); } PropertyInfo qdev_prop_netdev = { Instead of g_free(str); return; err: error_set_from_qdev_prop_error(errp, err, dev, prop, str); g_free(str); } You could exploit that error_set_from_qdev_prop_error() does nothing when err is 0: out:
Re: [Qemu-devel] [PATCH V2] net: don't use set/get_pointer() in set/get_netdev()
On 10/13/2014 11:31 AM, Jason Wang wrote: On 10/10/2014 09:03 PM, Markus Armbruster wrote: Jason Wang jasow...@redhat.com writes: Commit 1ceef9f27359cbe92ef124bf74de6f792e71f6fb (net: multiqueue support) tries to use set_pointer() and get_pointer() to set and get NICPeers which is not a pointer defined in DEFINE_PROP_NETDEV. This trick works but result a unclean and fragile implementation (e.g print_netdev and parse_netdev). This patch solves this issue by not using set/get_pinter() and set and get netdev directly in set_netdev() and get_netdev(). After this the parse_netdev() and print_netdev() were no longer used and dropped from the source. Cc: Markus Armbruster arm...@redhat.com Cc: Stefan Hajnoczi stefa...@redhat.com Cc: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Jason Wang jasow...@redhat.com --- Changes from V1: - validate ncs pointer before accessing them, this fixes the qtest failure on arm. --- hw/core/qdev-properties-system.c | 71 ++-- 1 file changed, 39 insertions(+), 32 deletions(-) diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index ae0900f..6939ea5 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -176,41 +176,67 @@ PropertyInfo qdev_prop_chr = { }; [...] goto err; } for (i = 0; i queues; i++) { if (peers[i] == NULL) { -ret = -ENOENT; +err = -ENOENT; goto err; } if (peers[i]-peer) { -ret = -EEXIST; +err = -EEXIST; goto err; } if (ncs[i]) { -ret = -EINVAL; +err = -EINVAL; error_set_from_qdev_prop_error() does not accept -EINVAL, either. Ok. Just have a check, error_set_from_qdev_prop_error() can accept -EINVAL, so I will only call error_setg() directly for -E2BIG.
[Qemu-devel] [PATCH v3 0/2] qcow2: Patch for shrinking qcow2 disk image
This is the third version for qcow2 shrinking. In this version, fixed host cluster leak when shrinking a disk image. Such as: Step 1, # /opt/qemu-git-arm/bin/qemu-img info /home/lijun/Work/tmp/shrink.qcow2 image: /home/lijun/Work/tmp/shrink.qcow2 file format: qcow2 virtual size: 2.0G (2147483648 bytes) disk size: 2.0G cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false corrupt: false Step 2, # /opt/qemu-git-arm/bin/qemu-img resize /home/lijun/Work/tmp/shrink.qcow2 -100M Image resized. Step 3, # /opt/qemu-git-arm/bin/qemu-img info /home/lijun/Work/tmp/shrink.qcow2 image: /home/lijun/Work/tmp/shrink.qcow2 file format: qcow2 virtual size: 1.9G (2042626048 bytes) disk size: 1.9G cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false corrupt: false Step 4, # /opt/qemu-git-arm/bin/qemu-img check /home/lijun/Work/tmp/shrink.qcow2 No errors were found on the image. 31005/31168 = 99.48% allocated, 0.79% fragmented, 0.00% compressed clusters Image end offset: 2033844224 As above show, in step 1, disk size and virtual size are all 2.0G. After Step 3, disk size and virtual size are all 1.9G. BTW, above is also a simple testing. I will submit a test case for qemu-iotests later. But I am not so familar with howto write qemu-iotests now. In file block/qcow2.c, just using ftruncate to fix host cluster leak. In file block/qcow2-cluster.c, just re-copy qcow2_grow_l1_table to realize qcow2_shrink_l1_and_l2_table. In file block/qcow2-refcount.c, also update the realization to handle self-describing refcount blocks in function update_refcount. Thanks. Jun Li (2): qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking qcow2: add update refcount table realization for update_refcount block/qcow2-cluster.c | 173 + block/qcow2-refcount.c | 44 + block/qcow2.c | 40 ++-- block/qcow2.h | 2 + 4 files changed, 255 insertions(+), 4 deletions(-) -- 1.9.3
[Qemu-devel] [PATCH v3 1/2] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking
This patch is the realization of new function qcow2_shrink_l1_and_l2_table. This function will shrink/discard l1 and l2 table when do qcow2 shrinking. Signed-off-by: Jun Li junm...@gmail.com --- Compared to v2, v3 fixed host cluster leak. --- block/qcow2-cluster.c | 173 ++ block/qcow2.c | 40 ++-- block/qcow2.h | 2 + 3 files changed, 211 insertions(+), 4 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index f7dd8c0..2ac3536 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -29,6 +29,9 @@ #include block/qcow2.h #include trace.h +static int l2_load(BlockDriverState *bs, uint64_t l2_offset, + uint64_t **l2_table); + int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, bool exact_size) { @@ -135,6 +138,176 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, return ret; } +int qcow2_shrink_l1_and_l2_table(BlockDriverState *bs, uint64_t new_l1_size, + int new_l2_index, bool exact_size) +{ +BDRVQcowState *s = bs-opaque; +int new_l1_size2, ret, i; +uint64_t *new_l1_table; +int64_t new_l1_table_offset; +int64_t old_l1_table_offset, old_l1_size; +uint8_t data[12]; + +new_l1_size2 = sizeof(uint64_t) * new_l1_size; +new_l1_table = qemu_try_blockalign(bs-file, + align_offset(new_l1_size2, 512)); +if (new_l1_table == NULL) { +return -ENOMEM; +} +memset(new_l1_table, 0, align_offset(new_l1_size2, 512)); + +/* shrinking l1 table */ +memcpy(new_l1_table, s-l1_table, new_l1_size2); + +/* write new table (align to cluster) */ +BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_ALLOC_TABLE); +new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2); +if (new_l1_table_offset 0) { +qemu_vfree(new_l1_table); +return new_l1_table_offset; +} + +ret = qcow2_cache_flush(bs, s-refcount_block_cache); +if (ret 0) { +goto fail; +} + +/* the L1 position has not yet been updated, so these clusters must + * indeed be completely free */ +ret = qcow2_pre_write_overlap_check(bs, 0, new_l1_table_offset, +new_l1_size2); +if (ret 0) { +goto fail; +} + +BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_WRITE_TABLE); + +for (i = 0; i new_l1_size; i++) { +new_l1_table[i] = cpu_to_be64(new_l1_table[i]); +} + +ret = bdrv_pwrite_sync(bs-file, new_l1_table_offset, + new_l1_table, new_l1_size2); +if (ret 0) { +goto fail; +} + +for (i = 0; i new_l1_size; i++) { +new_l1_table[i] = be64_to_cpu(new_l1_table[i]); +} + +/* set new table */ +BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_ACTIVATE_TABLE); +cpu_to_be32w((uint32_t *)data, new_l1_size); +stq_be_p(data + 4, new_l1_table_offset); +ret = bdrv_pwrite_sync(bs-file, offsetof(QCowHeader, l1_size), + data, sizeof(data)); +if (ret 0) { +goto fail; +} + +old_l1_table_offset = s-l1_table_offset; +s-l1_table_offset = new_l1_table_offset; +uint64_t *old_l1_table = s-l1_table; +s-l1_table = new_l1_table; +old_l1_size = s-l1_size; +s-l1_size = new_l1_size; + +int num = old_l1_size - s-l1_size; + +while (num = 0) { +uint64_t l2_offset; +int ret; +uint64_t *l2_table, l2_entry; +int last_free_cluster = 0; + +l2_offset = old_l1_table[s-l1_size + num - 1] L1E_OFFSET_MASK; +if (l2_offset == 0) { +goto retry; +} + +if (num == 0) { +if (new_l2_index == 0) { +goto retry; +} +last_free_cluster = new_l2_index + 1; +} + +/* load l2_table into cache */ +ret = l2_load(bs, l2_offset, l2_table); + +if (ret 0) { +goto fail; +} + +for (i = s-l2_size; i last_free_cluster; i--) { +l2_entry = be64_to_cpu(l2_table[i - 1]); + +switch (qcow2_get_cluster_type(l2_entry)) { +case QCOW2_CLUSTER_UNALLOCATED: +if (!bs-backing_hd) { +continue; +} +break; + +case QCOW2_CLUSTER_ZERO: +continue; + +case QCOW2_CLUSTER_NORMAL: +case QCOW2_CLUSTER_COMPRESSED: +break; + +default: +abort(); +} + +qcow2_cache_entry_mark_dirty(s-l2_table_cache, l2_table); +if (s-qcow_version = 3) { +l2_table[i - 1] = cpu_to_be64(QCOW_OFLAG_ZERO); +} else { +l2_table[i - 1] = cpu_to_be64(0); +} + +/* before discard the specify item of l2_entry, + * set this entry with 0. +
[Qemu-devel] [PATCH v3 2/2] qcow2: add update refcount table realization for update_refcount
When every item of refcount block is NULL, free refcount block and reset the corresponding item of refcount table with NULL. At the same time, put this cluster to s-free_cluster_index. Signed-off-by: Jun Li junm...@gmail.com --- This version adding handle self-describing refcount blocks. --- block/qcow2-refcount.c | 44 1 file changed, 44 insertions(+) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 2bcaaf9..8594ffd 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -28,6 +28,7 @@ #include qemu/range.h static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size); +static int write_reftable_entry(BlockDriverState *bs, int rt_index); static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, int64_t offset, int64_t length, int addend, enum qcow2_discard_type type); @@ -599,6 +600,49 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs, if (refcount == 0 s-discard_passthrough[type]) { update_refcount_discard(bs, cluster_offset, s-cluster_size); } + +unsigned int refcount_table_index; +refcount_table_index = cluster_index (s-cluster_bits - + REFCOUNT_SHIFT); + +uint64_t refcount_block_offset = +s-refcount_table[refcount_table_index] REFT_OFFSET_MASK; + +int64_t self_block_index = refcount_block_offset s-cluster_bits; +int self_block_index2 = (refcount_block_offset s-cluster_bits) +((1 (s-cluster_bits - REFCOUNT_SHIFT)) - 1); + +/* When refcount block is NULL, update refcount table */ +if (!be16_to_cpu(refcount_block[0]) || self_block_index2 == 0) { +int k; +int refcount_block_entries = s-cluster_size / + sizeof(refcount_block[0]); +for (k = 0; k refcount_block_entries; k++) { +if (k == self_block_index2) { +k++; +} + +if (be16_to_cpu(refcount_block[k])) { +break; +} +} + +if (k == refcount_block_entries) { +if (self_block_index s-free_cluster_index) { +s-free_cluster_index = self_block_index; +} + +/* update refcount table */ +s-refcount_table[refcount_table_index] = 0; +ret = write_reftable_entry(bs, refcount_table_index); +if (ret 0) { +fprintf(stderr, Could not update refcount table: %s\n, +strerror(-ret)); +goto fail; +} + +} +} } ret = 0; -- 1.9.3
[Qemu-devel] [PATCH V3] net: don't use set/get_pointer() in set/get_netdev()
Commit 1ceef9f27359cbe92ef124bf74de6f792e71f6fb (net: multiqueue support) tries to use set_pointer() and get_pointer() to set and get NICPeers which is not a pointer defined in DEFINE_PROP_NETDEV. This trick works but result a unclean and fragile implementation (e.g print_netdev and parse_netdev). This patch solves this issue by not using set/get_pinter() and set and get netdev directly in set_netdev() and get_netdev(). After this the parse_netdev() and print_netdev() were no longer used and dropped from the source. Cc: Markus Armbruster arm...@redhat.com Cc: Stefan Hajnoczi stefa...@redhat.com Cc: Peter Maydell peter.mayd...@linaro.org Signed-off-by: Jason Wang jasow...@redhat.com --- Changes from V2: - Use error_setg() instead of error_set_from_qdev_prop_error() for E2BIG error. - Clean the return part of the set_netdev() since eror_set_from_qdev_prop_error() does nothing when err is 0. Changes from V1: - validate ncs pointer before accessing them, this fixes the qtest failure on arm. --- hw/core/qdev-properties-system.c | 70 ++-- 1 file changed, 38 insertions(+), 32 deletions(-) diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index 84caa1d..03a6e68 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -176,41 +176,68 @@ PropertyInfo qdev_prop_chr = { }; /* --- netdev device --- */ +static void get_netdev(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) +{ +DeviceState *dev = DEVICE(obj); +Property *prop = opaque; +NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop); +char *p = g_strdup(peers_ptr-ncs[0] ? peers_ptr-ncs[0]-name : ); -static int parse_netdev(DeviceState *dev, const char *str, void **ptr) +visit_type_str(v, p, name, errp); +g_free(p); +} + +static void set_netdev(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) { -NICPeers *peers_ptr = (NICPeers *)ptr; +DeviceState *dev = DEVICE(obj); +Property *prop = opaque; +NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop); NetClientState **ncs = peers_ptr-ncs; NetClientState *peers[MAX_QUEUE_NUM]; -int queues, i = 0; -int ret; +Error *local_err = NULL; +int queues, err = 0, i = 0; +char *str; + +if (dev-realized) { +qdev_prop_set_after_realize(dev, name, errp); +return; +} + +visit_type_str(v, str, name, local_err); +if (local_err) { +error_propagate(errp, local_err); +return; +} queues = qemu_find_net_clients_except(str, peers, NET_CLIENT_OPTIONS_KIND_NIC, MAX_QUEUE_NUM); if (queues == 0) { -ret = -ENOENT; +err = -ENOENT; goto err; } if (queues MAX_QUEUE_NUM) { -ret = -E2BIG; +error_setg(errp, queues of backend '%s'(%d) exceeds QEMU limitation(%d), + str, queues, MAX_QUEUE_NUM); goto err; } for (i = 0; i queues; i++) { if (peers[i] == NULL) { -ret = -ENOENT; +err = -ENOENT; goto err; } if (peers[i]-peer) { -ret = -EEXIST; +err = -EEXIST; goto err; } if (ncs[i]) { -ret = -EINVAL; +err = -EINVAL; goto err; } @@ -220,30 +247,9 @@ static int parse_netdev(DeviceState *dev, const char *str, void **ptr) peers_ptr-queues = queues; -return 0; - err: -return ret; -} - -static char *print_netdev(void *ptr) -{ -NetClientState *netdev = ptr; -const char *val = netdev-name ? netdev-name : ; - -return g_strdup(val); -} - -static void get_netdev(Object *obj, Visitor *v, void *opaque, - const char *name, Error **errp) -{ -get_pointer(obj, v, opaque, print_netdev, name, errp); -} - -static void set_netdev(Object *obj, Visitor *v, void *opaque, - const char *name, Error **errp) -{ -set_pointer(obj, v, opaque, parse_netdev, name, errp); +error_set_from_qdev_prop_error(errp, err, dev, prop, str); +g_free(str); } PropertyInfo qdev_prop_netdev = { -- 1.8.3.1