Re: [Qemu-devel] [PATCH v5 08/11] qcow2: Rebuild refcount structure during check

2014-10-12 Thread Max Reitz

Am 11.10.2014 um 20:56 schrieb Benoît Canet:

+int64_t first_free_cluster = 0, rt_ofs = -1, cluster = 0;
+int64_t rb_ofs, rb_start, rb_index;

Everytime a few day pass and I read rb_ofs and rt_ofs again I found these names
obfuscated. I know Linus says that C is a spartan language but these look odd.
Maybe refcount_block_offset, refcount_block_* and refcount_table_offset would be
better.


I would use longer names if there was no line length limit. ;-)

I'll try and see how it looks.

Max



Re: [Qemu-devel] [PATCH] qcow2: fix double-free of Qcow2DiscardRegion in qcow2_process_discards

2014-10-12 Thread Kevin Wolf
Am 11.10.2014 um 09:14 hat Zhang Haoyu geschrieben:
 In qcow2_update_snapshot_refcount - qcow2_process_discards() - 
 bdrv_discard()
 may free the Qcow2DiscardRegion which is referenced by next pointer in
 qcow2_process_discards() now, in next iteration, d = next, so g_free(d)
 will double-free this Qcow2DiscardRegion.
 
 qcow2_snapshot_delete
 |- qcow2_update_snapshot_refcount
 |-- qcow2_process_discards
 |--- bdrv_discard
 | aio_poll
 |- aio_dispatch
 |-- bdrv_co_io_em_complete
 |--- qemu_coroutine_enter(co-coroutine, NULL); === coroutine entry is 
 bdrv_co_do_rw
 |--- g_free(d) == free first Qcow2DiscardRegion is okay
 |--- d = next;  == this set is done in QTAILQ_FOREACH_SAFE() macro.
 |--- g_free(d);  == double-free will happen if during previous iteration, 
 bdrv_discard had free this object.

Do you have a reproducer for this or did code review lead you to this?

At the moment I can't see how bdrv_discard(bs-file) could ever free a
Qcow2DiscardRegion of bs, as it's working on a completely different
BlockDriverState (which usually won't even be a qcow2 one).

 bdrv_co_do_rw
 |- bdrv_co_do_writev
 |-- bdrv_co_do_pwritev
 |--- bdrv_aligned_pwritev
 | qcow2_co_writev
 |- qcow2_alloc_cluster_link_l2
 |-- qcow2_free_any_clusters
 |--- qcow2_free_clusters
 | update_refcount
 |- qcow2_process_discards
 |-- g_free(d)  == In next iteration, this Qcow2DiscardRegion will be 
 double-free.

This shouldn't happen in a nested call either, as s-lock can't be taken
recursively.

Kevin



Re: [Qemu-devel] [PATCH] oslib-posix: change free to g_free

2014-10-12 Thread Kevin Wolf
Am 11.10.2014 um 05:23 hat Eric Blake geschrieben:
 On 10/10/2014 08:54 PM, arei.gong...@huawei.com wrote:
  From: Gonglei arei.gong...@huawei.com
  
  The caller of qemu_vfree() maybe not check whether parameter
  ptr pointer is NULL or not, such as vpc_open().
  Using g_free() is more safe.
 
 NACK.  g_free is only safe for pointers allocated by g_malloc.
 qemu_vfree is for use on pointers allocated by qemu_try_memalign and
 friends (matching the name valloc which is an older spelling of
 posix_memalign), which are NOT allocated by g_malloc.  Furthermore,
 free(NULL) is safe.

I second that. Strong NACK.

Kevin



Re: [Qemu-devel] [PATCH v2] libvixl: a64: Skip -Wunused-variable for gcc 5.0.0 or higher

2014-10-12 Thread Peter Maydell
On 12 October 2014 01:32, Chen Gang gang.chen.5...@gmail.com wrote:
 On 10/12/14 5:25, Peter Maydell wrote:
 Some other approaches to this that would confine the
 fix to the makefiles rather than requiring us to modify
 the vixl source itself:
  a) add a -Wno- option for the affected .o files

 It is one way, but may have effect with gcc 4 version, and also it is
 effect with the whole file which is wider than current way.

  b) use -isystem rather than -I to include the libvixl
 directory on the include path


 It sounds good to me, although for me, it is not related with current
 issue.

-isystem disables a bunch of gcc warnings automatically,
which is why I suggested it. I'm not overall sure it's
a great idea though.

-- PMM



Re: [Qemu-devel] [Bug?] qemu abort when trying to passthrough BCM5719 Gigabit Ethernet

2014-10-12 Thread Michael S. Tsirkin
On Sat, Oct 11, 2014 at 01:41:48AM -0600, Alex Williamson wrote:
 On Sat, 2014-10-11 at 13:58 +0800, zhanghailiang wrote:
  Hi all,
  
  When i try to passthrough BCM5719 Gigabit Ethernet to guest using the qemu 
  master branch, it aborted,
  and show kvm_set_phys_mem:error registering slot:Bad Address.
  
  qemu command:
  #./qemu/qemu/x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 
  -vnc :99 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3  -drive 
  file=/home/suse11_sp3_64,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native
   -device scsi-hd,bus=scsi0.0,channel=0,scsi-
  id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 -device 
  pci-assign,host=01:00.1,id=mydevice -net none
  
  info about guest and host:
  host OS: 3.16.5
  *guest OS: Novell SuSE Linux Enterprise Server 11 SP3*
  #cat /proc/cpuinfo
  processor   : 31
  vendor_id   : GenuineIntel
  cpu family  : 6
  model   : 62
  model name  : Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
  stepping: 4
  microcode   : 0x416
  cpu MHz : 1926.875
  cache size  : 20480 KB
  physical id : 1
  siblings: 16
  core id : 7
  cpu cores   : 8
  apicid  : 47
  initial apicid  : 47
  fpu : yes
  fpu_exception   : yes
  cpuid level : 13
  wp  : yes
  flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
  cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
  pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl 
  xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor 
  ds_cpl vmx
  smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt 
  tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt 
  pln
  pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
  bogomips: 4005.35
  clflush size: 64
  cache_alignment : 64
  address sizes   : 46 bits physical, 48 bits virtual
  power management:
  
  gdb info:
  (gdb) bt
  #0  0x71ce9989 in raise () from /usr/lib64/libc.so.6
  #1  0x71ceb098 in abort () from /usr/lib64/libc.so.6
  #2  0x556275cf in kvm_set_phys_mem (section=0x7fffedcea790, 
  add=true) at /home/qemu/qemu/kvm-all.c:711
  #3  0x5562980f in address_space_update_topology_pass 
  (as=as@entry=0x55d01ca0 address_space_memory, 
  adding=adding@entry=true,
   new_view=optimized out, new_view=optimized out, 
  old_view=0x7fffe8022a90, old_view=0x7fffe8022a90) at 
  /home/qemu/qemu/memory.c:752
  #4  0x5562b910 in address_space_update_topology (as=0x55d01ca0 
  address_space_memory) at /home/qemu/qemu/memory.c:767
  #5  memory_region_transaction_commit () at /home/qemu/qemu/memory.c:808
  #6  0x557a75b4 in pci_update_mappings (d=0x562ba9f0) at 
  hw/pci/pci.c:1113
  #7  0x557a7932 in pci_default_write_config 
  (d=d@entry=0x562ba9f0, addr=addr@entry=20, 
  val_in=val_in@entry=4294967295, l=l@entry=4)
   at hw/pci/pci.c:1165
  #8  0x5566c17e in assigned_dev_pci_write_config 
  (pci_dev=0x562ba9f0, address=20, val=4294967295, len=4)
   at /home/qemu/qemu/hw/i386/kvm/pci-assign.c:1196
  #9  0x55628fea in access_with_adjusted_size (addr=addr@entry=0, 
  value=value@entry=0x7fffedceaae0, size=size@entry=4,
   access_size_min=optimized out, access_size_max=optimized out, 
  access=0x55629160 memory_region_write_accessor, mr=0x56231f00)
   at /home/qemu/qemu/memory.c:480
  #10 0x5562dbf7 in memory_region_dispatch_write (size=4, 
  data=18446744073709551615, addr=0, mr=0x56231f00) at 
  /home/qemu/qemu/memory.c:1122
  #11 io_mem_write (mr=mr@entry=0x56231f00, addr=0, val=optimized out, 
  size=4) at /home/qemu/qemu/memory.c:1958
  #12 0x555f8963 in address_space_rw (as=0x55d01d80 
  address_space_io, addr=addr@entry=3324, buf=0x77fec000 
  \377\377\377\377,
   len=len@entry=4, is_write=is_write@entry=true) at 
  /home/qemu/qemu/exec.c:2145
  #13 0x55628491 in kvm_handle_io (count=1, size=4, 
  direction=optimized out, data=optimized out, port=3324) at 
  /home/qemu/qemu/kvm-all.c:1614
  #14 kvm_cpu_exec (cpu=cpu@entry=0x5620e610) at 
  /home/qemu/qemu/kvm-all.c:1771
  #15 0x55617182 in qemu_kvm_cpu_thread_fn (arg=0x5620e610) at 
  /home/qemu/qemu/cpus.c:953
  #16 0x76ba2df3 in start_thread () from /usr/lib64/libpthread.so.0
  #17 0x71daa3dd in clone () from /usr/lib64/libc.so.6
  
  messages log:
  Oct 10 07:43:18 localhost kernel: kvm: zapping shadow pages for mmio 
  generation wraparound
  Oct 10 07:43:27 localhost kernel: kvm [13251]: vcpu0 disabled perfctr 
  wrmsr: 0xc1 data 0xabcd
  Oct 10 07:43:28 localhost kernel: intel_iommu_map: iommu width (48) is not 
  sufficient for the mapped address (fe001000)
  Oct 10 07:43:28 localhost kernel: kvm_iommu_map_address:iommu failed to map 
  pfn=94880
  
  

Re: [Qemu-devel] [PATCH] qcow2: fix double-free of Qcow2DiscardRegion in qcow2_process_discards

2014-10-12 Thread Zhang Haoyu


On 2014-10-12 15:34, Kevin Wolf wrote:

Am 11.10.2014 um 09:14 hat Zhang Haoyu geschrieben:

In qcow2_update_snapshot_refcount - qcow2_process_discards() - bdrv_discard()
may free the Qcow2DiscardRegion which is referenced by next pointer in
qcow2_process_discards() now, in next iteration, d = next, so g_free(d)
will double-free this Qcow2DiscardRegion.

qcow2_snapshot_delete
|- qcow2_update_snapshot_refcount
|-- qcow2_process_discards
|--- bdrv_discard
| aio_poll
|- aio_dispatch
|-- bdrv_co_io_em_complete
|--- qemu_coroutine_enter(co-coroutine, NULL); === coroutine entry is 
bdrv_co_do_rw
|--- g_free(d) == free first Qcow2DiscardRegion is okay
|--- d = next;  == this set is done in QTAILQ_FOREACH_SAFE() macro.
|--- g_free(d);  == double-free will happen if during previous iteration, 
bdrv_discard had free this object.

Do you have a reproducer for this or did code review lead you to this?
This problem can be reproduced with loop of savevm - delvm - savem - 
delvm ..., about 4 hours.

When I delete the vm snapshot, qemu crashed with a core file,
I debug the core file and find the double-free and the stack.

So I add a breakpoint at g_free(d);, and find that indeed a double-free 
happened,  twice free with the same address.

And only the first discard region have not happened with double-free.


At the moment I can't see how bdrv_discard(bs-file) could ever free a
Qcow2DiscardRegion of bs, as it's working on a completely different
BlockDriverState (which usually won't even be a qcow2 one).
I think the aio_context in bdrv_discard - aio_poll(aio_context, true) 
is the qemu_aio_context,
no matter the bs or bs-file passed to bdrv_discard, so 
aio_poll(aio_context) will poll all of the aio.



bdrv_co_do_rw
|- bdrv_co_do_writev
|-- bdrv_co_do_pwritev
|--- bdrv_aligned_pwritev
| qcow2_co_writev
|- qcow2_alloc_cluster_link_l2
|-- qcow2_free_any_clusters
|--- qcow2_free_clusters
| update_refcount
|- qcow2_process_discards
|-- g_free(d)  == In next iteration, this Qcow2DiscardRegion will be 
double-free.

This shouldn't happen in a nested call either, as s-lock can't be taken
recursively.
Could you detail how s-lock prevent that, above stack is from the gdb, 
when I add a breakpoint in g_free(d).


Thanks,
Zhang Haoyu


Kevin







Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough

2014-10-12 Thread Michael S. Tsirkin
On Thu, Oct 09, 2014 at 01:53:16PM +0800, Chen, Tiejun wrote:
 
 
 On 2014/10/7 15:26, Michael S. Tsirkin wrote:
 On Tue, Sep 30, 2014 at 10:43:09AM +0800, Chen, Tiejun wrote:
 On 2014/9/29 18:01, Michael S. Tsirkin wrote:
 On Sun, Sep 28, 2014 at 10:59:05AM +0800, Chen, Tiejun wrote:
 On 2014/9/3 9:40, Kay, Allen M wrote:
 
 
 -Original Message-
 From: Chen, Tiejun
 Sent: Monday, September 01, 2014 12:50 AM
 To: Michael S. Tsirkin
 Cc: xen-de...@lists.xensource.com; Kay, Allen M; qemu-devel@nongnu.org;
 Konrad Rzeszutek Wilk
 Subject: Re: [Qemu-devel] [Xen-devel] [PATCH 2/2] xen:i386:pc_piix: 
 create
 isa bridge specific to IGD passthrough
 
 On 2014/9/1 14:05, Michael S. Tsirkin wrote:
 On Mon, Sep 01, 2014 at 10:50:37AM +0800, Chen, Tiejun wrote:
 On 2014/8/31 16:58, Michael S. Tsirkin wrote:
 On Fri, Aug 29, 2014 at 09:28:50AM +0800, Chen, Tiejun wrote:
 
 
 On 2014/8/28 8:56, Chen, Tiejun wrote:
 + */
 +dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0),
 +
 xen-igd-passthrough-isa-bridge);
 +if (dev) {
 +r = xen_host_pci_device_get(hdev, 0, 0,
 + PCI_DEVFN(0x1f,
 0), 0);
 +if (!r) {
 +pci_config_set_vendor_id(dev-config,
 hdev.vendor_id);
 +pci_config_set_device_id(dev-config,
 + hdev.device_id);
 
 Can you, instead, implement the reverse logic, probing the card
 and supplying the correct device id for PCH?
 
 
 Here what is your so-called reverse logic as I already asked you
 previously? Do you mean I should list all PCHs with a combo
 illustrated with the vendor/device id in advance? Then look up
 if we can find a
 
 Michael,
 
 
 Ping.
 
 Thanks
 Tiejun
 
 Could you explain this exactly? Then I can try follow-up your
 idea ASAP if this is necessary and possible.
 
 Michel,
 
 Could you give us some explanation for your reverse logic when
 you're free?
 
 Thanks
 Tiejun
 
 So future drivers will look at device ID for the card and figure out
 how things should work from there.
 Old drivers still poke at device id of the chipset for this, but
 maybe qemu can do what newer drivers do:
 look at the card and figure out what guest should do, then present
 the appropriate chipset id.
 
 This is based on what Jesse said:
  Practically speaking, we could probably assume specific GPU/PCH
 combos,
  as I don't think they're generally mixed across generations, 
  though
 SNB
  and IVB did have compatible sockets, so there is the 
  possibility of
  mixing CPT and PPT PCHs, but those are register identical as 
  far as the
  graphics driver is concerned, so even that should be safe.
 
 
 Michael,
 
 Thanks for your explanation.
 
 so the idea is to have a reverse table mapping GPU back to PCH.
 Present to guest the ID that will let it assume the correct GPU.
 
 I guess you mean we should create to maintain such a table:
 
 [GPU Card: device_id(s), PCH: device_id]
 
 Then each time, instead of exposing that real PCH device id directly,
 qemu first can get the real GPU device id to lookup this table to
 present a appropriate PCH's device id to the guest.
 
 And looks here that appropriate PCH's device id is not always a that
 real PCH's device id. Right? If I'm wrong please correct me.
 
 Exactly: we don't really care what the PCH ID is - we only need it for
 the guest driver to do the right thing.
 
 Agreed.
 
 I need to ask Allen to check if one given GPU card device id is always
 corresponding to one given PCH on both HSW and BDW currently. If yes, I 
 can
 do this quickly. Otherwise I need some time to establish that sort
 of connection.
 
 Michael,
 
 Sorry for this delay response but please keep in mind we really are in 
 this
 process.
 
 You know this involve different GPU components we have to take some time 
 to
 communicate or even discuss internal.
 
 Now I have a draft codes, could you take a look at this? I hope that 
 comment
 can help us to understand something, but if you have any question, we can
 further respond inline.
 
 ---
 typedef struct {
  uint16_t gpu_device_id;
  uint16_t pch_device_id;
 } XenIGDDeviceIDInfo;
 
 /* In real world different GPU should have different PCH. But actually
   * the different PCH DIDs likely map to different PCH SKUs. We do the
   * same thing for the GPU. For PCH, the different SKUs are going to be
   * all the same silicon design and implementation, just different
   * features turn on and off with fuses. The SW interfaces should be
   * consistent across all SKUs in a given family (eg LPT). But just same
   * features may not be supported.
   *
   * Most of these different PCH features probably don't matter to the
   * Gfx driver, but obviously any difference in display port connections
   * will so it should be fine with any PCH in case of passthrough.
   *
   * So currently use one PCH version, 0x8c4e, to cover all HSW scenarios,
   * 0x9cc3 for BDW.
 
 Pls write Haswell and Broadwell in full in comment.
 
 Are you saying I should list all PCH 

Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image

2014-10-12 Thread Alexandre DERUMIER
Just a wild guess - Alexandre, did you tried detect-zeroes blk option
for mirroring targets?

Hi, yes, I have also tried with detect-zeroes (on or discard) with virtio and 
virtio-scsi, doesn't help. (I'm not sure that is implemtend in drive-mirror).

As workaround currently, after drive-mirror, I doing fstrim inside the guest 
(with virtio-scsi + discard),
and like this I can free space on rbd storage.



- Mail original -

De: Andrey Korolyov and...@xdel.ru
À: Fam Zheng f...@redhat.com
Cc: Alexandre DERUMIER aderum...@odiso.com, qemu-devel 
qemu-devel@nongnu.org, Ceph Devel ceph-de...@vger.kernel.org
Envoyé: Samedi 11 Octobre 2014 10:30:40
Objet: Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image

On Sat, Oct 11, 2014 at 12:25 PM, Fam Zheng f...@redhat.com wrote:
 On Sat, 10/11 10:00, Alexandre DERUMIER wrote:
 What is the source format? If the zero clusters are actually unallocated 
 in the
 source image, drive-mirror will not write those clusters either. I.e. with
 drive-mirror sync=top, both source and target should have the same 
 qemu-img
 map output.

 Thanks for your reply,

 I had tried drive mirror (sync=full) with

 raw file (sparse) - rbd (no sparse)
 rbd (sparse) - rbd (no sparse)
 raw file (sparse) - qcow2 on ext4 (sparse)
 rbd (sparse) - raw on ext4 (sparse)

 Also I see that I have the same problem with target file format on xfs. 

 raw file (sparse) - qcow2 on xfs (no sparse)
 rbd (sparse) - raw on xfs (no sparse)


 These don't tell me much. Maybe it's better to show the actual commands and 
 how
 you tell sparse from no sparse?

 Does qcow2 - qcow2 work for you on xfs?


 I only have this problem with drive-mirror, qemu-img convert seem to simply 
 skip zero blocks.


 Or maybe this is because I'm using sync=full ?

 What is the difference between full and top ?

 sync: what parts of the disk image should be copied to the destination;
 possibilities include full for all the disk, top for only the sectors
 allocated in the topmost image.

 (what is topmost image ?)

 For sync=top, only the clusters allocated in the image itself is copied; for
 full, all those clusters allocated in the image itself, and its backing
 image, and it's backing's backing image, ..., are copied.

 The image itself, having a backing image or not, is called the topmost image.

 Fam
 --

Just a wild guess - Alexandre, did you tried detect-zeroes blk option
for mirroring targets?



Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image

2014-10-12 Thread Alexandre DERUMIER
These don't tell me much. Maybe it's better to show the actual commands and 
how
you tell sparse from no sparse?


Well,

I create 2 empty source images files of 10G. (source.qcow2 and source.raw)

then:

du -sh source.qcow2  : 2M
du -sh source.raw  : 0M


then I convert them with qemu-img convert : (source.qcow2 - target.qcow2 , 
source.raw - target.raw)

du -sh target.qcow2  : 2M
du -sh target.raw  : 0M

(So it's ok here)

But If I convert them with drive-mirror:

du -sh target.qcow2  : 11G
du -sh target.raw  : 11G



I have also double check with #df, and I see space allocated on filesystem when 
using drive-mirror.

I have the same behavior if target is a rbd storage.

Also I have done test again ext3,ext4, and I see the same problem than with xfs.


I'm pretty sure that drive-mirror copy zero block, qemu-img convert take around 
2s to convert the empty file
(because it's skipping zero block), and drive mirror take around 5min.

- Mail original -

De: Fam Zheng f...@redhat.com
À: Alexandre DERUMIER aderum...@odiso.com
Cc: qemu-devel qemu-devel@nongnu.org, Ceph Devel 
ceph-de...@vger.kernel.org
Envoyé: Samedi 11 Octobre 2014 10:25:35
Objet: Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image

On Sat, 10/11 10:00, Alexandre DERUMIER wrote:
 What is the source format? If the zero clusters are actually unallocated in 
 the
 source image, drive-mirror will not write those clusters either. I.e. with
 drive-mirror sync=top, both source and target should have the same 
 qemu-img
 map output.

 Thanks for your reply,

 I had tried drive mirror (sync=full) with

 raw file (sparse) - rbd (no sparse)
 rbd (sparse) - rbd (no sparse)
 raw file (sparse) - qcow2 on ext4 (sparse)
 rbd (sparse) - raw on ext4 (sparse)

 Also I see that I have the same problem with target file format on xfs.

 raw file (sparse) - qcow2 on xfs (no sparse)
 rbd (sparse) - raw on xfs (no sparse)


These don't tell me much. Maybe it's better to show the actual commands and how
you tell sparse from no sparse?

Does qcow2 - qcow2 work for you on xfs?


 I only have this problem with drive-mirror, qemu-img convert seem to simply 
 skip zero blocks.


 Or maybe this is because I'm using sync=full ?

 What is the difference between full and top ?

 sync: what parts of the disk image should be copied to the destination;
 possibilities include full for all the disk, top for only the sectors
 allocated in the topmost image.

 (what is topmost image ?)

For sync=top, only the clusters allocated in the image itself is copied; for
full, all those clusters allocated in the image itself, and its backing
image, and it's backing's backing image, ..., are copied.

The image itself, having a backing image or not, is called the topmost image.

Fam



Re: [Qemu-devel] [PATCH v2] libvixl: a64: Skip -Wunused-variable for gcc 5.0.0 or higher

2014-10-12 Thread Chen Gang
On 10/12/14 15:50, Peter Maydell wrote:
 On 12 October 2014 01:32, Chen Gang gang.chen.5...@gmail.com wrote:
 On 10/12/14 5:25, Peter Maydell wrote:
 Some other approaches to this that would confine the
 fix to the makefiles rather than requiring us to modify
 the vixl source itself:
  a) add a -Wno- option for the affected .o files

 It is one way, but may have effect with gcc 4 version, and also it is
 effect with the whole file which is wider than current way.

  b) use -isystem rather than -I to include the libvixl
 directory on the include path


 It sounds good to me, although for me, it is not related with current
 issue.
 
 -isystem disables a bunch of gcc warnings automatically,
 which is why I suggested it. I'm not overall sure it's
 a great idea though.
 

OK, thanks. -isystem really can skip this warning, originally, I am
not notice about it. :-)

But unlucky, other files within 'libvix' which also include this header
file, also report this warning. So for me, it is not a good idea to let
'-isystem' for the 'libvix' own source files.

Next, I shall firstly confirm whether it is a gcc 5.0 (g++) issue or not
in gcc upstream mailing list.

 - if it is gcc 5.0 issue, I shall try to fix it within this month.

 - else, for me, this patch v2 can still continue.


Thanks.
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed



Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rbd image

2014-10-12 Thread Paolo Bonzini
Il 08/10/2014 13:15, Alexandre DERUMIER ha scritto:
 Hi,
 
 I'm currently planning to migrate our storage to ceph/rbd through qemu 
 drive-mirror
 
 and It seem that drive-mirror with rbd block driver, don't create a sparse 
 image. (all zeros are copied to the target rbd).
 
 Also note, that it's working fine with qemu-img convert , the rbd volume is 
 sparse after conversion.
 
 
 Could it be related to the bdrv_co_write_zeroes missing features in 
 block/rbd.c ?
 
 (It's available in other block drivers (scsi,gluster,raw-aio) , and I don't 
 have this problem with theses block drivers).

Lack of bdrv_co_write_zeroes is why detect-zeroes does not work.

Lack of bdrv_get_block_status is why sparse-sparse does not work
without detect-zeroes.

Paolo



Re: [Qemu-devel] [PATCH] qcow2: fix leak of Qcow2DiscardRegion in update_refcount_discard

2014-10-12 Thread Max Reitz

Am 11.10.2014 um 10:35 schrieb Zhang Haoyu:

When the Qcow2DiscardRegion is adjacent to another one referenced by d,
free this Qcow2DiscardRegion metadata referenced by p after
it was removed from s-discards queue.

Signed-off-by: Zhang Haoyu zhan...@sangfor.com
---
  block/qcow2-refcount.c | 1 +
  1 file changed, 1 insertion(+)


Reviewed-by: Max Reitz mre...@redhat.com



Re: [Qemu-devel] [question] is it possible that big-endian l1 table offset referenced by other I/O while updating l1 table offset in qcow2_update_snapshot_refcount?

2014-10-12 Thread Max Reitz

Am 10.10.2014 um 03:54 schrieb Zhang Haoyu:

Hi,
I encounter a problem that after deleting snapshot, the qcow2 image size is 
very larger than that it should be displayed by ls command,
but the virtual disk size is okay via qemu-img info.
I suspect that during updating l1 table offset, other I/O job reference the 
big-endian l1 table offset (very large value),
so the file is truncated to very large.

Not quite.  Rather, all the data that the snapshot used to occupy is
still consuming holes in the file; the maximum offset of the file is
still unchanged, even if the file is no longer using as many referenced
clusters.  Recent changes have gone in to sparsify the file when
possible (punching holes if your kernel and file system is new enough to
support that), so that it is not consuming the amount of disk space that
a mere ls reports.  But if what you are asking for is a way to compact
the file back down, then you'll need to submit a patch.  The idea of
having an online defragmenter for qcow2 files has been kicked around
before, but it is complex enough that no one has attempted a patch yet.

Sorry, I didn't clarify the problem clearly.
In qcow2_update_snapshot_refcount(), below code,
 /* Update L1 only if it isn't deleted anyway (addend = -1) */
 if (ret == 0  addend = 0  l1_modified) {
 for (i = 0; i  l1_size; i++) {
 cpu_to_be64s(l1_table[i]);
 }

 ret = bdrv_pwrite_sync(bs-file, l1_table_offset, l1_table, l1_size2);

 for (i = 0; i  l1_size; i++) {
 be64_to_cpus(l1_table[i]);
 }
 }
between cpu_to_be64s(l1_table[i]); and be64_to_cpus(l1_table[i]);,
is it possible that there is other I/O reference this interim l1 table whose 
entries contain the be64 l2 table offset?
The be64 l2 table offset maybe a very large value, hundreds of TB is possible,
then the qcow2 file will be truncated to far larger than normal size.
So we'll see the huge size of the qcow2 file by ls -hl, but the size is still 
normal displayed by qemu-img info.

If the possibility mentioned above exists, below raw code may fix it,
  if (ret == 0  addend = 0  l1_modified) {
 tmp_l1_table = g_malloc0(l1_size * sizeof(uint64_t))
 memcpy(tmp_l1_table, l1_table, l1_size * sizeof(uint64_t));
 for (i = 0; i  l1_size; i++) {
 cpu_to_be64s(tmp_l1_table[i]);
 }
 ret = bdrv_pwrite_sync(bs-file, l1_table_offset, tmp_l1_table, 
l1_size2);

 free(tmp_l1_table);
 }


l1_table is already a local variable (local to 
qcow2_update_snapshot_refcount()), so I can't really imagine how 
introducing another local buffer should mitigate the problem, if there 
is any.


Max



Re: [Qemu-devel] [Patch v4 6/8] target_arm: Change the reset values based on the ELF entry

2014-10-12 Thread Alistair Francis
On Wed, Oct 8, 2014 at 1:03 AM, Martin Galvan
martin.gal...@tallertechnologies.com wrote:
 On Tue, Oct 7, 2014 at 11:13 AM, Alistair Francis alistai...@gmail.com 
 wrote:
 The Netduino 2 machine won't run unless the reset_pc is based
 on the ELF entry point.

 Signed-off-by: Alistair Francis alistai...@gmail.com
 Signed-off-by: Peter Crosthwaite crosthwaite.pe...@gmail.com
 ---
 V2:
  - Malloc straight away, thanks to Peter C

  hw/arm/armv7m.c | 19 ---
  1 file changed, 16 insertions(+), 3 deletions(-)

 diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
 index 7169027..07b36e2 100644
 --- a/hw/arm/armv7m.c
 +++ b/hw/arm/armv7m.c
 @@ -155,11 +155,19 @@ static void armv7m_bitband_init(void)

  /* Board init.  */

 +typedef struct ARMV7MResetArgs {
 +ARMCPU *cpu;
 +uint32_t reset_pc;
 +} ARMV7MResetArgs;
 +
  static void armv7m_reset(void *opaque)
  {
 -ARMCPU *cpu = opaque;
 +ARMV7MResetArgs *args = opaque;
 +
 +cpu_reset(CPU(args-cpu));

 -cpu_reset(CPU(cpu));
 +args-cpu-env.thumb = args-reset_pc  1;
 +args-cpu-env.regs[15] = args-reset_pc  ~1;
  }

  /* Init CPU and memory for a v7-M based board.
 @@ -180,6 +188,7 @@ qemu_irq *armv7m_init(MemoryRegion *system_memory, int 
 mem_size, int num_irq,
  int i;
  int big_endian;
  MemoryRegion *hack = g_new(MemoryRegion, 1);
 +ARMV7MResetArgs *reset_args = g_new0(ARMV7MResetArgs, 1);

  if (cpu_model == NULL) {
 cpu_model = cortex-m3;
 @@ -234,7 +243,11 @@ qemu_irq *armv7m_init(MemoryRegion *system_memory, int 
 mem_size, int num_irq,
  vmstate_register_ram_global(hack);
  memory_region_add_subregion(system_memory, 0xf000, hack);

 -qemu_register_reset(armv7m_reset, cpu);
 +*reset_args = (ARMV7MResetArgs) {
 +.cpu = cpu,
 +.reset_pc = entry,
 +};
 +qemu_register_reset(armv7m_reset, reset_args);
  return pic;
  }

 How does this differ from what's being done in arm_cpu_reset for
 ARMv7-M? What about the initial MSP?

So the problem I was having is that the standard linker meant that the
reset_handler wasn't being called. I have made some changes to the
linker file so now it calls the reset_handler. This patch is no longer
required. Sorry it took me so long to figure out what the problem was

Thanks,

Alistair


 --

 Martín Galván

 Software Engineer

 Taller Technologies Argentina


 San Lorenzo 47, 3rd Floor, Office 5

 Córdoba, Argentina

 Phone: 54 351 4217888 / +54 351 4218211



Re: [Qemu-devel] [question] is it possible that big-endian l1 tableoffset referenced by other I/O while updating l1 table offset in qcow2_update_snapshot_refcount?

2014-10-12 Thread Zhang Haoyu
 Hi,
 I encounter a problem that after deleting snapshot, the qcow2 image size 
 is very larger than that it should be displayed by ls command,
 but the virtual disk size is okay via qemu-img info.
 I suspect that during updating l1 table offset, other I/O job reference 
 the big-endian l1 table offset (very large value),
 so the file is truncated to very large.
 Not quite.  Rather, all the data that the snapshot used to occupy is
 still consuming holes in the file; the maximum offset of the file is
 still unchanged, even if the file is no longer using as many referenced
 clusters.  Recent changes have gone in to sparsify the file when
 possible (punching holes if your kernel and file system is new enough to
 support that), so that it is not consuming the amount of disk space that
 a mere ls reports.  But if what you are asking for is a way to compact
 the file back down, then you'll need to submit a patch.  The idea of
 having an online defragmenter for qcow2 files has been kicked around
 before, but it is complex enough that no one has attempted a patch yet.
 Sorry, I didn't clarify the problem clearly.
 In qcow2_update_snapshot_refcount(), below code,
  /* Update L1 only if it isn't deleted anyway (addend = -1) */
  if (ret == 0  addend = 0  l1_modified) {
  for (i = 0; i  l1_size; i++) {
  cpu_to_be64s(l1_table[i]);
  }

  ret = bdrv_pwrite_sync(bs-file, l1_table_offset, l1_table, 
 l1_size2);

  for (i = 0; i  l1_size; i++) {
  be64_to_cpus(l1_table[i]);
  }
  }
 between cpu_to_be64s(l1_table[i]); and be64_to_cpus(l1_table[i]);,
 is it possible that there is other I/O reference this interim l1 table whose 
 entries contain the be64 l2 table offset?
 The be64 l2 table offset maybe a very large value, hundreds of TB is 
 possible,
 then the qcow2 file will be truncated to far larger than normal size.
 So we'll see the huge size of the qcow2 file by ls -hl, but the size is 
 still normal displayed by qemu-img info.

 If the possibility mentioned above exists, below raw code may fix it,
   if (ret == 0  addend = 0  l1_modified) {
  tmp_l1_table = g_malloc0(l1_size * sizeof(uint64_t))
  memcpy(tmp_l1_table, l1_table, l1_size * sizeof(uint64_t));
  for (i = 0; i  l1_size; i++) {
  cpu_to_be64s(tmp_l1_table[i]);
  }
  ret = bdrv_pwrite_sync(bs-file, l1_table_offset, tmp_l1_table, 
 l1_size2);

  free(tmp_l1_table);
  }

l1_table is already a local variable (local to 
qcow2_update_snapshot_refcount()), so I can't really imagine how 
introducing another local buffer should mitigate the problem, if there 
is any.

l1_table is not necessarily a local variable to qcow2_update_snapshot_refcount,
which depends on condition of if (l1_table_offset != s-l1_table_offset),
if the condition not true, l1_table = s-l1_table.

Thanks,
Zhang Haoyu
Max




Re: [Qemu-devel] [PATCH V2] net: don't use set/get_pointer() in set/get_netdev()

2014-10-12 Thread Jason Wang
On 10/10/2014 09:03 PM, Markus Armbruster wrote:
 Jason Wang jasow...@redhat.com writes:

 Commit 1ceef9f27359cbe92ef124bf74de6f792e71f6fb (net: multiqueue
 support) tries to use set_pointer() and get_pointer() to set and get
 NICPeers which is not a pointer defined in DEFINE_PROP_NETDEV. This
 trick works but result a unclean and fragile implementation (e.g
 print_netdev and parse_netdev).

 This patch solves this issue by not using set/get_pinter() and set and
 get netdev directly in set_netdev() and get_netdev(). After this the
 parse_netdev() and print_netdev() were no longer used and dropped from
 the source.

 Cc: Markus Armbruster arm...@redhat.com
 Cc: Stefan Hajnoczi stefa...@redhat.com
 Cc: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
 Changes from V1:
 - validate ncs pointer before accessing them, this fixes the qtest failure
   on arm.
 ---
  hw/core/qdev-properties-system.c | 71 
 ++--
  1 file changed, 39 insertions(+), 32 deletions(-)

 diff --git a/hw/core/qdev-properties-system.c 
 b/hw/core/qdev-properties-system.c
 index ae0900f..6939ea5 100644
 --- a/hw/core/qdev-properties-system.c
 +++ b/hw/core/qdev-properties-system.c
 @@ -176,41 +176,67 @@ PropertyInfo qdev_prop_chr = {
  };
  
  /* --- netdev device --- */
 +static void get_netdev(Object *obj, Visitor *v, void *opaque,
 +   const char *name, Error **errp)
 +{
 +DeviceState *dev = DEVICE(obj);
 +Property *prop = opaque;
 +NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop);
 +char *p = g_strdup(peers_ptr-ncs[0] ? peers_ptr-ncs[0]-name : );
 Can -ncs[0]-name ever be null?

Seems not, id is mandatory for netdev.

  
 -static int parse_netdev(DeviceState *dev, const char *str, void **ptr)
 +visit_type_str(v, p, name, errp);
 +g_free(p);
 +}
 +
 +static void set_netdev(Object *obj, Visitor *v, void *opaque,
 +   const char *name, Error **errp)
  {
 -NICPeers *peers_ptr = (NICPeers *)ptr;
 +DeviceState *dev = DEVICE(obj);
 +Property *prop = opaque;
 +NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop);
  NetClientState **ncs = peers_ptr-ncs;
  NetClientState *peers[MAX_QUEUE_NUM];
 -int queues, i = 0;
 -int ret;
 +Error *local_err = NULL;
 +int err, queues, i = 0;
 +char *str;
 +
 +if (dev-realized) {
 +qdev_prop_set_after_realize(dev, name, errp);
 +return;
 +}
 +
 +visit_type_str(v, str, name, local_err);
 +if (local_err) {
 +error_propagate(errp, local_err);
 +return;
 +}
  
  queues = qemu_find_net_clients_except(str, peers,
NET_CLIENT_OPTIONS_KIND_NIC,
MAX_QUEUE_NUM);
  if (queues == 0) {
 -ret = -ENOENT;
 +err = -ENOENT;
  goto err;
  }
  
  if (queues  MAX_QUEUE_NUM) {
 -ret = -E2BIG;
 +err = -E2BIG;
 error_set_from_qdev_prop_error() does not accept -E2BIG.

 You could call error_setg(...) directly instead.

Ok.
  goto err;
  }
  
  for (i = 0; i  queues; i++) {
  if (peers[i] == NULL) {
 -ret = -ENOENT;
 +err = -ENOENT;
  goto err;
  }
  
  if (peers[i]-peer) {
 -ret = -EEXIST;
 +err = -EEXIST;
  goto err;
  }
  
  if (ncs[i]) {
 -ret = -EINVAL;
 +err = -EINVAL;
 error_set_from_qdev_prop_error() does not accept -EINVAL, either.

Ok.
  goto err;
  }
  
 @@ -219,31 +245,12 @@ static int parse_netdev(DeviceState *dev, const char 
 *str, void **ptr)
  }
  
  peers_ptr-queues = queues;
 -
 -return 0;
 +g_free(str);
 +return;
  
  err:
 Label err clashes with local variable err.  Harmless, but maybe you'd
 like to rename one of them.

It was used in error_set_from_qdev_prop_error(errp, err, dev, prop, str);
 -return ret;
 -}
 -
 -static char *print_netdev(void *ptr)
 -{
 -NetClientState *netdev = ptr;
 -const char *val = netdev-name ? netdev-name : ;
 -
 -return g_strdup(val);
 -}
 -
 -static void get_netdev(Object *obj, Visitor *v, void *opaque,
 -   const char *name, Error **errp)
 -{
 -get_pointer(obj, v, opaque, print_netdev, name, errp);
 -}
 -
 -static void set_netdev(Object *obj, Visitor *v, void *opaque,
 -   const char *name, Error **errp)
 -{
 -set_pointer(obj, v, opaque, parse_netdev, name, errp);
 +error_set_from_qdev_prop_error(errp, err, dev, prop, str);
 +g_free(str);
  }
  
  PropertyInfo qdev_prop_netdev = {
 Instead of

 g_free(str);
 return;

 err:
 error_set_from_qdev_prop_error(errp, err, dev, prop, str);
 g_free(str);
 }

 You could exploit that error_set_from_qdev_prop_error() does nothing
 when err is 0:

 out:
 

Re: [Qemu-devel] [PATCH V2] net: don't use set/get_pointer() in set/get_netdev()

2014-10-12 Thread Jason Wang
On 10/13/2014 11:31 AM, Jason Wang wrote:
 On 10/10/2014 09:03 PM, Markus Armbruster wrote:
 Jason Wang jasow...@redhat.com writes:

 Commit 1ceef9f27359cbe92ef124bf74de6f792e71f6fb (net: multiqueue
 support) tries to use set_pointer() and get_pointer() to set and get
 NICPeers which is not a pointer defined in DEFINE_PROP_NETDEV. This
 trick works but result a unclean and fragile implementation (e.g
 print_netdev and parse_netdev).

 This patch solves this issue by not using set/get_pinter() and set and
 get netdev directly in set_netdev() and get_netdev(). After this the
 parse_netdev() and print_netdev() were no longer used and dropped from
 the source.

 Cc: Markus Armbruster arm...@redhat.com
 Cc: Stefan Hajnoczi stefa...@redhat.com
 Cc: Peter Maydell peter.mayd...@linaro.org
 Signed-off-by: Jason Wang jasow...@redhat.com
 ---
 Changes from V1:
 - validate ncs pointer before accessing them, this fixes the qtest failure
   on arm.
 ---
  hw/core/qdev-properties-system.c | 71 
 ++--
  1 file changed, 39 insertions(+), 32 deletions(-)

 diff --git a/hw/core/qdev-properties-system.c 
 b/hw/core/qdev-properties-system.c
 index ae0900f..6939ea5 100644
 --- a/hw/core/qdev-properties-system.c
 +++ b/hw/core/qdev-properties-system.c
 @@ -176,41 +176,67 @@ PropertyInfo qdev_prop_chr = {
  };
  
[...]
  goto err;
  }
  
  for (i = 0; i  queues; i++) {
  if (peers[i] == NULL) {
 -ret = -ENOENT;
 +err = -ENOENT;
  goto err;
  }
  
  if (peers[i]-peer) {
 -ret = -EEXIST;
 +err = -EEXIST;
  goto err;
  }
  
  if (ncs[i]) {
 -ret = -EINVAL;
 +err = -EINVAL;
 error_set_from_qdev_prop_error() does not accept -EINVAL, either.
 Ok.

Just have a check, error_set_from_qdev_prop_error() can accept -EINVAL,
so I will only call error_setg() directly for -E2BIG.



[Qemu-devel] [PATCH v3 0/2] qcow2: Patch for shrinking qcow2 disk image

2014-10-12 Thread Jun Li
This is the third version for qcow2 shrinking. In this version, fixed host
cluster leak when shrinking a disk image. Such as:
Step 1,
# /opt/qemu-git-arm/bin/qemu-img info /home/lijun/Work/tmp/shrink.qcow2
image: /home/lijun/Work/tmp/shrink.qcow2
file format: qcow2
virtual size: 2.0G (2147483648 bytes)
disk size: 2.0G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
corrupt: false

Step 2,
# /opt/qemu-git-arm/bin/qemu-img resize /home/lijun/Work/tmp/shrink.qcow2 -100M
Image resized.

Step 3,
# /opt/qemu-git-arm/bin/qemu-img info /home/lijun/Work/tmp/shrink.qcow2
image: /home/lijun/Work/tmp/shrink.qcow2
file format: qcow2
virtual size: 1.9G (2042626048 bytes)
disk size: 1.9G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
corrupt: false

Step 4,
# /opt/qemu-git-arm/bin/qemu-img check /home/lijun/Work/tmp/shrink.qcow2
No errors were found on the image.
31005/31168 = 99.48% allocated, 0.79% fragmented, 0.00% compressed clusters
Image end offset: 2033844224

As above show, in step 1, disk size and virtual size are all 2.0G.
After Step 3, disk size and virtual size are all 1.9G.

BTW, above is also a simple testing. I will submit a test case for
qemu-iotests later. But I am not so familar with howto write qemu-iotests now.

In file block/qcow2.c, just using ftruncate to fix host cluster leak.

In file block/qcow2-cluster.c, just re-copy qcow2_grow_l1_table to
realize qcow2_shrink_l1_and_l2_table.

In file block/qcow2-refcount.c, also update the realization to handle 
self-describing
refcount blocks in function update_refcount.

Thanks.


Jun Li (2):
  qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking
  qcow2: add update refcount table realization for update_refcount

 block/qcow2-cluster.c  | 173 +
 block/qcow2-refcount.c |  44 +
 block/qcow2.c  |  40 ++--
 block/qcow2.h  |   2 +
 4 files changed, 255 insertions(+), 4 deletions(-)

-- 
1.9.3




[Qemu-devel] [PATCH v3 1/2] qcow2: Add qcow2_shrink_l1_and_l2_table for qcow2 shrinking

2014-10-12 Thread Jun Li
This patch is the realization of new function qcow2_shrink_l1_and_l2_table.
This function will shrink/discard l1 and l2 table when do qcow2 shrinking.

Signed-off-by: Jun Li junm...@gmail.com
---
Compared to v2, v3 fixed host cluster leak.
---
 block/qcow2-cluster.c | 173 ++
 block/qcow2.c |  40 ++--
 block/qcow2.h |   2 +
 3 files changed, 211 insertions(+), 4 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index f7dd8c0..2ac3536 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -29,6 +29,9 @@
 #include block/qcow2.h
 #include trace.h
 
+static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
+   uint64_t **l2_table);
+
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
 bool exact_size)
 {
@@ -135,6 +138,176 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t 
min_size,
 return ret;
 }
 
+int qcow2_shrink_l1_and_l2_table(BlockDriverState *bs, uint64_t new_l1_size,
+ int new_l2_index, bool exact_size)
+{
+BDRVQcowState *s = bs-opaque;
+int new_l1_size2, ret, i;
+uint64_t *new_l1_table;
+int64_t new_l1_table_offset;
+int64_t old_l1_table_offset, old_l1_size;
+uint8_t data[12];
+
+new_l1_size2 = sizeof(uint64_t) * new_l1_size;
+new_l1_table = qemu_try_blockalign(bs-file,
+   align_offset(new_l1_size2, 512));
+if (new_l1_table == NULL) {
+return -ENOMEM;
+}
+memset(new_l1_table, 0, align_offset(new_l1_size2, 512));
+
+/* shrinking l1 table */
+memcpy(new_l1_table, s-l1_table, new_l1_size2);
+
+/* write new table (align to cluster) */
+BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_ALLOC_TABLE);
+new_l1_table_offset = qcow2_alloc_clusters(bs, new_l1_size2);
+if (new_l1_table_offset  0) {
+qemu_vfree(new_l1_table);
+return new_l1_table_offset;
+}
+
+ret = qcow2_cache_flush(bs, s-refcount_block_cache);
+if (ret  0) {
+goto fail;
+}
+
+/* the L1 position has not yet been updated, so these clusters must
+ * indeed be completely free */
+ret = qcow2_pre_write_overlap_check(bs, 0, new_l1_table_offset,
+new_l1_size2);
+if (ret  0) {
+goto fail;
+}
+
+BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_WRITE_TABLE);
+
+for (i = 0; i  new_l1_size; i++) {
+new_l1_table[i] = cpu_to_be64(new_l1_table[i]);
+}
+
+ret = bdrv_pwrite_sync(bs-file, new_l1_table_offset,
+   new_l1_table, new_l1_size2);
+if (ret  0) {
+goto fail;
+}
+
+for (i = 0; i  new_l1_size; i++) {
+new_l1_table[i] = be64_to_cpu(new_l1_table[i]);
+}
+
+/* set new table */
+BLKDBG_EVENT(bs-file, BLKDBG_L1_GROW_ACTIVATE_TABLE);
+cpu_to_be32w((uint32_t *)data, new_l1_size);
+stq_be_p(data + 4, new_l1_table_offset);
+ret = bdrv_pwrite_sync(bs-file, offsetof(QCowHeader, l1_size),
+   data, sizeof(data));
+if (ret  0) {
+goto fail;
+}
+
+old_l1_table_offset = s-l1_table_offset;
+s-l1_table_offset = new_l1_table_offset;
+uint64_t *old_l1_table = s-l1_table;
+s-l1_table = new_l1_table;
+old_l1_size = s-l1_size;
+s-l1_size = new_l1_size;
+
+int num = old_l1_size - s-l1_size;
+
+while (num = 0) {
+uint64_t l2_offset;
+int ret;
+uint64_t *l2_table, l2_entry;
+int last_free_cluster = 0;
+
+l2_offset = old_l1_table[s-l1_size + num - 1]  L1E_OFFSET_MASK;
+if (l2_offset == 0) {
+goto retry;
+}
+
+if (num == 0) {
+if (new_l2_index == 0) {
+goto retry;
+}
+last_free_cluster = new_l2_index + 1;
+}
+
+/* load l2_table into cache */
+ret = l2_load(bs, l2_offset, l2_table);
+
+if (ret  0) {
+goto fail;
+}
+
+for (i = s-l2_size; i  last_free_cluster; i--) {
+l2_entry = be64_to_cpu(l2_table[i - 1]);
+
+switch (qcow2_get_cluster_type(l2_entry)) {
+case QCOW2_CLUSTER_UNALLOCATED:
+if (!bs-backing_hd) {
+continue;
+}
+break;
+
+case QCOW2_CLUSTER_ZERO:
+continue;
+
+case QCOW2_CLUSTER_NORMAL:
+case QCOW2_CLUSTER_COMPRESSED:
+break;
+
+default:
+abort();
+}
+
+qcow2_cache_entry_mark_dirty(s-l2_table_cache, l2_table);
+if (s-qcow_version = 3) {
+l2_table[i - 1] = cpu_to_be64(QCOW_OFLAG_ZERO);
+} else {
+l2_table[i - 1] = cpu_to_be64(0);
+}
+
+/* before discard the specify item of l2_entry,
+ * set this entry with 0.
+  

[Qemu-devel] [PATCH v3 2/2] qcow2: add update refcount table realization for update_refcount

2014-10-12 Thread Jun Li
When every item of refcount block is NULL, free refcount block and reset the
corresponding item of refcount table with NULL. At the same time, put this
cluster to s-free_cluster_index.

Signed-off-by: Jun Li junm...@gmail.com
---
This version adding handle self-describing refcount blocks.
---
 block/qcow2-refcount.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 2bcaaf9..8594ffd 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -28,6 +28,7 @@
 #include qemu/range.h
 
 static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
+static int write_reftable_entry(BlockDriverState *bs, int rt_index);
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
 int64_t offset, int64_t length,
 int addend, enum qcow2_discard_type type);
@@ -599,6 +600,49 @@ static int QEMU_WARN_UNUSED_RESULT 
update_refcount(BlockDriverState *bs,
 if (refcount == 0  s-discard_passthrough[type]) {
 update_refcount_discard(bs, cluster_offset, s-cluster_size);
 }
+
+unsigned int refcount_table_index;
+refcount_table_index = cluster_index  (s-cluster_bits -
+   REFCOUNT_SHIFT);
+
+uint64_t refcount_block_offset =
+s-refcount_table[refcount_table_index]  REFT_OFFSET_MASK;
+
+int64_t self_block_index = refcount_block_offset  s-cluster_bits;
+int self_block_index2 = (refcount_block_offset  s-cluster_bits) 
+((1  (s-cluster_bits - REFCOUNT_SHIFT)) - 1);
+
+/* When refcount block is NULL, update refcount table */
+if (!be16_to_cpu(refcount_block[0]) || self_block_index2 == 0) {
+int k;
+int refcount_block_entries = s-cluster_size /
+ sizeof(refcount_block[0]);
+for (k = 0; k  refcount_block_entries; k++) {
+if (k == self_block_index2) {
+k++;
+}
+
+if (be16_to_cpu(refcount_block[k])) {
+break;
+}
+}
+
+if (k == refcount_block_entries) {
+if (self_block_index  s-free_cluster_index) {
+s-free_cluster_index = self_block_index;
+}
+
+/* update refcount table */
+s-refcount_table[refcount_table_index] = 0;
+ret = write_reftable_entry(bs, refcount_table_index);
+if (ret  0) {
+fprintf(stderr, Could not update refcount table: %s\n,
+strerror(-ret));
+goto fail;
+}
+
+}
+}
 }
 
 ret = 0;
-- 
1.9.3




[Qemu-devel] [PATCH V3] net: don't use set/get_pointer() in set/get_netdev()

2014-10-12 Thread Jason Wang
Commit 1ceef9f27359cbe92ef124bf74de6f792e71f6fb (net: multiqueue
support) tries to use set_pointer() and get_pointer() to set and get
NICPeers which is not a pointer defined in DEFINE_PROP_NETDEV. This
trick works but result a unclean and fragile implementation (e.g
print_netdev and parse_netdev).

This patch solves this issue by not using set/get_pinter() and set and
get netdev directly in set_netdev() and get_netdev(). After this the
parse_netdev() and print_netdev() were no longer used and dropped from
the source.

Cc: Markus Armbruster arm...@redhat.com
Cc: Stefan Hajnoczi stefa...@redhat.com
Cc: Peter Maydell peter.mayd...@linaro.org
Signed-off-by: Jason Wang jasow...@redhat.com
---
Changes from V2:
- Use error_setg() instead of error_set_from_qdev_prop_error() for E2BIG error.
- Clean the return part of the set_netdev() since
  eror_set_from_qdev_prop_error() does nothing when err is 0.
Changes from V1:
- validate ncs pointer before accessing them, this fixes the qtest failure
  on arm.
---
 hw/core/qdev-properties-system.c | 70 ++--
 1 file changed, 38 insertions(+), 32 deletions(-)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 84caa1d..03a6e68 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -176,41 +176,68 @@ PropertyInfo qdev_prop_chr = {
 };
 
 /* --- netdev device --- */
+static void get_netdev(Object *obj, Visitor *v, void *opaque,
+   const char *name, Error **errp)
+{
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop);
+char *p = g_strdup(peers_ptr-ncs[0] ? peers_ptr-ncs[0]-name : );
 
-static int parse_netdev(DeviceState *dev, const char *str, void **ptr)
+visit_type_str(v, p, name, errp);
+g_free(p);
+}
+
+static void set_netdev(Object *obj, Visitor *v, void *opaque,
+   const char *name, Error **errp)
 {
-NICPeers *peers_ptr = (NICPeers *)ptr;
+DeviceState *dev = DEVICE(obj);
+Property *prop = opaque;
+NICPeers *peers_ptr = qdev_get_prop_ptr(dev, prop);
 NetClientState **ncs = peers_ptr-ncs;
 NetClientState *peers[MAX_QUEUE_NUM];
-int queues, i = 0;
-int ret;
+Error *local_err = NULL;
+int queues, err = 0, i = 0;
+char *str;
+
+if (dev-realized) {
+qdev_prop_set_after_realize(dev, name, errp);
+return;
+}
+
+visit_type_str(v, str, name, local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
 
 queues = qemu_find_net_clients_except(str, peers,
   NET_CLIENT_OPTIONS_KIND_NIC,
   MAX_QUEUE_NUM);
 if (queues == 0) {
-ret = -ENOENT;
+err = -ENOENT;
 goto err;
 }
 
 if (queues  MAX_QUEUE_NUM) {
-ret = -E2BIG;
+error_setg(errp, queues of backend '%s'(%d) exceeds QEMU 
limitation(%d),
+   str, queues, MAX_QUEUE_NUM);
 goto err;
 }
 
 for (i = 0; i  queues; i++) {
 if (peers[i] == NULL) {
-ret = -ENOENT;
+err = -ENOENT;
 goto err;
 }
 
 if (peers[i]-peer) {
-ret = -EEXIST;
+err = -EEXIST;
 goto err;
 }
 
 if (ncs[i]) {
-ret = -EINVAL;
+err = -EINVAL;
 goto err;
 }
 
@@ -220,30 +247,9 @@ static int parse_netdev(DeviceState *dev, const char *str, 
void **ptr)
 
 peers_ptr-queues = queues;
 
-return 0;
-
 err:
-return ret;
-}
-
-static char *print_netdev(void *ptr)
-{
-NetClientState *netdev = ptr;
-const char *val = netdev-name ? netdev-name : ;
-
-return g_strdup(val);
-}
-
-static void get_netdev(Object *obj, Visitor *v, void *opaque,
-   const char *name, Error **errp)
-{
-get_pointer(obj, v, opaque, print_netdev, name, errp);
-}
-
-static void set_netdev(Object *obj, Visitor *v, void *opaque,
-   const char *name, Error **errp)
-{
-set_pointer(obj, v, opaque, parse_netdev, name, errp);
+error_set_from_qdev_prop_error(errp, err, dev, prop, str);
+g_free(str);
 }
 
 PropertyInfo qdev_prop_netdev = {
-- 
1.8.3.1