Re: [Qemu-devel] [PATCH v1 17/17] arm: boot: Support big-endian elfs

2016-02-27 Thread Peter Crosthwaite
On Tue, Jan 19, 2016 at 10:06 AM, Peter Maydell
 wrote:
> On 18 January 2016 at 07:12, Peter Crosthwaite
>  wrote:
>> Support ARM big-endian ELF files in system-mode emulation. When loading
>> an elf, determine the endianness mode expected by the elf, and set the
>> relevant CPU state accordingly.
>>
>> With this, big-endian modes are now fully supported via system-mode LE,
>> so there is no need to restrict the elf loading to the TARGET
>> endianness so the ifdeffery on TARGET_WORDS_BIGENDIAN goes away.
>>
>> Signed-off-by: Peter Crosthwaite 
>> ---
>>
>>  hw/arm/boot.c| 96 
>> ++--
>>  include/hw/arm/arm.h |  9 +
>>  2 files changed, 88 insertions(+), 17 deletions(-)
>>
>> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
>> index 0de4269..053c9e8 100644
>> --- a/hw/arm/boot.c
>> +++ b/hw/arm/boot.c
>> @@ -465,9 +465,34 @@ static void do_cpu_reset(void *opaque)
>>  cpu_reset(cs);
>>  if (info) {
>>  if (!info->is_linux) {
>> +int i;
>>  /* Jump to the entry point.  */
>>  uint64_t entry = info->entry;
>>
>> +switch (info->endianness) {
>> +case ARM_ENDIANNESS_LE:
>> +env->cp15.sctlr_el[1] &= ~SCTLR_E0E;
>> +for (i = 1; i < 4; ++i) {
>> +env->cp15.sctlr_el[i] &= ~SCTLR_EE;
>> +}
>> +env->uncached_cpsr &= ~CPSR_E;
>> +break;
>> +case ARM_ENDIANNESS_BE8:
>> +env->cp15.sctlr_el[1] |= SCTLR_E0E;
>> +for (i = 1; i < 4; ++i) {
>> +env->cp15.sctlr_el[i] |= SCTLR_EE;
>> +}
>> +env->uncached_cpsr |= CPSR_E;
>> +break;
>> +case ARM_ENDIANNESS_BE32:
>> +env->cp15.sctlr_el[1] |= SCTLR_B;
>> +break;
>> +case ARM_ENDIANNESS_UNKNOWN:
>> +break; /* Board's decision */
>> +default:
>> +g_assert_not_reached();
>> +}
>
> Do we really want this much magic for non-linux images?

I think so ...

>  I would
> expect that the image would be intended to run with whatever the
> state the board puts the CPU in from reset (ie the CPU has suitable
> QOM properties for its initial endianness state, corresponding to
> real hardware reset-config signals like the A15's CFGEND/CFGTE).
>

As with this you can handle two use cases:

1: Elfs for firmware development
2: Elfs for a real-world loadable guest.

Firmware elfs should match the hardwired (QOM properties) settings
anyway, but the more interesting case is the real-world loadable guest
ELF. That is, you have some elf-capable bootloader in real HW which
will DTRT based on the elf header before handoff. This will emulate
that case without needing to demote your elf to raw binary data (and
then explicitly load your bootloader).

>> +
>>  if (!env->aarch64) {
>>  env->thumb = info->entry & 1;
>>  entry &= 0xfffe;
>> @@ -589,16 +614,23 @@ static void arm_load_kernel_notify(Notifier *notifier, 
>> void *data)
>>  int kernel_size;
>>  int initrd_size;
>>  int is_linux = 0;
>> +
>>  uint64_t elf_entry, elf_low_addr, elf_high_addr;
>>  int elf_machine;
>> +bool elf_is64;
>> +union {
>> +Elf32_Ehdr h32;
>> +Elf64_Ehdr h64;
>> +} elf_header;
>> +
>>  hwaddr entry, kernel_load_offset;
>> -int big_endian;
>>  static const ARMInsnFixup *primary_loader;
>>  ArmLoadKernelNotifier *n = DO_UPCAST(ArmLoadKernelNotifier,
>>   notifier, notifier);
>>  ARMCPU *cpu = n->cpu;
>>  struct arm_boot_info *info =
>>  container_of(n, struct arm_boot_info, load_kernel_notifier);
>> +Error *err = NULL;
>>
>>  /* The board code is not supposed to set secure_board_setup unless
>>   * running its code in secure mode is actually possible, and KVM
>> @@ -678,12 +710,6 @@ static void arm_load_kernel_notify(Notifier *notifier, 
>> void *data)
>>  if (info->nb_cpus == 0)
>>  info->nb_cpus = 1;
>>
>> -#ifdef TARGET_WORDS_BIGENDIAN
>> -big_endian = 1;
>> -#else
>> -big_endian = 0;
>> -#endif
>
> Was this code ever built with TARGET_WORDS_BIGENDIAN defined?
>

No I believe not. I can do the dead code cleanup as a separate patch?

>> -
>>  /* We want to put the initrd far enough into RAM that when the
>>   * kernel is uncompressed it will not clobber the initrd. However
>>   * on boards without much RAM we must ensure that we still leave
>> @@ -698,16 +724,52 @@ static void arm_load_kernel_notify(Notifier *notifier, 
>> void *data)
>>  MIN(info->ram_size / 2, 128 * 1024 * 1024);
>>
>>  /* Assume that raw images are linux kernels, and ELF images are not.  */
>> -kernel_size = load_elf(info->kernel_filename, NULL, NULL, &elf_entry,
>> -  

Re: [Qemu-devel] [PATCH v1 16/17] loader: Add data swap option to load-elf

2016-02-27 Thread Peter Crosthwaite
On Tue, Jan 19, 2016 at 9:53 AM, Peter Maydell  wrote:
> On 18 January 2016 at 07:12, Peter Crosthwaite
>  wrote:
>> Some CPUs are of an opposite data-endianness to other components in the
>> system. Sometimes elfs have the data sections layed out with this CPU
>> data-endianess accounting for when loaded via the CPU, byte swaps
>> (relative to other system components) will occur.
>>
>> The leading example, is ARM's BE32 mode, which is is basically LE with
>> address manipulation on half-word and byte accesses to access the
>> hw/byte reversed address. This means that word data is invariant
>> accross LE and BE32. This also means that instructions are still LE.
>> The expectation is that the elf will be loaded via the CPU in this
>> endianness scheme, which means the data in the elf is reversed at
>> compile time.
>>
>> As QEMU loads via the system memory directly, rather than the CPU, we
>> need a mechanism to reverse elf data endianness to implement this
>> possibility.
>>
>> Signed-off-by: Peter Crosthwaite 
>
>> diff --git a/include/hw/loader.h b/include/hw/loader.h
>> index 33067f8..e542575 100644
>> --- a/include/hw/loader.h
>> +++ b/include/hw/loader.h
>> @@ -35,7 +35,7 @@ const char *load_elf_strerror(int error);
>>  int load_elf(const char *filename, uint64_t (*translate_fn)(void *, 
>> uint64_t),
>>   void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
>>   uint64_t *highaddr, int big_endian, int elf_machine,
>> - int clear_lsb);
>> + int clear_lsb, int data_swab);
>>  void load_elf_hdr(const char *filename, void *hdr, bool *is64, Error 
>> **errp);
>>  int load_aout(const char *filename, hwaddr addr, int max_sz,
>>int bswap_needed, hwaddr target_page_size);
>
> Can we have a doc comment so we have something that defines what
> values data_swab accepts? (it's not just a bool).
>

This is difficult to capture without writing to whole documentation
for load_elf. So here goes:

/** load_elf:
 * @filename: Path of ELF file
 * @translate_fn: optional function to translate load addresses
 * @translate_opaque: opaque data passed to @translate_fn
 * @pentry: Populated with program entry point. Ignored if NULL.
 * @lowaddr: Populated with lowest loaded address. Ignored if NULL.
 * @highaddr: Populated with highest loaded address. Ignored if NULL.
 * @bigendian: Expected ELF endianness. 0 for LE otherwise BE
 * @elf_machine: Expected ELF machine type
 * @clear_lsb: Set to mask off LSB of addresses (Some arch's use this
   for non-address data)
 * @data_swab: Set to order of byte swapping for data. 0 for no swap, 1
 * for swapping bytes within halfwords, 2 for bytes within
 * words and 3 for within doublewords.
 *
 * Load an ELF file's contents to the emulated systems address space.
 * Clients may optionally specify a callback to perform address
 * translations. @pentry, @lowaddr and @highaddr are optional pointers
 * which will be populated with various load information. @bigendian and
 * @elf_machine give the expected endianness and machine for the ELF the
 * load will fail it the target ELF does not match. Some architectures
 * have some arch-specific behaviours that come into effect when their
 * particular values for @elf_machine are set.
 */

Regards,
Peter

> Otherwise
> Reviewed-by: Peter Maydell 
>
> thanks
> -- PMM



Re: [Qemu-devel] [PATCH v1 15/17] loader: add API to load elf header

2016-02-27 Thread Peter Crosthwaite
On Tue, Jan 19, 2016 at 9:50 AM, Peter Maydell  wrote:
> On 18 January 2016 at 07:12, Peter Crosthwaite
>  wrote:
>> Add an API to load an elf header header from a file. Populates a
>> buffer with the header contents, as well as a boolean for whether the
>> elf is 64b or not. Both arguments are optional.
>>
>> Signed-off-by: Peter Crosthwaite 
>> ---
>>
>>  hw/core/loader.c| 48 
>>  include/hw/loader.h |  1 +
>>  2 files changed, 49 insertions(+)
>>
>> diff --git a/hw/core/loader.c b/hw/core/loader.c
>> index 6b69852..28da8e2 100644
>> --- a/hw/core/loader.c
>> +++ b/hw/core/loader.c
>> @@ -331,6 +331,54 @@ const char *load_elf_strerror(int error)
>>  }
>>  }
>>
>> +void load_elf_hdr(const char *filename, void *hdr, bool *is64, Error **errp)
>> +{
>> +int fd;
>> +uint8_t e_ident[EI_NIDENT];
>> +size_t hdr_size, off = 0;
>> +bool is64l;
>> +
>> +fd = open(filename, O_RDONLY | O_BINARY);
>> +if (fd < 0) {
>> +error_setg_errno(errp, errno, "Fail to open file");
>
> "Failed" (also below).
>

Fixed (x2).

> I don't think we end up with the filename anywhere in the
> error message; it would be helpful if we could include it.
>

Fixed (x4)

>> +return;
>> +}
>> +if (read(fd, e_ident, sizeof(e_ident)) != sizeof(e_ident)) {
>> +error_setg_errno(errp, errno, "Fail to read file");
>> +goto fail;
>> +}
>> +if (e_ident[0] != ELFMAG0 ||
>> +e_ident[1] != ELFMAG1 ||
>> +e_ident[2] != ELFMAG2 ||
>> +e_ident[3] != ELFMAG3) {
>> +error_setg(errp, "Bad ELF magic");
>> +goto fail;
>> +}
>> +
>> +is64l = e_ident[EI_CLASS] == ELFCLASS64;
>> +hdr_size = is64l ? sizeof(Elf64_Ehdr) : sizeof(Elf32_Ehdr);
>> +if (is64) {
>> +*is64 = is64l;
>> +}
>> +
>> +lseek(fd, 0, SEEK_SET);
>
> You're not checking this lseek for failure (and you don't
> need it anyway, because you could just copy the magic bytes
> into *hdr and read four fewer bytes).
>

OK, so I have optimised it away. What I am doing now is always reading
to straight to hdr[], and if the caller passes hdr == NULL, then hdr
is set to a local buffer (and the full header read is still skipped as
per current logic).

>> +while (hdr && off < hdr_size) {
>> +size_t br = read(fd, hdr + off, hdr_size - off);
>> +switch (br) {
>> +case 0:
>> +error_setg(errp, "File too short");
>> +goto fail;
>> +case -1:
>> +error_setg_errno(errp, errno, "Failed to read file");
>> +goto fail;
>> +}
>> +off += br;
>> +}
>> +
>> +fail:
>> +close(fd);
>> +}
>> +
>>  /* return < 0 if error, otherwise the number of bytes loaded in memory */
>>  int load_elf(const char *filename, uint64_t (*translate_fn)(void *, 
>> uint64_t),
>>   void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
>> diff --git a/include/hw/loader.h b/include/hw/loader.h
>> index f7b43ab..33067f8 100644
>> --- a/include/hw/loader.h
>> +++ b/include/hw/loader.h
>> @@ -36,6 +36,7 @@ int load_elf(const char *filename, uint64_t 
>> (*translate_fn)(void *, uint64_t),
>>   void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
>>   uint64_t *highaddr, int big_endian, int elf_machine,
>>   int clear_lsb);
>> +void load_elf_hdr(const char *filename, void *hdr, bool *is64, Error 
>> **errp);
>
> Doc comment, please.
>

Added:

+
+/** load_elf_hdr:
+ * @filename: Path of ELF file
+ * @hdr: Buffer to populate with header data. Header data will not be
+ * filled if set to NULL.
+ * @is64: Set to true if the ELF is 64bit. Ignored if set to NULL
+ * @errp: Populated with an error in failure cases
+ *
+ * Inspect as ELF file's header. Read its full header contents into a
+ * buffer and/or determine if the ELF is 64bit.
+ */


Regards,
Peter

>>  int load_aout(const char *filename, hwaddr addr, int max_sz,
>>int bswap_needed, hwaddr target_page_size);
>>  int load_uimage(const char *filename, hwaddr *ep,
>> --
>> 1.9.1
>
> thanks
> -- PMM



Re: [Qemu-devel] [PATCH v1 13/17] arm: linux-user: don't set CPSR.E in BE32 mode

2016-02-27 Thread Peter Crosthwaite
On Tue, Jan 19, 2016 at 9:35 AM, Peter Maydell  wrote:
> On 19 January 2016 at 17:26, Peter Maydell  wrote:
>> On 18 January 2016 at 07:12, Peter Crosthwaite
>>  wrote:
>>> Don't set CPSR.E for BE32 linux-user mode. As linux-user mode models
>>> BE32, using normal BE (and system mode will not), a special case is
>>> needed for user-mode where if sctlr.b is set, the CPU identifies as BE.
>>>
>>> Signed-off-by: Peter Crosthwaite 
>>> ---
>>>
>>>  linux-user/main.c |  2 --
>>>  target-arm/cpu.h  | 12 +++-
>>>  2 files changed, 11 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/linux-user/main.c b/linux-user/main.c
>>> index d481458..60375fb 100644
>>> --- a/linux-user/main.c
>>> +++ b/linux-user/main.c
>>> @@ -4496,8 +4496,6 @@ int main(int argc, char **argv, char **envp)
>>>  env->uncached_cpsr |= CPSR_E;
>>>  } else {
>>>  env->cp15.sctlr_el[1] |= SCTLR_B;
>>> -/* We model BE32 as regular BE, so set CPSR_E */
>>> -env->uncached_cpsr |= CPSR_E;
>>
>> ...this is undoing what we just did in the previous patch and
>> which I reviewed as being the wrong thing there...
>>
>>>  }
>>>  #endif
>>>  }
>>> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
>>> index 3edd56b..96b1e99 100644
>>> --- a/target-arm/cpu.h
>>> +++ b/target-arm/cpu.h
>>> @@ -1812,7 +1812,17 @@ static bool arm_cpu_is_big_endian(CPUARMState *env)
>>>
>>>  /* In 32bit endianness is determined by looking at CPSR's E bit */
>>>  if (!is_a64(env)) {
>>> -return (env->uncached_cpsr & CPSR_E) ? 1 : 0;
>>> +return
>>> +#ifdef CONFIG_USER_ONLY
>>> +/* In user mode, BE32 data accesses are just modelled as
>>> + * regular BE access. In system mode, BE32 is modelled as
>>> + * little endian, with the appropriate address translations on
>>> + * non-word accesses. So sctlr.b only affects overall
>>> + * endianness in user mode
>>> + */
>>> +arm_sctlr_b(env) ||
>>> +#endif
>>> +((env->uncached_cpsr & CPSR_E) ? 1 : 0);
>>>  }
>>
>> This doesn't seem quite right -- for system emulation we currently
>> pick MO_BE or MO_LE based on the TB flag which is set according
>> to (arm_cpu_is_big_endian(env). So if we ignore SCTLR.B in
>> system mode then we'll still try to do LE accesses.
>
> Ah, no, looking at the next patch this is correct, it's just the
> comment is a touch confusing. I suggest
>
>  /* In system mode, BE32 is modelled in line with the architecture
>   * (as word-invariant big-endianness), where loads and stores are done
>   * little endian but from addresses which are adjusted by XORing
>   * with the appropriate constant. So the endianness to use for the
>   * raw data access is not affected by SCTLR.B.
>   * In user mode, however, we model BE32 as byte-invariant big-endianness
>   * (because user-only code cannot tell the difference), and so we
>   * need to use a data access endianness that depends on SCTLR.B.
>   */
>

Comment updated.

Regards,
Peter

> thanks
> -- PMM



Re: [Qemu-devel] [PATCH v1 10/17] target-arm: implement setend

2016-02-27 Thread Peter Crosthwaite
On Tue, Jan 19, 2016 at 8:29 AM, Peter Maydell  wrote:
> On 18 January 2016 at 07:12, Peter Crosthwaite
>  wrote:
>> From: Paolo Bonzini 
>>
>> Since this is not a high-performance path, just use a helper to
>> flip the E bit and force a lookup in the hash table since the
>> flags have changed.
>>
>> Signed-off-by: Paolo Bonzini 
>> Signed-off-by: Peter Crosthwaite 
>> ---
>>
>>  target-arm/helper.h|  1 +
>>  target-arm/op_helper.c |  5 +
>>  target-arm/translate.c | 16 
>>  3 files changed, 14 insertions(+), 8 deletions(-)
>>
>> diff --git a/target-arm/helper.h b/target-arm/helper.h
>> index c2a85c7..2315a9c 100644
>> --- a/target-arm/helper.h
>> +++ b/target-arm/helper.h
>> @@ -48,6 +48,7 @@ DEF_HELPER_FLAGS_3(sel_flags, TCG_CALL_NO_RWG_SE,
>> i32, i32, i32, i32)
>>  DEF_HELPER_2(exception_internal, void, env, i32)
>>  DEF_HELPER_4(exception_with_syndrome, void, env, i32, i32, i32)
>> +DEF_HELPER_1(setend, void, env)
>>  DEF_HELPER_1(wfi, void, env)
>>  DEF_HELPER_1(wfe, void, env)
>>  DEF_HELPER_1(yield, void, env)
>> diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
>> index e42d287..2a4bc67 100644
>> --- a/target-arm/op_helper.c
>> +++ b/target-arm/op_helper.c
>> @@ -295,6 +295,11 @@ uint32_t HELPER(usat16)(CPUARMState *env, uint32_t x, 
>> uint32_t shift)
>>  return res;
>>  }
>>
>> +void HELPER(setend)(CPUARMState *env)
>> +{
>> +env->uncached_cpsr ^= CPSR_E;
>> +}
>> +
>>  /* Function checks whether WFx (WFI/WFE) instructions are set up to be 
>> trapped.
>>   * The function returns the target EL (1-3) if the instruction is to be 
>> trapped;
>>   * otherwise it returns 0 indicating it is not trapped.
>> diff --git a/target-arm/translate.c b/target-arm/translate.c
>> index cb925ef..192a5d6 100644
>> --- a/target-arm/translate.c
>> +++ b/target-arm/translate.c
>> @@ -7726,10 +7726,10 @@ static void disas_arm_insn(DisasContext *s, unsigned 
>> int insn)
>>  if ((insn & 0x0dff) == 0x0101) {
>>  ARCH(6);
>>  /* setend */
>> -if (((insn >> 9) & 1) != s->bswap_code) {
>> -/* Dynamic endianness switching not implemented. */
>> -qemu_log_mask(LOG_UNIMP, "arm: unimplemented setend\n");
>> -goto illegal_op;
>> +if (((insn >> 9) & 1) != !!(s->mo_endianness == MO_BE)) {
>> +gen_helper_setend(cpu_env);
>> +gen_set_pc_im(s, s->pc);
>> +s->is_jmp = DISAS_JUMP;
>
> Rather than manually calling set_pc_im and using DISAS_JUMP, better
> to use DISAS_UPDATE, which will do the gen_set_pc_im() call for you.
>
> (Ditto in the other hunk.)
>

Fixed.

Regards,
Peter

> thanks
> -- PMM



[Qemu-devel] [Bug 1550743] [NEW] connect low speed host devices to qemu ehci does not work

2016-02-27 Thread Bert Brecht
Public bug reported:

$ qemu-system-i386 -hda my_x86.img -device ich9-usb-ehci1,id=ehci -device 
usb-host,vendorid=0x045e,productid=0x071d -serial stdio
qemu-system-i386: Warning: speed mismatch trying to attach usb device 
"Microsoft? 2.4GHz Transceiver V" ( speed) to bus "ehci.0", port "1" (high 
speed)
qemu-system-i386: Warning: speed mismatch trying to attach usb device 
"Microsoft? 2.4GHz Transceiver V" ( speed) to bus "ehci.0", port "1" (high 
speed)
qemu-system-i386: Warning: speed mismatch trying to attach usb device 
"Microsoft? 2.4GHz Transceiver V" ( speed) to bus "ehci.0", port "1" (high 
speed)

Which is obviously wrong. The ehci specification states:

Low-speed device, release ownership of port <= Table 2-16.

Table 2-6:

Number of Companion Controller (N_CC). This field indicates the number of
companion controllers associated with this USB 2.0 host controller.
A zero in this field indicates there are no companion host controllers. 
Port-ownership
hand-off is not supported. Only high-speed devices are supported on the host 
controller
root ports.
A value larger than zero in this field indicates there are companion USB 1.1 
host
controller(s). Port-ownership hand-offs are supported. High, Full- and Low-speed
devices are supported on the host controller root ports.

Which is not longer true, as for example skylake and baytrail offers a
dual usb stack of ehci and xhci. In that case, EHCI handles the low
speed device as well.

brgds,
Bert

** Affects: qemu
 Importance: Undecided
 Status: New


** Tags: ehci usb

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1550743

Title:
  connect low speed host devices to qemu ehci does not work

Status in QEMU:
  New

Bug description:
  $ qemu-system-i386 -hda my_x86.img -device ich9-usb-ehci1,id=ehci -device 
usb-host,vendorid=0x045e,productid=0x071d -serial stdio
  qemu-system-i386: Warning: speed mismatch trying to attach usb device 
"Microsoft? 2.4GHz Transceiver V" ( speed) to bus "ehci.0", port "1" (high 
speed)
  qemu-system-i386: Warning: speed mismatch trying to attach usb device 
"Microsoft? 2.4GHz Transceiver V" ( speed) to bus "ehci.0", port "1" (high 
speed)
  qemu-system-i386: Warning: speed mismatch trying to attach usb device 
"Microsoft? 2.4GHz Transceiver V" ( speed) to bus "ehci.0", port "1" (high 
speed)

  Which is obviously wrong. The ehci specification states:

  Low-speed device, release ownership of port <= Table 2-16.

  Table 2-6:

  Number of Companion Controller (N_CC). This field indicates the number of
  companion controllers associated with this USB 2.0 host controller.
  A zero in this field indicates there are no companion host controllers. 
Port-ownership
  hand-off is not supported. Only high-speed devices are supported on the host 
controller
  root ports.
  A value larger than zero in this field indicates there are companion USB 1.1 
host
  controller(s). Port-ownership hand-offs are supported. High, Full- and 
Low-speed
  devices are supported on the host controller root ports.

  Which is not longer true, as for example skylake and baytrail offers a
  dual usb stack of ehci and xhci. In that case, EHCI handles the low
  speed device as well.

  brgds,
  Bert

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1550743/+subscriptions



Re: [Qemu-devel] [PATCH v1 08/17] target-arm: cpu: Move cpu_is_big_endian to header

2016-02-27 Thread Peter Crosthwaite
On Tue, Jan 19, 2016 at 8:11 AM, Peter Maydell  wrote:
> On 18 January 2016 at 07:12, Peter Crosthwaite
>  wrote:
>> From: Peter Crosthwaite 
>>
>> There is a CPU data endianness test that is used to drive the
>> virtio_big_endian test.
>>
>> Move this up to the header so it can be more generally used for endian
>> tests. The KVM specific cpu_syncronize_state call is left behind in the
>> virtio specific function.
>>
>> Signed-off-by: Peter Crosthwaite 
>> ---
>>
>>  target-arm/cpu.c | 19 +++
>>  target-arm/cpu.h | 19 +++
>>  2 files changed, 22 insertions(+), 16 deletions(-)
>>
>> diff --git a/target-arm/cpu.c b/target-arm/cpu.c
>> index 35a1f12..d3b73bf 100644
>> --- a/target-arm/cpu.c
>> +++ b/target-arm/cpu.c
>> @@ -368,26 +368,13 @@ static void arm_cpu_kvm_set_irq(void *opaque, int irq, 
>> int level)
>>  #endif
>>  }
>>
>> -static bool arm_cpu_is_big_endian(CPUState *cs)
>> +static bool arm_cpu_virtio_is_big_endian(CPUState *cs)
>>  {
>>  ARMCPU *cpu = ARM_CPU(cs);
>>  CPUARMState *env = &cpu->env;
>> -int cur_el;
>>
>>  cpu_synchronize_state(cs);
>> -
>> -/* In 32bit guest endianness is determined by looking at CPSR's E bit */
>> -if (!is_a64(env)) {
>> -return (env->uncached_cpsr & CPSR_E) ? 1 : 0;
>> -}
>> -
>> -cur_el = arm_current_el(env);
>> -
>> -if (cur_el == 0) {
>> -return (env->cp15.sctlr_el[1] & SCTLR_E0E) != 0;
>> -}
>> -
>> -return (env->cp15.sctlr_el[cur_el] & SCTLR_EE) != 0;
>> +return arm_cpu_is_big_endian(env);
>>  }
>>
>>  #endif
>> @@ -1420,7 +1407,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void 
>> *data)
>>  cc->do_unaligned_access = arm_cpu_do_unaligned_access;
>>  cc->get_phys_page_debug = arm_cpu_get_phys_page_debug;
>>  cc->vmsd = &vmstate_arm_cpu;
>> -cc->virtio_is_big_endian = arm_cpu_is_big_endian;
>> +cc->virtio_is_big_endian = arm_cpu_virtio_is_big_endian;
>>  #endif
>>  cc->gdb_num_core_regs = 26;
>>  cc->gdb_core_xml_file = "arm-core.xml";
>> diff --git a/target-arm/cpu.h b/target-arm/cpu.h
>> index f83070a..54675c7 100644
>> --- a/target-arm/cpu.h
>> +++ b/target-arm/cpu.h
>> @@ -1795,6 +1795,25 @@ static inline bool arm_singlestep_active(CPUARMState 
>> *env)
>>  && arm_generate_debug_exceptions(env);
>>  }
>>
>> +/* Return true if the processor is in big-endian mode. */
>> +static bool arm_cpu_is_big_endian(CPUARMState *env)
>> +{
>
> No problems code-wise, but can we call the function
> arm_cpu_data_is_big_endian() or something?

Fixed.

Regards,
Peter

> This is returning the
> endianness to use for data accesses; there isn't an overall
> "big-endian mode" that affects everything except for the
> obsolete BE32.
>
> thanks
> -- PMM



Re: [Qemu-devel] [PATCH v1 04/17] target-arm: implement SCTLR.EE

2016-02-27 Thread Peter Crosthwaite
On Tue, Jan 19, 2016 at 7:58 AM, Peter Maydell  wrote:
> On 18 January 2016 at 07:12, Peter Crosthwaite
>  wrote:
>> From: Peter Crosthwaite 
>>
>> Implement SCTLR.EE bit which controls data endianess for exceptions
>> and page table translations. SCTLR.EE is mirrored to the CPSR.E bit
>> on exception entry.
>>
>> Signed-off-by: Peter Crosthwaite 
>> ---
>>
>>  target-arm/helper.c | 42 --
>>  1 file changed, 32 insertions(+), 10 deletions(-)
>>
>> diff --git a/target-arm/helper.c b/target-arm/helper.c
>> index 59d5a41..afac1b2 100644
>> --- a/target-arm/helper.c
>> +++ b/target-arm/helper.c
>> @@ -5889,7 +5889,10 @@ void arm_cpu_do_interrupt(CPUState *cs)
>>  /* Clear IT bits.  */
>>  env->condexec_bits = 0;
>>  /* Switch to the new mode, and to the correct instruction set.  */
>> -env->uncached_cpsr = (env->uncached_cpsr & ~CPSR_M) | new_mode;
>> +env->uncached_cpsr = (env->uncached_cpsr & ~(CPSR_M)) | new_mode;
>
> Why change this line?
>

Reverted

>> +/* Set new mode endianess */
>
> "endianness"
>

Fixed

>> +env->uncached_cpsr = (env->uncached_cpsr & ~(CPSR_E)) |
>> +(env->cp15.sctlr_el[arm_current_el(env)] & SCTLR_EE ? CPSR_E : 0);
>
> This is a bit confusing. I think just splitting it into
> multiple statements would help:
>env->uncached_cpsr &= ~CPSR_E;
>if (env->cp15.sctlr_el[arm_current_el(env)] & SCTLR_EE) {
>env->uncached_cpsr |= CPSR_E;
>}
>

Fixed.

>>  env->daif |= mask;
>>  /* this is a lie, as the was no c1_sys on V4T/V5, but who cares
>>   * and we should just guard the thumb mode on V4 */
>> @@ -5958,6 +5961,12 @@ static inline bool 
>> regime_translation_disabled(CPUARMState *env,
>>  return (regime_sctlr(env, mmu_idx) & SCTLR_M) == 0;
>>  }
>>
>> +static inline bool regime_translation_big_endian(CPUARMState *env,
>> + ARMMMUIdx mmu_idx)
>> +{
>> +return (regime_sctlr(env, mmu_idx) & SCTLR_EE) != 0;
>> +}
>> +
>>  /* Return the TCR controlling this translation regime */
>>  static inline TCR *regime_tcr(CPUARMState *env, ARMMMUIdx mmu_idx)
>>  {
>> @@ -6263,7 +6272,7 @@ static hwaddr S1_ptw_translate(CPUARMState *env, 
>> ARMMMUIdx mmu_idx,
>>   */
>>  static uint32_t arm_ldl_ptw(CPUState *cs, hwaddr addr, bool is_secure,
>>  ARMMMUIdx mmu_idx, uint32_t *fsr,
>> -ARMMMUFaultInfo *fi)
>> +ARMMMUFaultInfo *fi, bool be)
>>  {
>>  ARMCPU *cpu = ARM_CPU(cs);
>>  CPUARMState *env = &cpu->env;
>> @@ -6274,12 +6283,16 @@ static uint32_t arm_ldl_ptw(CPUState *cs, hwaddr 
>> addr, bool is_secure,
>>  if (fi->s1ptw) {
>>  return 0;
>>  }
>> -return address_space_ldl(cs->as, addr, attrs, NULL);
>> +if (be) {
>> +return address_space_ldl_be(cs->as, addr, attrs, NULL);
>> +} else {
>> +return address_space_ldl_le(cs->as, addr, attrs, NULL);
>> +}
>>  }
>
> Why not just call regime_translation_big_endian() inside arm_ldl_ptw()
> and arm_ldq_ptw(), rather than having every call site making the call
> and passing in the result?
>

Fixed.

> PS: this patch will conflict with the multi-ases series but only
> fairly trivially.
>

Resolved.

Regards,
Peter

> thanks
> -- PMM



Re: [Qemu-devel] [PATCH v2 6/9] i.MX: Add i.MX6 System Reset Controller device.

2016-02-27 Thread Peter Maydell
On 27 February 2016 at 16:57, Jean-Christophe DUBOIS
 wrote:
> Hi Peter and Peter,
>
> I need to test that the changes I did for PSCI (factor out on/off code) do
> not introduce any regression.
>
> On which QEMU target should I test my changes to PSCI to check I didn't mess
> up anything?

The 'virt' board uses PSCI.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v2 6/9] i.MX: Add i.MX6 System Reset Controller device.

2016-02-27 Thread Jean-Christophe DUBOIS

Hi Peter and Peter,

I need to test that the changes I did for PSCI (factor out on/off code) 
do not introduce any regression.


On which QEMU target should I test my changes to PSCI to check I didn't 
mess up anything?


Thanks

JC

Le 16/02/2016 16:35, Peter Maydell a écrit :

On 8 February 2016 at 22:08, Jean-Christophe Dubois  
wrote:

This controller is also present in i.MX5X devices but they are not
yet emulated by QEMU.

Signed-off-by: Jean-Christophe Dubois 
@@ -0,0 +1,353 @@
+/*
+ * IMX6 System Reset Controller
+ *
+ * Copyright (c) 2015 Jean-Christophe Dubois 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/misc/imx6_src.h"
+#include "sysemu/sysemu.h"
+#include "qemu/bitops.h"

#include "qemu/osdep.h" as first #include line.


+static void imx6_src_reset(DeviceState *dev)
+{
+IMX6SRCState *s = IMX6_SRC(dev);
+
+DPRINTF("\n");
+
+/*
+ * We only clear the first registers as all GPR registers are preserved
+ * over resets
+ */
+memset(s->regs, 0, SRC_MAX * sizeof(uint32_t));

Comment doesn't seem to match code?


+/* Set reset values */
+s->regs[SRC_SCR] = 0x521;
+s->regs[SRC_SRSR] = 0x1;
+s->regs[SRC_SIMR] = 0x1F;
+}
+
+static CPUState *imx6_src_get_cpu_by_id(uint32_t id)
+{
+CPUState *cpu;
+
+DPRINTF("cpu %d\n", id);
+
+CPU_FOREACH(cpu) {
+ARMCPU *armcpu = ARM_CPU(cpu);
+
+if (armcpu->mp_affinity == id) {
+return cpu;
+}
+}
+
+qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: Resquesting unknown CPU %d\n",
+  TYPE_IMX6_SRC, __func__, id);
+
+return NULL;
+}
+
+static void imx6_src_cpu_on(uint32_t cpuid, uint32_t entry, uint32_t 
context_id)
+{
+CPUState *target_cpu_state;
+ARMCPU *target_cpu;
+CPUClass *target_cpu_class;
+
+DPRINTF("cpu %d @ 0x%08x with R0 = 0x%08x\n", cpuid, entry, context_id);
+
+/* change to the cpu we are powering up */
+target_cpu_state = imx6_src_get_cpu_by_id(cpuid);
+if (!target_cpu_state) {
+return;
+}
+target_cpu = ARM_CPU(target_cpu_state);
+if (!target_cpu->powered_off) {
+qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: CPU %d is already running\n",
+  TYPE_IMX6_SRC, __func__, cpuid);
+return;
+}
+target_cpu_class = CPU_GET_CLASS(target_cpu);
+
+/* Initialize the cpu we are turning on */
+cpu_reset(target_cpu_state);
+target_cpu->powered_off = false;
+target_cpu_state->halted = 0;
+
+target_cpu->env.regs[0] = context_id;
+target_cpu->env.thumb = entry & 1;
+
+target_cpu_class->set_pc(target_cpu_state, entry);
+}
+
+static void imx6_src_cpu_off(uint32_t cpuid)
+{
+CPUState *target_cpu_state;
+ARMCPU *target_cpu;
+
+DPRINTF("cpu %d\n", cpuid);
+
+/* change to the cpu we are powering up */
+target_cpu_state = imx6_src_get_cpu_by_id(cpuid);
+if (!target_cpu_state) {
+return;
+}
+target_cpu = ARM_CPU(target_cpu_state);
+if (target_cpu->powered_off) {
+qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: CPU %d is already off\n",
+  TYPE_IMX6_SRC, __func__, cpuid);
+return;
+}
+
+target_cpu->powered_off = true;
+target_cpu_state->halted = 1;
+target_cpu_state->exception_index = EXCP_HLT;
+cpu_loop_exit(target_cpu_state);
+}

+static void imx6_src_cpu_reset(uint32_t cpuid)
+{
+CPUState *target_cpu_state;
+ARMCPU *target_cpu;
+
+DPRINTF("cpu %d\n", cpuid);
+
+/* change to the cpu we are powering up */
+target_cpu_state = imx6_src_get_cpu_by_id(cpuid);
+if (!target_cpu_state) {
+return;
+}
+target_cpu = ARM_CPU(target_cpu_state);
+if (target_cpu->powered_off) {
+qemu_log_mask(LOG_GUEST_ERROR, "[%s]%s: CPU %d is off\n",
+  TYPE_IMX6_SRC, __func__, cpuid);
+return;
+}
+
+/* Reset the cpu we are turning on */
+cpu_reset(target_cpu_state);
+}

The code that is messing about with target CPUs to power them up
and down needs to be abstracted out into a public API in target-arm/,
so it can be used both by your device and by target-arm/psci.c.
I don't want variations on this code to duplicate into various
devices, because chances are good it will need to change for
multithreaded TCG.


+static void imx6_src_realize(DeviceState *dev, Error **errp)
+{
+IMX6SRCState *s = IMX6_SRC(dev);
+
+memory_region_init_io(&s->iomem, OBJECT(dev), &imx6_src_ops, s,
+  TYPE_IMX6_SRC, 0x1000);
+sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->iomem);
+}
+
+static void imx6_src_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+
+dc->realize = imx6_src_realize;
+dc->reset = imx6_src_reset;
+dc->vmsd = &vmstate_imx6_src;
+dc->desc = "i.MX6 System Reset Controller";
+}
+/* SRC_SCR */
+#define CORE3_ENABLE_SHIFT (24)

Re: [Qemu-devel] [Bug 1550503] [NEW] target-arm/helper.c:5493: bad test ?

2016-02-27 Thread Peter Maydell
On 26 February 2016 at 20:07, dcb <1550...@bugs.launchpad.net> wrote:
> Public bug reported:
>
> [qemu/target-arm/helper.c:5493]: (style) Expression '(X & 0x1f) !=
> 0xf80f' is always true.
>
> Source code is
>
> (env->uncached_cpsr & CPSR_M) != CPSR_USER &&
>
> but
>
> ./qemu/target-arm/cpu.h:#define CPSR_M (0x1fU)
>
> ./qemu/target-arm/cpu.h:#define CPSR_USER (CPSR_NZCV | CPSR_Q | CPSR_GE)

Yeah, that's a bug. Should be ARM_CPU_MODE_USR, not CPSR_USER.

thanks
-- PMM



Re: [Qemu-devel] [PATCH COLO-Frame v15 29/38] COLO: Separate the process of saving/loading ram and device state

2016-02-27 Thread Hailiang Zhang

On 2016/2/26 21:16, Dr. David Alan Gilbert wrote:

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:

We separate the process of saving/loading ram and device state when do
checkpoint, we add new helpers for save/load ram/device. With this change,
we can directly transfer ram from master to slave without using
QEMUSizeBufferas as assistant, which also reduce the size of extra memory
been used during checkpoint.

Besides, we move the colo_flush_ram_cache to the proper position after the
above change.

Signed-off-by: zhanghailiang 
Signed-off-by: Li Zhijian 
---
v14:
- split two new patches from this patch
- Some minor fixes from Dave
v13:
- Re-use some existed helper functions to realize saving/loading
   ram and device.
v11:
- Remove load configuration section in qemu_loadvm_state_begin()
---
  migration/colo.c   | 48 ++--
  migration/ram.c|  5 -
  migration/savevm.c |  5 +
  3 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 16bada6..300fa54 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -288,21 +288,37 @@ static int colo_do_checkpoint_transaction(MigrationState 
*s,
  goto out;
  }

+colo_put_cmd(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
+if (local_err) {
+goto out;
+}
+
  /* Disable block migration */
  s->params.blk = 0;
  s->params.shared = 0;
-qemu_savevm_state_header(trans);
-qemu_savevm_state_begin(trans, &s->params);
+qemu_savevm_state_begin(s->to_dst_file, &s->params);
+ret = qemu_file_get_error(s->to_dst_file);
+if (ret < 0) {
+error_report("Save vm state begin error");
+goto out;
+}
+
  qemu_mutex_lock_iothread();
-qemu_savevm_state_complete_precopy(trans, false);
+/*
+* Only save VM's live state, which not including device state.
+* TODO: We may need a timeout mechanism to prevent COLO process
+* to be blocked here.
+*/
+qemu_savevm_live_state(s->to_dst_file);
+/* Note: device state is saved into buffer */
+ret = qemu_save_device_state(trans);
  qemu_mutex_unlock_iothread();


Yes, I still worry a little about what can hang under that lock, but I think


Hmm, we have some other places in COLO taking this lock too. Some of them are
OK with holding it. But holding it while sending/receiving date from other side
in COLO is dangerous. One solution is to apply timeout for QEMUFile operation,
But it is not a good choice.


it's the best we've got at the moment; we probably need to understand what
the rules are about what actually needs the lock!



Yes, this is another way to solve the problem. Don't hold the lock while
sending/receiving date.
For qemu_savevm_live_state() here, IMHO, we can send the remained RAM data
without holding the global lock. Here we hold the lock, because we
call cpu_synchronize_all_states() in it. (I'm not sure ...)

Thanks,
Hailiang


Other than that,

Reviewed-by: Dr. David Alan Gilbert 


-
-qemu_fflush(trans);
-
-colo_put_cmd(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
-if (local_err) {
+if (ret < 0) {
+error_report("Save device state error");
  goto out;
  }
+qemu_fflush(trans);
+
  /* we send the total size of the vmstate first */
  size = qsb_get_length(buffer);
  colo_put_cmd_value(s->to_dst_file, COLO_MESSAGE_VMSTATE_SIZE,
@@ -573,6 +589,16 @@ void *colo_process_incoming_thread(void *opaque)
  goto out;
  }

+ret = qemu_loadvm_state_begin(mis->from_src_file);
+if (ret < 0) {
+error_report("Load vm state begin error, ret=%d", ret);
+goto out;
+}
+ret = qemu_loadvm_state_main(mis->from_src_file, mis);
+if (ret < 0) {
+error_report("Load VM's live state (ram) error");
+goto out;
+}
  /* read the VM state total size first */
  value = colo_get_cmd_value(mis->from_src_file,
   COLO_MESSAGE_VMSTATE_SIZE, &local_err);
@@ -605,8 +631,10 @@ void *colo_process_incoming_thread(void *opaque)
  qemu_mutex_lock_iothread();
  qemu_system_reset(VMRESET_SILENT);
  vmstate_loading = true;
-if (qemu_loadvm_state(fb) < 0) {
-error_report("COLO: loadvm failed");
+colo_flush_ram_cache();
+ret = qemu_load_device_state(fb);
+if (ret < 0) {
+error_report("COLO: load device state failed");
  qemu_mutex_unlock_iothread();
  goto out;
  }
diff --git a/migration/ram.c b/migration/ram.c
index 891f3b2..8f416d5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2465,7 +2465,6 @@ static int ram_load(QEMUFile *f, void *opaque, int 
version_id)
   * be atomic
   */
  bool postcopy_running = postcopy_state_get() >= 
POSTCOPY_INCOMING_LISTENING;
-bool need_flush = false;

  seq_i

[Qemu-devel] [PATCH v3 12/15] hbitmap: serialization

2016-02-27 Thread Fam Zheng
From: Vladimir Sementsov-Ogievskiy 

Functions to serialize / deserialize(restore) HBitmap. HBitmap should be
saved to linear sequence of bits independently of endianness and bitmap
array element (unsigned long) size. Therefore Little Endian is chosen.

These functions are appropriate for dirty bitmap migration, restoring
the bitmap in several steps is available. To save performance, every
step writes only the last level of the bitmap. All other levels are
restored by hbitmap_deserialize_finish() as a last step of restoring.
So, HBitmap is inconsistent while restoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
[Fix left shift operand to 1UL; add "finish" parameter. - Fam]
Signed-off-by: Fam Zheng 
---
 include/qemu/hbitmap.h |  79 
 util/hbitmap.c | 137 +
 2 files changed, 216 insertions(+)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index f8ed058..26cac7d 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -146,6 +146,85 @@ void hbitmap_reset_all(HBitmap *hb);
 bool hbitmap_get(const HBitmap *hb, uint64_t item);
 
 /**
+ * hbitmap_serialization_granularity:
+ * @hb: HBitmap to operate on.
+ *
+ * Granularity of serialization chunks, used by other serialization functions.
+ * For every chunk:
+ * 1. Chunk start should be aligned to this granularity.
+ * 2. Chunk size should be aligned too, except for last chunk (for which
+ *  start + count == hb->size)
+ */
+uint64_t hbitmap_serialization_granularity(const HBitmap *hb);
+
+/**
+ * hbitmap_serialization_size:
+ * @hb: HBitmap to operate on.
+ * @start: Starting bit
+ * @count: Number of bits
+ *
+ * Return number of bytes hbitmap_(de)serialize_part needs
+ */
+uint64_t hbitmap_serialization_size(const HBitmap *hb,
+uint64_t start, uint64_t count);
+
+/**
+ * hbitmap_serialize_part
+ * @hb: HBitmap to operate on.
+ * @buf: Buffer to store serialized bitmap.
+ * @start: First bit to store.
+ * @count: Number of bits to store.
+ *
+ * Stores HBitmap data corresponding to given region. The format of saved data
+ * is linear sequence of bits, so it can be used by hbitmap_deserialize_part
+ * independently of endianness and size of HBitmap level array elements
+ */
+void hbitmap_serialize_part(const HBitmap *hb, uint8_t *buf,
+uint64_t start, uint64_t count);
+
+/**
+ * hbitmap_deserialize_part
+ * @hb: HBitmap to operate on.
+ * @buf: Buffer to restore bitmap data from.
+ * @start: First bit to restore.
+ * @count: Number of bits to restore.
+ * @finish: Whether to call hbitmap_deserialize_finish automatically.
+ *
+ * Restores HBitmap data corresponding to given region. The format is the same
+ * as for hbitmap_serialize_part.
+ *
+ * If @finish is false, caller must call hbitmap_serialize_finish before using
+ * the bitmap.
+ */
+void hbitmap_deserialize_part(HBitmap *hb, uint8_t *buf,
+  uint64_t start, uint64_t count,
+  bool finish);
+
+/**
+ * hbitmap_deserialize_zeroes
+ * @hb: HBitmap to operate on.
+ * @start: First bit to restore.
+ * @count: Number of bits to restore.
+ * @finish: Whether to call hbitmap_deserialize_finish automatically.
+ *
+ * Fills the bitmap with zeroes.
+ *
+ * If @finish is false, caller must call hbitmap_serialize_finish before using
+ * the bitmap.
+ */
+void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t start, uint64_t count,
+bool finish);
+
+/**
+ * hbitmap_deserialize_finish
+ * @hb: HBitmap to operate on.
+ *
+ * Repair HBitmap after calling hbitmap_deserialize_data. Actually, all HBitmap
+ * layers are restored here.
+ */
+void hbitmap_deserialize_finish(HBitmap *hb);
+
+/**
  * hbitmap_free:
  * @hb: HBitmap to operate on.
  *
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 2d3d04c..5f02c17 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -395,6 +395,143 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item)
 return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & bit) != 0;
 }
 
+uint64_t hbitmap_serialization_granularity(const HBitmap *hb)
+{
+/* Require at least 64 bit granularity to be safe on both 64 bit and 32 bit
+ * hosts. */
+return 64 << hb->granularity;
+}
+
+/* Start should be aligned to serialization granularity, chunk size should be
+ * aligned to serialization granularity too, except for last chunk.
+ */
+static void serialization_chunk(const HBitmap *hb,
+uint64_t start, uint64_t count,
+unsigned long **first_el, size_t *el_count)
+{
+uint64_t last = start + count - 1;
+uint64_t gran = hbitmap_serialization_granularity(hb);
+
+assert((start & (gran - 1)) == 0);
+assert((last >> hb->granularity) < hb->size);
+if ((last >> hb->granularity) != hb->size - 1) {
+assert((count & (gran - 1)) == 0);
+}
+
+sta

[Qemu-devel] [PATCH v3 13/15] block: BdrvDirtyBitmap serialization interface

2016-02-27 Thread Fam Zheng
From: Vladimir Sementsov-Ogievskiy 

Several functions to provide necessary access to BdrvDirtyBitmap for
block-migration.c

Signed-off-by: Vladimir Sementsov-Ogievskiy 
[Add the "finish" parameters. - Fam]
Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 block/dirty-bitmap.c | 37 +
 include/block/dirty-bitmap.h | 14 ++
 2 files changed, 51 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 27d33e7..ef1c49c 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -452,6 +452,43 @@ void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, 
HBitmap *in)
 hbitmap_free(tmp);
 }
 
+uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
+  uint64_t start, uint64_t count)
+{
+return hbitmap_serialization_size(bitmap->bitmap, start, count);
+}
+
+uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap)
+{
+return hbitmap_serialization_granularity(bitmap->bitmap);
+}
+
+void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
+  uint8_t *buf, uint64_t start,
+  uint64_t count)
+{
+hbitmap_serialize_part(bitmap->bitmap, buf, start, count);
+}
+
+void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap *bitmap,
+uint8_t *buf, uint64_t start,
+uint64_t count, bool finish)
+{
+hbitmap_deserialize_part(bitmap->bitmap, buf, start, count, finish);
+}
+
+void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
+  uint64_t start, uint64_t count,
+  bool finish)
+{
+hbitmap_deserialize_zeroes(bitmap->bitmap, start, count, finish);
+}
+
+void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap)
+{
+hbitmap_deserialize_finish(bitmap->bitmap);
+}
+
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
 int nr_sectors)
 {
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 2e70a7e..9b838f3 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -56,4 +56,18 @@ void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t 
sector_num);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 
+uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
+  uint64_t start, uint64_t count);
+uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap);
+void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
+  uint8_t *buf, uint64_t start,
+  uint64_t count);
+void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap *bitmap,
+uint8_t *buf, uint64_t start,
+uint64_t count, bool finish);
+void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
+  uint64_t start, uint64_t count,
+  bool finish);
+void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap);
+
 #endif
-- 
2.4.3




[Qemu-devel] [PATCH v3 14/15] tests: Add test code for hbitmap serialization

2016-02-27 Thread Fam Zheng
Acked-by: John Snow 
Signed-off-by: Fam Zheng 
---
 tests/test-hbitmap.c | 139 +++
 1 file changed, 139 insertions(+)

diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index c00c2b5..8f64941 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -12,6 +12,7 @@
 #include "qemu/osdep.h"
 #include 
 #include "qemu/hbitmap.h"
+#include "qemu/bitmap.h"
 #include "block/block.h"
 
 #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
@@ -738,6 +739,16 @@ static void test_hbitmap_meta_one(TestHBitmapData *data, 
const void *unused)
 }
 }
 
+static void test_hbitmap_serialize_granularity(TestHBitmapData *data,
+   const void *unused)
+{
+int r;
+
+hbitmap_test_init(data, L3 * 2, 3);
+r = hbitmap_serialization_granularity(data->hb);
+g_assert_cmpint(r, ==, BITS_PER_LONG << 3);
+}
+
 static void test_hbitmap_meta_zero(TestHBitmapData *data, const void *unused)
 {
 hbitmap_test_init_meta(data, 0, 0, 1);
@@ -745,6 +756,125 @@ static void test_hbitmap_meta_zero(TestHBitmapData *data, 
const void *unused)
 hbitmap_check_meta(data, 0, 0);
 }
 
+static void hbitmap_test_serialize_range(TestHBitmapData *data,
+ uint8_t *buf, size_t buf_size,
+ uint64_t pos, uint64_t count)
+{
+size_t i;
+
+assert(hbitmap_granularity(data->hb) == 0);
+hbitmap_reset_all(data->hb);
+memset(buf, 0, buf_size);
+if (count) {
+hbitmap_set(data->hb, pos, count);
+}
+hbitmap_serialize_part(data->hb, buf, 0, data->size);
+for (i = 0; i < data->size; i++) {
+int is_set = test_bit(i, (unsigned long *)buf);
+if (i >= pos && i < pos + count) {
+g_assert(is_set);
+} else {
+g_assert(!is_set);
+}
+}
+hbitmap_reset_all(data->hb);
+hbitmap_deserialize_part(data->hb, buf, 0, data->size, true);
+
+for (i = 0; i < data->size; i++) {
+int is_set = hbitmap_get(data->hb, i);
+if (i >= pos && i < pos + count) {
+g_assert(is_set);
+} else {
+g_assert(!is_set);
+}
+}
+}
+
+static void test_hbitmap_serialize_basic(TestHBitmapData *data,
+ const void *unused)
+{
+int i, j;
+size_t buf_size;
+uint8_t *buf;
+uint64_t positions[] = { 0, 1, L1 - 1, L1, L2 - 1, L2, L2 + 1, L3 - 1 };
+int num_positions = sizeof(positions) / sizeof(positions[0]);
+
+hbitmap_test_init(data, L3, 0);
+buf_size = hbitmap_serialization_size(data->hb, 0, data->size);
+buf = g_malloc0(buf_size);
+
+for (i = 0; i < num_positions; i++) {
+for (j = 0; j < num_positions; j++) {
+hbitmap_test_serialize_range(data, buf, buf_size,
+ positions[i],
+ MIN(positions[j], L3 - positions[i]));
+}
+}
+
+g_free(buf);
+}
+
+static void test_hbitmap_serialize_part(TestHBitmapData *data,
+const void *unused)
+{
+int i, j, k;
+size_t buf_size;
+uint8_t *buf;
+uint64_t positions[] = { 0, 1, L1 - 1, L1, L2 - 1, L2, L2 + 1, L3 - 1 };
+int num_positions = sizeof(positions) / sizeof(positions[0]);
+
+hbitmap_test_init(data, L3, 0);
+buf_size = L2;
+buf = g_malloc0(buf_size);
+
+for (i = 0; i < num_positions; i++) {
+hbitmap_set(data->hb, positions[i], 1);
+}
+
+for (i = 0; i < data->size; i += buf_size) {
+hbitmap_serialize_part(data->hb, buf, i, buf_size);
+for (j = 0; j < buf_size; j++) {
+bool should_set = false;
+for (k = 0; k < num_positions; k++) {
+if (positions[k] == j + i) {
+should_set = true;
+break;
+}
+}
+g_assert_cmpint(should_set, ==, test_bit(j, (unsigned long *)buf));
+}
+}
+
+g_free(buf);
+}
+
+static void test_hbitmap_serialize_zeroes(TestHBitmapData *data,
+  const void *unused)
+{
+int i;
+HBitmapIter iter;
+int64_t next;
+uint64_t positions[] = { 0, L1, L2, L3 - L1};
+int num_positions = sizeof(positions) / sizeof(positions[0]);
+
+hbitmap_test_init(data, L3, 0);
+
+for (i = 0; i < num_positions; i++) {
+hbitmap_set(data->hb, positions[i], L1);
+}
+
+for (i = 0; i < num_positions; i++) {
+hbitmap_deserialize_zeroes(data->hb, positions[i], L1, true);
+hbitmap_iter_init(&iter, data->hb, 0);
+next = hbitmap_iter_next(&iter);
+if (i == num_positions - 1) {
+g_assert_cmpint(next, ==, -1);
+} else {
+g_assert_cmpint(next, ==, positions[i + 1]);
+}
+}
+}
+
 static void hbitmap_test_add(const char *testpath,
   

[Qemu-devel] [PATCH v3 09/15] block: Support meta dirty bitmap

2016-02-27 Thread Fam Zheng
The added group of operations enables tracking of the changed bits in
the dirty bitmap.

Signed-off-by: Fam Zheng 
---
 block/dirty-bitmap.c | 51 
 include/block/dirty-bitmap.h |  9 
 2 files changed, 60 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 16f73b2..5f19320 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -38,6 +38,7 @@
  */
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;/* Dirty sector bitmap implementation */
+HBitmap *meta;  /* Meta dirty bitmap */
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap (Number of sectors) */
@@ -103,6 +104,56 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 return bitmap;
 }
 
+/* bdrv_create_meta_dirty_bitmap
+ *
+ * Create a meta dirty bitmap that tracks the changes of bits in @bitmap. I.e.
+ * when a dirty status bit in @bitmap is changed (either from reset to set or
+ * the other way around), its respective meta dirty bitmap bit will be marked
+ * dirty as well.
+ *
+ * @bitmap: the block dirty bitmap for which to create a meta dirty bitmap.
+ * @chunk_size: how many bytes of bitmap data does each bit in the meta bitmap
+ * track.
+ */
+void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap,
+   int chunk_size)
+{
+assert(!bitmap->meta);
+bitmap->meta = hbitmap_create_meta(bitmap->bitmap,
+   chunk_size * BITS_PER_BYTE);
+}
+
+void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+assert(bitmap->meta);
+hbitmap_free_meta(bitmap->bitmap);
+bitmap->meta = NULL;
+}
+
+int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
+   BdrvDirtyBitmap *bitmap, int64_t sector,
+   int nb_sectors)
+{
+uint64_t i;
+int gran = bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
+
+/* To optimize: we can make hbitmap to internally check the range in a
+ * coarse level, or at least do it word by word. */
+for (i = sector; i < sector + nb_sectors; i += gran) {
+if (hbitmap_get(bitmap->meta, i)) {
+return true;
+}
+}
+return false;
+}
+
+void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap, int64_t sector,
+  int nb_sectors)
+{
+hbitmap_reset(bitmap->meta, sector, nb_sectors);
+}
+
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
 {
 return bitmap->successor;
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index e1dbd8e..3b27742 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -9,6 +9,9 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
   uint32_t granularity,
   const char *name,
   Error **errp);
+void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap,
+   int chunk_size);
+void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
Error **errp);
@@ -37,6 +40,12 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors);
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int nr_sectors);
+int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
+   BdrvDirtyBitmap *bitmap, int64_t sector,
+   int nb_sectors);
+void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap, int64_t sector,
+  int nb_sectors);
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector);
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);
-- 
2.4.3




[Qemu-devel] [PATCH v3 10/15] block: Add two dirty bitmap getters

2016-02-27 Thread Fam Zheng
For dirty bitmap users to get the size and the name of a
BdrvDirtyBitmap.

Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 block/dirty-bitmap.c | 10 ++
 include/block/dirty-bitmap.h |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 5f19320..a0c5acb 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -154,6 +154,16 @@ void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
 hbitmap_reset(bitmap->meta, sector, nb_sectors);
 }
 
+int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
+{
+return bitmap->size;
+}
+
+const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap)
+{
+return bitmap->name;
+}
+
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
 {
 return bitmap->successor;
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 3b27742..2e70a7e 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -33,6 +33,8 @@ uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState 
*bs);
 uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
+const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
+int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
 DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
 int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t sector);
-- 
2.4.3




[Qemu-devel] [PATCH v3 08/15] tests: Add test code for meta bitmap

2016-02-27 Thread Fam Zheng
Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 tests/test-hbitmap.c | 116 +++
 1 file changed, 116 insertions(+)

diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index abe1427..c00c2b5 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -12,6 +12,7 @@
 #include "qemu/osdep.h"
 #include 
 #include "qemu/hbitmap.h"
+#include "block/block.h"
 
 #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
 
@@ -21,6 +22,7 @@
 
 typedef struct TestHBitmapData {
 HBitmap   *hb;
+HBitmap   *meta;
 unsigned long *bits;
 size_t size;
 size_t old_size;
@@ -92,6 +94,14 @@ static void hbitmap_test_init(TestHBitmapData *data,
 }
 }
 
+static void hbitmap_test_init_meta(TestHBitmapData *data,
+   uint64_t size, int granularity,
+   int meta_chunk)
+{
+hbitmap_test_init(data, size, granularity);
+data->meta = hbitmap_create_meta(data->hb, meta_chunk);
+}
+
 static inline size_t hbitmap_test_array_size(size_t bits)
 {
 size_t n = (bits + BITS_PER_LONG - 1) / BITS_PER_LONG;
@@ -134,6 +144,9 @@ static void hbitmap_test_teardown(TestHBitmapData *data,
   const void *unused)
 {
 if (data->hb) {
+if (data->meta) {
+hbitmap_free_meta(data->hb);
+}
 hbitmap_free(data->hb);
 data->hb = NULL;
 }
@@ -635,6 +648,103 @@ static void 
test_hbitmap_truncate_shrink_large(TestHBitmapData *data,
 hbitmap_test_truncate(data, size, -diff, 0);
 }
 
+static void hbitmap_check_meta(TestHBitmapData *data,
+   int64_t start, int count)
+{
+int64_t i;
+
+for (i = 0; i < data->size; i++) {
+if (i >= start && i < start + count) {
+g_assert(hbitmap_get(data->meta, i));
+} else {
+g_assert(!hbitmap_get(data->meta, i));
+}
+}
+}
+
+static void hbitmap_test_meta(TestHBitmapData *data,
+  int64_t start, int count,
+  int64_t check_start, int check_count)
+{
+hbitmap_reset_all(data->hb);
+hbitmap_reset_all(data->meta);
+
+/* Test "unset" -> "unset" will not update meta. */
+hbitmap_reset(data->hb, start, count);
+hbitmap_check_meta(data, 0, 0);
+
+/* Test "unset" -> "set" will update meta */
+hbitmap_set(data->hb, start, count);
+hbitmap_check_meta(data, check_start, check_count);
+
+/* Test "set" -> "set" will not update meta */
+hbitmap_reset_all(data->meta);
+hbitmap_set(data->hb, start, count);
+hbitmap_check_meta(data, 0, 0);
+
+/* Test "set" -> "unset" will update meta */
+hbitmap_reset_all(data->meta);
+hbitmap_reset(data->hb, start, count);
+hbitmap_check_meta(data, check_start, check_count);
+}
+
+static void hbitmap_test_meta_do(TestHBitmapData *data, int chunk_size)
+{
+uint64_t size = chunk_size * 100;
+hbitmap_test_init_meta(data, size, 0, chunk_size);
+
+hbitmap_test_meta(data, 0, 1, 0, chunk_size);
+hbitmap_test_meta(data, 0, chunk_size, 0, chunk_size);
+hbitmap_test_meta(data, chunk_size - 1, 1, 0, chunk_size);
+hbitmap_test_meta(data, chunk_size - 1, 2, 0, chunk_size * 2);
+hbitmap_test_meta(data, chunk_size - 1, chunk_size + 1, 0, chunk_size * 2);
+hbitmap_test_meta(data, chunk_size - 1, chunk_size + 2, 0, chunk_size * 3);
+hbitmap_test_meta(data, 7 * chunk_size - 1, chunk_size + 2,
+  6 * chunk_size, chunk_size * 3);
+hbitmap_test_meta(data, size - 1, 1, size - chunk_size, chunk_size);
+hbitmap_test_meta(data, 0, size, 0, size);
+}
+
+static void test_hbitmap_meta_byte(TestHBitmapData *data, const void *unused)
+{
+hbitmap_test_meta_do(data, BITS_PER_BYTE);
+}
+
+static void test_hbitmap_meta_word(TestHBitmapData *data, const void *unused)
+{
+hbitmap_test_meta_do(data, BITS_PER_LONG);
+}
+
+static void test_hbitmap_meta_sector(TestHBitmapData *data, const void *unused)
+{
+hbitmap_test_meta_do(data, BDRV_SECTOR_SIZE * BITS_PER_BYTE);
+}
+
+/**
+ * Create an HBitmap and test set/unset.
+ */
+static void test_hbitmap_meta_one(TestHBitmapData *data, const void *unused)
+{
+int i;
+int64_t offsets[] = {
+0, 1, L1 - 1, L1, L1 + 1, L2 - 1, L2, L2 + 1, L3 - 1, L3, L3 + 1
+};
+
+hbitmap_test_init_meta(data, L3 * 2, 0, 1);
+for (i = 0; i < ARRAY_SIZE(offsets); i++) {
+hbitmap_test_meta(data, offsets[i], 1, offsets[i], 1);
+hbitmap_test_meta(data, offsets[i], L1, offsets[i], L1);
+hbitmap_test_meta(data, offsets[i], L2, offsets[i], L2);
+}
+}
+
+static void test_hbitmap_meta_zero(TestHBitmapData *data, const void *unused)
+{
+hbitmap_test_init_meta(data, 0, 0, 1);
+
+hbitmap_check_meta(data, 0, 0);
+}
+
 static void hbitmap_test_add(const char *testpath,
void (*test_func)(TestHBitm

[Qemu-devel] [PATCH v3 04/15] block: Move block dirty bitmap code to separate files

2016-02-27 Thread Fam Zheng
The only code change is making bdrv_dirty_bitmap_truncate public. It is
used in block.c.

Also two long lines (bdrv_get_dirty) are wrapped.

Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 block.c  | 360 
 block/Makefile.objs  |   2 +-
 block/dirty-bitmap.c | 387 +++
 include/block/block.h|  35 +---
 include/block/dirty-bitmap.h |  45 +
 5 files changed, 434 insertions(+), 395 deletions(-)
 create mode 100644 block/dirty-bitmap.c
 create mode 100644 include/block/dirty-bitmap.h

diff --git a/block.c b/block.c
index ba24b8e..3dbaf11 100644
--- a/block.c
+++ b/block.c
@@ -53,23 +53,6 @@
 #include 
 #endif
 
-/**
- * A BdrvDirtyBitmap can be in three possible states:
- * (1) successor is NULL and disabled is false: full r/w mode
- * (2) successor is NULL and disabled is true: read only mode ("disabled")
- * (3) successor is set: frozen mode.
- * A frozen bitmap cannot be renamed, deleted, anonymized, cleared, set,
- * or enabled. A frozen bitmap can only abdicate() or reclaim().
- */
-struct BdrvDirtyBitmap {
-HBitmap *bitmap;/* Dirty sector bitmap implementation */
-BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
-char *name; /* Optional non-empty unique ID */
-int64_t size;   /* Size of the bitmap (Number of sectors) */
-bool disabled;  /* Bitmap is read-only */
-QLIST_ENTRY(BdrvDirtyBitmap) list;
-};
-
 #define NOT_DONE 0x7fff /* used while emulated sync operation in progress 
*/
 
 struct BdrvStates bdrv_states = QTAILQ_HEAD_INITIALIZER(bdrv_states);
@@ -88,9 +71,6 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const 
char *filename,
  BlockDriverState *parent,
  const BdrvChildRole *child_role, Error **errp);
 
-static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
-static void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs);
-
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -3431,346 +3411,6 @@ void bdrv_lock_medium(BlockDriverState *bs, bool locked)
 }
 }
 
-BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
-{
-BdrvDirtyBitmap *bm;
-
-assert(name);
-QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
-if (bm->name && !strcmp(name, bm->name)) {
-return bm;
-}
-}
-return NULL;
-}
-
-void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
-{
-assert(!bdrv_dirty_bitmap_frozen(bitmap));
-g_free(bitmap->name);
-bitmap->name = NULL;
-}
-
-BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
-  uint32_t granularity,
-  const char *name,
-  Error **errp)
-{
-int64_t bitmap_size;
-BdrvDirtyBitmap *bitmap;
-uint32_t sector_granularity;
-
-assert((granularity & (granularity - 1)) == 0);
-
-if (name && bdrv_find_dirty_bitmap(bs, name)) {
-error_setg(errp, "Bitmap already exists: %s", name);
-return NULL;
-}
-sector_granularity = granularity >> BDRV_SECTOR_BITS;
-assert(sector_granularity);
-bitmap_size = bdrv_nb_sectors(bs);
-if (bitmap_size < 0) {
-error_setg_errno(errp, -bitmap_size, "could not get length of device");
-errno = -bitmap_size;
-return NULL;
-}
-bitmap = g_new0(BdrvDirtyBitmap, 1);
-bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
-bitmap->size = bitmap_size;
-bitmap->name = g_strdup(name);
-bitmap->disabled = false;
-QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
-return bitmap;
-}
-
-bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
-{
-return bitmap->successor;
-}
-
-bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
-{
-return !(bitmap->disabled || bitmap->successor);
-}
-
-DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
-{
-if (bdrv_dirty_bitmap_frozen(bitmap)) {
-return DIRTY_BITMAP_STATUS_FROZEN;
-} else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
-return DIRTY_BITMAP_STATUS_DISABLED;
-} else {
-return DIRTY_BITMAP_STATUS_ACTIVE;
-}
-}
-
-/**
- * Create a successor bitmap destined to replace this bitmap after an 
operation.
- * Requires that the bitmap is not frozen and has no successor.
- */
-int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, Error **errp)
-{
-uint64_t granularity;
-BdrvDirtyBitmap *child;
-
-if (bdrv_dirty_bitmap_frozen(bitmap)) {
-error_setg(errp, "Cannot create a successor for a bitmap that is "
-   "currently frozen");
-return -1;
-}
-assert(!bitmap->su

[Qemu-devel] [PATCH v3 06/15] block: Hide HBitmap in block dirty bitmap interface

2016-02-27 Thread Fam Zheng
HBitmap is an implementation detail of block dirty bitmap that should be hidden
from users. Introduce a BdrvDirtyBitmapIter to encapsulate the underlying
HBitmapIter.

A small difference in the interface is, before, an HBitmapIter is initialized
in place, now the new BdrvDirtyBitmapIter must be dynamically allocated because
the structure definition is in block.c.

Two current users are converted too.

Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 block/backup.c   | 14 --
 block/dirty-bitmap.c | 39 +--
 block/mirror.c   | 14 --
 include/block/dirty-bitmap.h |  7 +--
 include/qemu/typedefs.h  |  1 +
 5 files changed, 55 insertions(+), 20 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index c6c1b81..16ce6ad 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -324,14 +324,14 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 int64_t end;
 int64_t last_cluster = -1;
 BlockDriverState *bs = job->common.bs;
-HBitmapIter hbi;
+BdrvDirtyBitmapIter *dbi;
 
 granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
 clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
-bdrv_dirty_iter_init(job->sync_bitmap, &hbi);
+dbi = bdrv_dirty_iter_new(job->sync_bitmap, 0);
 
 /* Find the next dirty sector(s) */
-while ((sector = hbitmap_iter_next(&hbi)) != -1) {
+while ((sector = bdrv_dirty_iter_next(dbi)) != -1) {
 cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
 
 /* Fake progress updates for any clusters we skipped */
@@ -343,7 +343,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
 do {
 if (yield_and_check(job)) {
-return ret;
+goto out;
 }
 ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
 BACKUP_SECTORS_PER_CLUSTER, &error_is_read,
@@ -351,7 +351,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 if ((ret < 0) &&
 backup_error_action(job, error_is_read, -ret) ==
 BLOCK_ERROR_ACTION_REPORT) {
-return ret;
+goto out;
 }
 } while (ret < 0);
 }
@@ -359,7 +359,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 /* If the bitmap granularity is smaller than the backup granularity,
  * we need to advance the iterator pointer to the next cluster. */
 if (granularity < BACKUP_CLUSTER_SIZE) {
-bdrv_set_dirty_iter(&hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
+bdrv_set_dirty_iter(dbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
 }
 
 last_cluster = cluster - 1;
@@ -371,6 +371,8 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 job->common.offset += ((end - last_cluster - 1) * BACKUP_CLUSTER_SIZE);
 }
 
+out:
+bdrv_dirty_iter_free(dbi);
 return ret;
 }
 
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 556e1d1..16f73b2 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -42,9 +42,15 @@ struct BdrvDirtyBitmap {
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap (Number of sectors) */
 bool disabled;  /* Bitmap is read-only */
+int active_iterators;   /* How many iterators are active */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
+struct BdrvDirtyBitmapIter {
+HBitmapIter hbi;
+BdrvDirtyBitmap *bitmap;
+};
+
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
 {
 BdrvDirtyBitmap *bm;
@@ -212,6 +218,7 @@ void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 
 QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
+assert(!bitmap->active_iterators);
 hbitmap_truncate(bitmap->bitmap, size);
 bitmap->size = size;
 }
@@ -224,6 +231,7 @@ static void 
bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
 BdrvDirtyBitmap *bm, *next;
 QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
 if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
+assert(!bitmap->active_iterators);
 assert(!bdrv_dirty_bitmap_frozen(bm));
 QLIST_REMOVE(bm, list);
 hbitmap_free(bm->bitmap);
@@ -320,9 +328,29 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
*bitmap)
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }
 
-void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
+BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
+  

[Qemu-devel] [PATCH v3 15/15] block: More operations for meta dirty bitmap

2016-02-27 Thread Fam Zheng
Callers can create an iterator of meta bitmap with
bdrv_dirty_meta_iter_new(), then use the bdrv_dirty_iter_* operations on
it. Meta iterators are also counted by bitmap->active_iterators.

Also add a couple of functions to retrieve granularity and count.

Signed-off-by: Fam Zheng 

---

Vladimir, are these interfaces enough for your migration series? If it
is more convenient to you, you can adjust and take it into your series.
---
 block/dirty-bitmap.c | 19 +++
 include/block/dirty-bitmap.h |  3 +++
 2 files changed, 22 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index ef1c49c..eff4369 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -392,6 +392,11 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
*bitmap)
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }
 
+uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap)
+{
+return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->meta);
+}
+
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector)
 {
@@ -402,6 +407,15 @@ BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap 
*bitmap,
 return iter;
 }
 
+BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap)
+{
+BdrvDirtyBitmapIter *iter = g_new(BdrvDirtyBitmapIter, 1);
+hbitmap_iter_init(&iter->hbi, bitmap->meta, 0);
+iter->bitmap = bitmap;
+bitmap->active_iterators++;
+return iter;
+}
+
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter)
 {
 if (!iter) {
@@ -513,3 +527,8 @@ int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
 {
 return hbitmap_count(bitmap->bitmap);
 }
+
+int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap)
+{
+return hbitmap_count(bitmap->meta);
+}
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 9b838f3..d9b7437 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -31,6 +31,7 @@ void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
 uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
+uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
 const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
@@ -48,12 +49,14 @@ int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
 void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, int64_t sector,
   int nb_sectors);
+BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector);
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);
 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter);
 void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t sector_num);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
+int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 
 uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
-- 
2.4.3




[Qemu-devel] [PATCH v3 05/15] block: Remove unused typedef of BlockDriverDirtyHandler

2016-02-27 Thread Fam Zheng
Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 include/block/block.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 3fbd70d..ff1933a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -322,8 +322,6 @@ BlockDriverState *check_to_replace_node(BlockDriverState 
*parent_bs,
 const char *node_name, Error **errp);
 
 /* async block I/O */
-typedef void BlockDriverDirtyHandler(BlockDriverState *bs, int64_t sector,
- int sector_num);
 BlockAIOCB *bdrv_aio_readv(BlockDriverState *bs, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque);
-- 
2.4.3




[Qemu-devel] [PATCH v3 07/15] HBitmap: Introduce "meta" bitmap to track bit changes

2016-02-27 Thread Fam Zheng
Upon each bit toggle, the corresponding bit in the meta bitmap will be
set.

Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 include/qemu/hbitmap.h | 17 +
 util/hbitmap.c | 66 ++
 2 files changed, 68 insertions(+), 15 deletions(-)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index e29188c..f8ed058 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -178,6 +178,23 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap 
*hb, uint64_t first);
  */
 unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi);
 
+/* hbitmap_create_meta:
+ * Create a "meta" hbitmap to track dirtiness of the bits in this HBitmap.
+ * The caller owns the created bitmap and must call hbitmap_free_meta(hb) to
+ * free it.
+ *
+ * @hb: The HBitmap to operate on.
+ * @chunk_size: How many bits in @hb does one bit in the meta track.
+ */
+HBitmap *hbitmap_create_meta(HBitmap *hb, int chunk_size);
+
+/* hbitmap_free_meta:
+ * Free the meta bitmap of @hb.
+ *
+ * @hb: The HBitmap whose meta bitmap should be freed.
+ */
+void hbitmap_free_meta(HBitmap *hb);
+
 /**
  * hbitmap_iter_next:
  * @hbi: HBitmapIter to operate on.
diff --git a/util/hbitmap.c b/util/hbitmap.c
index b22b87d..2d3d04c 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -79,6 +79,9 @@ struct HBitmap {
  */
 int granularity;
 
+/* A meta dirty bitmap to track the dirtiness of bits in this HBitmap. */
+HBitmap *meta;
+
 /* A number of progressively less coarse bitmaps (i.e. level 0 is the
  * coarsest).  Each bit in level N represents a word in level N+1 that
  * has a set bit, except the last level where each bit represents the
@@ -210,25 +213,27 @@ static uint64_t hb_count_between(HBitmap *hb, uint64_t 
start, uint64_t last)
 }
 
 /* Setting starts at the last layer and propagates up if an element
- * changes from zero to non-zero.
+ * changes.
  */
 static inline bool hb_set_elem(unsigned long *elem, uint64_t start, uint64_t 
last)
 {
 unsigned long mask;
-bool changed;
+unsigned long old;
 
 assert((last >> BITS_PER_LEVEL) == (start >> BITS_PER_LEVEL));
 assert(start <= last);
 
 mask = 2UL << (last & (BITS_PER_LONG - 1));
 mask -= 1UL << (start & (BITS_PER_LONG - 1));
-changed = (*elem == 0);
+old = *elem;
 *elem |= mask;
-return changed;
+return old != *elem;
 }
 
-/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)... */
-static void hb_set_between(HBitmap *hb, int level, uint64_t start, uint64_t 
last)
+/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)...
+ * Returns true if at least one bit is changed. */
+static bool hb_set_between(HBitmap *hb, int level, uint64_t start,
+   uint64_t last)
 {
 size_t pos = start >> BITS_PER_LEVEL;
 size_t lastpos = last >> BITS_PER_LEVEL;
@@ -257,22 +262,27 @@ static void hb_set_between(HBitmap *hb, int level, 
uint64_t start, uint64_t last
 if (level > 0 && changed) {
 hb_set_between(hb, level - 1, pos, lastpos);
 }
+return changed;
 }
 
 void hbitmap_set(HBitmap *hb, uint64_t start, uint64_t count)
 {
 /* Compute range in the last layer.  */
+uint64_t first, n;
 uint64_t last = start + count - 1;
 
 trace_hbitmap_set(hb, start, count,
   start >> hb->granularity, last >> hb->granularity);
 
-start >>= hb->granularity;
+first = start >> hb->granularity;
 last >>= hb->granularity;
-count = last - start + 1;
+n = last - first + 1;
 
-hb->count += count - hb_count_between(hb, start, last);
-hb_set_between(hb, HBITMAP_LEVELS - 1, start, last);
+hb->count += n - hb_count_between(hb, first, last);
+if (hb_set_between(hb, HBITMAP_LEVELS - 1, first, last) &&
+hb->meta) {
+hbitmap_set(hb->meta, start, count);
+}
 }
 
 /* Resetting works the other way round: propagate up if the new
@@ -293,8 +303,10 @@ static inline bool hb_reset_elem(unsigned long *elem, 
uint64_t start, uint64_t l
 return blanked;
 }
 
-/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)... */
-static void hb_reset_between(HBitmap *hb, int level, uint64_t start, uint64_t 
last)
+/* The recursive workhorse (the depth is limited to HBITMAP_LEVELS)...
+ * Returns true if at least one bit is changed. */
+static bool hb_reset_between(HBitmap *hb, int level, uint64_t start,
+ uint64_t last)
 {
 size_t pos = start >> BITS_PER_LEVEL;
 size_t lastpos = last >> BITS_PER_LEVEL;
@@ -337,21 +349,28 @@ static void hb_reset_between(HBitmap *hb, int level, 
uint64_t start, uint64_t la
 if (level > 0 && changed) {
 hb_reset_between(hb, level - 1, pos, lastpos);
 }
+
+return changed;
+
 }
 
 void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t count)
 {
 /* Compute range in the last layer.  */
+uint64_t first;
 uint64_t last = start + count

[Qemu-devel] [PATCH v3 03/15] typedefs: Add BdrvDirtyBitmap

2016-02-27 Thread Fam Zheng
Following patches to refactor and move block dirty bitmap code could use
this.

Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 include/block/block.h   | 1 -
 include/qemu/typedefs.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/block/block.h b/include/block/block.h
index 00827f7..5ee9b8f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -476,7 +476,6 @@ void *qemu_try_blockalign(BlockDriverState *bs, size_t 
size);
 void *qemu_try_blockalign0(BlockDriverState *bs, size_t size);
 bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov);
 
-typedef struct BdrvDirtyBitmap BdrvDirtyBitmap;
 BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
   uint32_t granularity,
   const char *name,
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index 6ed91b4..0a2206d 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -10,6 +10,7 @@ typedef struct AddressSpace AddressSpace;
 typedef struct AioContext AioContext;
 typedef struct AllwinnerAHCIState AllwinnerAHCIState;
 typedef struct AudioState AudioState;
+typedef struct BdrvDirtyBitmap BdrvDirtyBitmap;
 typedef struct BlockBackend BlockBackend;
 typedef struct BlockBackendRootState BlockBackendRootState;
 typedef struct BlockDriverState BlockDriverState;
-- 
2.4.3




[Qemu-devel] [PATCH v3 11/15] block: Assert that bdrv_release_dirty_bitmap succeeded

2016-02-27 Thread Fam Zheng
We use a loop over bs->dirty_bitmaps to make sure the caller is
only releasing a bitmap owned by bs. Let's also assert that in this case
the caller is releasing a bitmap that does exist.

Signed-off-by: Fam Zheng 
---
 block/dirty-bitmap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index a0c5acb..27d33e7 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -304,6 +304,9 @@ static void 
bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
 }
 }
 }
+if (bitmap) {
+abort();
+}
 }
 
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
-- 
2.4.3




[Qemu-devel] [PATCH v3 02/15] block: Include hbitmap.h in block.h

2016-02-27 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 include/block/block.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/block/block.h b/include/block/block.h
index 1c4f4d8..00827f7 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -8,6 +8,7 @@
 #include "block/accounting.h"
 #include "qapi/qmp/qobject.h"
 #include "qapi-types.h"
+#include "qemu/hbitmap.h"
 
 /* block.c */
 typedef struct BlockDriver BlockDriver;
@@ -475,7 +476,6 @@ void *qemu_try_blockalign(BlockDriverState *bs, size_t 
size);
 void *qemu_try_blockalign0(BlockDriverState *bs, size_t size);
 bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov);
 
-struct HBitmapIter;
 typedef struct BdrvDirtyBitmap BdrvDirtyBitmap;
 BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
   uint32_t granularity,
-- 
2.4.3




[Qemu-devel] [PATCH v3 01/15] backup: Use Bitmap to replace "s->bitmap"

2016-02-27 Thread Fam Zheng
"s->bitmap" tracks done sectors, we only check bit states without using any
iterator which HBitmap is good for. Switch to "Bitmap" which is simpler and
more memory efficient.

Meanwhile, rename it to done_bitmap, to reflect the intention.

Signed-off-by: Fam Zheng 
Reviewed-by: John Snow 
---
 block/backup.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 00cafdb..c6c1b81 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -20,6 +20,7 @@
 #include "qapi/qmp/qerror.h"
 #include "qemu/ratelimit.h"
 #include "sysemu/block-backend.h"
+#include "qemu/bitmap.h"
 
 #define BACKUP_CLUSTER_BITS 16
 #define BACKUP_CLUSTER_SIZE (1 << BACKUP_CLUSTER_BITS)
@@ -45,7 +46,7 @@ typedef struct BackupBlockJob {
 BlockdevOnError on_target_error;
 CoRwlock flush_rwlock;
 uint64_t sectors_read;
-HBitmap *bitmap;
+unsigned long *done_bitmap;
 QLIST_HEAD(, CowRequest) inflight_reqs;
 } BackupBlockJob;
 
@@ -111,7 +112,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
 cow_request_begin(&cow_request, job, start, end);
 
 for (; start < end; start++) {
-if (hbitmap_get(job->bitmap, start)) {
+if (test_bit(start, job->done_bitmap)) {
 trace_backup_do_cow_skip(job, start);
 continue; /* already copied */
 }
@@ -162,7 +163,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
 goto out;
 }
 
-hbitmap_set(job->bitmap, start, 1);
+set_bit(start, job->done_bitmap);
 
 /* Publish progress, guest I/O counts as progress too.  Note that the
  * offset field is an opaque progress value, it is not a disk offset.
@@ -392,7 +393,7 @@ static void coroutine_fn backup_run(void *opaque)
 start = 0;
 end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
 
-job->bitmap = hbitmap_alloc(end, 0);
+job->done_bitmap = bitmap_new(end);
 
 bdrv_set_enable_write_cache(target, true);
 if (target->blk) {
@@ -473,7 +474,7 @@ static void coroutine_fn backup_run(void *opaque)
 /* wait until pending backup_do_cow() calls have completed */
 qemu_co_rwlock_wrlock(&job->flush_rwlock);
 qemu_co_rwlock_unlock(&job->flush_rwlock);
-hbitmap_free(job->bitmap);
+g_free(job->done_bitmap);
 
 if (target->blk) {
 blk_iostatus_disable(target->blk);
-- 
2.4.3




[Qemu-devel] [PATCH v3 00/15] Dirty bitmap changes for migration/persistence work

2016-02-27 Thread Fam Zheng
v2: Rebase and address comments from John, Eric and Vladimir:

[02/15] block: Include hbitmap.h in block.h
Split from the next patch. [Eric]

[04/15] block: Move block dirty bitmap code to separate files
Fix long line to keep checkpatch.pl happy. [Vladimir]
Fix comment (HBitmapIter -> BdrvDirtyBitmapIter). [Vladimir]

[06/15] block: Hide HBitmap in block dirty bitmap interface
Add John's rev-by.

[09/15] block: Support meta dirty bitmap
granularity -> chunk_size

[10/15] block: Add two dirty bitmap getters
Add "const" qualifier in function parameters. [Vladimir]

[11/15] block: Assert that bdrv_release_dirty_bitmap succeeded
Rebase and reword commit message.

[12/15] hbitmap: serialization
Fix typos and add comment. [John]

[13/15] block: BdrvDirtyBitmap serialization interface
Add John's rev-by.

[14/15] tests: Add test code for hbitmap serialization
Add John's ack-by.

Two major features are added to block dirty bitmap (and underlying HBitmap) in
this series: meta bitmap and serialization, together with all other supportive
patches.

Both operations are common in dirty bitmap migration and persistence: they need
to find whether and which part of the dirty bitmap in question has changed with
meta dirty bitmap, and they need to write it to the target with serialization.


Fam Zheng (13):
  backup: Use Bitmap to replace "s->bitmap"
  block: Include hbitmap.h in block.h
  typedefs: Add BdrvDirtyBitmap
  block: Move block dirty bitmap code to separate files
  block: Remove unused typedef of BlockDriverDirtyHandler
  block: Hide HBitmap in block dirty bitmap interface
  HBitmap: Introduce "meta" bitmap to track bit changes
  tests: Add test code for meta bitmap
  block: Support meta dirty bitmap
  block: Add two dirty bitmap getters
  block: Assert that bdrv_release_dirty_bitmap succeeded
  tests: Add test code for hbitmap serialization
  block: More operations for meta dirty bitmap

Vladimir Sementsov-Ogievskiy (2):
  hbitmap: serialization
  block: BdrvDirtyBitmap serialization interface

 block.c  | 360 -
 block/Makefile.objs  |   2 +-
 block/backup.c   |  25 +-
 block/dirty-bitmap.c | 534 +++
 block/mirror.c   |  14 +-
 include/block/block.h|  40 +---
 include/block/dirty-bitmap.h |  76 ++
 include/qemu/hbitmap.h   |  96 
 include/qemu/typedefs.h  |   2 +
 tests/test-hbitmap.c | 255 +
 util/hbitmap.c   | 203 ++--
 11 files changed, 1176 insertions(+), 431 deletions(-)
 create mode 100644 block/dirty-bitmap.c
 create mode 100644 include/block/dirty-bitmap.h

-- 
2.4.3




Re: [Qemu-devel] [PATCH v2 11/13] hbitmap: serialization

2016-02-27 Thread Fam Zheng
On Tue, 01/26 12:01, John Snow wrote:
> 
> 
> On 01/20/2016 01:11 AM, Fam Zheng wrote:
> > From: Vladimir Sementsov-Ogievskiy 
> > 
> > Functions to serialize / deserialize(restore) HBitmap. HBitmap should be
> > saved to linear sequence of bits independently of endianness and bitmap
> > array element (unsigned long) size. Therefore Little Endian is chosen.
> > 
> > These functions are appropriate for dirty bitmap migration, restoring
> > the bitmap in several steps is available. To save performance, every
> > step writes only the last level of the bitmap. All other levels are
> > restored by hbitmap_deserialize_finish() as a last step of restoring.
> > So, HBitmap is inconsistent while restoring.
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > [Fix left shift operand to 1UL; add "finish" parameter. - Fam]
> > Signed-off-by: Fam Zheng 
> > ---
> >  include/qemu/hbitmap.h |  78 
> >  util/hbitmap.c | 135 
> > +
> >  2 files changed, 213 insertions(+)
> > 
> > diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
> > index 595ad65..00dbb60 100644
> > --- a/include/qemu/hbitmap.h
> > +++ b/include/qemu/hbitmap.h
> > @@ -149,6 +149,84 @@ void hbitmap_reset_all(HBitmap *hb);
> >  bool hbitmap_get(const HBitmap *hb, uint64_t item);
> >  
> >  /**
> > + * hbitmap_serialization_granularity:
> > + * @hb: HBitmap to operate on.
> > + *
> > + * Grunularity of serialization chunks, used by other serializetion 
> > functions.
> 
> "Granularity," "serialization."
> 
> Perhaps we should be consistent with "hbitmap" vs "HBitmap" as well, too.
> 
> > + * For every chunk:
> > + * 1. Chunk start should be aligned to this granularity.
> > + * 2. Chunk size should be aligned too, except for last chunk (for which
> > + *  start + count == hb->size)
> > + */
> > +uint64_t hbitmap_serialization_granularity(const HBitmap *hb);
> > +
> > +/**
> > + * hbitmap_data_size:
> > + * @hb: HBitmap to operate on.
> > + * @count: Number of bits
> > + *
> > + * Return amount of bytes hbitmap_(de)serialize_part needs
> > + */
> 
> "number of bytes" -- amount is for uncountable nouns (like "water.")
> 
> > +uint64_t hbitmap_serialization_size(const HBitmap *hb,
> > +uint64_t start, uint64_t count);
> > +
> > +/**
> > + * hbitmap_serialize_part
> > + * @hb: HBitmap to operate on.
> > + * @buf: Buffer to store serialized bitmap.
> > + * @start: First bit to store.
> > + * @count: Number of bits to store.
> > + *
> > + * Stores HBitmap data corresponding to given region. The format of saved 
> > data
> > + * is linear sequence of bits, so it can be used by 
> > hbitmap_deserialize_part
> > + * independently of endianness and size of HBitmap level array elements
> > + */
> > +void hbitmap_serialize_part(const HBitmap *hb, uint8_t *buf,
> > +uint64_t start, uint64_t count);
> > +
> > +/**
> > + * hbitmap_deserialize_part
> > + * @hb: HBitmap to operate on.
> > + * @buf: Buffer to restore bitmap data from.
> > + * @start: First bit to restore.
> > + * @count: Number of bits to restore.
> > + * @finish: Whether to call hbitmap_deserialize_finish automatically.
> > + *
> > + * Restores HBitmap data corresponding to given region. The format is the 
> > same
> > + * as for hbitmap_serialize_part.
> > + *
> > + * If @finish is false, caller must call hbitmap_serialize_finish before 
> > using
> > + * the bitmap.
> > + */
> > +void hbitmap_deserialize_part(HBitmap *hb, uint8_t *buf,
> > +  uint64_t start, uint64_t count,
> > +  bool finish);
> > +
> > +/**
> > + * hbitmap_deserialize_zeroes
> > + * @hb: HBitmap to operate on.
> > + * @start: First bit to restore.
> > + * @count: Number of bits to restore.
> > + * @finish: Whether to call hbitmap_deserialize_finish automatically.
> > + *
> > + * Fills the bitmap with zeroes.
> > + *
> > + * If @finish is false, caller must call hbitmap_serialize_finish before 
> > using
> > + * the bitmap.
> > + */
> > +void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t start, uint64_t 
> > count,
> > +bool finish);
> > +
> > +/**
> > + * hbitmap_deserialize_finish
> > + * @hb: HBitmap to operate on.
> > + *
> > + * Repair HBitmap after calling hbitmap_deserialize_data. Actually, all 
> > HBitmap
> > + * layers are restored here.
> > + */
> > +void hbitmap_deserialize_finish(HBitmap *hb);
> > +
> > +/**
> >   * hbitmap_free:
> >   * @hb: HBitmap to operate on.
> >   *
> > diff --git a/util/hbitmap.c b/util/hbitmap.c
> > index 79f6236..1e49ab7 100644
> > --- a/util/hbitmap.c
> > +++ b/util/hbitmap.c
> > @@ -397,6 +397,141 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item)
> >  return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & bit) 
> > != 0;
> >  }
> >  
> > +uint64_t hbitmap_serialization_granularity(const HBitmap *hb)
> > +{
> > +return 64 << 

Re: [Qemu-devel] [PATCH v2 14/13] block: More operations for meta dirty bitmap

2016-02-27 Thread Fam Zheng
On Wed, 01/27 17:57, Vladimir Sementsov-Ogievskiy wrote:
> On 26.01.2016 10:26, Fam Zheng wrote:
> >Callers can create an iterator of meta bitmap with
> >bdrv_dirty_meta_iter_new(), then use the bdrv_dirty_iter_* operations on
> >it. Meta iterators are also counted by bitmap->active_iterators.
> >
> >Also add a couple of functions to retrieve granularity and count.
> >
> >Signed-off-by: Fam Zheng 
> >
> >---
> >
> >Vladimir, are these interfaces enough for your migration series? If it
> >is more convenient to you, you can adjust and take it into your series.
> 
> This patch is ok (the only thing - see comment below), thanks. The
> only thing is that my migration series may be changed to use
> post-copy migration instead of meta-bitmaps..
> 
> >---
> >  block/dirty-bitmap.c | 19 +++
> >  include/block/dirty-bitmap.h |  3 +++
> >  2 files changed, 22 insertions(+)
> >
> >diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> >index bf84b17..1aa7f76 100644
> >--- a/block/dirty-bitmap.c
> >+++ b/block/dirty-bitmap.c
> >@@ -369,6 +369,11 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
> >*bitmap)
> >  return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
> >  }
> >+uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap)
> >+{
> >+return hbitmap_granularity(bitmap->meta);
> >+}
> 
> I think it should be more like bdrv_dirty_bitmap_granularity, with
> "BDRV_SECTOR_SIZE << ".

Yes, will change.

Fam



Re: [Qemu-devel] [PATCH v2 05/13] block: Hide HBitmap in block dirty bitmap interface

2016-02-27 Thread Fam Zheng
On Tue, 01/26 19:02, Vladimir Sementsov-Ogievskiy wrote:
> 
> why? The old way works for meta iters. The new code - doesn't. I suppose
> 
> assert(iter->hbi.hb);
> hbitmap_iter_init(&iter->hbi, iter->hbi.hb, sector_num);

Okay, will change.

Fam