Re: [Qemu-devel] [PATCH] cpu_physical_memory_sync_dirty_bitmap: Another alignment fix

2018-01-04 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 03/01/2018 19:33, Dr. David Alan Gilbert (git) wrote:
> > The optimised version operates on 'longs' dealing with (typically) 64
> > pages at a time, replacing the whole long by a 0 and counting the bits.
> > If the Ramblock is less than 64bits in length that long can contain bits
> > representing two different RAMBlocks, but the code will update the
> > bmap belinging to the 1st RAMBlock only while having updated the total
> > dirty page count for both.
> 
> The patch is obviously correct, but would it make sense also to align
> the RAMBlocks' initial ram_addr_t to a multiple of BITS_PER_LONG <<
> TARGET_PAGE_BITS?

Yes, I can do that as a separate patch.  The alignment starts getting
a little silly - say 4k target page, 64 bits long so aligning a 4k
RAMBlock to 256kb boundary - but I think it's OK.

Dave
P.S. I'd be careful of saying 'obviously correct' given how many small
fixes this function has had recently!

> Thanks,
> 
> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH] cpu_physical_memory_sync_dirty_bitmap: Another alignment fix

2018-01-03 Thread Juan Quintela
"Dr. David Alan Gilbert (git)"  wrote:
> From: "Dr. David Alan Gilbert" 
>
> This code has an optimised, word aligned version, and a boring
> unaligned version. My commit f70d345 fixed one alignment issue, but
> there's another.
>
> The optimised version operates on 'longs' dealing with (typically) 64
> pages at a time, replacing the whole long by a 0 and counting the bits.
> If the Ramblock is less than 64bits in length that long can contain bits
> representing two different RAMBlocks, but the code will update the
> bmap belinging to the 1st RAMBlock only while having updated the total
> dirty page count for both.
>
> This probably didn't matter prior to 6b6712ef which split the dirty
> bitmap by RAMBlock, but now they're separate RAMBlocks we end up
> with a count that doesn't match the state in the bitmaps.
>
> Symptom:
>   Migration showing a few dirty pages left to be sent constantly
>   Seen on aarch64 and x86 with x86+ovmf
>
> Signed-off-by: Dr. David Alan Gilbert 
> Reported-by: Wei Huang 
> Fixes: 6b6712efccd383b48a909bee0b29e079a57601ec

Reviewed-by: Juan Quintela 




Re: [Qemu-devel] [PATCH] cpu_physical_memory_sync_dirty_bitmap: Another alignment fix

2018-01-03 Thread Paolo Bonzini
On 03/01/2018 19:33, Dr. David Alan Gilbert (git) wrote:
> The optimised version operates on 'longs' dealing with (typically) 64
> pages at a time, replacing the whole long by a 0 and counting the bits.
> If the Ramblock is less than 64bits in length that long can contain bits
> representing two different RAMBlocks, but the code will update the
> bmap belinging to the 1st RAMBlock only while having updated the total
> dirty page count for both.

The patch is obviously correct, but would it make sense also to align
the RAMBlocks' initial ram_addr_t to a multiple of BITS_PER_LONG <<
TARGET_PAGE_BITS?

Thanks,

Paolo



Re: [Qemu-devel] [PATCH] cpu_physical_memory_sync_dirty_bitmap: Another alignment fix

2018-01-03 Thread Wei Huang


On 01/03/2018 12:33 PM, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> This code has an optimised, word aligned version, and a boring
> unaligned version. My commit f70d345 fixed one alignment issue, but
> there's another.
> 
> The optimised version operates on 'longs' dealing with (typically) 64
> pages at a time, replacing the whole long by a 0 and counting the bits.
> If the Ramblock is less than 64bits in length that long can contain bits
> representing two different RAMBlocks, but the code will update the
> bmap belinging to the 1st RAMBlock only while having updated the total
> dirty page count for both.
> 
> This probably didn't matter prior to 6b6712ef which split the dirty
> bitmap by RAMBlock, but now they're separate RAMBlocks we end up
> with a count that doesn't match the state in the bitmaps.
> 
> Symptom:
>   Migration showing a few dirty pages left to be sent constantly
>   Seen on aarch64 and x86 with x86+ovmf
> 
> Signed-off-by: Dr. David Alan Gilbert 
> Reported-by: Wei Huang 
> Fixes: 6b6712efccd383b48a909bee0b29e079a57601ec

This solves the failure I saw in the migration test case.

Acked-by: Wei Huang 

> ---
>  include/exec/ram_addr.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 6cbc02aa0f..7633ef6342 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -391,9 +391,10 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock 
> *rb,
>  uint64_t num_dirty = 0;
>  unsigned long *dest = rb->bmap;
>  
> -/* start address is aligned at the start of a word? */
> +/* start address and length is aligned at the start of a word? */
>  if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) ==
> - (start + rb->offset)) {
> + (start + rb->offset) &&
> +!(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
>  int k;
>  int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
>  unsigned long * const *src;
> 



[Qemu-devel] [PATCH] cpu_physical_memory_sync_dirty_bitmap: Another alignment fix

2018-01-03 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

This code has an optimised, word aligned version, and a boring
unaligned version. My commit f70d345 fixed one alignment issue, but
there's another.

The optimised version operates on 'longs' dealing with (typically) 64
pages at a time, replacing the whole long by a 0 and counting the bits.
If the Ramblock is less than 64bits in length that long can contain bits
representing two different RAMBlocks, but the code will update the
bmap belinging to the 1st RAMBlock only while having updated the total
dirty page count for both.

This probably didn't matter prior to 6b6712ef which split the dirty
bitmap by RAMBlock, but now they're separate RAMBlocks we end up
with a count that doesn't match the state in the bitmaps.

Symptom:
  Migration showing a few dirty pages left to be sent constantly
  Seen on aarch64 and x86 with x86+ovmf

Signed-off-by: Dr. David Alan Gilbert 
Reported-by: Wei Huang 
Fixes: 6b6712efccd383b48a909bee0b29e079a57601ec
---
 include/exec/ram_addr.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 6cbc02aa0f..7633ef6342 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -391,9 +391,10 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock 
*rb,
 uint64_t num_dirty = 0;
 unsigned long *dest = rb->bmap;
 
-/* start address is aligned at the start of a word? */
+/* start address and length is aligned at the start of a word? */
 if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) ==
- (start + rb->offset)) {
+ (start + rb->offset) &&
+!(length & ((BITS_PER_LONG << TARGET_PAGE_BITS) - 1))) {
 int k;
 int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
 unsigned long * const *src;
-- 
2.14.3