On 06/27/2012 04:34 AM, Orit Wasserman wrote:
> Signed-off-by: Orit Wasserman <owass...@redhat.com>
> ---
>  docs/xbzrle.txt |  142 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 142 insertions(+), 0 deletions(-)
>  create mode 100644 docs/xbzrle.txt
> 

> +Format
> +=======
> +
> +The compression format preforms a XOR between the previous and current 
> content

s/preforms/performs/

> +of the page, where zero represents an unchanged value.
> +The page data delta is represented by zero and non zero runs.
> +A zero run is represented by its length (in bytes).
> +A non zero run is represented by its length (in bytes) and the new data.
> +The run length is encoded using ULEB128 (http://en.wikipedia.org/wiki/LEB128)
> +

Maybe mention that there is more than one valid encoding, and that the
sender may send a longer encoding if the computation cost of determining
the shortest representation is not worthwhile (to make it clear that we
may or may not decide to optimize the case of one unchanged byte
splitting two nzruns or being inlined into a single nzrun).

> +On the sender side XBZRLE is used as a compact delta encoding of page 
> updates,
> +retrieving the old page content from the cache (default size of 512 MB). The
> +receiving side uses the existing page's content and XBZRLE to decode the new
> +page's content.
> +
> +This is a more compact way to store the deltas than the previous version.

What previous version?  Oh, you mentioned it in the next paragraph (the
XBRLE algorithm used by Benoit, Svard, Tordsson, and Elmroth); although
I had been assuming you were comparing it to the previous way that qemu
sent changed pages (that is, uncompressed).  I think you could either
delete this one-liner paragraph with no loss in information, or use this
more generic wording instead:

This typically results in a more compact representation of a changed page.

> +
> +Example
> +new buffer:
> +1100 zeros
> +1 2 3 4 5 6 7 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f

> 
> +old buffer:
> +1100 zeros
> +5 6 7 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 
> 22

> 
> +encoded buffer:
> +
> +encoded length 118
> +
> +e8 7 70 0 1 2 3 4 5 6 7 8 9 a b c d e f 10 11 12 13 14 15 16 17 18 19 1a 1b 
> 1c

Huh?  If I'm doing my math right, this starts with a zero run of 1100
bytes, which 0x44c, which is encoded as 0xcc 0x08 (that is,
((0x44c&0x7f)|0x80) for the first byte, and (0x44c>>7) for the second
byte).  But you listed the zero run as 0xe8 0x07 (which decodes to
0x3e8, or 1000).

Also, I would list old buffer before new.  And I'd pad out the buffer so
that all bytes are two-digits (use leading zero) and therefore columns
are aligned.  You can also use a shorter string to still get the point
across.  Maybe something like:

old buffer, 4096 bytes:
1100 zeros
01 02 03 04 05 06 00 08 09 0a 0b 0c
2984 zeros

new buffer, 4096 bytes:
1100 zeros
02 03 04 04 05 06 07 08 00 00 09 0a 00 0b 0c 0d
2980 zeros

one possible encoded buffer, 18 bytes:
cc 08 03 02 03 04 03 0a 07 08 00 00 09 0a 00 0b 0c 0d

-- 
Eric Blake   ebl...@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to