Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
On Fri, Nov 28, 2014 at 04:28:57PM +0300, Vladimir Sementsov-Ogievskiy wrote: On 21.11.2014 19:55, Stefan Hajnoczi wrote: Active dirty bitmaps should migrate too. I'm thinking now that the appropriate thing is to add live migration of dirty bitmaps to QEMU (regardless of whether they are active or not). I think, we should migrate named dirty bitmaps, which are not used now. So if some external mechanism uses the bitmap (for example - backup) - we actually can't migrate this process, because we will need to restore the whole backup structure including a pointer to the bitmap, which is too hard and includes not only bitmap migration. So, if named bitmap is enabled, but not used (only bdrv_aligned_pwritev writes to it) it can be migrated. For this I see the following solutions: 1) Just save all corresponding pieces of named bitmaps with every migrated block. The block size is 1mb, so the overhead for migrating additionally a bitmap with 64kb granularity would be 2b, and it would be 256b for bitmap with 512b granularity. This approach needs additional fields in BlkMigBlock, for saving bitmaps pieces. block-migration.c is not used for all live migration. So it's important not to tie dirty bitmap migration to block-migration.c, at least there needs to be a way to skip actually copying disk contents in block-migration.c. (When there is shared storage that both source and destination hosts can access then block-migration.c is not used. Also, there is a newer non-shared storage migration mechanism that is used instead of block-migration.c which is not tied into the live migration data stream, so block-migration.c is optional.) 2) Add DIRTY flag to migrated block flags, to distinguish blocks, which became dirty while migrating. Save all the bitmaps separately, and also update them on block_load, when we receive block with DIRTY flag on. Some information will be lost, migrated dirty bitmaps may be more dirty then original ones. This approach needs additional field bool dirty in BlkMigBlock, and saving this flag in blk_send. These solutions don't depend on persistence of dirty bitmaps or persistent bitmap file format. That's an important characteristic since we probably want to migrate named dirty bitmaps, whether they are persistent or not. Stefan pgpuYj8ZiRiN_.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
On 21.11.2014 19:55, Stefan Hajnoczi wrote: Active dirty bitmaps should migrate too. I'm thinking now that the appropriate thing is to add live migration of dirty bitmaps to QEMU (regardless of whether they are active or not). I think, we should migrate named dirty bitmaps, which are not used now. So if some external mechanism uses the bitmap (for example - backup) - we actually can't migrate this process, because we will need to restore the whole backup structure including a pointer to the bitmap, which is too hard and includes not only bitmap migration. So, if named bitmap is enabled, but not used (only bdrv_aligned_pwritev writes to it) it can be migrated. For this I see the following solutions: 1) Just save all corresponding pieces of named bitmaps with every migrated block. The block size is 1mb, so the overhead for migrating additionally a bitmap with 64kb granularity would be 2b, and it would be 256b for bitmap with 512b granularity. This approach needs additional fields in BlkMigBlock, for saving bitmaps pieces. 2) Add DIRTY flag to migrated block flags, to distinguish blocks, which became dirty while migrating. Save all the bitmaps separately, and also update them on block_load, when we receive block with DIRTY flag on. Some information will be lost, migrated dirty bitmaps may be more dirty then original ones. This approach needs additional field bool dirty in BlkMigBlock, and saving this flag in blk_send. These solutions don't depend on persistence of dirty bitmaps or persistent bitmap file format. Best regards, Vladimir On 21.11.2014 19:55, Stefan Hajnoczi wrote: On Fri, Nov 21, 2014 at 01:27:40PM +0300, Vladimir Sementsov-Ogievskiy wrote: There is a constraint if we want to get live migration for free: The bitmap contents must be accessible with bdrv_read() and bdrv_get_block_status() to skip zero regions. Hm. I'm afraid, it still will not be free. If bitmap is active, it's actual version is in memory. To migrate bitmap file like a disk image, we should start syncing it with every write to corresponding disk, doubling number of io. It would be possible to drive-mirror the persistent dirty bitmap and then flush it like all drives when the guest vCPUs are paused for migration. After thinking more about it though, this approach places more I/O into the critical guest downtime phase. In other words, slow disk I/O could lead to long guest downtimes while QEMU tries to write out the dirty bitmap. Moreover, we have normal dirty bitmaps, which have no name/file, do we migrate them? If, for example, the migration occurs when backup in progress? Active bitmaps should be migrated in the same way for persistent/named/normal bitmaps. I can't find in qemu source, is there bitmap migration? bs-dirty_bitmaps is not migrated, in fact none of BlockDriverState is migrated. QEMU only migrates emulated device state (e.g. the hardware registers and associated state). It does not emulate host state that the guest cannot see like the dirty bitmap. Or you are saying about migrating disabled bitmaps? Hm. We should sync bitmap file on bitmap_disable. Disabled persistent bitmap is just a static file ~30mb, we can easily migrate it without common procedure with cow or something like this.. Active dirty bitmaps should migrate too. I'm thinking now that the appropriate thing is to add live migration of dirty bitmaps to QEMU (regardless of whether they are active or not). Stefan
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
I'm thinking now that the appropriate thing is to add live migration of dirty bitmaps to QEMU (regardless of whether they are active or not). Digging the code around, I've found this: in mig_save_device_dirty which is actually an iteration of live block migration, after sending a sector we need to clear appropriate bit in migration dirty bitmap (bmds-dirty_bitmap). But we clear such bits in all bitmaps, associated with this device: bdrv_reset_dirty(bmds-bs, sector, nr_sectors); which is void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors) { BdrvDirtyBitmap *bitmap; QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) { hbitmap_reset(bitmap-bitmap, cur_sector, nr_sectors); } } I don't know why is it so, but with such approach we cant talk about dirty bitmap migration. Actually, all other dirty bitmaps, not related to this migration are broken because of this. It's a mistake or I don't understand the concept of several dirty bitmaps per device in qemu. I've thought that they are separate entities, which are maintained by qemu. And other subsystems like backup or migration can create for itself a bitmap and use it not touching other bitmaps.. Am I wrong? Best regards, Vladimir On 21.11.2014 19:55, Stefan Hajnoczi wrote: On Fri, Nov 21, 2014 at 01:27:40PM +0300, Vladimir Sementsov-Ogievskiy wrote: There is a constraint if we want to get live migration for free: The bitmap contents must be accessible with bdrv_read() and bdrv_get_block_status() to skip zero regions. Hm. I'm afraid, it still will not be free. If bitmap is active, it's actual version is in memory. To migrate bitmap file like a disk image, we should start syncing it with every write to corresponding disk, doubling number of io. It would be possible to drive-mirror the persistent dirty bitmap and then flush it like all drives when the guest vCPUs are paused for migration. After thinking more about it though, this approach places more I/O into the critical guest downtime phase. In other words, slow disk I/O could lead to long guest downtimes while QEMU tries to write out the dirty bitmap. Moreover, we have normal dirty bitmaps, which have no name/file, do we migrate them? If, for example, the migration occurs when backup in progress? Active bitmaps should be migrated in the same way for persistent/named/normal bitmaps. I can't find in qemu source, is there bitmap migration? bs-dirty_bitmaps is not migrated, in fact none of BlockDriverState is migrated. QEMU only migrates emulated device state (e.g. the hardware registers and associated state). It does not emulate host state that the guest cannot see like the dirty bitmap. Or you are saying about migrating disabled bitmaps? Hm. We should sync bitmap file on bitmap_disable. Disabled persistent bitmap is just a static file ~30mb, we can easily migrate it without common procedure with cow or something like this.. Active dirty bitmaps should migrate too. I'm thinking now that the appropriate thing is to add live migration of dirty bitmaps to QEMU (regardless of whether they are active or not). Stefan
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
Active dirty bitmaps should migrate too. I'm thinking now that the appropriate thing is to add live migration of dirty bitmaps to QEMU (regardless of whether they are active or not). Only for persistent bitmaps, or for all named bitmaps? If for all named bitmaps, then this migration should not be connected with bitmap file and it's format. Best regards, Vladimir On 21.11.2014 19:55, Stefan Hajnoczi wrote: On Fri, Nov 21, 2014 at 01:27:40PM +0300, Vladimir Sementsov-Ogievskiy wrote: There is a constraint if we want to get live migration for free: The bitmap contents must be accessible with bdrv_read() and bdrv_get_block_status() to skip zero regions. Hm. I'm afraid, it still will not be free. If bitmap is active, it's actual version is in memory. To migrate bitmap file like a disk image, we should start syncing it with every write to corresponding disk, doubling number of io. It would be possible to drive-mirror the persistent dirty bitmap and then flush it like all drives when the guest vCPUs are paused for migration. After thinking more about it though, this approach places more I/O into the critical guest downtime phase. In other words, slow disk I/O could lead to long guest downtimes while QEMU tries to write out the dirty bitmap. Moreover, we have normal dirty bitmaps, which have no name/file, do we migrate them? If, for example, the migration occurs when backup in progress? Active bitmaps should be migrated in the same way for persistent/named/normal bitmaps. I can't find in qemu source, is there bitmap migration? bs-dirty_bitmaps is not migrated, in fact none of BlockDriverState is migrated. QEMU only migrates emulated device state (e.g. the hardware registers and associated state). It does not emulate host state that the guest cannot see like the dirty bitmap. Or you are saying about migrating disabled bitmaps? Hm. We should sync bitmap file on bitmap_disable. Disabled persistent bitmap is just a static file ~30mb, we can easily migrate it without common procedure with cow or something like this.. Active dirty bitmaps should migrate too. I'm thinking now that the appropriate thing is to add live migration of dirty bitmaps to QEMU (regardless of whether they are active or not). Stefan
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
There is a constraint if we want to get live migration for free: The bitmap contents must be accessible with bdrv_read() and bdrv_get_block_status() to skip zero regions. Hm. I'm afraid, it still will not be free. If bitmap is active, it's actual version is in memory. To migrate bitmap file like a disk image, we should start syncing it with every write to corresponding disk, doubling number of io. Moreover, we have normal dirty bitmaps, which have no name/file, do we migrate them? If, for example, the migration occurs when backup in progress? Active bitmaps should be migrated in the same way for persistent/named/normal bitmaps. I can't find in qemu source, is there bitmap migration? Or you are saying about migrating disabled bitmaps? Hm. We should sync bitmap file on bitmap_disable. Disabled persistent bitmap is just a static file ~30mb, we can easily migrate it without common procedure with cow or something like this.. Best regards, Vladimir On 20.11.2014 14:36, Stefan Hajnoczi wrote: On Thu, Nov 20, 2014 at 01:41:14PM +0300, Vladimir Sementsov-Ogievskiy wrote: Also, it may be better to make this as qcow2 extension. And bitmap will be saved in separate qcow2 file, which will contain only the bitmap(s) and no other data (no disk, no snapshots). I think you are on to something with the idea of making the persistent dirty bitmap itself a disk image. That way drive-mirror and other commands can be used to live migrate the dirty bitmap along with the guest's disks. This allows both QEMU and management tools to reuse existing code. (We may need to allow multiple block jobs per BlockDriverState to make this work but in theory that can be done.) There is a constraint if we want to get live migration for free: The bitmap contents must be accessible with bdrv_read() and bdrv_get_block_status() to skip zero regions. Putting the dirty bitmap into its own data structure in qcow2 and not accessible as a BlockDriverState bdrv_read() means custom code must be written to migrate the dirty bitmap. So I suggest putting the bitmap contents into a disk image that can be accessed as a BlockDriverState with bdrv_read(). The metadata (bitmap name, granularity, etc) doesn't need to be stored in the image file because management tools must be aware of it anyway. The only thing besides the data that really needs to be stored is the up-to-date flag to decide whether this dirty bitmap was synced cleanly. A much simpler format would do for that. Stefan
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
The metadata (bitmap name, granularity, etc) doesn't need to be stored in the image file because management tools must be aware of it anyway. What tools do you mean? In my opinion dirty bitmap should exist as a separate object. If it exists, it should be loaded with it's drive image and it should be maintained by qemu (loaded and enabled as a BdrvDirtyBitmap). If we use qcow2 format for dirty bitmaps, we can store metadata using header extension.. Also snapshots may be used to store several bitmaps in case when server shutdowns during backup and we need to store both current active bitmap and it's snapshot used by backup. Best regards, Vladimir On 20.11.2014 14:36, Stefan Hajnoczi wrote: On Thu, Nov 20, 2014 at 01:41:14PM +0300, Vladimir Sementsov-Ogievskiy wrote: Also, it may be better to make this as qcow2 extension. And bitmap will be saved in separate qcow2 file, which will contain only the bitmap(s) and no other data (no disk, no snapshots). I think you are on to something with the idea of making the persistent dirty bitmap itself a disk image. That way drive-mirror and other commands can be used to live migrate the dirty bitmap along with the guest's disks. This allows both QEMU and management tools to reuse existing code. (We may need to allow multiple block jobs per BlockDriverState to make this work but in theory that can be done.) There is a constraint if we want to get live migration for free: The bitmap contents must be accessible with bdrv_read() and bdrv_get_block_status() to skip zero regions. Putting the dirty bitmap into its own data structure in qcow2 and not accessible as a BlockDriverState bdrv_read() means custom code must be written to migrate the dirty bitmap. So I suggest putting the bitmap contents into a disk image that can be accessed as a BlockDriverState with bdrv_read(). The metadata (bitmap name, granularity, etc) doesn't need to be stored in the image file because management tools must be aware of it anyway. The only thing besides the data that really needs to be stored is the up-to-date flag to decide whether this dirty bitmap was synced cleanly. A much simpler format would do for that. Stefan
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
On Fri, Nov 21, 2014 at 01:27:40PM +0300, Vladimir Sementsov-Ogievskiy wrote: There is a constraint if we want to get live migration for free: The bitmap contents must be accessible with bdrv_read() and bdrv_get_block_status() to skip zero regions. Hm. I'm afraid, it still will not be free. If bitmap is active, it's actual version is in memory. To migrate bitmap file like a disk image, we should start syncing it with every write to corresponding disk, doubling number of io. It would be possible to drive-mirror the persistent dirty bitmap and then flush it like all drives when the guest vCPUs are paused for migration. After thinking more about it though, this approach places more I/O into the critical guest downtime phase. In other words, slow disk I/O could lead to long guest downtimes while QEMU tries to write out the dirty bitmap. Moreover, we have normal dirty bitmaps, which have no name/file, do we migrate them? If, for example, the migration occurs when backup in progress? Active bitmaps should be migrated in the same way for persistent/named/normal bitmaps. I can't find in qemu source, is there bitmap migration? bs-dirty_bitmaps is not migrated, in fact none of BlockDriverState is migrated. QEMU only migrates emulated device state (e.g. the hardware registers and associated state). It does not emulate host state that the guest cannot see like the dirty bitmap. Or you are saying about migrating disabled bitmaps? Hm. We should sync bitmap file on bitmap_disable. Disabled persistent bitmap is just a static file ~30mb, we can easily migrate it without common procedure with cow or something like this.. Active dirty bitmaps should migrate too. I'm thinking now that the appropriate thing is to add live migration of dirty bitmaps to QEMU (regardless of whether they are active or not). Stefan pgpYWBMj_TFW5.pgp Description: PGP signature
[Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
QDB file is for storing dirty bitmap. The specification is based on qcow2 specification. Saving several bitmaps is necessary when server shutdowns during backup. In this case 2 tables for each disk are available. One collected for a previous period and one active. Though this feature is discussable. Big endian format and Standard Cluster Descriptor are used to simplify integration with qcow2, to support internal bitmaps for qcow2 in future. The idea is that the same procedure writing the data to QDB file could do the same for QCOW2. The only difference is cluster refcount table. Should we use it here or not is still questionable. Signed-off-by: Vladimir Sementsov-Ogievskiy vsement...@parallels.com --- docs/specs/qdb.txt | 132 + 1 file changed, 132 insertions(+) create mode 100644 docs/specs/qdb.txt diff --git a/docs/specs/qdb.txt b/docs/specs/qdb.txt new file mode 100644 index 000..d570a69 --- /dev/null +++ b/docs/specs/qdb.txt @@ -0,0 +1,132 @@ +== General == + +QDB means Qemu Dirty Bitmaps. QDB file can store several dirty bitmaps. +QDB file is organized in units of constant size, which are called clusters. + +All numbers in QDB are stored in Big Endian byte order. + +== Header == + +The first cluster of a QDB image contains the file header: + +Byte 0 - 3: magic +QDB magic string (QDB\0) + + 4 - 7: version +Version number (valid value is 1) + + 8 - 11: cluster_bits +Number of bits that are used for addressing an offset +within a cluster (1 cluster_bits is the cluster size). +Must not be less than 9 (i.e. 512 byte clusters). + + 12 - 15: nb_bitmaps +Number of bitmaps contained in the file + + 16 - 23: bitmaps_offset +Offset into the QDB file at which the bitmap table starts. +Must be aligned to a cluster boundary. + + 24 - 27: header_length +Length of the header structure in bytes. + +Like in qcow2, directly after the image header, optional sections called header extensions can +be stored. Each extension has a structure like the following: + +Byte 0 - 3: Header extension type: +0x - End of the header extension area +other - Unknown header extension, can be safely + ignored + + 4 - 7: Length of the header extension data + + 8 - n: Header extension data + + n - m: Padding to round up the header extension size to the next +multiple of 8. + +Unless stated otherwise, each header extension type shall appear at most once +in the same image. + +== Cluster mapping == + +QDB uses a ONE-level structure for the mapping of +bitmaps to host clusters. It is called L1 table. + +The L1 table has a variable size (stored in the Bitmap table entry) and may +use multiple clusters, however it must be contiguous in the QDB file. + +Given a offset into the bitmap, the offset into the QDB file can be +obtained as follows: + +offset = l1_table[offset / cluster_size] + (offset % cluster_size) + +L1 table entry: + +Bit 0 - 61: Cluster descriptor + +62 - 63: Reserved + +Standard Cluster Descriptor (the same as in qcow2): + +Bit 0:If set to 1, the cluster reads as all zeros. The host +cluster offset can be used to describe a preallocation, +but it won't be used for reading data from this cluster, +nor is data read from the backing file if the cluster is +unallocated. + + 1 - 8:Reserved (set to 0) + + 9 - 55:Bits 9-55 of host cluster offset. Must be aligned to a +cluster boundary. If the offset is 0, the cluster is +unallocated. + +56 - 61:Reserved (set to 0) + +If a cluster is unallocated, read requests shall read zero. + +== Bitmap table == + +QDB supports storing of several bitmaps. + +A directory of all bitmaps is stored in the bitmap table, a contiguous area +in the QDB file, whose starting offset and length are given by the header +fields bitmaps_offset and nb_bitmaps. The entries of the bitmap table +have variable length, depending on the length of name and extra data. + +Bitmap table entry: + +Byte 0 - 7:Offset into the QDB file at which the L1 table for the +bitmap starts. Must be aligned to a cluster boundary. + + 8 - 11:Number of entries in the L1 table of the bitmap + +12 - 15:Bitmap granularity +As represented in HBitmap structure. Given a granularity of +G, each bit in the bitmap will actually represent a group +of 2^G bytes. + +16 - 23:Bitmap size
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
Also, it may be better to make this as qcow2 extension. And bitmap will be saved in separate qcow2 file, which will contain only the bitmap(s) and no other data (no disk, no snapshots). Best regards, Vladimir On 20.11.2014 13:34, Vladimir Sementsov-Ogievskiy wrote: QDB file is for storing dirty bitmap. The specification is based on qcow2 specification. Saving several bitmaps is necessary when server shutdowns during backup. In this case 2 tables for each disk are available. One collected for a previous period and one active. Though this feature is discussable. Big endian format and Standard Cluster Descriptor are used to simplify integration with qcow2, to support internal bitmaps for qcow2 in future. The idea is that the same procedure writing the data to QDB file could do the same for QCOW2. The only difference is cluster refcount table. Should we use it here or not is still questionable. Signed-off-by: Vladimir Sementsov-Ogievskiy vsement...@parallels.com --- docs/specs/qdb.txt | 132 + 1 file changed, 132 insertions(+) create mode 100644 docs/specs/qdb.txt diff --git a/docs/specs/qdb.txt b/docs/specs/qdb.txt new file mode 100644 index 000..d570a69 --- /dev/null +++ b/docs/specs/qdb.txt @@ -0,0 +1,132 @@ +== General == + +QDB means Qemu Dirty Bitmaps. QDB file can store several dirty bitmaps. +QDB file is organized in units of constant size, which are called clusters. + +All numbers in QDB are stored in Big Endian byte order. + +== Header == + +The first cluster of a QDB image contains the file header: + +Byte 0 - 3: magic +QDB magic string (QDB\0) + + 4 - 7: version +Version number (valid value is 1) + + 8 - 11: cluster_bits +Number of bits that are used for addressing an offset +within a cluster (1 cluster_bits is the cluster size). +Must not be less than 9 (i.e. 512 byte clusters). + + 12 - 15: nb_bitmaps +Number of bitmaps contained in the file + + 16 - 23: bitmaps_offset +Offset into the QDB file at which the bitmap table starts. +Must be aligned to a cluster boundary. + + 24 - 27: header_length +Length of the header structure in bytes. + +Like in qcow2, directly after the image header, optional sections called header extensions can +be stored. Each extension has a structure like the following: + +Byte 0 - 3: Header extension type: +0x - End of the header extension area +other - Unknown header extension, can be safely + ignored + + 4 - 7: Length of the header extension data + + 8 - n: Header extension data + + n - m: Padding to round up the header extension size to the next +multiple of 8. + +Unless stated otherwise, each header extension type shall appear at most once +in the same image. + +== Cluster mapping == + +QDB uses a ONE-level structure for the mapping of +bitmaps to host clusters. It is called L1 table. + +The L1 table has a variable size (stored in the Bitmap table entry) and may +use multiple clusters, however it must be contiguous in the QDB file. + +Given a offset into the bitmap, the offset into the QDB file can be +obtained as follows: + +offset = l1_table[offset / cluster_size] + (offset % cluster_size) + +L1 table entry: + +Bit 0 - 61: Cluster descriptor + +62 - 63: Reserved + +Standard Cluster Descriptor (the same as in qcow2): + +Bit 0:If set to 1, the cluster reads as all zeros. The host +cluster offset can be used to describe a preallocation, +but it won't be used for reading data from this cluster, +nor is data read from the backing file if the cluster is +unallocated. + + 1 - 8:Reserved (set to 0) + + 9 - 55:Bits 9-55 of host cluster offset. Must be aligned to a +cluster boundary. If the offset is 0, the cluster is +unallocated. + +56 - 61:Reserved (set to 0) + +If a cluster is unallocated, read requests shall read zero. + +== Bitmap table == + +QDB supports storing of several bitmaps. + +A directory of all bitmaps is stored in the bitmap table, a contiguous area +in the QDB file, whose starting offset and length are given by the header +fields bitmaps_offset and nb_bitmaps. The entries of the bitmap table +have variable length, depending on the length of name and extra data. + +Bitmap table entry: + +Byte 0 - 7:Offset into the QDB file at which the L1 table for the +bitmap starts. Must be aligned to a cluster boundary. + + 8 - 11:Number of entries in the L1 table of the
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
On Thu, Nov 20, 2014 at 01:41:14PM +0300, Vladimir Sementsov-Ogievskiy wrote: Also, it may be better to make this as qcow2 extension. And bitmap will be saved in separate qcow2 file, which will contain only the bitmap(s) and no other data (no disk, no snapshots). I think you are on to something with the idea of making the persistent dirty bitmap itself a disk image. That way drive-mirror and other commands can be used to live migrate the dirty bitmap along with the guest's disks. This allows both QEMU and management tools to reuse existing code. (We may need to allow multiple block jobs per BlockDriverState to make this work but in theory that can be done.) There is a constraint if we want to get live migration for free: The bitmap contents must be accessible with bdrv_read() and bdrv_get_block_status() to skip zero regions. Putting the dirty bitmap into its own data structure in qcow2 and not accessible as a BlockDriverState bdrv_read() means custom code must be written to migrate the dirty bitmap. So I suggest putting the bitmap contents into a disk image that can be accessed as a BlockDriverState with bdrv_read(). The metadata (bitmap name, granularity, etc) doesn't need to be stored in the image file because management tools must be aware of it anyway. The only thing besides the data that really needs to be stored is the up-to-date flag to decide whether this dirty bitmap was synced cleanly. A much simpler format would do for that. Stefan pgp2_wMK4EiKn.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v2] persistent dirty bitmap: add QDB file spec.
On 11/20/2014 03:34 AM, Vladimir Sementsov-Ogievskiy wrote: QDB file is for storing dirty bitmap. The specification is based on qcow2 specification. Saving several bitmaps is necessary when server shutdowns during backup. In this case 2 tables for each disk are available. One collected for a previous period and one active. Though this feature is discussable. Big endian format and Standard Cluster Descriptor are used to simplify integration with qcow2, to support internal bitmaps for qcow2 in future. The idea is that the same procedure writing the data to QDB file could do the same for QCOW2. The only difference is cluster refcount table. Should we use it here or not is still questionable. Signed-off-by: Vladimir Sementsov-Ogievskiy vsement...@parallels.com --- docs/specs/qdb.txt | 132 + 1 file changed, 132 insertions(+) create mode 100644 docs/specs/qdb.txt No comment on whether the approach itself makes sense - just a high-level review of this document in isolation. diff --git a/docs/specs/qdb.txt b/docs/specs/qdb.txt new file mode 100644 index 000..d570a69 --- /dev/null +++ b/docs/specs/qdb.txt @@ -0,0 +1,132 @@ +== General == Missing a copyright notice. Yeah, you've got a lot of bad examples in this directory (in docs/* in general), but there ARE a few of the newer files that are starting to buck the trend and use a copyright/license blurb. + +QDB means Qemu Dirty Bitmaps. QDB file can store several dirty bitmaps. +QDB file is organized in units of constant size, which are called clusters. + +All numbers in QDB are stored in Big Endian byte order. + +== Header == + +The first cluster of a QDB image contains the file header: + +Byte 0 - 3: magic +QDB magic string (QDB\0) + + 4 - 7: version +Version number (valid value is 1) + + 8 - 11: cluster_bits +Number of bits that are used for addressing an offset +within a cluster (1 cluster_bits is the cluster size). +Must not be less than 9 (i.e. 512 byte clusters). Is there a maximum? + + 12 - 15: nb_bitmaps +Number of bitmaps contained in the file + + 16 - 23: bitmaps_offset +Offset into the QDB file at which the bitmap table starts. +Must be aligned to a cluster boundary. + + 24 - 27: header_length +Length of the header structure in bytes. does that include the length of all extensions? Should we enforce a maximum header length of one cluster? + +Like in qcow2, directly after the image header, optional sections called header extensions can +be stored. Each extension has a structure like the following: + +Byte 0 - 3: Header extension type: +0x - End of the header extension area +other - Unknown header extension, can be safely + ignored + + 4 - 7: Length of the header extension data + + 8 - n: Header extension data + + n - m: Padding to round up the header extension size to the next +multiple of 8. + +Unless stated otherwise, each header extension type shall appear at most once +in the same image. I like how qcow2 v3 has a header extension for listing the name of each header extension, for nicer error messages. Also, I think that declaring all unknown extensions as ignorable may be dangerous, since you lack a capability bitmask. Maybe it would be wise to copy the qcow2 v3 capabilities (including flags for ignorable vs. mandatory support of given features, where a client can sanely decide what to do if it does not recognize a feature). + +26 - 27:Size of the bitmap name + +36 - 39:Size of extra data in the table entry (used for future +extensions of the format) + +variable: Extra data for future extensions. Unknown fields must be +ignored. This block is width 0 if bytes 36-39 is 0? How are extensions identified? Are they required to be done like overall file headers, with an id, length, and then variable data, so that it is possible to scan to the end of each unknown extension to see if the next extension is known? This is where capability bits in the overall header may make more sense. + +variable: Name of the bitmap (not null terminated) The length of this block is determined by bytes 26-27? + +variable: Padding to round up the bitmap table entry size to the +next multiple of 8. + +The fields size, granularity, enabled and name are corresponding with +the fields in struct BdrvDirtyBitmap. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library