On Thu, Jan 05, 2012 at 01:46:08PM -0200, Marcelo Tosatti wrote: > On Thu, Dec 29, 2011 at 05:36:59PM +0800, Dong Xu Wang wrote: > > From: Dong Xu Wang <wdon...@linux.vnet.ibm.com> > > > > Introduce a new file format: add-cow. The usage can be found in add-cow.txt > > of > > this patch. > > > > CC: Kevin Wolf <kw...@redhat.com> > > CC: Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> > > Signed-off-by: Dong Xu Wang <wdon...@linux.vnet.ibm.com> > > --- > > After applying this patch, qemu might can not compile, need apply this > > patch first: > > http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg02527.html > > > > Makefile.objs | 1 + > > block.c | 2 +- > > block.h | 1 + > > block/add-cow.c | 429 > > ++++++++++++++++++++++++++++++++++++++++++++++++ > > block_int.h | 1 + > > docs/specs/add-cow.txt | 72 ++++++++ > > 6 files changed, 505 insertions(+), 1 deletions(-) > > create mode 100644 block/add-cow.c > > create mode 100644 docs/specs/add-cow.txt > > > > > > + s->bitmap_size = ((bs->total_sectors + 7) >> 3); > > + s->bitmap = qemu_blockalign(bs, s->bitmap_size); > > + > > + ret = bdrv_pread(bs->file, sizeof(header), s->bitmap, > > + s->bitmap_size); > > + if (ret != s->bitmap_size) { > > + goto fail; > > + } > > As noted previously, it is not acceptable to read the entire bitmap in > memory since it might be very large. A cache, which limits the in-memory > size of the bitmap, must be created. In the qcow2-cache.c file you can > find an example (thats for qcow2 metadata cache). You can divide the > bitmap in chunks of say, 4k, and have: > > int is_bit_set(int64_t bitnum, BlockDriverState *bs) > { > int64_t bitmap_entry = bitnum >> bits_per_entry; > > if (!is_in_bitmap_cache(bs, bitmap_entry)) > read_from_disk(bs, bitmap_entry); > > return lookup_bitmap_cache(bs, bitnum); > } > > And then limit the cache to a few megabytes. > > Also when setting a bit you must update cache and write > to disk.
I suspect it's also better to increase the bitmap granularity. The bitmap should track allocation at a larger "cluster" size like 64 KB. That way we reduce the number of I/O operations required to update metadata - it reduces the amount of metadata by a factor of 65536 / 512 = 128. If you imagine a random write workload with 4 KB block size there is an advantage to a 64 KB cluster size since later I/Os may require no bitmap updates where we already allocated a cluster in an earlier operation. The downside of a larger bitmap granularity is that writes are increased to 64 KB, but if you run a benchmark I guess there is a threshold around 32 or 64 KB where the reduction in I/O operations makes up for the larger I/O size. It depends on your disks. Stefan