Il 28/07/2012 15:26, Eric Blake ha scritto: > On 07/24/2012 05:04 AM, Paolo Bonzini wrote: >> HBitmaps provide an array of bits. The bits are stored as usual in an >> array of unsigned longs, but HBitmap is also optimized to provide fast >> iteration over set bits; going from one bit to the next is O(logB n) >> worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough >> that the number of levels is in fact fixed. >> > >> +++ b/hbitmap.c >> + * So it is easy to move up simply by shifting the index right by >> + * log2(BITS_PER_LONG) bits. To move down, you shift the index left >> + * similarly, and add the word index within the group. Iteration uses >> + * ffs (find first set bit) to find the next word to examine; this >> + * operation can be done in constant time in most current architectures. > > Technically, ffs and friends can be done in constant time in ALL > architectures. There are some well-known bit-twiddling algorithms for > computing it in straightline C code with no conditionals (and therefore > in constant time, just with a larger constant than on platforms that > lack the builtin single instruction).
Technically, those are O(log2 BITS_PER_LONG). :) >> + >> +struct HBitmap { >> + /* Number of total bits in the bottom level. */ >> + uint64_t size; >> + >> + /* Number of set bits in the bottom level. */ >> + uint64_t count; >> + >> + /* A scaling factor. When setting or resetting bits, the bitmap will >> + * scale bit numbers right by this amount of bits. When iterating, >> + * the bitmap will scale bit numbers left by this amonut of bits. > > s/amonut/amount/ > >> + * Example of operations in a size-16, granularity-1 HBitmap: >> + * >> + * initial state 00000000 >> + * set(start=0, count=9) 11111000 (iter: 0, 2, 4, 6, 8) >> + * reset(start=1, count=3) 00111000 (iter: 4, 6, 8) >> + * set(start=9, count=2) 00111100 (iter: 4, 6, 8, 10) >> + * reset(start=5, count=5) 00000000 > > Thanks. That helps me understand tremendously over the previous version > of this patch. Basically, you are stating that rather than storing 16 > bits, you compress and only store information on 8 pages, and each > operation on a range of bits first rounds the bits to determine which > page they land in then affect the entire page; while iteration only > visits the first bit of each page. A granularity of 1 means pages are > 1<<1 or every 2 bit numbers. Yes. > Typical values of granularity will > probably then be 0 (every bit is important) or 12 (operating on 4k > pages, where touching any byte in the page is sufficient to track that > entire page as affected in the bitmap). In our case, bit indices are sector numbers, so a common value of the granularity will be for example 7=16-9 (for a 64k-coarse dirty bitmap). >> + */ >> + int granularity; >> + >> + /* A number of progressively less coarse bitmaps (i.e. level 0 is the >> + * coarsest). Each bit in level N represents a word in level N+1 that >> + * has a set bit, except the last level where each bit represents the >> + * actual bitmap. >> + */ >> + unsigned long *levels[HBITMAP_LEVELS]; > > That is, even allocating a 1-bit bitmap will still allocate > HBITMAP_LEVELS arrays, rather than trying to dynamically optimize and > reduce the number of levels necessary to hold the requested size. Fair > enough. Also because the HBitmapIter would become even more complex than it already is. >> + >> +static int hbi_count_towards(HBitmapIter *hbi, uint64_t last) >> +{ >> + uint64_t next = hbi_next_internal(hbi); >> + int n; >> + >> + /* Take it easy with the last few bits. */ >> + if (next >= (last & -BITS_PER_LONG)) { >> + return (next > last ? 0 : 1); > > You could write this as: > return next <= last; > but probably not worth the obfuscation. You probably guessed why I wrote it that way. It's at the top of the function, and such a return may give the wrong idea that the function returns a boolean. >> +void hbitmap_iter_init(HBitmapIter *hbi, HBitmap *hb, uint64_t first) >> +{ >> + int i, bit; >> + size_t pos; >> + >> + hbi->hb = hb; >> + pos = first; >> + for (i = HBITMAP_LEVELS; --i >= 0; ) { >> + bit = pos & (BITS_PER_LONG - 1); >> + pos >>= BITS_PER_LEVEL; >> + >> + /* Drop bits representing items before first. */ >> + hbi->cur[i] = hb->levels[i][pos] & ~((1UL << bit) - 1); >> + >> + /* We have already added level i+1, so the lowest set bit has >> + * been processed. Clear it. >> + */ >> + if (i != HBITMAP_LEVELS - 1) { >> + hbi->cur[i] &= ~(1UL << bit); >> + } >> + } >> + >> + hbi->pos = first >> BITS_PER_LEVEL; >> + hbi->granularity = hb->granularity; > > Do we really need hbi->granularity, or can the code get by with > hbi->hb->granularity? It's probably prematurely optimized---this way the fast path does not need to access hb at all. But it's also a little bit helpful in gdb, and it is practically free, so I'd rather leave it. >> +HBitmap *hbitmap_alloc(uint64_t size, int granularity) >> +{ >> + HBitmap *hb = g_malloc0(sizeof(struct HBitmap)); >> + int i; >> + >> + assert(granularity >= 0 && granularity < 64); > > Shouldn't this be granularity < BITS_PER_LONG? Yep, thanks. >> +++ b/hbitmap.h > >> +#define BITS_PER_LEVEL (BITS_PER_LONG == 32 ? 5 : 6) >> + >> +/* For 32-bit, the largest that fits in a 4 GiB address space. >> + * For 64-bit, the number of sectors in 1 PiB. Good luck, in >> + * either case... :) >> + */ >> +#define HBITMAP_LOG_MAX_SIZE (BITS_PER_LONG == 32 ? 34 : 41) >> + >> +/* Leave an extra bit for a sentinel. */ >> +#define HBITMAP_LEVELS ((HBITMAP_LOG_MAX_SIZE / BITS_PER_LEVEL) + 1) > > Interesting that this picks 7 levels for both 32-bit and 64-bit long > (hmm, that's why you capped HBITMAP_LOG_MAX_SIZE to the number of > sectors in 1 PiB, rather than covering the entire address space :) > >> + >> +struct HBitmapIter { >> + HBitmap *hb; >> + size_t pos; >> + int granularity; > > Again, do you really need granularity here? > >> + unsigned long cur[HBITMAP_LEVELS]; >> +}; >> + > > I did a much closer read of the code this time around, and I'm happy > that your implementation looks sound by inspection as well as has an > accompanying testsuite. Yeah, no way to be confident about this without a testsuite. Thanks for the review! Paolo >> +++ b/tests/test-hbitmap.c > >> +static void test_hbitmap_iter_granularity(TestHBitmapData *data, >> + const void *unused) >> +{ >> + HBitmapIter hbi; >> + >> + /* Note that hbitmap_test_check has to be invoked manually in this >> test. */ >> + hbitmap_test_init(data, 131072 << 7, 7); >> + hbitmap_iter_init(&hbi, data->hb, 0); >> + g_assert_cmpint(hbitmap_iter_next(&hbi), <, 0); >> + hbitmap_test_set(data, ((L2 + L1 + 1) << 7) + 8, 8); >> + hbitmap_iter_init(&hbi, data->hb, 0); >> + g_assert_cmpint(hbitmap_iter_next(&hbi), ==, (L2 + L1 + 1) << 7); > > Misleading comment, since you didn't call hbitmap_test_check. >