Now, in response to your question about some other 3-pass inducing
pattern, let's think back to v1, where you questioned what would happen
if a hole in the reftable gets turned into data due to a later
allocation. Let's see if I can come up with a scenario for that...
Let's stick with a cluster size of 512, and use 32-bit and 64-bit widths
as our two sizes. If we downsize from 64 to 32 bits, then every two
refblock clusters in the old table results in one call to operation()
for the new table; conversely, if we upsize, then every refblock cluster
in the old table gives two calls to operation() in the new table. The
trick at hand is to come up with some image where we punch a hole so
that on the first pass, we call operation() with refblock_empty true for
one iteration (necessarily a call later than the first, since the image
header guarantees the first refblock is not empty), but where we have
data after the hole, where it is the later data that triggers the
allocation that will finally start to fill the hole.
How about starting with an image that occupies between 1.5 and 2
refblocks worth of 32-width clusters (so an image anywhere between 193
and 256 clusters, or between 98816 and 131072 bytes). You should be
able to figure out how many clusters this consumes for L1, L2, plus 1
for header, reftable, and 2 for refblocks, in order to figure out how
many remaining clusters are dedicated to data; ideally, the data
clusters are contiguous, and occupy a swath that covers at least
clusters 126 through 192. Widening to 64-bit width will require 4
refblocks instead of 2, if all refblocks are needed. But the whole idea
of punching a hole is that we don't need a refblock if it will be
all-zero entries. So take this original image, and discard the data
clusters from physical index 126 through 192, (this is NOT the data
visible at guest offset 31744, but whatever actual offset of guest data
that maps to physical offset 31744). The old reftable now looks like {
refblock_o1 [0-125 occupied, 126 and 127 empty], refblock_o2 [128-191
empty, 192-whatever occupied, tail empty] }. With no allocations
required, this would in turn would map to the following new refblocks: {
refblock_n1 [0-64 occupied], refblock_n2 [65-125 occupied, 126-127
empty], NULL, refblock_n4 [192-whatever occupied] }. Note that we do
not need to allocate refblock_n3 because of the hole in the old
refblock; we DO end up allocating three refblocks, but in the sequence
of things, refblock_n1 and refblock_n2 are allocated while we are
visiting refblock_o1 and still fit in refblock_o1, while refblock_n4 is
not allocated until after we have already passed over the first half of
refblock_o2.
Thus, the second walk over the image will see that we need to allocate
refblock_n3 because it now contains entries (in particular, the entry
for refblock_n4, but also the 1-cluster entry for the proposed reftable
that is allocated between the walks). So, while your test used the
allocation of the reftable as the spillover point, my scenario here uses
the allocation of later refblocks as the spillover point that got missed
during the first iteration.
which means the reftable now looks like { refblock1, NULL, refblock3,
NULL... }; and where refblock1 now has at least two free entries
(possibly three, if the just-freed refblock2 happened to live before
cluster 62). is we can also free refblock2