On Tue, May 16, 2023 at 10:55:25AM -0500, Eric Blake wrote: > > +static int > > +evil_config (nbdkit_next_config *next, nbdkit_backend *nxdata, > > + const char *key, const char *value) > > +{ > > + if (strcmp (key, "evil") == 0 || strcmp (key, "evil-mode") == 0) { > > + if (strcmp (value, "cosmic-rays") == 0 || > > + strcmp (value, "cosmic") == 0) { > > Do we want strcasecmp on these? Do we want to allow _ as a synonym to -?
We're a bit random about this. There are a few places where we use ascii_strcasecmp, but mostly we use strcmp ... [...] > > + /* Choose the block size based on the probability, so that at least > > + * 100 bits are expected to be corrupted in the block. Block size > > + * must be a power of 2. > > + */ > > + block_size = next_power_of_2 ((uint64_t) (100. / evil_probability)); > > ...and I'm worried that your block_size computation reaches a point > where it does not do what we want if the probably gets too close to > zero. Even if it is not division by zero, division by a subnormal > float can easily produce overflow to infinity, which converts to > UINT64_MAX, and next_power_of_2(UINT64_MAX) was untested in patch 3's > unit tests, but appears like it will be '1 << (64 - 64)' == 1, which > isn't really desirable. Yes this is definitely a bug. I meant to avoid divide by zero, and did in one place, but not everywhere. I think it's worthwhile allowing the probability to be zero, not least because it avoids a special case for the user. However I will need to add a test to make sure I've avoided all the problems. Also as you say very small but non-zero probabilities can make block_size blow up. Let me fix all of that & add tests. > > + > > + uint64_t offs, intvl, i, rand; > > + const uint64_t dinvp = (uint64_t) (2.0 * (1.0 / evil_probability)); > > [1] Ah, I see - you did mean multiplication, not exponentiation. Like > you commented elsewhere, it is not quite a true uniform distribution, > but certainly seems sane enough for the intent. I will clarify the comment. > > +++ b/tests/test-evil-cosmic.sh > > > + > > +f="test-evil-cosmic.out" > > +rm -f $f > > +cleanup_fn rm -f $f > > + > > +# 80 million zero bits in the backing disk, and the filter will > > +# randomly flip (ie. set high) 1 in 800,000 bits, or about 100. > > + > > +# XXX Actually the number of set bits clusters around 80. There could > > +# be a mistake in my calculations or the interval algorithm we use > > +# might be biased. > > I have not closely inspected the math yet to see if there's an obvious > bias (treat this as a first-pass quick review looking for obvious > code/shell bugs, as opposed to more insidious deep analysis bugs). > But that comment does mean we probably ought to double-check things > prior to v2. Interestingly the other tests are nicely distributed around the expected values, it's just this one test where things are out of whack. I'll add a bit of debugging to the interval calculation etc to see if I can see what's going on. > > +export f > > +nbdkit -U - null 10000000 \ > > + --filter=evil --filter=noextents \ > > + evil=cosmic-rays evil-probability=1/800000 \ > > + --run 'nbdcopy "$uri" $f' > > + > > +# This will give an approximate count of the number of set bits. > > + > > +zbytes="$( od -A n -w1 -v -t x1 < $f | sort | uniq -c | > > + $SED -n -E -e 's/([0-9]+)[[:space:]]+00[[:space:]]*$/\1/p' )" > > +nzbits=$(( 10000000 - zbytes )); # looks wrong but actually correct ... > > Interesting computation! I can probably change this to use Python code. Because I wanted to use nbdcopy (to allow the filter to be hit by multiple threads), I wasn't using nbdsh, so Python wasn't immediately available ... > > +# Expect about 50 stuck-high bits. > > +buf = h.pread(10000000, 0) > > +bits = count_bits(buf) > > +print("stuck high bits: %d (expected 50)" % bits) > > +assert(bits > 20 and bits < 80) > > + > > +# If we read subsets they should match the contents of the buffer. > > +buf1 = h.pread(1000, 1000) > > +assert(buf1 == buf[1000:2000]) > > + > > +buf1 = h.pread(10000, 999) > > +assert(buf1 == buf[999:10999]) > > Should we also assert that there is at least one stuck bit in those > two ranges, and/or pick a different or larger range to improve the > chances of that being the case? So I think you mean pick a range where we know there is a stuck bit before doing the read + assert. I will see. With a 1:800,000 chance, it's not likely the ranges as written would have stuck bits so we're comparing zeroes ... > > +++ b/tests/test-evil-stuck-wires.sh > > > + > > +# Run nbdkit with the evil filter. > > +start_nbdkit -P $pidfile -U $sock \ > > + --filter=evil --filter=noextents \ > > + null 1G evil=stuck-wires evil-probability=1/10000 > > + > > +# Reads from the filter should have 1:10,000 bits stuck high or low. > > +# However we don't see stuck low bits are we are always reading > > s/are we are/since we are/ Will fix. > Overall looks like an interesting idea for a filter. Thanks, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com nbdkit - Flexible, fast NBD server with plugins https://gitlab.com/nbdkit/nbdkit _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs