The Wednesday 24 Sep 2014 à 10:44:14 (+0100), Stefan Hajnoczi wrote : > The blkdebug block driver is undocumented. Documenting it is worthwhile > since it offers powerful error injection features that are used by > qemu-iotests test cases. > > This document will make it easier for people to learn about and use > blkdebug. > > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > --- > v3: > * Fix tab space damage [Eric] > * Rephrase event_names[] as full list of events [Eric] > * Explain that blkdebug state is not observable from outside [Eric] > * Clarify state 0 and state 1 [Eric] > > v2: > * Added GPL v2 or later license and Red Hat copyright [Eric] > * Expanded ini rules file explanation [Paolo] > * Added note that errno values depend on the host [Eric] > > docs/blkdebug.txt | 161 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 161 insertions(+) > create mode 100644 docs/blkdebug.txt > > diff --git a/docs/blkdebug.txt b/docs/blkdebug.txt > new file mode 100644 > index 0000000..5dde072 > --- /dev/null > +++ b/docs/blkdebug.txt > @@ -0,0 +1,161 @@ > +Block I/O error injection using blkdebug > +---------------------------------------- > +Copyright (C) 2014 Red Hat Inc > + > +This work is licensed under the terms of the GNU GPL, version 2 or later. > See > +the COPYING file in the top-level directory. > + > +The blkdebug block driver is a rule-based error injection engine. It can be > +used to exercise error code paths in block drivers including ENOSPC (out of > +space) and EIO. > + > +This document gives an overview of the features available in blkdebug. > + > +Background > +---------- > +Block drivers have many error code paths that handle I/O errors. Image > formats > +are especially complex since metadata I/O errors during cluster allocation or > +while updating tables happen halfway through request processing and require > +discipline to keep image files consistent. > + > +Error injection allows test cases to trigger I/O errors at specific points. > +This way, all error paths can be tested to make sure they are correct. > + > +Rules > +----- > +The blkdebug block driver takes a list of "rules" that tell the error > injection > +engine when to fail an I/O request. > + > +Each I/O request is evaluated against the rules. If a rule matches the > request > +then its "action" is executed. > + > +Rules can be placed in a configuration file; the configuration file > +follows the same .ini-like format used by QEMU's -readconfig option, and > +each section of the file represents a rule. > + > +The following configuration file defines a single rule: > + > + $ cat blkdebug.conf > + [inject-error] > + event = "read_aio" > + errno = "28" > + > +This rule fails all aio read requests with ENOSPC (28). Note that the errno > +value depends on the host. On Linux, see > +/usr/include/asm-generic/errno-base.h for errno values. > + > +Invoke QEMU as follows: > + > + $ qemu-system-x86_64 > + -drive > if=none,cache=none,file=blkdebug:blkdebug.conf:test.img,id=drive0 \ > + -device virtio-blk-pci,drive=drive0,id=virtio-blk-pci0 > + > +Rules support the following attributes: > + > + event - which type of operation to match (e.g. read_aio, write_aio, > + flush_to_os, flush_to_disk). See the "Events" section for > + information on events. > + > + state - (optional) the engine must be in this state number in order for > this > + rule to match. See the "State transitions" section for information > + on states. > + > + errno - the numeric errno value to return when a request matches this rule. > + The errno values depend on the host since the numeric values are > not > + standarized in the POSIX specification. > + > + sector - (optional) a sector number that the request must overlap in order > to > + match this rule > + > + once - (optional, default "off") only execute this action on the first > + matching request > + > + immediately - (optional, default "off") return a NULL BlockDriverAIOCB > + pointer and fail without an errno instead. This exercises > the > + code path where BlockDriverAIOCB fails and the caller's > + BlockDriverCompletionFunc is not invoked. > + > +Events > +------ > +Block drivers provide information about the type of I/O request they are > about > +to make so rules can match specific types of requests. For example, the > qcow2 > +block driver tells blkdebug when it accesses the L1 table so rules can match > +only L1 table accesses and not other metadata or guest data requests. > + > +The core events are: > + > + read_aio - guest data read > + > + write_aio - guest data write > + > + flush_to_os - write out unwritten block driver state (e.g. cached metadata) > + > + flush_to_disk - flush the host block device's disk cache > + > +See block/blkdebug.c:event_names[] for the full list of events. You may need > +to grep block driver source code to understand the meaning of specific > events. > + > +State transitions > +----------------- > +There are cases where more power is needed to match a particular I/O request > in > +a longer sequence of requests. For example: > + > + write_aio > + flush_to_disk > + write_aio > + > +How do we match the 2nd write_aio but not the first? This is where state > +transitions come in. > + > +The error injection engine has an integer called the "state" that always > starts > +initialized to 1. The state integer is internal to blkdebug and cannot be > +observed from outside but rules can interact with it for powerful matching > +behavior. > + > +Rules can be conditional on the current state and they can transition to a > new > +state. > + > +When a rule's "state" attribute is non-zero then the current state must equal > +the attribute in order for the rule to match. > + > +For example, to match the 2nd write_aio: > + > + [set-state] > + event = "write_aio" > + state = "1" > + new_state = "2" > + > + [inject-error] > + event = "write_aio" > + state = "2" > + errno = "5" > + > +The first write_aio request matches the set-state rule and transitions from > +state 1 to state 2. Once state 2 has been entered, the set-state rule no > +longer matches since it requires state 1. But the inject-error rule now > +matches the next write_aio request and injects EIO (5). > + > +State transition rules support the following attributes: > + > + event - which type of operation to match (e.g. read_aio, write_aio, > + flush_to_os, flush_to_disk). See the "Events" section for > + information on events. > + > + state - (optional) the engine must be in this state number in order for > this > + rule to match > + > + new_state - transition to this state number > + > +Suspend and resume > +------------------ > +Exercising code paths in block drivers may require specific ordering amongst > +concurrent requests. The "breakpoint" feature allows requests to be halted > on > +a blkdebug event and resumed later. This makes it possible to achieve > +deterministic ordering when multiple requests are in flight. > + > +Breakpoints on blkdebug events are associated with a user-defined "tag" > string. > +This tag serves as an identifier by which the request can be resumed at a > later > +point. > + > +See the qemu-io(1) break, resume, remove_break, and wait_break commands for > +details. > -- > 1.9.3 > >
I won't be able to spellcheck and help clarify it better than Eric but it's nice that it is getting documented since it's a powerfull and useful feature. Best regards Benoît