Re: [PATCH] tests/qemu-iotests: Rework the checks and spots using GNU sed
On Wed, Feb 16, 2022 at 12:39:06PM +0100, Thomas Huth wrote: > > > -$SED -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} > > > [0-9]{2}:[0-9]{2}:[0-9]{2}/-mm-dd hh:mm:ss/' > > > +gsed -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} > > > [0-9]{2}:[0-9]{2}:[0-9]{2}/-mm-dd hh:mm:ss/' > > > > GNU sed recommends spelling it 'sed -E', not 'sed -r', when using > > extended regex. Older POSIX did not support either spelling, but the > > next version will require -E, as many implementations have it now: > > https://www.austingroupbugs.net/view.php?id=528 > > Thanks for the pointer ... I originally checked "man 1p sed" on > my system and did not see -r or -E in there, so I thought that > this must be really something specific to GNU sed. But now that > you've mentioned this, I just double-checked the build environments > that we support with QEMU, and seems like -E should be supported > everywhere: Yay. > > So I think it should be safe to change these spots that are > using "-r" to "sed -E" (and not use gsed here). > > > Other than the fact that this was easier to write with ERE, I'm not > > seeing any other GNU-only extensions in use here; but I'd recommend > > that as long as we're touching the line, we spell it 'gsed -Ee' > > instead of -re (here, and in several other places). > > > > > _filter_qom_path() > > > { > > > -$SED -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g' > > > +gsed -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g' > > > > Here, it is our use of \+ that is a GNU sed extension, although it is > > fairly easy (but verbose) to translate that one to portable sed > > (\+ is the same as *). So gsed is correct. Then again, since we claim 'sed -E' is portable, we can get the + operator everywhere by using ERE instead of BRE (and with fewer leaning toothpicks, another reason I like ERE better than BRE). On the > > other hand, the use of [[0-9]\+\] looks ugly - it probably does NOT > > match what we meant (we have a bracket expression '[...]' that matches > > the 11 characters [ and 0-9, then '\+' to match that bracket > > expression 1 or more times, then '\]' which in its context is > > identical to ']' to match a closing ], since only opening [ needs \ > > escaping for literal treatment). What we probably meant is: > > > > '/Attached to:/s/\device\[[0-9][0-9]*]/device[N]/g' > > > > at which point normal sed would do. > > Ok ... but I'd prefer to clean such spots rather in a second step, > to make sure not to introduce bugs and to make the review easier. Yeah, fixing bugs and cleaning up consistent use of sed/gsed/$SED are worth separating. > > > _filter_qemu_io() > > > { > > > -_filter_win32 | $SED -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* > > > [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX > > > YYY\/sec and XXX ops\/sec)/" \ > > > +_filter_win32 | gsed -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* > > > [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX > > > YYY\/sec and XXX ops\/sec)/" \ > > > -e "s/: line [0-9][0-9]*: *[0-9][0-9]*\( Aborted\| > > > Killed\)/:\1/" \ > > > -e "s/qemu-io> //g" > > > > I'm not seeing anything specific to GNU sed in this (long) sed script; > > can we relax this one to plain 'sed'? Use of s#some/text## might be > > easier than having to s/some\/text//, but that's not essential. > > If I switch that to plain sed, I'm getting errors like this on NetBSD: > > --- /home/qemu/qemu-test.is2SLq/src/tests/qemu-iotests/046.out > +++ 11296-046.out.bad > @@ -66,6 +66,7 @@ > 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > wrote 65536/65536 bytes at offset 2031616 > 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > + > Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=6442450944 > backing_file=TEST_DIR/t.IMGFMT.base backing_fmt=IMGFMT Huh; not sure what happened that I didn't see. But I trust your tests as a more canonical version of "it worked on this platform's sed" than my "I don't see anything blantantly non-POSIX" ;) > > == Some concurrent requests touching the same cluster == > > So I'll keep gsed here for now - we need it for _filter_qemu_io > anyway since it's calling _filter_win32 that currently needs > gsed, too. Yeah, I think your patch is big enough to prove there are places where it really is easier to rely on gsed than to try and be portable. > > > > @@ -142,7 +142,7 @@ _do_filter_img_create() > > > # precedes ", fmt=") and the options part ($options, which starts > > > # with "fmt=") > > > # (And just echo everything before the first "^Formatting") > > > -readarray formatting_line < <($SED -e 's/, fmt=/\n/') > > > +readarray formatting_line < <(gsed -e 's/, fmt=/\n/') > > > > This one looks like it should work with plain 'sed'. > > Using normal sed does not really work for me here. For example > with NetBSD, I get errors like this: > > ---
Re: [PATCH] tests/qemu-iotests: Rework the checks and spots using GNU sed
On 15/02/2022 23.10, Eric Blake wrote: On Tue, Feb 15, 2022 at 02:20:31PM +0100, Thomas Huth wrote: Instead of failing the iotests if GNU sed is not available (or skipping them completely in the check-block.sh script), it would be better to simply skip the bash-based tests that rely on GNU sed, so that the other tests could still be run. Thus we now explicitely use "gsed" (either as direct program or as a wrapper around "sed" if it's the GNU version) in the spots that rely on the GNU sed behavior. Then we also remove the sed checks from the check-block.sh script, so that "make check-block" can now be run on systems without GNU sed, too. ... diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter index 935217aa65..a3b1708adc 100644 --- a/tests/qemu-iotests/common.filter +++ b/tests/qemu-iotests/common.filter @@ -21,58 +21,58 @@ _filter_date() { -$SED -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/-mm-dd hh:mm:ss/' +gsed -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}/-mm-dd hh:mm:ss/' GNU sed recommends spelling it 'sed -E', not 'sed -r', when using extended regex. Older POSIX did not support either spelling, but the next version will require -E, as many implementations have it now: https://www.austingroupbugs.net/view.php?id=528 Thanks for the pointer ... I originally checked "man 1p sed" on my system and did not see -r or -E in there, so I thought that this must be really something specific to GNU sed. But now that you've mentioned this, I just double-checked the build environments that we support with QEMU, and seems like -E should be supported everywhere: macOS 11: $ sed --help sed: illegal option -- - usage: sed script [-Ealnru] [-i extension] [file ...] sed [-Ealnu] [-i extension] [-e script] ... [-f script_file] ... [file ...] NetBSD 9.2: $ sed --help sed: unknown option -- - Usage: sed [-aElnru] command [file ...] sed [-aElnru] [-e command] [-f command_file] [-I[extension]] [-i[extension]] [file ...] FreeBSD 12.3: $ sed --help sed: illegal option -- - usage: sed script [-Ealnru] [-i extension] [file ...] sed [-Ealnu] [-i extension] [-e script] ... [-f script_file] ... [file ...] OpenBSD 7.0: $ sed --help sed: unknown option -- - usage: sed [-aEnru] [-i[extension]] command [file ...] sed [-aEnru] [-e command] [-f command_file] [-i[extension]] [file ...] Illumos: Has -E according to https://illumos.org/man/1/sed Busybox: Has -E according to https://www.commandlinux.com/man-page/man1/busybox.1.html Haiku: Seems to use GNU sed, so -E is available. We likely never will run the iotests on Windows, so I did not check those (but I assume MSYS and friends are using GNU sed, too). So I think it should be safe to change these spots that are using "-r" to "sed -E" (and not use gsed here). Other than the fact that this was easier to write with ERE, I'm not seeing any other GNU-only extensions in use here; but I'd recommend that as long as we're touching the line, we spell it 'gsed -Ee' instead of -re (here, and in several other places). _filter_qom_path() { -$SED -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g' +gsed -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g' Here, it is our use of \+ that is a GNU sed extension, although it is fairly easy (but verbose) to translate that one to portable sed (\+ is the same as *). So gsed is correct. On the other hand, the use of [[0-9]\+\] looks ugly - it probably does NOT match what we meant (we have a bracket expression '[...]' that matches the 11 characters [ and 0-9, then '\+' to match that bracket expression 1 or more times, then '\]' which in its context is identical to ']' to match a closing ], since only opening [ needs \ escaping for literal treatment). What we probably meant is: '/Attached to:/s/\device\[[0-9][0-9]*]/device[N]/g' at which point normal sed would do. Ok ... but I'd prefer to clean such spots rather in a second step, to make sure not to introduce bugs and to make the review easier. # Removes \r from messages _filter_win32() { -$SED -e 's/\r//g' +gsed -e 's/\r//g' Yep, \r is another GNU sed extension. } # sanitize qemu-io output _filter_qemu_io() { -_filter_win32 | $SED -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \ +_filter_win32 | gsed -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX YYY\/sec and XXX ops\/sec)/" \ -e "s/: line [0-9][0-9]*: *[0-9][0-9]*\( Aborted\| Killed\)/:\1/" \ -e "s/qemu-io> //g" I'm not seeing anything specific to GNU sed in this (long) sed script; can we relax this one to plain 'sed'? Use of s#some/text## might be easier than having to s/some\/text//, but that's not essential. If I switch that to plain
Re: [PATCH] tests/qemu-iotests: Rework the checks and spots using GNU sed
On Tue, Feb 15, 2022 at 02:20:31PM +0100, Thomas Huth wrote: > Instead of failing the iotests if GNU sed is not available (or skipping > them completely in the check-block.sh script), it would be better to > simply skip the bash-based tests that rely on GNU sed, so that the other > tests could still be run. Thus we now explicitely use "gsed" (either as > direct program or as a wrapper around "sed" if it's the GNU version) > in the spots that rely on the GNU sed behavior. Then we also remove the > sed checks from the check-block.sh script, so that "make check-block" > can now be run on systems without GNU sed, too. > > Signed-off-by: Thomas Huth > --- > I've checked that this works fine with: > make vm-build-netbsd TARGET_LIST=x86_64-softmmu BUILD_TARGET=check-block > make vm-build-freebsd TARGET_LIST=x86_64-softmmu BUILD_TARGET=check-block > and with the macOS targets in our CI. > > tests/check-block.sh | 12 -- > tests/qemu-iotests/271 | 2 +- > tests/qemu-iotests/common.filter | 74 > tests/qemu-iotests/common.rc | 45 +-- > 4 files changed, 61 insertions(+), 72 deletions(-) > > +++ b/tests/qemu-iotests/271 > @@ -896,7 +896,7 @@ _make_test_img -o extended_l2=on 1M > # Second and third writes in _concurrent_io() are independent and may finish > in > # different order. So, filter offset out to match both possible variants. > _concurrent_io | $QEMU_IO | _filter_qemu_io | \ > -$SED -e 's/\(20480\|40960\)/OFFSET/' > +sed -e 's/\(20480\|40960\)/OFFSET/' Looks like a portable sed script, so 'sed' instead of 'gsed' here is fine. > _concurrent_verify | $QEMU_IO | _filter_qemu_io > > # success, all done > diff --git a/tests/qemu-iotests/common.filter > b/tests/qemu-iotests/common.filter > index 935217aa65..a3b1708adc 100644 > --- a/tests/qemu-iotests/common.filter > +++ b/tests/qemu-iotests/common.filter > @@ -21,58 +21,58 @@ > > _filter_date() > { > -$SED -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} > [0-9]{2}:[0-9]{2}:[0-9]{2}/-mm-dd hh:mm:ss/' > +gsed -re 's/[0-9]{4}-[0-9]{2}-[0-9]{2} > [0-9]{2}:[0-9]{2}:[0-9]{2}/-mm-dd hh:mm:ss/' GNU sed recommends spelling it 'sed -E', not 'sed -r', when using extended regex. Older POSIX did not support either spelling, but the next version will require -E, as many implementations have it now: https://www.austingroupbugs.net/view.php?id=528 Other than the fact that this was easier to write with ERE, I'm not seeing any other GNU-only extensions in use here; but I'd recommend that as long as we're touching the line, we spell it 'gsed -Ee' instead of -re (here, and in several other places). > _filter_qom_path() > { > -$SED -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g' > +gsed -e '/Attached to:/s/\device[[0-9]\+\]/device[N]/g' Here, it is our use of \+ that is a GNU sed extension, although it is fairly easy (but verbose) to translate that one to portable sed (\+ is the same as *). So gsed is correct. On the other hand, the use of [[0-9]\+\] looks ugly - it probably does NOT match what we meant (we have a bracket expression '[...]' that matches the 11 characters [ and 0-9, then '\+' to match that bracket expression 1 or more times, then '\]' which in its context is identical to ']' to match a closing ], since only opening [ needs \ escaping for literal treatment). What we probably meant is: '/Attached to:/s/\device\[[0-9][0-9]*]/device[N]/g' at which point normal sed would do. > } > > # replace occurrences of the actual TEST_DIR value with TEST_DIR > _filter_testdir() > { > -$SED -e "s#$TEST_DIR/#TEST_DIR/#g" \ > - -e "s#$SOCK_DIR/#SOCK_DIR/#g" \ > - -e "s#SOCK_DIR/fuse-#TEST_DIR/#g" > +sed -e "s#$TEST_DIR/#TEST_DIR/#g" \ > +-e "s#$SOCK_DIR/#SOCK_DIR/#g" \ > +-e "s#SOCK_DIR/fuse-#TEST_DIR/#g" And this one indeed looks portable to POSIX (unless $TEST_DIR contains weird stuff by accident). > # Removes \r from messages > _filter_win32() > { > -$SED -e 's/\r//g' > +gsed -e 's/\r//g' Yep, \r is another GNU sed extension. > } > > # sanitize qemu-io output > _filter_qemu_io() > { > -_filter_win32 | $SED -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* > [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX > YYY\/sec and XXX ops\/sec)/" \ > +_filter_win32 | gsed -e "s/[0-9]* ops\; [0-9/:. sec]* ([0-9/.inf]* > [EPTGMKiBbytes]*\/sec and [0-9/.inf]* ops\/sec)/X ops\; XX:XX:XX.X (XXX > YYY\/sec and XXX ops\/sec)/" \ > -e "s/: line [0-9][0-9]*: *[0-9][0-9]*\( Aborted\| Killed\)/:\1/" \ > -e "s/qemu-io> //g" I'm not seeing anything specific to GNU sed in this (long) sed script; can we relax this one to plain 'sed'? Use of s#some/text## might be easier than having to s/some\/text//, but that's not essential. > } > @@ -80,7 +80,7 @@ _filter_qemu_io() > # replace occurrences of QEMU_PROG with "qemu" > _filter_qemu() > { > -