Re: [Qemu-devel] [PATCH 5/5] iotests: Let 233 run concurrently

Max Reitz Tue, 07 May 2019 13:50:08 -0700

On 07.05.19 22:38, Eric Blake wrote:
> On 5/7/19 1:36 PM, Max Reitz wrote:
>> common.nbd's nbd_server_set_tcp_port() tries to find a free port, and
>> then uses it for the whole test run.  However, this is racey because
> 
> racy
> 
>> even if the port was free at the beginning, there is no guarantee it
>> will continue to be available.  Therefore, 233 currently cannot reliably
>> be run concurrently with other NBD TCP tests.
>>
>> This patch addresses the problem by dropping nbd_server_set_tcp_port(),
>> and instead finding a new port every time nbd_server_start_tcp_socket()
>> is invoked.  For this, we run qemu-nbd with --fork and on error evaluate
>> the output to see whether it contains "Address already in use".  If so,
>> we try the next port.
>>
>> On success, we still want to continually redirect the output from
>> qemu-nbd to stderr.  To achieve both, we redirect qemu-nbd's stderr to a
>> FIFO that we then open in bash.  If the parent process exits with status
>> 0 (which means that the server has started successfully), we launch a
>> background cat process that copies the FIFO to stderr.  On failure, we
>> read the whole content into a variable and then evaluate it.
>>
>> While at it, use --fork in nbd_server_start_unix_socket(), too.  Doing
>> so allows us to drop nbd_server_wait_for_*_socket().
>>
>> Note that the reason common.nbd did not use --fork before is that
>> qemu-nbd did not have --pid-file.
>>
>> Signed-off-by: Max Reitz <mre...@redhat.com>
>> ---
>>  tests/qemu-iotests/233        |  1 -
>>  tests/qemu-iotests/common.nbd | 93 ++++++++++++++++-------------------
>>  2 files changed, 42 insertions(+), 52 deletions(-)
>>
> 
>> @@ -34,76 +39,62 @@ nbd_server_stop()
>>          fi
>>      fi
>>      rm -f "$nbd_unix_socket"
>> -}
>> -
>> -nbd_server_wait_for_unix_socket()
>> -{
> ...
>> -    echo "Failed in check of unix socket created by qemu-nbd"
>> -    exit 1
>> +    rm -f "$nbd_stderr_fifo"
> 
> You could use a single 'rm -f "$nbd_unix_socket" "$nbd_stderr_fifo"'.
> That's cosmetic, though.
> 
> Are we sure that even on failure, our fifo will not fill up and cause
> deadlock? If the failing qemu-nbd has so much output as to be non-atomic
> so that it blocks waiting for a reader, but we don't read anything until
> after qemu-nbd exits after forking the daemon, then we have deadlock.


Hm, right.  I don’t think it will happen, but if it does, it won’t be
because of an “Address already in use”.  So if it did happen, the test
should fail anyway.

Of course, a hang is not the nicest way to fail a test, but I think as
long as we don’t think it will be a problem, it should be fine.

(The alternative I can think of would be to start a background cat that
copies data over to a log file, and then kill it after the qemu-nbd
parent process has exited.  On error, we read the log; on success, we
print it to stderr and then start the cat from nbd_stderr_fifo to stderr.)

> But in the common case, I don't think qemu-nbd ever spits out that much
> in errors, even when it fails to start whether due to a socket in use or
> for other reasons.  And even if it does hang, it is our testsuite (and
> our CI tools will probably notice it), rather than our main code.
> 
> Otherwise, it's a lot of shell code with quite a few bash-isms, but we
> already require bash, and I didn't spot anything blatantly wrong.
> 
> Reviewed-by: Eric Blake <ebl...@redhat.com>

Thanks again!

I’ll prepare the v2.

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 5/5] iotests: Let 233 run concurrently

Reply via email to