On Thu, Nov 11, 2021 at 06:52:30PM +0100, Kevin Wolf wrote: > Am 11.11.2021 um 16:33 hat Roman Kagan geschrieben: > > vhost-user-blk realize only attempts to reconnect if the previous > > connection attempt failed on "a problem with the connection and not an > > error related to the content (which would fail again the same way in the > > next attempt)". > > > > However this distinction is very subtle, and may be inadvertently broken > > if the code changes somewhere deep down the stack and a new error gets > > propagated up to here. > > > > OTOH now that the number of reconnection attempts is limited it seems > > harmless to try reconnecting on any error. > > > > So relax the condition of whether to retry connecting to check for any > > error. > > > > This patch amends a527e312b5 "vhost-user-blk: Implement reconnection > > during realize". > > > > Signed-off-by: Roman Kagan <rvka...@yandex-team.ru> > > It results in less than perfect error messages. With a modified export > that just crashes qemu-storage-daemon during get_features, I get: > > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=c: Failed to read msg > header. Read 0 instead of 12. Original request 1. > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=c: Reconnecting after > error: vhost_backend_init failed: Protocol error > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=c: Reconnecting after > error: Failed to connect to '/tmp/vsock': Connection refused > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=c: Reconnecting after > error: Failed to connect to '/tmp/vsock': Connection refused > qemu-system-x86_64: -device vhost-user-blk-pci,chardev=c: Failed to connect > to '/tmp/vsock': Connection refused
This patch doesn't change any error messages. Which ones specifically became less than perfect as a result of this patch? > I guess this might be tolerable. On the other hand, the patch doesn't > really fix anything either, but just gets rid of possible subtleties. The remaining patches in the series make other errors beside -EPROTO propagate up to this point, and some (most) of them are retryable. This was the reason to include this patch at the beginning of the series (I guess I should've mentioned that in the patch log). Thanks, Roman.