On Tue, Apr 06, 2021 at 06:51:14PM +0300, Vladimir Sementsov-Ogievskiy wrote: > If on nbd_close() we detach the thread (in > nbd_co_establish_connection_cancel() thr->state becomes > CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use > s->connect_thread (which is set to NULL), as running thread may free it > at any time. > > Still nbd_co_establish_connection() does exactly this: it saves > s->connect_thread to local variable (just for better code style) and > use it even after yield point, when thread may be already detached. > > Fix that. Also check thr to be non-NULL on > nbd_co_establish_connection() start for safety. > > After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes > impossible in the second switch in nbd_co_establish_connection(). > Still, don't add extra abort() just before the release. If it somehow > possible to reach this "case:" it won't hurt. Anyway, good refactoring > of all this reconnect mess will come soon. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> > --- > > Hi all! I faced a crash, just running 277 iotest in a loop. I can't > reproduce it on master, it reproduces only on my branch with nbd > reconnect refactorings. > > Still, it seems very possible that it may crash under some conditions. > So I propose this patch for 6.0. It's written so that it's obvious that > it will not hurt: > > pre-patch, on first hunk we'll just crash if thr is NULL, > on second hunk it's safe to return -1, and using thr when > s->connect_thread is already zeroed is obviously wrong. > > block/nbd.c | 11 +++++++++++ > 1 file changed, 11 insertions(+)
Can we please get it merged in 6.0 as it's a genuine crasher fix? Reviewed-by: Roman Kagan <rvka...@yandex-team.ru>