06.04.2021 19:20, Vladimir Sementsov-Ogievskiy wrote:
06.04.2021 18:51, Vladimir Sementsov-Ogievskiy wrote:
If on nbd_close() we detach the thread (in
nbd_co_establish_connection_cancel() thr->state becomes
CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use
s->connect_thread (which is set to NULL), as running thread may free it
at any time.
Still nbd_co_establish_connection() does exactly this: it saves
s->connect_thread to local variable (just for better code style) and
use it even after yield point, when thread may be already detached.
Fix that. Also check thr to be non-NULL on
nbd_co_establish_connection() start for safety.
After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes
impossible in the second switch in nbd_co_establish_connection().
Still, don't add extra abort() just before the release. If it somehow
possible to reach this "case:" it won't hurt. Anyway, good refactoring
of all this reconnect mess will come soon.
Signed-off-by: Vladimir Sementsov-Ogievskiy<vsement...@virtuozzo.com>
---
Hi all! I faced a crash, just running 277 iotest in a loop. I can't
reproduce it on master, it reproduces only on my branch with nbd
reconnect refactorings.
Still, it seems very possible that it may crash under some conditions.
So I propose this patch for 6.0. It's written so that it's obvious that
it will not hurt:
pre-patch, on first hunk we'll just crash if thr is NULL,
on second hunk it's safe to return -1, and using thr when
s->connect_thread is already zeroed is obviously wrong.
Ha, occasionally I reinvented what Roman already does in "[PATCH 1/7] block/nbd:
avoid touching freed connect_thread".
My additional first hunk actually is not needed, as nbd_co_establish_connection is
called after if (!nbd_clisent_connecting(s)) { return; }, so we should not be here
after nbd_co_establish_connection_cancel(bs, true); which is called with
s->state set to NBD_CLIENT_QUIT.
So, it would be more honest to take Roman's patch "[PATCH 1/7] block/nbd: avoid
touching freed connect_thread" :)
Still, I like my variant, because it make obvious that s->connect_thread may
change only to NULL, not to some new pointer.
Eric, could you take a look? If there no more pending block patches, I can try
to send pull-request myself
Kevin, I see you've staged several patches for rc3.. This one is quite simple,
could you add it too?
--
Best regards,
Vladimir