06.04.2021 19:20, Vladimir Sementsov-Ogievskiy wrote:
06.04.2021 18:51, Vladimir Sementsov-Ogievskiy wrote:
If on nbd_close() we detach the thread (in
nbd_co_establish_connection_cancel() thr->state becomes
CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use
s->connect_thread (which is set to NULL), as running thread may free it
at any time.

Still nbd_co_establish_connection() does exactly this: it saves
s->connect_thread to local variable (just for better code style) and
use it even after yield point, when thread may be already detached.

Fix that. Also check thr to be non-NULL on
nbd_co_establish_connection() start for safety.

After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes
impossible in the second switch in nbd_co_establish_connection().
Still, don't add extra abort() just before the release. If it somehow
possible to reach this "case:" it won't hurt. Anyway, good refactoring
of all this reconnect mess will come soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy<vsement...@virtuozzo.com>
---

Hi all! I faced a crash, just running 277 iotest in a loop. I can't
reproduce it on master, it reproduces only on my branch with nbd
reconnect refactorings.

Still, it seems very possible that it may crash under some conditions.
So I propose this patch for 6.0. It's written so that it's obvious that
it will not hurt:

  pre-patch, on first hunk we'll just crash if thr is NULL,
  on second hunk it's safe to return -1, and using thr when
  s->connect_thread is already zeroed is obviously wrong.

Ha, occasionally I reinvented what Roman already does in "[PATCH 1/7] block/nbd: 
avoid touching freed connect_thread".

My additional first hunk actually is not needed, as nbd_co_establish_connection is 
called after if (!nbd_clisent_connecting(s)) { return; }, so we should not be here 
after  nbd_co_establish_connection_cancel(bs, true); which is called with 
s->state set to NBD_CLIENT_QUIT.

So, it would be more honest to take Roman's patch "[PATCH 1/7] block/nbd: avoid 
touching freed connect_thread" :)

Still, I like my variant, because it make obvious that s->connect_thread may 
change only to NULL, not to some new pointer.


Eric, could you take a look? If there no more pending block patches, I can try 
to send pull-request myself


Kevin, I see you've staged several patches for rc3.. This one is quite simple, 
could you add it too?

--
Best regards,
Vladimir

Reply via email to