Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread
On 13.04.21 14:19, Vladimir Sementsov-Ogievskiy wrote: 13.04.2021 14:53, Max Reitz wrote: On 06.04.21 17:51, Vladimir Sementsov-Ogievskiy wrote: If on nbd_close() we detach the thread (in nbd_co_establish_connection_cancel() thr->state becomes CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use s->connect_thread (which is set to NULL), as running thread may free it at any time. Still nbd_co_establish_connection() does exactly this: it saves s->connect_thread to local variable (just for better code style) and use it even after yield point, when thread may be already detached. Fix that. Also check thr to be non-NULL on nbd_co_establish_connection() start for safety. After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes impossible in the second switch in nbd_co_establish_connection(). Still, don't add extra abort() just before the release. If it somehow possible to reach this "case:" it won't hurt. Anyway, good refactoring of all this reconnect mess will come soon. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I faced a crash, just running 277 iotest in a loop. I can't reproduce it on master, it reproduces only on my branch with nbd reconnect refactorings. Still, it seems very possible that it may crash under some conditions. So I propose this patch for 6.0. It's written so that it's obvious that it will not hurt: pre-patch, on first hunk we'll just crash if thr is NULL, on second hunk it's safe to return -1, and using thr when s->connect_thread is already zeroed is obviously wrong. block/nbd.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index c26dc5a54f..1d4668d42d 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -443,6 +443,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error **errp) BDRVNBDState *s = bs->opaque; NBDConnectThread *thr = s->connect_thread; + if (!thr) { + /* detached */ + return -1; + } + qemu_mutex_lock(&thr->mutex); switch (thr->state) { First, it is a bit strange not to set *errp in these cases. Oops, right! ashamed) OK, so who cares. It wouldn’t do anything anyway. Apart from that, all the changes do is to turn use after frees or immediate NULL dereferences into clean errors. I can’t see any resources that should be cleaned up, so I hope Coverity won’t hate me for taking this patch. And then we’ll see whether Peter will take the pull request... Max
Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread
13.04.2021 14:53, Max Reitz wrote: On 06.04.21 17:51, Vladimir Sementsov-Ogievskiy wrote: If on nbd_close() we detach the thread (in nbd_co_establish_connection_cancel() thr->state becomes CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use s->connect_thread (which is set to NULL), as running thread may free it at any time. Still nbd_co_establish_connection() does exactly this: it saves s->connect_thread to local variable (just for better code style) and use it even after yield point, when thread may be already detached. Fix that. Also check thr to be non-NULL on nbd_co_establish_connection() start for safety. After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes impossible in the second switch in nbd_co_establish_connection(). Still, don't add extra abort() just before the release. If it somehow possible to reach this "case:" it won't hurt. Anyway, good refactoring of all this reconnect mess will come soon. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I faced a crash, just running 277 iotest in a loop. I can't reproduce it on master, it reproduces only on my branch with nbd reconnect refactorings. Still, it seems very possible that it may crash under some conditions. So I propose this patch for 6.0. It's written so that it's obvious that it will not hurt: pre-patch, on first hunk we'll just crash if thr is NULL, on second hunk it's safe to return -1, and using thr when s->connect_thread is already zeroed is obviously wrong. block/nbd.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index c26dc5a54f..1d4668d42d 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -443,6 +443,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error **errp) BDRVNBDState *s = bs->opaque; NBDConnectThread *thr = s->connect_thread; + if (!thr) { + /* detached */ + return -1; + } + qemu_mutex_lock(&thr->mutex); switch (thr->state) { First, it is a bit strange not to set *errp in these cases. Oops, right! ashamed) I don’t think it’d make a difference anyway, but now that I look into it... The error would be propagated to s->connect_err, but that isn’t used anywhere, so... What’s the point? Yes it's unused.. That's to be improved later. Anyway. What I really wanted to ask is: nbd_co_establish_connection() may create a thread, which accesses the thread. Is that a problem? Like, can this happen: - nbd_co_establish_connection(): thr->state := THREAD_RUNNING - nbd_co_establish_connection(): thread is created - nbd_co_establish_connection(): thr->mutex unlocked - connect_thread_func(): thr->mutex locked - connect_thread_func(): thr->state := something else - connect_thread_func(): thr->mutex unlocked - nbd_co_establish_connection(): yields - (As I understood it, this yield leads to nbd_co_establish_connection_cancel() continuing) - nbd_co_EC_cancel(): thr->mutex locked - nbd_co_EC_cancel(): do_free true - nbd_co_EC_cancel(): nbd_free_connect_thread() - connect_thread_func(): nbd_free_connect_thread() I think not. Here we are safe: connect_thread_func will only do free if thread in CONNECT_THREAD_RUNNING_DETACHED state. Thread becomes CONNECT_THREAD_RUNNING_DETACHED only on one code path in nbd_co_establish_connection_cancel(), and on that path do_free is false. OK, what you say is possible if we call nbd_co_establish_connection_cancel() twice with detach=true. But that should not be possible if it is called only from .bdrv_close (indirectly) of nbd driver. The problem is that nbd_co_EC_cancel() may free the thread state, and then nbd_co_establish_connection() reuses saved thr local varible after yield. Still (as I noted before), I've never reproduced it on master, only on my branch with some modifications. Still it seems possible anyway. -- Best regards, Vladimir
Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread
On 06.04.21 17:51, Vladimir Sementsov-Ogievskiy wrote: If on nbd_close() we detach the thread (in nbd_co_establish_connection_cancel() thr->state becomes CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use s->connect_thread (which is set to NULL), as running thread may free it at any time. Still nbd_co_establish_connection() does exactly this: it saves s->connect_thread to local variable (just for better code style) and use it even after yield point, when thread may be already detached. Fix that. Also check thr to be non-NULL on nbd_co_establish_connection() start for safety. After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes impossible in the second switch in nbd_co_establish_connection(). Still, don't add extra abort() just before the release. If it somehow possible to reach this "case:" it won't hurt. Anyway, good refactoring of all this reconnect mess will come soon. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I faced a crash, just running 277 iotest in a loop. I can't reproduce it on master, it reproduces only on my branch with nbd reconnect refactorings. Still, it seems very possible that it may crash under some conditions. So I propose this patch for 6.0. It's written so that it's obvious that it will not hurt: pre-patch, on first hunk we'll just crash if thr is NULL, on second hunk it's safe to return -1, and using thr when s->connect_thread is already zeroed is obviously wrong. block/nbd.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index c26dc5a54f..1d4668d42d 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -443,6 +443,11 @@ nbd_co_establish_connection(BlockDriverState *bs, Error **errp) BDRVNBDState *s = bs->opaque; NBDConnectThread *thr = s->connect_thread; +if (!thr) { +/* detached */ +return -1; +} + qemu_mutex_lock(&thr->mutex); switch (thr->state) { First, it is a bit strange not to set *errp in these cases. I don’t think it’d make a difference anyway, but now that I look into it... The error would be propagated to s->connect_err, but that isn’t used anywhere, so... What’s the point? Anyway. What I really wanted to ask is: nbd_co_establish_connection() may create a thread, which accesses the thread. Is that a problem? Like, can this happen: - nbd_co_establish_connection(): thr->state := THREAD_RUNNING - nbd_co_establish_connection(): thread is created - nbd_co_establish_connection(): thr->mutex unlocked - connect_thread_func(): thr->mutex locked - connect_thread_func(): thr->state := something else - connect_thread_func(): thr->mutex unlocked - nbd_co_establish_connection(): yields - (As I understood it, this yield leads to nbd_co_establish_connection_cancel() continuing) - nbd_co_EC_cancel(): thr->mutex locked - nbd_co_EC_cancel(): do_free true - nbd_co_EC_cancel(): nbd_free_connect_thread() - connect_thread_func(): nbd_free_connect_thread() ? Max
Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread
12.04.2021 11:45, Roman Kagan wrote: On Tue, Apr 06, 2021 at 06:51:14PM +0300, Vladimir Sementsov-Ogievskiy wrote: If on nbd_close() we detach the thread (in nbd_co_establish_connection_cancel() thr->state becomes CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use s->connect_thread (which is set to NULL), as running thread may free it at any time. Still nbd_co_establish_connection() does exactly this: it saves s->connect_thread to local variable (just for better code style) and use it even after yield point, when thread may be already detached. Fix that. Also check thr to be non-NULL on nbd_co_establish_connection() start for safety. After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes impossible in the second switch in nbd_co_establish_connection(). Still, don't add extra abort() just before the release. If it somehow possible to reach this "case:" it won't hurt. Anyway, good refactoring of all this reconnect mess will come soon. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I faced a crash, just running 277 iotest in a loop. I can't reproduce it on master, it reproduces only on my branch with nbd reconnect refactorings. Still, it seems very possible that it may crash under some conditions. So I propose this patch for 6.0. It's written so that it's obvious that it will not hurt: pre-patch, on first hunk we'll just crash if thr is NULL, on second hunk it's safe to return -1, and using thr when s->connect_thread is already zeroed is obviously wrong. block/nbd.c | 11 +++ 1 file changed, 11 insertions(+) Can we please get it merged in 6.0 as it's a genuine crasher fix? Reviewed-by: Roman Kagan Thanks! I'm applying it to my nbd branch and will send PULL request for rc3 today -- Best regards, Vladimir
Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread
On Tue, Apr 06, 2021 at 06:51:14PM +0300, Vladimir Sementsov-Ogievskiy wrote: > If on nbd_close() we detach the thread (in > nbd_co_establish_connection_cancel() thr->state becomes > CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use > s->connect_thread (which is set to NULL), as running thread may free it > at any time. > > Still nbd_co_establish_connection() does exactly this: it saves > s->connect_thread to local variable (just for better code style) and > use it even after yield point, when thread may be already detached. > > Fix that. Also check thr to be non-NULL on > nbd_co_establish_connection() start for safety. > > After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes > impossible in the second switch in nbd_co_establish_connection(). > Still, don't add extra abort() just before the release. If it somehow > possible to reach this "case:" it won't hurt. Anyway, good refactoring > of all this reconnect mess will come soon. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > > Hi all! I faced a crash, just running 277 iotest in a loop. I can't > reproduce it on master, it reproduces only on my branch with nbd > reconnect refactorings. > > Still, it seems very possible that it may crash under some conditions. > So I propose this patch for 6.0. It's written so that it's obvious that > it will not hurt: > > pre-patch, on first hunk we'll just crash if thr is NULL, > on second hunk it's safe to return -1, and using thr when > s->connect_thread is already zeroed is obviously wrong. > > block/nbd.c | 11 +++ > 1 file changed, 11 insertions(+) Can we please get it merged in 6.0 as it's a genuine crasher fix? Reviewed-by: Roman Kagan
Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread
06.04.2021 19:20, Vladimir Sementsov-Ogievskiy wrote: 06.04.2021 18:51, Vladimir Sementsov-Ogievskiy wrote: If on nbd_close() we detach the thread (in nbd_co_establish_connection_cancel() thr->state becomes CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use s->connect_thread (which is set to NULL), as running thread may free it at any time. Still nbd_co_establish_connection() does exactly this: it saves s->connect_thread to local variable (just for better code style) and use it even after yield point, when thread may be already detached. Fix that. Also check thr to be non-NULL on nbd_co_establish_connection() start for safety. After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes impossible in the second switch in nbd_co_establish_connection(). Still, don't add extra abort() just before the release. If it somehow possible to reach this "case:" it won't hurt. Anyway, good refactoring of all this reconnect mess will come soon. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I faced a crash, just running 277 iotest in a loop. I can't reproduce it on master, it reproduces only on my branch with nbd reconnect refactorings. Still, it seems very possible that it may crash under some conditions. So I propose this patch for 6.0. It's written so that it's obvious that it will not hurt: pre-patch, on first hunk we'll just crash if thr is NULL, on second hunk it's safe to return -1, and using thr when s->connect_thread is already zeroed is obviously wrong. Ha, occasionally I reinvented what Roman already does in "[PATCH 1/7] block/nbd: avoid touching freed connect_thread". My additional first hunk actually is not needed, as nbd_co_establish_connection is called after if (!nbd_clisent_connecting(s)) { return; }, so we should not be here after nbd_co_establish_connection_cancel(bs, true); which is called with s->state set to NBD_CLIENT_QUIT. So, it would be more honest to take Roman's patch "[PATCH 1/7] block/nbd: avoid touching freed connect_thread" :) Still, I like my variant, because it make obvious that s->connect_thread may change only to NULL, not to some new pointer. Eric, could you take a look? If there no more pending block patches, I can try to send pull-request myself Kevin, I see you've staged several patches for rc3.. This one is quite simple, could you add it too? -- Best regards, Vladimir
Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread
06.04.2021 18:51, Vladimir Sementsov-Ogievskiy wrote: If on nbd_close() we detach the thread (in nbd_co_establish_connection_cancel() thr->state becomes CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use s->connect_thread (which is set to NULL), as running thread may free it at any time. Still nbd_co_establish_connection() does exactly this: it saves s->connect_thread to local variable (just for better code style) and use it even after yield point, when thread may be already detached. Fix that. Also check thr to be non-NULL on nbd_co_establish_connection() start for safety. After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes impossible in the second switch in nbd_co_establish_connection(). Still, don't add extra abort() just before the release. If it somehow possible to reach this "case:" it won't hurt. Anyway, good refactoring of all this reconnect mess will come soon. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I faced a crash, just running 277 iotest in a loop. I can't reproduce it on master, it reproduces only on my branch with nbd reconnect refactorings. Still, it seems very possible that it may crash under some conditions. So I propose this patch for 6.0. It's written so that it's obvious that it will not hurt: pre-patch, on first hunk we'll just crash if thr is NULL, on second hunk it's safe to return -1, and using thr when s->connect_thread is already zeroed is obviously wrong. Ha, occasionally I reinvented what Roman already does in "[PATCH 1/7] block/nbd: avoid touching freed connect_thread". My additional first hunk actually is not needed, as nbd_co_establish_connection is called after if (!nbd_clisent_connecting(s)) { return; }, so we should not be here after nbd_co_establish_connection_cancel(bs, true); which is called with s->state set to NBD_CLIENT_QUIT. So, it would be more honest to take Roman's patch "[PATCH 1/7] block/nbd: avoid touching freed connect_thread" :) Eric, could you take a look? If there no more pending block patches, I can try to send pull-request myself -- Best regards, Vladimir