On Thu, Oct 26, 2023 at 11:51:00AM +0200, Martijn van Duren wrote: > This case is covered by the new regress' backend_get_toofew and > backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked > to the max it's really hard to trigger (I had to run > backend_get_wrongorder, backend_get_toofew, backend_get_toomany > sequentially in a tight loop killing snmpd between iterations for the > best chance). > > When we receive an invalid varbindlist in a response we set the invalid > variable. This in turn calls appl_varbind_error(), but the avi_state > of the varbinds remains in APPL_VBSTATE_PENDING. Directly following > we call appl_request_downstream_free(), which sees that the varbinds > haven't been resolved, triggering a call to > appl_request_upstream_resolve(). This call in turn sees that the > error has been set and just sends out the error-response to the client > and frees the appl_request_upstream. From here we return back to > appl_response(), which also calls appl_request_upstream_resolve(), > resulting in a use after free. > > The main tool for fixing this issue is making use of > appl_request_upstream's aru_locked member, which will cause > appl_request_upstream_resolve() to return instantly. The simplest fix is > to set aru_locked before calling appl_request_downstream_free() and > unsetting it directly afterwards inside appl_response(). > > The second one is the diff proposed below, which shrinks the code. > > appl_request_upstream_free() is only called once from > appl_request_upstream_reply(). appl_request_upstream_reply() in turn > is only called by appl_request_upstream_resolve(). > appl_request_upstream_resolve() is called in 3 places: > - appl_processpdu(): to kick things off > - appl_request_downstream_free(): For when a backend disappears with > outstanding requests > - appl_response(): To kickstart the next round of resolving. > > Since appl_request_downstream_free() is always called from > appl_response(), we can leverage that function and make it call > appl_request_upstream_resolve() unconditionally. > > appl_request_downstream_free() is called from the following locations: > - appl_close(): When a backend has disappeared. > - appl_request_upstream_free(): We send out a reply early, because an > error has been detected. > - appl_response(): We received a response > > appl_request_upstream_free() can't reenter into > appl_request_upstream_resolve(), or it would potentially trigger new > appl_request_downstreams. This can be prevented by setting aru_locked > before calling appl_request_downstream_free(). > For all other cases we should rely on appl_request_upstream_resolve()'s > logic to handle varbinds in any state, so there's no reason make calls > from other contexts conditional.
Your description of the bug makes sense and your choice of resolving it as well. Thanks for the in-depth explanation, that helped a lot. Still, I must say that I don't really feel at ease with the amount of complexity and entanglement here. I simply can't fit all of this into my head within a reasonable amount of time. ok tb (fwiw) since it resolves a real problem and simplifies things a bit.