On Wed, Jul 01, 2026 at 11:47:05AM +0200, David Marchand wrote:
> On Fri, 26 Jun 2026 at 12:34, Anatoly Burakov <[email protected]>
> wrote:
> >
> > When rte_mp_request_async() fails to send requests to all peers,
> > copy and param can lose ownership and leak.
> >
> > However, we cannot simply free them unconditionally, as "partial failure"
> > means some requests were already queued and thus still reference `copy` and
> > `param`, so freeing them directly on the error path can cause
> > use-after-free when those requests are later handled by the async timeout.
> >
> > Fix this by rolling back queued requests from the current batch, and reset
> > nb_sent to 0. Freeing the requests is now safe even if some requests were
> > sent, as any responses or timeouts will not find the request ID in the
> > queue and will safely exit without doing anything.
> >
> > Coverity issue: 501503
> >
> > Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
> > Cc: [email protected]
> >
> > Signed-off-by: Anatoly Burakov <[email protected]>
> > ---
> > lib/eal/common/eal_common_proc.c | 31 +++++++++++++++++++++++++++++++
> > 1 file changed, 31 insertions(+)
> >
> > diff --git a/lib/eal/common/eal_common_proc.c
> > b/lib/eal/common/eal_common_proc.c
> > index 991bf215a3..0cffc7a127 100644
> > --- a/lib/eal/common/eal_common_proc.c
> > +++ b/lib/eal/common/eal_common_proc.c
> > @@ -1245,6 +1245,32 @@ rte_mp_request_async(struct rte_mp_msg *req, const
> > struct timespec *ts,
> > } else if (mp_request_async(path, copy, param, ts))
> > ret = -1;
> > }
> > +
> > + /*
> > + * On partial failure, roll back all queued requests. We hold the
> > lock
> > + * so no one else touches the queue. All requests in this batch
> > share
> > + * the same param pointer. Stale alarms will fire and harmlessly
> > find
> > + * nothing via ID-based lookup.
> > + */
> > + if (ret != 0 && reply->nb_sent > 0) {
> > + struct pending_request *r, *next;
> > +
> > + for (r = TAILQ_FIRST(&pending_requests.requests);
> > + r != NULL; r = next) {
> > + next = TAILQ_NEXT(r, next);
> > + if (r->type == REQUEST_TYPE_ASYNC &&
> > + r->async.param == param) {
> > + TAILQ_REMOVE(&pending_requests.requests,
> > + r, next);
> > + free(r->reply);
> > + /* r->request == copy, freed below after
> > the loop */
> > + free(r);
> > + }
> > + }
> > + /* requests on the queue were removed so keep things
> > consistent */
> > + reply->nb_sent = 0;
> > + }
> > +
>
> Please, don't reimplement the safe macro.
>
> I plan to update this with:
>
> @@ -1252,15 +1252,11 @@ rte_mp_request_async(struct rte_mp_msg *req,
> const struct timespec *ts,
> * nothing via ID-based lookup.
> */
> if (ret != 0 && reply->nb_sent > 0) {
> - struct pending_request *r, *next;
> -
> - for (r = TAILQ_FIRST(&pending_requests.requests);
> - r != NULL; r = next) {
> - next = TAILQ_NEXT(r, next);
> - if (r->type == REQUEST_TYPE_ASYNC &&
> - r->async.param == param) {
> - TAILQ_REMOVE(&pending_requests.requests,
> - r, next);
> + struct pending_request *r, *tmp;
> +
> + RTE_TAILQ_FOREACH_SAFE(r, &pending_requests.requests,
> next, tmp) {
> + if (r->type == REQUEST_TYPE_ASYNC &&
> r->async.param == param) {
> +
> TAILQ_REMOVE(&pending_requests.requests, r, next);
> free(r->reply);
> /* r->request == copy, freed below
> after the loop */
> free(r);
>
> Objection?
> If not, I'll update while applying.
>
+1 to this suggestion from me.