Re: segfault in haproxy=1.8.4

Christopher Faulet Thu, 15 Mar 2018 06:50:38 -0700

Le 15/03/2018 à 12:16, Willy Tarreau a écrit :

+static struct stream *pendconn_process_next_strm(struct server *srv, struct 
proxy *px)
   {
+       struct pendconn *p = NULL;
+       struct server   *rsrv;
        rsrv = srv->track;
        if (!rsrv)
                rsrv = srv;
+       if (srv->nbpend) {
+               list_for_each_entry(p, &srv->pendconns, list) {


-- mark here --

+                       if (!HA_SPIN_TRYLOCK(PENDCONN_LOCK, &p->lock))
+                               goto ps_found;
+               }
+               p = NULL;
+       }
+
+  ps_found:
+       if (srv_currently_usable(rsrv) && px->nbpend) {
+               struct pendconn *pp;
+
+               list_for_each_entry(pp, &px->pendconns, list) {
+                       /* If the server pendconn is older than the proxy one,
+                        * we process the server one. */
+                       if (p && !tv_islt(&pp->strm->logs.tv_request, 
&p->strm->logs.tv_request))
+                               goto pendconn_found;


Here there is still a race : pp->strm is dereferenced out of the lock,
so the owner could very well decide to disappear in the mean time. I
think this check needs to be moved in the trylock below. Or another
option (probably simpler and cleaner) would be to add a tv_request
field to the pendconn, and directly use it.


Well, not really. When we access to a pendconn in a queue (proxy or server)
we always get a lock on it. In other side, a stream must release its
pendconn, if any, before being released. To do so the right queue must be
locked by the stream. So when we manipulate a pendconn in a queue we are
sure that the stream cannot release it in same time. And if the stream
released it, it cannot be in the queue.

Everything should be thread-safe because I added an lock on the pendconn
_AND_ the pendconn is now only released by the stream itself. These 2
conditions are necessary and sufficient to guarantee the thread-safety.


OK that's fine. I think that you should indicate for certain functions
that they must only be called under this or that lock (I remember seeing
one this morning in this situation, I don't remember which one).

But then what prevents a stream from removing itself from the queue at
the mark above ? It takes the lock, removes itself, unlocks, but <p> is
still seen by the function, the trylock succeeds (on just freed memory),
the stream is dereferenced again after being freed, and we may even
unlink it twice, immediately corrupting memory. I may be wrong but I
don't see what prevents this case from being possible. If it is, you may
need to use another lock when manipulating this list (likely the server
lock as you initially did), though that could become expensive.

When we scan a queue, it is locked. So on your mark, the server queue isalready locked. To remove a pendconn from a queue, we also need to havea lock on this queue, if it is still linked. Else we can safely removeit, nobody can fall on the pendconn by scanning the queue.

In fact without locking the queue (server one and/or proxy one), it isunsafe to loop on it. In same way, we must lock the queue to remove apendconn from it. I will add comments to clarify everything. But, Ireally think it is thread-safe.


--
Christopher Faulet

Re: segfault in haproxy=1.8.4

Reply via email to