It looks like we stumbled upon an issue in 
https://bz.apache.org/bugzilla/show_bug.cgi?id=65402 which concerns the life 
times of our backend connections.

When a frontend connection causes a backend request and drops, our backend 
connection only notifies the loss when it attempts to pass some data. In normal 
http response processing, this is not an issue since response chunks are 
usually coming in quite frequently. Then the proxied connection will fail to 
pass it to an aborted frontend connection and cleanup will occur.

However, with such modern shenanigans such as Server Side Events (SSE), the 
request is supposed to be long running and will produce body chunks quite 
infrequently, like every 30 seconds or so. This leaves our proxy workers 
hanging in recv for quite a while and may lead to worker exhaustion.

We can say SSE is a bad idea anyway, but that will probably not stop people 
from doing such crazy things.

What other mitigations do we have?
- pthread_kill() will interrupt the recv and probably make it fail
- we can use shorter socket timeouts on backend and check r->connection status 
in between
- ???

Whatever the means, I think it would be a Good Thing to abort backend 
connections earlier than we do now.

WDYT?

- Stefan

Reply via email to