Hi Willy,
And a second mail as i just thought of one extra thing you wrote that
maybe i misunderstand or perhaps confused you with a small remark about
cpu usage in my earlier mail (that was a side effect of my other earlier
but totally wrong code change..).
I'm suspecting we could have something wrong with the polled_mask, maybe
sometimes it's removed too early somewhere, preventing the delete(write)
from being performed, which would explain why it loops.
To clarify the issue is not that haproxy uses cpu by looping, the issue
is that haproxy prevents the page from loading in the browser. The 'fix'
on the old version after the commit introducing the issue was to call
the EV_SET write delete *less* often. Or maybe my understanding of what
is does is just wrong :).
Op 13-4-2018 om 0:57 schreef PiBa-NL:
Hi Willy,
Op 13-4-2018 om 0:22 schreef Willy Tarreau:
I'm suspecting we could have something wrong with the polled_mask, maybe
sometimes it's removed too early somewhere, preventing the delete(write)
from being performed, which would explain why it loops.
By the way you must really not try to debug an
old version but stick to the latest fixes.
Okay testing from now on with current master, just thought it would be
easier to backtrack if i knew what particular new/missing event would
possibly cause it. And it could have been simpler to find a fix just
after the problem was introduced, but it seems it ain't that simple :).
I'm seeing two things that could be of interest to test :
- remove the two "if (fdtab[fd].polled_mask & tid_bit)" conditions
to delete the events. It will slightly inflate the list of events
but not that much. If it fixes the problem it means that the
polled_mask is sometimes wrong. Please do that with the updated
master.
Removing the 'if polled_mask' does not fix the issue, in fact that
makes it worse. the "srvrep[0007:0008]: HTTP/1.1 401 Unauthorized" is
also not shown anymore without those checks..
- switch to poll() just to see if you have the same so that we can
figure if only the kqueue code triggers the issue. poll() doesn't
rely on polled_mask at all.
Using poll (startup with -dk) the request works properly.
Many thanks for your tests.
Willy
Regards,
PiBa-NL (Pieter)
Regards,
PiBa-NL (Pieter)