On 2/28/20 10:01 PM, Marc Lehmann wrote:
That is confusing - if you read from the socket _without_ getting a
readyness notification from libev, then of course you might get EAGAIN,
but that wouldn't have anything to do with libev, as it isn't involved,
right?
This is by no means an libev error, but it was me having a socket in at state I did not understand. I was just hoping others have experienced something like it.
It would be more asurprising if libev told you you cna read and THEN you
got EAGAIN - this is possible, but shouldn't happen when you only read
after libev told you so.

That was not the problem, it seemed like i got a new socket from accept, that newer got any data to read but no errors either (just EAGAIN), and in order to check if I missed something, i ended up setup a libev timer, that did an extra read, just to see if i missed an EV_READ, which was not the case. But the original problem was the socket with no data and the "ab" test tool complaining about missing data.

Again, in normal usage, this was not a problem, it was when I put much pressure on it, it began to fail, and I was not able to replicate this while debugging.

Yeah, that would either be a network problem, or what you think is the
socket actually isn't the socket (i.e. somehow the program confuses fd's
for example).

My focus ended on the listen socket backlog, as it ran full fast due to starvation from the other connections handing headers and more, when I made code that balanced this better, the problem simply disappeared.

I was thinking about splitting the reading of the header in normal reading, and then parsing the header in some idle handling, but after looking at some profiling, it did not seem to really be the problem.

If that is the case,then it is because your kernel never received any data
for it - or your kernel is buggy, which is possible, but much less likely
than a bug somewhere else.

I really don't understand why i ended up with these ghost sockets, but when I made sure to prioritize the listen socket events and only empty the backlog half on each listen socket read event (before this I looped over accept until EAGAIN), it acted flawlesly, and as I originally expected.

I had a old thread based service, that fetched data from accept using its own thread, and that never goth any of these ghost sockets, and the client was just rejected if the backlog got full.

Well, I don't know if I am one of the poeple you want to reach, but of
course, I had buggy programs, too, and my first steps would usually be to
identify, with certainty, an fd that ahs the problem and see its kernel state
- if no data is there, then no data was received (or it eas read earler).
With TCP, iot is relatively easy to runa tcpdump and later identify the
connection, and then you cna see _exqactly_ what packets were exchanged.

That way you can find out with certainty whether the bednchmark tool sent
something, whether your receive socket received somethign,a nd thus rule
out your kernel, the benchmark tool, or your program.

If you don't know how to read tcp exchanges, learning it will go a long
way towards enlightenment (i.e. I think its worth learning in any case :)

Thanks for you advise, this could have been my next step, my problem was how to isolate this. I mean tcpdump on 30K test connection screams for some special tooling, for isolation the bug even more, and my brain just could not cope :-)

But yes, knowing the basics is always worth the trip !

But thanks for your (and Bernd's) input ...

/BL



_______________________________________________
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/mailman/listinfo/libev

Reply via email to