On Mon, Jun 4, 2012 at 8:40 AM, Marc Lehmann <schm...@schmorp.de> wrote: > On Mon, Jun 04, 2012 at 02:44:19PM +0200, Joachim Nilsson > <troglo...@gmail.com> wrote: >> On Mon, Jun 4, 2012 at 2:29 PM, Marc Lehmann <schm...@schmorp.de> wrote: >> > On Mon, Jun 04, 2012 at 08:41:38AM +0800, 钱晓明 <mailtoanta...@163.com> >> > wrote: >> >> If there are more than one event loop on same udp socket, and each of >> >> them in different thread, datagrams will be processed by all threads >> >> concurrently? For example, one thread is processing udp message while > If the processing of the udp packet takes comparatively long, then you will > have few threads selecting on the fd (in the best case only one on average), > because the others are busy processing something else.
In the specific case of connectionless UDP packets, if the processing overhead to handle each incoming packet is very optimized and fast, on Linux hosts, I've found that the Linux socket code itself becomes the scaling limit. One of my projects is an open source authoritative DNS server where I did a lot of perf testing on this scenario. Spawning multiple threads looping on a single socket doesn't help, as they're all waiting on some lower-level socket serialization in the kernel, so you get basically the same throughput with 8 threads as you do with 1, even if you have a bunch of CPU cores to use. But you can scale up to more packets/sec handled on a given machine by doing one thread-per-socket (-per-core) and distributing the packets to these multiple sockets by some other mechanism. e.g. in the DNS case with an 8-core server, you might use some front-end hardware loadbalancer (or some linux iptables/ipvsadm type hacks) to re-distribute the incoming UDP packets on port 53 to local ports 1060-1067, run a thread+loop per socket on each of your 8 cores listening on those ports, and of course have the LB stuff remap the responses to source port 53 afterwards as well. Then you can see real scaling over anything you might try to do against a single UDP socket. This all depends, as Marc was saying, on how much per-packet processing overhead you have in your daemon. If the overhead was higher then yes you might be better off with several threads/loops on a single socket. I'd say as a very very rough rule of thumb: if your processing code is light and optimal enough that it could handle somewhere in the ballpark of say 20-50K (or higher) pps on a single CPU core on the target machine, you're probably in the territory where the socket is the limit and it's time to do thread-per-socket as described above. -- Brandon _______________________________________________ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev