Jesús Cea Avión <j...@jcea.es> added the comment: They are operations potentially slow, there are not realtime specifications.
My machine is quite capable (16 CPUs), but let's see what a bit of DTRACE scripting tells us: First, let's time the open: """ syscall::open*:entry /copyinstr(arg0)=="/dev/poll"/ { self->ts = timestamp; } syscall::open*:return /self->ts/ { @stats = quantize(timestamp-self->ts); self->ts = 0; } """ The result, times in nanoseconds: """ value ------------- Distribution ------------- count 2048 | 0 4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 4497743 8192 | 17358 16384 | 1592 32768 | 2046 65536 | 812 131072 | 374 262144 | 0 """ Most "open"s finish in 4 microseconds, but we have a handful of "slow" openings of hundred of microseconds. Anyway, argueing with the GIL here is a nonissue, since the "open" is only done at object creation, and a sane program will only create a handful of devpoll objets. Let's see now the write: """ syscall::open*:entry /copyinstr(arg0)=="/dev/poll"/ { self->ts = timestamp; } syscall::open*:return /self->ts/ { @stats = quantize(timestamp-self->ts); self->ts = 0; } """ The results for a single descriptor registered: """ value ------------- Distribution ------------- count 256 | 0 512 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 4102701 1024 | 28514 2048 | 11269 4096 | 537 8192 | 245 16384 | 377 32768 | 193 65536 | 134 131072 | 71 262144 | 0 """ Most writes are really fast, half a microsecond, but we have sporadic latencies of a hundred of microseconds. Re-registering 200 sockets per loop, I have: """ value ------------- Distribution ------------- count 512 | 0 1024 | 50 2048 | 94 4096 | 157 8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1112314 16384 |@ 22977 32768 | 1584 65536 | 842 131072 | 168 262144 | 0 """ Most writes takes around 8 microseconds. So now the problem is to estimate how much time I need to release & readquire the GIL. For that, and while we don't have DTRACE integration in python (my next project, I hope), I change sourcecode to add a manual probe using the Solaris High Resolution timers, I get a median time of around 550 nanoseconds (half a microsecond), with a few spikes, when the GIL is not contested. So, unlocking the GIL is adding around half microsecond to the syscalls. This is fast, but comparable with the syscall actual duration. Liberating the GIL is doubling the time to register&unregister a socket (supposely we are doing a single socket per iteration, something realistic in real code). Being realistic, any work we do with the descriptors (like actually reading or writing it) is going to make this 0.5 microsecond gain pointless. Freeing the GIL, we are protected of any kernel contention, and playing for the rules :-). Anyway I am open to feedback. PS: I also checked with GIL contention, and results are comparable. That is surprising, maybe related to the GIL improvements in 3.3. I haven't investigated the issue further. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6397> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com