This seems to be a semi-reproducible case (randomness due to scheduler and ordering of events):
https://gist.github.com/4496344 We start 3 threads here: (1) Low priority -- allocate messages in a loop, send messages in another loop (2) Medium priority -- dumb primality test (just burns CPU) (3) High priority -- receive messages in a loop, deallocate messages in another loop On my system, generally, allocations (zmq_msg_init_size), deallocations (zmq_msg_close), sending (zmq_send), and receiving (zmq_recv) with the ZeroMQ functions each take microseconds. The medium priority thread takes, in contrast, tens of seconds to complete its task. We let these threads run and compete with the scheduler until: (1) The allocate and deallocate loops overlap with each other (2) The high thread hits a lock somewhere in memory allocation code that the low thread holds (3) Medium thread preempts the high thread (because he is stuck waiting for low thread) I think some executions hit priority inversion as evidenced by this snippet from example.log (starting at https://gist.github.com/4496344#file-example-log-L1478): > [LOW,1357608714109653] Allocate loop Start. > [LOW,1357608714109657] Allocated and set in 4 microseconds > [LOW,1357608714109661] Allocate loop Start. Here, the low thread is preempted by the medium thread. > [MEDIUM,1357608714109675] Performing PRIME test loop. Okay, but clearly medium can get preempted by a high thread---which happens here. > [HIGH,1357608714109708] Deallocate loop start. Oddly, our high thread gets preempted for the medium thread! Here I assume that the high thread is now stuck waiting on a lock that the low thread holds (embedded behind memory allocation code). > [MEDIUM,1357608755279843] Computed in 41.170 seconds. > [MEDIUM] Found PRIME. After medium finishes, I'm guessing the low priority thread is allowed to execute relinquishing its lock (this is lower level than where our prints are). > ***[HIGH,1357608755279893] Deallocated in 41.170 seconds > [HIGH,1357608755279909] Deallocate loop start. > [HIGH,1357608755279914] Deallocated in 5 microseconds > [HIGH,1357608755279925] Deallocate loop start. > [HIGH,1357608755279929] Deallocated in 4 microseconds > [HIGH,1357608755279940] Deallocate loop start. > [HIGH,1357608755279944] Deallocated in 4 microseconds After the high priority thread gets in some work, the low priority gets scheduled again (our high priority thread has a small usleep to allow the low priority to sneak in sometimes so that they can interleave allocations and deallocations). > [LOW,1357608755279922] Allocated and set in 41.170 seconds Wow: our high priority thread just had to wait 40 seconds for a medium priority thread (with no ZeroMQ calls at all) to execute. -- Wolf PS I'm not sure on the best route to handle this with ZeroMQ as a library is. Perhaps we should document this on a wiki page and warn people to be careful mixing priorities with dynamically allocated messages and the inproc transport. On Wed, Nov 14, 2012 at 9:09 AM, Pieter Hintjens <p...@imatix.com> wrote: > Do you have a reproducible case for this? > > -Pieter > > On Wed, Nov 14, 2012 at 11:01 PM, Roger Dannenberg <r...@cs.cmu.edu> wrote: >> Does ZeroMQ support communication among fixed priority threads using >> inproc transport? It looks to me like ZeroMQ uses malloc to >> allocate/free messages, which implies a shared lock on a shared heap. If >> a low priority thread gets the lock and a medium priority thread >> preempts it, can't that block a high priority thread indefinitely? I >> believe OS X and Windows do not have locks with priority ceiling or >> priority inheritance protocols, and it appears that Linux offers >> priority inheritance but does not use it in malloc/free as implemented >> in glibc, so it seems that priority inversion is (still) a potential >> problem. Does ZeroMQ offer a solution? >> -Roger Dannenberg >> >> _______________________________________________ >> zeromq-dev mailing list >> zeromq-dev@lists.zeromq.org >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev > _______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org > http://lists.zeromq.org/mailman/listinfo/zeromq-dev _______________________________________________ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev