On Feb 15, 2011, at 6:00 PM, Chuck Remes wrote:

> Due to some ongoing issues with 0mq on OSX, I switched over to using my linux 
> box as the main dev and test server. 
> 
> I am running a very recent master from the last day or two, so it's all 2.1.0.
> 
> My systems do a lot of high-volume communication amongst 4 distributed 
> components. They connect strictly via the tcp transport (no inproc or ipc). 
> After switching to linux (archlinux running the 2.6.35 kernel) I started 
> getting the mailbox assertion after it ran for a few hours.
> 
>  Assertion failed:  new_sndbuf > old_sndbuf (mailbox.cpp:182)

So I added a little debug print before the assertion in mailbox.cpp. Here is 
what prints out:

Assertion failed: new_sndbuf > old_sndbuf (mailbox.cpp:183)

new_sndbuf = 2097152, old_sndbuf = 524288

new_sndbuf = 8388608, old_sndbuf = 2097152

new_sndbuf = 2097152, old_sndbuf = 524288

new_sndbuf = 10485760, old_sndbuf = 8388608

new_sndbuf = 10485760, old_sndbuf = 10485760


As you can see, it's growing in a fast loop here. My code isn't doing anything 
special either though it's hard to say for certain because this abort prevents 
me from seeing which specific application code leads to this condition.

Also, no other messages are printed between these debug prints in the 0mq 
library. My code is doing a lot of work (in a reactor, so it's single-threaded) 
yet it doesn't get a chance to print anything out while this buffer is being 
expanded 5 times in a row.

Lastly, while watching this print to a console in real-time, I saw that there 
was a noticeable pause *before* the first print came up. I don't know what the 
pause signifies; perhaps the OS was blocked on something?

Unfortunately, the failing component is part of a distributed system so 
creating a small reproducible example is likely impossible.

Any suggestions?

cr

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to