2011/1/31 Martin Sustrik <[email protected]>:
> On 01/31/2011 05:43 PM, Dhammika Pathirana wrote:
>
>> Try something like 512k, but I don't know about your app/traffic pattern.
>> These are system wide settings so you'd want to be stingy.
>
> There's one problem with 0mq architecture that may be the cause of the
> problem.
>
> Namely, the I/O thread (one that handles the network traffic in async way)
> and application thread (the one you use to access 0mq API from) communicate
> via a socketpair.
>
> Now imagine that I/O thread has something to say to the app thread every now
> and then (such as that new connection was established or that the connection
> was destroyed). If the user doesn't call any 0mq functions for an extended
> period of time the application thread has no chance to read its mailbox.
> Thus the mailbox (the socketpair) will finally fill up cause this assertion.
>
> Martin
>

Usage pattern of my crashing application may be described as following:
application works in 4 threads, each process N requests in parallel.
If there are free slots in thread, it receives message by ST_PULL
socket from previous application. For each message it makes many
additional subrequests to many other app types. When all subrequests
are processed, final message is sent to next application by ST_PUSH
socket, and its slot is freed. When speed of processing subrequests is
lesser than speed of incoming messages, application (thanks to HWM for
ST_PULL and ST_PUSH) will limit speed of whole system without problem.
Each thread is not sending subrequests to other apps directly: for
each app type i have additional threads, that is receiving requests
from application threads, sends it to application, receives response,
and deliver it to application. Except this, these threads is
responsible for response caching and resending request if no response
is received during some timeperiod.

Summary: application consists from many threads, and every thread in
almost all situations is suspended by blocking zeromq calls receive()
and send(). But this application have only one IO thread, may be this
be a problem?

I think, that problem with assert appears often, when system is in
limiting speed mode. But i'm not sure. From last restart, my
application works now 40 hours without troubles...

I'll create issue in github.
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to