The mailing list had a problem with spam, it's fixed now, sorry for the
inconvenience.

Are you using a socket from multiple threads? That is usually the prime
cause of crashes.

Also, usually the PUB binds and the SUB connects, although it should
work the other way around as well.

In these cases a minimal code snippet that reproduces the problem is the
best way to get to a resolution.

On Mon, 2016-12-12 at 19:03 +0000, Heungsub Lee wrote:
> Was my email sent well?  I sent this email yesterday, but I couldn't see at
> the archive.
> 
> 2016년 12월 12일 (월) 오전 3:55, Heungsub Lee <s...@subl.ee>님이 작성:
> 
> Hi folks, I'm Heungsub Lee.
> 
> I've been making a game server with ZeroMQ's Pub/Sub approach.  I got a
> critical problem by using PUB/SUB sockets.  Sometimes my processes are
> aborted with assertion failure from ZeroMQ:
> 
> Assertion failed: erased == 1 (src/mtrie.cpp:297)
> 
> I tried with pyzmq-16.0.2 over libzmq-4.2.0.
> 
> In my case, a SUB socket binds to an address then a PUB socket connects to
> the address.  All of PUB sockets and SUB sockets in a cluster connect with
> each others.  They makes a fully connected network among 500+ server
> processes.
> 
> A SUB socket frequently subscribes or unsubscribes their topics.  The
> topics in a cluster grow up since the cluster started.  At a moment when I
> checked, one of SUB sockets is subscribing 3000+ topics.
> 
> I saw 3 aborting scenarios:
> 
>    1. When a SUB socket closes, some PUB sockets abort.  Perhaps it is a
>    concurrency bug from pyzmq what I'm using.  I reproduced it by a test
>    case
>    
> <https://github.com/what-studio/pyzmq/commit/5159ee563a571daccf1285aa74917bb875c774a7>.
>    And I think I fixed it
>    
> <https://github.com/what-studio/pyzmq/commit/94ab0a88dbef7d0f33b34cdf18e55487735dde01>
>    .
>    2. When a PUB socket joins to a mature cluster it aborts almost
>    immediately.  A mature cluster means there are already so many subscribing
>    topics and subscribe/unsubscribe synchronization messages.
>    3. A PUB socket on a weak host machine (e.g. AWS EC2 t2.medium),
>    sometimes aborts.  I'm not sure what is the point.
> 
> Unfortunately, I couldn't reproduce the last 2 scenarios by a small code.
> But my server still has been aborted.
> 
> The assertion failure occurs when a PUB socket tries to remove a pipe to a
> SUB socket but there's no matched pipe.  I'm wondering if ZeroMQ guarantees
> the consistency of subscribe/unsubscribe synchronizations between busy PUB
> and SUB sockets.
> 
> Regards,
> Heungsub
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev


Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to