Hi Luca I have added some comments to the pr since I am unsure about the resource management of the term_endpoint::endpoint heap allocated string.
It would be nice if you could have a look to them, thanks! Regards, Harald 2017-09-26 23:23 GMT+02:00 Bill Torpey <wallstp...@gmail.com>: > Hi Luca: > > Sorry for not geting back sooner, but thanks again for listening, and the > PR looks good to me! > > Best Regards, > > Bill Torpey > > On Sep 19, 2017, at 9:13 AM, Luca Boccassi <luca.bocca...@gmail.com> > wrote: > > On Sun, 2017-09-17 at 12:29 -0400, Bill Torpey wrote: > > Luca: > > I hear what you’re saying but … I think I’m talking about a different > situation. > > If I understand your explanation correctly, you’re saying that > setting ZMQ_RECONNECT_IVL to -1 should prevent a disconnected > endpoint from *ever* reconnecting, under any set of circumstances. > > I would read the doc (4.2.2) more like the following (with addition > in *bold*): > > The ZMQ_RECONNECT_IVL option shall set the initial reconnection > interval for the specified socket. The reconnection interval is the > period ØMQ shall wait between attempts to *automatically* reconnect > disconnected peers when using connection-oriented transports. The > value -1 means no reconnection. > > > > What I’m questioning is the interaction between ZMQ_RECONNECT_IVL == > -1 and the behavior enforced by https://github.com/zeromq/libzmq/iss > ues/788. (Also see here: https://www.mail-archive.com/zeromq- > d...@lists.zeromq.org/msg21484.html). That commit is intended to > prevent *duplicate* connections from the same endpoint, for certain > socket types (e.g., pub/sub), where multiple connections (and their > associated duplicate messages) don’t make sense. > > One scenario I’m concerned about is the one where: > > 1. Endpoint connects to us > 2. Endpoint is disconnected for some reason > 3. Setting ZMQ_RECONNECT_IVL=-1 disables *automatic* > reconnect, so as far as we’re concerned the endpoint is dead > 4. Subsequently the endpoint connects to us again (e.g., > following a restart) > 5. Because we still have a record of the endpoint, we will > refuse the connection — even though the endpoint is dead from our > point of view. In this scenario that endpoint can NEVER reconnect. > > So I get that setting ZMQ_RECONNECT_IVL should prevent us from > reconnecting (automatically) to the disconnected endpoint, but I > don’t see the benefit of preventing that endpoint from actively > reconnecting at a later time. In this case, we’ve essentially > blacklisted that endpoint (forever), and I’m having trouble coming up > with a scenario where that would be intended behavior. > > Does this make sense? Am I missing something here? > > Also, to your point about adding a protocol layer on top of 0MQ — I > would MUCH prefer to let 0MQ handle as much of the underlying > connect/disconnect logic as possible. I’m concerned about the > potential for the protocol’s view of the connection state getting out > of sync with 0MQ’s view (not to mention a bunch of additional work on > the protocol layer, but more about synchronization). > > Thanks for listening ... > > Bill > > > I see. I guess there's a terminology confusion issue here - when I > wrote about connections and disconnections, I meant the automated ones > that happen in the background in the I/O thread. But I guess it makes > sense that a manual call to zmq_connect should work as expected. > > A workaround for this behaviour would be for the application to > manually call zmq_disconnect before doing a connect to the same > endpoint. > > But it turns out fixing it to automatically do it is not too hard > (unless I've made some silly mistake): > > https://github.com/zeromq/libzmq/pull/2756 > > On Sep 17, 2017, at 6:39 AM, Luca Boccassi <luca.bocca...@gmail.com > > wrote: > > > On Sat, 2017-09-16 at 14:34 -0400, Bill Torpey wrote: > > Hi Luca: > > Just a gentle reminder to add an issue so this can be tracked (or > let > me know if you’d prefer that I do that). > > Thanks! > > Bill > > > Thinking about this a bit more, I think it's expected behaviour > after > all. From the doc: > > "The 'ZMQ_RECONNECT_IVL' option shall set the initial reconnection > interval for the specified 'socket'. The reconnection interval is > the > period 0MQ shall wait between attempts to reconnect disconnected > peers > when using connection-oriented transports. The value -1 means no > reconnection." > > So it is working as intended - if a peer goes away, it will never > be > reconnected if that option is set. > > And it makes sense - in the context of a TCP connection, a dead > peer is > a dead peer. If for an application a dead peer might be resurrected > after X amount of time, there's no way to know that. It needs to be > handled by the application. > > There are various tools you can use: > > 1) ZMTP heartbeats - see ZMQ_HEARTBEAT* socket options > 2) socket monitoring events (including connects and disconnects) - > see > zmq_socket_monitor documentation > 3) Enhance your protocol - call zmq_disconnect(endpoint) on your > sockets when a particular message is received, or heartbeats are > missed, or a disconnect event happens. This way when you later call > zmq_connect(endpoint) and it happens to match a previous, dead > peer, it > will work as expected > > On Sep 2, 2017, at 1:21 PM, Luca Boccassi <luca.boccassi@gmail. > com> > wrote: > > On Sat, 2017-09-02 at 12:02 -0400, Bill Torpey wrote: > > Thanks again, Luca! > > For now, I’m going to go with disabling reconnect on the > “data” > sockets — that seems to be the best solution for my use case > (connecting to endpoints that were returned by the peer > binding > to an > unspecified (“wildcard”) port — e.g., "tcp://<interface>:*" > in > ZMQ). > > This assumes that ZMQ will completely forget about the > endpoint > if/when it is disconnected, if it is set not to > reconnect. Otherwise > I might run afoul of ZMQ’s silently ignoring connections to > endpoints > that it already knows about: https://github.com/zeromq/libzm > q/is > sues > /788 <https://github.com/zeromq/libzmq/issues/788> (e.g., in > the > case > where another process later happens to be assigned the same > ephemeral > port). > > I’ve done a quick scan of the libzmq code (v4.2.2) and it > doesn’t > appear that the endpoint is removed in the case of a > (terminal) > disconnect. If you can confirm/deny this behavior, that > would be > helpful. Failing that, I guess I’ll need to test this in the > debugger — any hints on how best to do this would also be > much > appreciated. > > Regards, > > Bill > > > Yes it doesn't look like it removes the endpoint - I guess it's > a > corner case that's missed. I'll open an issue. > > BTW all these things are very quick and easy to try with Python > on > Linux. Just install pyzmq, open a python3 terminal and: > > import zmq > ctx = zmq.Context.instance() > rep = ctx.socket(zmq.REP) > rep.bind("tcp://127.0.0.1:12345") > req = ctx.socket(zmq.REQ) > req.connect("tcp://127.0.0.1:12345") > req.send_string("hello") > rep.recv() > rep.send_string("hallo") > req.recv() > rep.unbind("tcp://127.0.0.1:12345") > rep.close() > rep = ctx.socket(zmq.REP) > rep.bind("tcp://127.0.0.1:12345") > req.send_string("hello") > rep.recv() > rep.send_string("hallo") > req.recv() > rep.unbind("tcp://127.0.0.1:12345") > rep.close() > req.close() > rep = ctx.socket(zmq.REP) > rep.bind("tcp://127.0.0.1:12345") > req = ctx.socket(zmq.REQ) > req.setsockopt(zmq.RECONNECT_IVL, > -1)req.connect("tcp://127.0.0.1:12345") > req.send_string("hello") > rep.recv() > rep.send_string("hallo") > req.recv() > rep.unbind("tcp://127.0.0.1:12345") > rep.close() > rep = ctx.socket(zmq.REP) > rep.bind("tcp://127.0.0.1:12345") > req.send_string("hello") > rep.recv() > > This last one won't receive the message > > On Sep 1, 2017, at 6:19 PM, Luca Boccassi <luca.boccassi@gm > ail. > com> > wrote: > > On Fri, 2017-09-01 at 18:03 -0400, Bill Torpey wrote: > > Thanks Luca! That was very helpful. > > Although it leads to a couple of other questions: > > - Can I assume that a ZMQ disconnect of a tcp endpoint > would > only > occur if the underlying TCP socket is closed by the OS? > Or > are > there > conditions in which ZMQ will proactively disconnect the > TCP > socket > and try to reconnect? > > > Normally that's the case - you can set up heartbeating with > the > appropriate options and that will kill a connection if > there's > no > answer > > - I see that there is a sockopt (ZMQ_RECONNECT_IVL) that > can > be > set > to -1 to disable reconnection entirely. In my case, the > the > “data” > socket pair will *always* connect to an ephemeral port, > so I > *never* > want to reconnect. Would this be a reasonable option in > my > case, > do > you think? > > > If that makes sense for your application, go for it - in > these > cases > the only way to be sure is to test it and see how it works > > - Would there be any interest in a patch that would > disable > reconnects (controlled by sockopt) for ephemeral ports > only? I’m > guessing that reconnecting mostly makes sense with well- > known > ports, > so something like this may be of general interest? > > > If by ephemeral port you mean anything over 1024, then > actually > in > most > applications I've seen it's always useful to reconnect, and > the > existing option should be enough for those cases where it's > not > desired > - we don't want to duplicate functionality > > Thanks again! > > Bill > > On Sep 1, 2017, at 5:30 PM, Luca Boccassi <luca.boccass > i@gm > ail. > com> > wrote: > > On Fri, 2017-09-01 at 16:59 -0400, Bill Torpey wrote: > > I'm curious about how ZMQ handles re-connection. I > understand > that > re-connection is supposed to happen "automagically" > under > the > covers, > but that poses an interesting question. > > To make a long story short, the application I'm > working > on > uses > pub/sub sockets over TCP. and works like follows: > > At startup: > 1. connects to a proxy/broker at a well-known > address, > using > a > pub/sub socket pair ("discovery"); > 2. subscribes to a well-known topic using the > "discovery" > sub > socket; > 3. binds a different pub/sub socket pair ("data") > and > retrieves > the > actual endpoints assigned; > 4. publishes the "data" endpoints from step 3 on the > "discovery" > pub > socket; > > When the application receives a message on the > "discovery" > sub > socket, it connects the "data" socket pair to the > endpoints > specified > in the "discovery" message. > > So far, this seems to be working relatively well, and > allows > the > high-volume, low-latency "data" messages to be > sent/received > directly > between peers, avoiding the extra hop caused by a > proxy/broker > connection. The discovery messages use the > proxy/broker, > but > since > these are (very) low-volume the extra hop doesn't > matter. The > use of > the proxy also eliminates the "slow joiner" problem > that > can > happen > with other configurations. > > My question is what happens when one of the "data" > peer > sockets > disconnects. Since ZMQ (apparently) keeps trying to > reconnect, > what > would prevent another process from binding to the > same > ephemeral > port? > > - Can I assume that if the new application at that > port > is > not a > ZMQ > application, that the reconnect will (silently) fail, > and > continue to > be retried? > > > The ZMTP handshake would fail, so yes. > > - What if the new application at that port *IS* a ZMQ > application? Would the reconnect succeed? And if > so, > what > would > happen if it's a *DIFFERENT* ZMQ application, and the > messages > that > it's sending/receiving don't match what the original > application > expects? > > > Depends on how you handle it in your application. If > you > have > security > concerns, then use CURVE with authentication so that > only > authorised > peers can connect. > > It's reasonable for the application to publish a > disconnect > message > when it terminates normally, and the connected peers > can > disconnect > that endpoint. But, applications don't always > terminate > normally > ;-) > > > That's a common pattern. But the application needs to > handle > unexpected > data somewhat gracefully. What that means is entirely > up to > the > application - as far as the library is concerned, if > the > handshake > succeeds then it's all good (hence the use case for > CURVE). > > Any guidance, hints or tips would be much appreciated > -- > thanks > in > advance! > > > -- > Kind regards, > Luca > Boccassi_______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org <mailto:zeromq-...@lists.ze > <zeromq-...@lists.ze> > romq > .org > > <mailto:zeromq-dev@lists.zeromq.org <zeromq-dev@lists.zeromq.org> <mailto: > zeromq-de > v@li > sts. > > > zeromq.org>> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> > <https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>> > > > _______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org <mailto:zeromq-...@lists.zero > <zeromq-...@lists.zero> > mq.o > rg> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> > > > -- > Kind regards, > Luca > Boccassi_______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org <mailto:zeromq-dev@lists.zeromq > <zeromq-dev@lists.zeromq> > .org > > > > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> > > > _______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > -- > Kind regards, > Luca Boccassi_______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > _______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org <mailto:zeromq-dev@lists.zeromq.org > <zeromq-dev@lists.zeromq.org>> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://lists.zeromq.org/mailman/listinfo/zeromq- > dev>_______________________________________________ > > > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org <mailto:zeromq-dev@lists.zeromq.org > <zeromq-dev@lists.zeromq.org>> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> > > > _______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > -- > Kind regards, > Luca Boccassi_______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > > _______________________________________________ > zeromq-dev mailing list > zeromq-dev@lists.zeromq.org > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > >
_______________________________________________ zeromq-dev mailing list zeromq-dev@lists.zeromq.org https://lists.zeromq.org/mailman/listinfo/zeromq-dev