Re: [zeromq-dev] zmq architecture/protocol planning

Michal Vyskocil Tue, 26 Jun 2018 13:41:57 -0700

Hi all,

I don't know if I understand all your constraints well, however there are
two projects you might be interested in


Zyre, it supports discovery and reliable group messaging without a broker.
https://github.com/zeromq/zyre/blob/master/README.md#scope-and-goals

Malamute,the zeromq broker. It provides pub/sub, request/reply and service
communication pattern.
https://github.com/zeromq/malamute/

I know that you might want to avoid central servers, however this is based
on zproto architecture with simple server and complex clients. According
Pieter this is great way towards reliability. Usually it's up to client to
handle advanced things like retransmit.

If nothing, this is a great source of design information for you, as you
seems to solve similar problems.

 Michal
Dne po 25. 6. 2018 19:12 uživatel James Addison <add...@gmail.com> napsal:

> This is great information, James - thank you for taking the time to share.
>
> On Mon, Jun 25, 2018 at 10:02 AM James Harvey <jhar...@factset.com> wrote:
>
>> Yes that’s possible, you can also use/store the round trip time from
>> subscribe to reply.
>>  That should give some idea of the network distance/congestion/load on
>> each publisher.
>>
>> Basically your client can build up a map of all possible options about
>> where is the best place to get X from.
>>
>> James Harvey
>>
>> On 25 Jun 2018, at 17:47, James Addison <add...@gmail.com> wrote:
>>
>> Ah! The simplicity of this approach makes sense.
>>
>> I suppose it also allows clients to disconnect/remove a cacher from it's
>> sources if a SUB notification isn't received in 5 or 10 seconds as well
>> then?
>>
>> Thanks again,
>> James
>>
>> On Mon, Jun 25, 2018 at 9:38 AM James Harvey <jhar...@factset.com> wrote:
>>
>>>
>>> Hi James,
>>>
>>> The cachers in my setup, publish their discovery information every
>>> second to the discovery proxies. I have maybe 100 catchers and the network
>>> overhead is low compared with the ease of use and the fact you can use that
>>> info to confirm the publisher is still in running.
>>>
>>> I add meta info to the Json blobs with details on cache sizes, how many
>>> current connections etc. This allows the client to make an informed
>>> decision on which cache to connect to.
>>>
>>> I also setup my publishers as verbose so the publisher can catch the
>>> subscription message (of the new joiner) and send out its details again. So
>>> no 1 second delay.
>>>
>>> You can use either method or both like me.
>>>
>>> I also think there is a beacon service built into zeromq (or czmq?) that
>>> may suit but I have never used it.
>>>
>>> Cheers
>>>
>>> James Harvey
>>>
>>> On 25 Jun 2018, at 17:16, James Addison <add...@gmail.com> wrote:
>>>
>>> James - thank you for building on what Bill mentioned; that's actually
>>> quite helpful. I think what you describe is very close to what I am needing
>>> to do. I wouldn't have thought to use XPUB/XSUB for this, but as always
>>> things seem intuitive _after_ the fact.
>>>
>>> Perhaps a naive question, but how are you handling new nodes joining the
>>> network (ie. scaling the network up due to load) after it's all up and
>>> running? I mean, they wouldn't receive the initial discovery pub/sub
>>> notifications from the earlier nodes, would they?
>>>
>>> On Mon, Jun 25, 2018 at 1:33 AM James Harvey <jhar...@factset.com>
>>> wrote:
>>>
>>>> Hi James,
>>>>
>>>> I am doing something almost identical to Bill with regards to
>>>> discovery.
>>>>
>>>> My system is a distributed cache, where I have X discovery proxies at
>>>> fixed location with a fixed port for upstream/downstream. They are just
>>>> xpub/xsub (with verbose/r on) zmq_proxy.
>>>>
>>>> + Cacher publishes (on a topic describing what's in its cache) it's
>>>> location in a json message (ports, ip, other details) to the upstream port
>>>> of the discovery proxy.
>>>> + Consumers subscribe to downstream port of discovery proxy with a
>>>> prefix partial subject of the caches they are interested in
>>>> + Consumers parse the incoming json's, decide best cache and connect to
>>>> it directly (bypassing the proxy).
>>>>
>>>> This system works between the DC and cloud (AWS). I also have a system
>>>> using zeromq internally to the DC that uses mcast PGM to broadcast the
>>>> discovery info. This is nice as there is no single point of failure but you
>>>> have more discovery traffic (as mcast PUB/SUB have to filter on the SUB
>>>> side) and you need a mcast capable network.
>>>>
>>>> James Harvey
>>>>
>>>>
>>>> From: zeromq-dev <zeromq-dev-boun...@lists.zeromq.org> On Behalf Of
>>>> Bill Torpey
>>>> Sent: 23 June 2018 21:29
>>>> To: ZeroMQ development list <zeromq-dev@lists.zeromq.org>
>>>> Subject: Re: [zeromq-dev] zmq architecture/protocol planning
>>>>
>>>> Hi James:
>>>>
>>>> I’m doing something similar on the service discovery end, but it’s a
>>>> work in progress, so take this with the appropriate amount of salt ;-)
>>>>
>>>> It seems a good idea to minimize state as much as possible, especially
>>>> distributed state, so I have so far avoided the central “registrar”,
>>>> preferring to distribute that functionality out to the nodes, and to
>>>> delegate as much functionality as possible to ZeroMQ itself.
>>>>
>>>> I’ve got a single well-known endpoint, which is a process running
>>>> zmq_proxy (actually multiple processes, but let’s keep it simple).  Nodes
>>>> use PUB/SUB messaging to exchange discovery messages with the proxy, and
>>>> use the discovery messages to establish direct PUB/SUB connections to peer
>>>> nodes over a second socket pair.  I let ZeroMQ deal with the filtering by
>>>> topic.  I also let ZeroMQ deal with ignoring multiple connection attempts
>>>> to the same endpoint, which  greatly simplifies the discovery protocol.
>>>>  (If you decide to do something like that, you probably want to make sure
>>>> you are working with a relatively recent version of ZeroMQ — there  have
>>>> been some recent changes in that functionality:
>>>> https://github.com/zeromq/libzmq/pull/2879).
>>>>
>>>> The result of this is a fully-connected network, with each node having
>>>> direct PUB/SUB connections to every other node.  That may or may not work
>>>> for your application, but for mine it is fine (~100 nodes total).
>>>>
>>>> As mentioned, there’s a somewhat complicated protocol that ensures that
>>>> every node gets to see all the discovery messages, but without flooding the
>>>> network.  That part is still a work-in-progress, but it’s looking pretty
>>>> reliable so far.
>>>>
>>>> If you decide to do something similar, let me suggest you take a look
>>>> at the excellent ZMQ_XPUB_WELCOME_MSG socket option contributed by Doron
>>>> Somech (https://somdoron.com/2015/09/reliable-pubsub/).  I use this to
>>>> get a notification when the discovery SUB socket is connected to the
>>>> zmq_proxy, which triggers publication of discovery messages on the
>>>> discovery PUB socket.
>>>>
>>>> Hope this helps...
>>>>
>>>> Regards,
>>>>
>>>> Bill
>>>>
>>>> On Jun 23, 2018, at 12:13 AM, James Addison <mailto:add...@gmail.com>
>>>> wrote:
>>>>
>>>> Looking for a little guidance/advice on ZMQ implementation.
>>>>
>>>> The following demonstrates the simplistic architecture that I'm
>>>> considering. It doesn't take into consideration redundancy, load balancing
>>>> at all levels (yet). The general flow of request/response traffic would be:
>>>>
>>>> -> HTTP request from internet
>>>> -> nginx (1 node)
>>>> -> aiohttp + zmq-based frontend (1 or more nodes depending on system
>>>> demands)
>>>> -> zmq-based router (1 node)
>>>> -> zmq-based worker (n nodes; scalable depending on dynamic demand)
>>>>
>>>> I want my system to work in environments where multicast/broadcast is
>>>> not available (ie. AWS EC2 VPC) - so I believe a well-known node for
>>>> service discovery is needed.
>>>>
>>>> With that in mind, all zmq-based nodes would:
>>>>
>>>> - register with the 'central' service discovery (SD) node on startup to
>>>> make other nodes aware of its presence
>>>> - separately SUBscribe to the service discovery node's PUB endpoint to
>>>> receive topics of pertinent peer nodes' connection details
>>>>
>>>> In the nginx config, I plan to have an 'upstream' defined in a separate
>>>> file that is updated by a zmq-based process that also SUBscribes to the
>>>> service discovery node.
>>>>
>>>> ZMQ-based processes, and their relation to other ZMQ-based processes:
>>>>
>>>> - service discovery (SD)
>>>> - zmq-based nginx upstream backend updater; registers with SD, SUBs to
>>>> frontend node topic (to automatically add frontend node connection details
>>>> to nginx config and reload nginx)
>>>> - frontend does some request validation and caching; registers with SD,
>>>> SUBS to router node topic (to auto connect to the router's endpoint)
>>>> - router is the standard zmq DEALER/ROUTER pattern; registers with SD
>>>> - worker is the bit that handles the heavy lifting; registers with SD,
>>>> SUBS to router node topic (to auto connect to the router's endpoint)
>>>>
>>>> The whole point of this is that each node only ever needs to know the
>>>> well-known service discovery node endpoint - and each node can
>>>> auto-discover and hopefully recover in most downtime scenarios (excluding
>>>> mainly if the SD node goes down, but that's outside of scope at the 
>>>> moment).
>>>> Questions!
>>>>
>>>> 1. Does this architecture make sense? In particular, the single
>>>> well-known service discovery node and every other node doin PUB/SUB with it
>>>> for relevant endpoint topics?
>>>> 2. Who should heartbeat to who? PING/PONG? ie. when a given node
>>>> registers with the SD node, should the registering node start heartbeating
>>>> on the same connection to the SD node, or should the SD node open a
>>>> separate new socket to the registering node? The SD node is the one that
>>>> will need to know if registered nodes drop off the earth, I think?
>>>>
>>>> I'll likely have followup questions - hope that's ok!
>>>>
>>>> Thanks,
>>>> James
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> mailto:zeromq-dev@lists.zeromq.org
>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev@lists.zeromq.org
>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>
>>>
>>> --
>>> James Addison
>>> email: add...@gmail.com
>>> twitter: @jamesaddison
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev@lists.zeromq.org
>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev@lists.zeromq.org
>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>
>>
>> --
>> James Addison
>> email: add...@gmail.com
>> twitter: @jamesaddison
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev@lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev@lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>
>
> --
> James Addison
> email: add...@gmail.com
> twitter: @jamesaddison
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>

_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] zmq architecture/protocol planning

Reply via email to