Re: [zeromq-dev] zmq architecture/protocol planning

James Harvey Mon, 25 Jun 2018 09:38:12 -0700

Hi James,

The cachers in my setup, publish their discovery information every second to 
the discovery proxies. I have maybe 100 catchers and the network overhead is 
low compared with the ease of use and the fact you can use that info to confirm 
the publisher is still in running.


I add meta info to the Json blobs with details on cache sizes, how many current 
connections etc. This allows the client to make an informed decision on which 
cache to connect to.

I also setup my publishers as verbose so the publisher can catch the 
subscription message (of the new joiner) and send out its details again. So no 
1 second delay.

You can use either method or both like me.

I also think there is a beacon service built into zeromq (or czmq?) that may 
suit but I have never used it.

Cheers

James Harvey

On 25 Jun 2018, at 17:16, James Addison 
<add...@gmail.com<mailto:add...@gmail.com>> wrote:

James - thank you for building on what Bill mentioned; that's actually quite 
helpful. I think what you describe is very close to what I am needing to do. I 
wouldn't have thought to use XPUB/XSUB for this, but as always things seem 
intuitive _after_ the fact.

Perhaps a naive question, but how are you handling new nodes joining the 
network (ie. scaling the network up due to load) after it's all up and running? 
I mean, they wouldn't receive the initial discovery pub/sub notifications from 
the earlier nodes, would they?

On Mon, Jun 25, 2018 at 1:33 AM James Harvey 
<jhar...@factset.com<mailto:jhar...@factset.com>> wrote:
Hi James,

I am doing something almost identical to Bill with regards to discovery.

My system is a distributed cache, where I have X discovery proxies at fixed 
location with a fixed port for upstream/downstream. They are just xpub/xsub 
(with verbose/r on) zmq_proxy.

+ Cacher publishes (on a topic describing what's in its cache) it's location in 
a json message (ports, ip, other details) to the upstream port of the discovery 
proxy.
+ Consumers subscribe to downstream port of discovery proxy with a prefix 
partial subject of the caches they are interested in
+ Consumers parse the incoming json's, decide best cache and connect to it 
directly (bypassing the proxy).

This system works between the DC and cloud (AWS). I also have a system using 
zeromq internally to the DC that uses mcast PGM to broadcast the discovery 
info. This is nice as there is no single point of failure but you have more 
discovery traffic (as mcast PUB/SUB have to filter on the SUB side) and you 
need a mcast capable network.

James Harvey


From: zeromq-dev 
<zeromq-dev-boun...@lists.zeromq.org<mailto:zeromq-dev-boun...@lists.zeromq.org>>
 On Behalf Of Bill Torpey
Sent: 23 June 2018 21:29
To: ZeroMQ development list 
<zeromq-dev@lists.zeromq.org<mailto:zeromq-dev@lists.zeromq.org>>
Subject: Re: [zeromq-dev] zmq architecture/protocol planning

Hi James:

I’m doing something similar on the service discovery end, but it’s a work in 
progress, so take this with the appropriate amount of salt ;-)

It seems a good idea to minimize state as much as possible, especially 
distributed state, so I have so far avoided the central “registrar”, preferring 
to distribute that functionality out to the nodes, and to delegate as much 
functionality as possible to ZeroMQ itself.

I’ve got a single well-known endpoint, which is a process running zmq_proxy 
(actually multiple processes, but let’s keep it simple).  Nodes use PUB/SUB 
messaging to exchange discovery messages with the proxy, and use the discovery 
messages to establish direct PUB/SUB connections to peer nodes over a second 
socket pair.  I let ZeroMQ deal with the filtering by topic.  I also let ZeroMQ 
deal with ignoring multiple connection attempts to the same endpoint, which  
greatly simplifies the discovery protocol.  (If you decide to do something like 
that, you probably want to make sure you are working with a relatively recent 
version of ZeroMQ — there  have been some recent changes in that functionality: 
https://github.com/zeromq/libzmq/pull/2879).

The result of this is a fully-connected network, with each node having direct 
PUB/SUB connections to every other node.  That may or may not work for your 
application, but for mine it is fine (~100 nodes total).

As mentioned, there’s a somewhat complicated protocol that ensures that every 
node gets to see all the discovery messages, but without flooding the network.  
That part is still a work-in-progress, but it’s looking pretty reliable so far.

If you decide to do something similar, let me suggest you take a look at the 
excellent ZMQ_XPUB_WELCOME_MSG socket option contributed by Doron Somech 
(https://somdoron.com/2015/09/reliable-pubsub/).  I use this to get a 
notification when the discovery SUB socket is connected to the zmq_proxy, which 
triggers publication of discovery messages on the discovery PUB socket.

Hope this helps...

Regards,

Bill

On Jun 23, 2018, at 12:13 AM, James Addison 
<mailto:add...@gmail.com<mailto:add...@gmail.com>> wrote:

Looking for a little guidance/advice on ZMQ implementation.

The following demonstrates the simplistic architecture that I'm considering. It 
doesn't take into consideration redundancy, load balancing at all levels (yet). 
The general flow of request/response traffic would be:

-> HTTP request from internet
-> nginx (1 node)
-> aiohttp + zmq-based frontend (1 or more nodes depending on system demands)
-> zmq-based router (1 node)
-> zmq-based worker (n nodes; scalable depending on dynamic demand)

I want my system to work in environments where multicast/broadcast is not 
available (ie. AWS EC2 VPC) - so I believe a well-known node for service 
discovery is needed.

With that in mind, all zmq-based nodes would:

- register with the 'central' service discovery (SD) node on startup to make 
other nodes aware of its presence
- separately SUBscribe to the service discovery node's PUB endpoint to receive 
topics of pertinent peer nodes' connection details

In the nginx config, I plan to have an 'upstream' defined in a separate file 
that is updated by a zmq-based process that also SUBscribes to the service 
discovery node.

ZMQ-based processes, and their relation to other ZMQ-based processes:

- service discovery (SD)
- zmq-based nginx upstream backend updater; registers with SD, SUBs to frontend 
node topic (to automatically add frontend node connection details to nginx 
config and reload nginx)
- frontend does some request validation and caching; registers with SD, SUBS to 
router node topic (to auto connect to the router's endpoint)
- router is the standard zmq DEALER/ROUTER pattern; registers with SD
- worker is the bit that handles the heavy lifting; registers with SD, SUBS to 
router node topic (to auto connect to the router's endpoint)

The whole point of this is that each node only ever needs to know the 
well-known service discovery node endpoint - and each node can auto-discover 
and hopefully recover in most downtime scenarios (excluding mainly if the SD 
node goes down, but that's outside of scope at the moment).
Questions!

1. Does this architecture make sense? In particular, the single well-known 
service discovery node and every other node doin PUB/SUB with it for relevant 
endpoint topics?
2. Who should heartbeat to who? PING/PONG? ie. when a given node registers with 
the SD node, should the registering node start heartbeating on the same 
connection to the SD node, or should the SD node open a separate new socket to 
the registering node? The SD node is the one that will need to know if 
registered nodes drop off the earth, I think?

I'll likely have followup questions - hope that's ok!

Thanks,
James
_______________________________________________
zeromq-dev mailing list
mailto:zeromq-dev@lists.zeromq.org<mailto:zeromq-dev@lists.zeromq.org>
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org<mailto:zeromq-dev@lists.zeromq.org>
https://lists.zeromq.org/mailman/listinfo/zeromq-dev


--
James Addison
email: add...@gmail.com<mailto:add...@gmail.com>
twitter: @jamesaddison
_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org<mailto:zeromq-dev@lists.zeromq.org>
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] zmq architecture/protocol planning

Reply via email to