Hi Bill,

I hadn't heard about OZ when I started my project that now relies on 0MQ. The tech stack was largely chosen by my supervisor and co-workers. Now the priority is to get the system verified and in production as soon as possible. Thus, at this point we must stick with the current dependencies. Anyway, this OZ does seem interesting...

About Spin, I mainly chose it because I had used it with success in my earlier work that relates to the thesis. I'll list some other points that come into my mind:

- I find the PROMELA language to be intuitive and elegant for formalising asynchronous systems. I like the way how PROMELA resembles a programming language. The TLA+ language seems to be lower level in the sense that I need to keep track of program counters etc. Of course there's the pluscal language for TLA, but I haven't tried it. - The TLC model checker seems to be quite a bit slower than Spin. This is more like the first impression than an informed opinion, though. It could be the case that TLC just has more overhead that slows down the verification of small models but which becomes unnoticeable on larger models. - The TLA+ IDE does feel much better than iSpin or jSpin. However, I personally prefer the simplicity of the Emacs PROMELA mode. - I find the theory behind Spin and enumerative model checking (guarded commands language, Büchi automata) easier to understand than that of symbolic model checking (CSPs, SAT, SMT, DPLL, CDCL, BMC, IC3, etc.). Direct construction with automata feels more comfortable for me than the more analytical encodings and constraint satisfaction problems. Of course, your mileage may vary. Some understanding on how the model checker works helps fighting the state space explosion. - I've read Mordechai Ben-Ari's book "Principles of the Spin Model Checker", and it was a great tutorial. It was much more understandable for me than Leslie Lamport's TLA Hyperbook and other resources.

Of the four model checkers I've tested (Spin, NuSMV, TLA+, and DIVINE) I find Spin the easiest to use. This is not to say that Spin itself is easy to use. Frankly, the output formatting of Spin verifier executables is horrible. Understanding counter examples can take much work.

Best regards,
John

On 15.10.2020 17.11, Bill Torpey wrote:
Hi John:

I’ll help where I can — pls see embedded comments.

Also, shameless plug: you may want to check out OZ (https://github.com/nyfix/OZ <https://github.com/nyfix/OZ>) — this is an “out of the box” implementation of ZeroMQ as the transport layer for OpenMAMA (https://www.openmama.org/ <https://www.openmama.org/>).  It may provide a simple way to run some test code (or at least provide some implementation hints).

I’m interested in your thesis in any case — I’ve been wanting to create a formal model of OZ for some time, and have been dipping my toes into TLA+ (https://en.wikipedia.org/wiki/TLA%2B <https://en.wikipedia.org/wiki/TLA%2B>) for that purpose.  I’m curious if you’ve considered TLA+ and if so what prompted you to choose PROMELA?

Good luck with your thesis, and I’d very much like to review it when you’ve got something suitable.

Regards,

Bill Torpey

On Oct 15, 2020, at 5:23 AM, John Lång <john.l...@mykolab.com <mailto:john.l...@mykolab.com>> wrote:

Hello,

(Hopefully this message didn't get duplicated...) As part of my master's thesis, I'm building a formal model for my distributed system using PROMELA. The system I'm modelling uses 0MQ publish-subscribe pattern. I have some questions about the pattern. I'd really appreciate all answers to these questions! I'm now looking at the informal specification at https://rfc.zeromq.org/spec/29/

The specification mostly looks clear. However, I'm a bit confused about these two bullet points:

  * SHALL create a queue when initiating an outgoing connection to a
    subscriber, and SHALL maintain the queue whether or not the
    connection is established.
  * SHALL create a queue when a subscriber connects to it. If this
    subscriber disconnects, the PUB socket SHALL destroy its queue
    and SHALL discard any messages it contains.

What is the difference between initiating an outgoing connection to a subscriber and a subscriber connecting to a publisher? In the C++ source code of my system, a publisher binds to a PUB socket and a subscriber connects to the address of a publisher. I guess this sounds more like the second point, but I'm not certain.

This is an implementation detail, but can be important in some use-cases.  Basically, when a SUB connects to a PUB, subscription information is exchanged between the SUB and PUB as part of the connect handshake — when a PUB connects to a SUB an additional set of messages is required to communicate subscription information from the SUB to the PUB.  This matters because, for most protocols, filtering is done by the PUB, so the PUB will not send messages until it knows that the SUB is interested in them.  In practice, this means that when a SUB connects to a PUB, the SUB will immediately start receiving messages — in the other direction, there is a “window” where some number of messages that are published immediately following the connect may not be sent to SUB.

There’s been a lot of discussion of this in the ZeroMQ community, for instance https://github.com/zeromq/libzmq/issues/2267 <https://github.com/zeromq/libzmq/issues/2267>.  Note that the canonical solution for this (calling zmq_poll after connect, as per https://gist.github.com/hintjens/7344533 <https://gist.github.com/hintjens/7344533>) is NOT reliable — there is still a window, although calling zmq_poll does often reduce the window size such that things appear to work.

Did I understand correctly that this message queue between a publisher and a subscriber works FIFO? It says that for outgoing messages, "SHALL silently drop the message if the queue for a subscriber is full." Does this imply that those messages that fit in the queue are delivered in the order they were sent in? I take it that the queue being full means that the high water mark has been exceeded.

Our testing was done not on queue full, but on socket disconnect/reconnect, but the principle is the same.  Any queued messages are delivered in order, and any messages dropped are dropped from the tail of the queue.  So a typical sequence at a subscriber would look something like this if the queue fills up after message #3 and then is drained before message #10 is sent:

1 .. 2 .. 3 .. <queue full> .. 10, 11, 12

What is a binary comparison? Is it the same as bitwise comparison?

Yes.

To me, "binary comparison" sounds like just comparing two things with each other.

Currently, I'm modelling my 0MQ publish-subscribe connections as arrays in PROMELA. After all, the specification for the publisher says that processing outgoing messages "SHALL NOT block on sending". Another reason for my decision is that I have multiple publishers and subscribers and a fixed message queue length, so a three-dimensional array is handy for accessing the messages. Is there a way for achieving sensible channel semantics for 0MQ publish-subscribe pattern?

I wonder if there's already a formal specification for 0MQ publish-subscribe pattern out there somewhere... I should probably do some more research to see if I can find related work on this matter.

If you do I’m sure that the community would love to know about it, as would I.

Best regards,
John Lång
_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org <mailto:zeromq-dev@lists.zeromq.org>
https://lists.zeromq.org/mailman/listinfo/zeromq-dev


_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to