On 29/8/2023 8:35 am, Steve Thompson wrote:


What happens if a WINTEL server running MQ buys the farm? Those inflight transactions going through that server may time out and have to be re-driven. Is this considered an outage? Not if you have a second one handling the load and it takes over. But that one or 10(?) users may see an error message. Does that count as an outage if the user only loses a few seconds in getting an answer? Or a Pharmacy getting info? Or an OR getting info on drug interactions?

Distributed systems are very different beasts. Mitigating network partitions has lead to the CAP theorem. Apache Kafka is a popular message broker on distributed systems and it's highly available if you run it on a least 3 nodes, which can tolerate the loss of 1 broker. 5 nodes for 2 brokers. Orchestration platforms such as Kubernetes and Open Shift make it quite easy to deploy clusters, even using availability zones for replicate to a remote data center. All brokers replicate with each other and are coordinated on the control plane using a consensus algorithm like Raft. As a mainframe guy I was blown away how anybody would find eventual consistency acceptable, but they do.

https://en.wikipedia.org/wiki/CAP_theorem
https://en.wikipedia.org/wiki/Raft_(algorithm)

z is the king of CA systems as it's not susceptible to network partitions. That's why I find it odd why people would want to run systems like Kafka on z/OS when it's architecture is designed to run on unreliable commodity hardware.


Need some perspective.

Steve Thompson

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to