hi  ALL

I would like to make a suggestion that the Alert Module is not currently 
designed to be in a high availability state, and that there are problems with 
sending repeated alerts when multiple alert services are started.
Alarm service down, DS alarm failure problem.
So far, I've come up with two architectures that address the problem of sending 
warning messages repeatedly, while implementing the high-availability Alert 
Moduler feature.

1、The first is the master-slave relationship between the alert services through 
ZK. Only the master node is responsible for sending information. After the 
master node is suspended, the master is selected again, and the new master node 
continues to provide the warning service.
2.The second is a de-centralised design in which all alert services work 
simultaneously through exclusive locks between them, in which case the alert 
messages are not repeated.

If we have a better plan, we can discuss it together

Thx

中文:
我提一个建议,目前alert module 设计上还不是高可用状态,存在启动多个alert 服务时,会重复发送告警信息的问题。
告警服务挂掉,ds告警功能失效的问题。
目前我想到了两种架构来解决重复发送告警信息的问题,同时实现alert moduler高可用功能。
1.第一种是alert 服务之间通过zk 实现主从关系,只有主节点来负责信息发送,在主节点挂掉后,重新选主,新的主节点来继续提供告警服务。
2.第二种采用去中心的设计,alert 服务 之间通过排它锁来实现所有alert 服务同时工作,并在这种情况下保证告警信息不重复发送。
如果大家有更好的方案,可以一起讨论

谢谢
 



[email protected]

Reply via email to