Hi,
I don't think the ha of alert is necessary at present. This extension can be extended by users. We should focus on the current scheduling. Best, Yichao Yang ------------------ Original ------------------ From: JUN GAO <[email protected]> Date: Sat,Aug 22,2020 9:41 PM To: dev <[email protected]> Subject: Re: About the high availability implementation of the Alert service I think the first one is better. [email protected] <[email protected]>于2020年8月22日 周六19:30写道: > hi ALL > > I would like to make a suggestion that the Alert Module is not currently > designed to be in a high availability state, and that there are problems > with sending repeated alerts when multiple alert services are started. > Alarm service down, DS alarm failure problem. > So far, I've come up with two architectures that address the problem of > sending warning messages repeatedly, while implementing the > high-availability Alert Moduler feature. > > 1、The first is the master-slave relationship between the alert services > through ZK. Only the master node is responsible for sending information. > After the master node is suspended, the master is selected again, and the > new master node continues to provide the warning service. > 2.The second is a de-centralised design in which all alert services work > simultaneously through exclusive locks between them, in which case the > alert messages are not repeated. > > If we have a better plan, we can discuss it together > > Thx > > 中文: > 我提一个建议,目前alert module 设计上还不是高可用状态,存在启动多个alert 服务时,会重复发送告警信息的问题。 > 告警服务挂掉,ds告警功能失效的问题。 > 目前我想到了两种架构来解决重复发送告警信息的问题,同时实现alert moduler高可用功能。 > 1.第一种是alert 服务之间通过zk 实现主从关系,只有主节点来负责信息发送,在主节点挂掉后,重新选主,新的主节点来继续提供告警服务。 > 2.第二种采用去中心的设计,alert 服务 之间通过排它锁来实现所有alert 服务同时工作,并在这种情况下保证告警信息不重复发送。 > 如果大家有更好的方案,可以一起讨论 > > 谢谢 > > > > > [email protected] > -- DolphinScheduler(Incubator) PPMC Jun Gao 高俊 [email protected]
