Hi,

I don't think the ha of alert is necessary at present. This extension can be 
extended  by users. We should focus on the current scheduling.

Best,
Yichao Yang


------------------ Original ------------------
From: JUN GAO <[email protected]&gt;
Date: Sat,Aug 22,2020 9:41 PM
To: dev <[email protected]&gt;
Subject: Re: About the high availability implementation of the Alert service



I think the first one is better.

[email protected] <[email protected]&gt;于2020年8月22日 周六19:30写道:

&gt; hi&nbsp; ALL
&gt;
&gt; I would like to make a suggestion that the Alert Module is not currently
&gt; designed to be in a high availability state, and that there are problems
&gt; with sending repeated alerts when multiple alert services are started.
&gt; Alarm service down, DS alarm failure problem.
&gt; So far, I've come up with two architectures that address the problem of
&gt; sending warning messages repeatedly, while implementing the
&gt; high-availability Alert Moduler feature.
&gt;
&gt; 1、The first is the master-slave relationship between the alert services
&gt; through ZK. Only the master node is responsible for sending information.
&gt; After the master node is suspended, the master is selected again, and the
&gt; new master node continues to provide the warning service.
&gt; 2.The second is a de-centralised design in which all alert services work
&gt; simultaneously through exclusive locks between them, in which case the
&gt; alert messages are not repeated.
&gt;
&gt; If we have a better plan, we can discuss it together
&gt;
&gt; Thx
&gt;
&gt; 中文:
&gt; 我提一个建议,目前alert module 设计上还不是高可用状态,存在启动多个alert 服务时,会重复发送告警信息的问题。
&gt; 告警服务挂掉,ds告警功能失效的问题。
&gt; 目前我想到了两种架构来解决重复发送告警信息的问题,同时实现alert moduler高可用功能。
&gt; 1.第一种是alert 服务之间通过zk 实现主从关系,只有主节点来负责信息发送,在主节点挂掉后,重新选主,新的主节点来继续提供告警服务。
&gt; 2.第二种采用去中心的设计,alert 服务 之间通过排它锁来实现所有alert 服务同时工作,并在这种情况下保证告警信息不重复发送。
&gt; 如果大家有更好的方案,可以一起讨论
&gt;
&gt; 谢谢
&gt;
&gt;
&gt;
&gt;
&gt; [email protected]
&gt;
-- 
DolphinScheduler(Incubator)&nbsp; PPMC
Jun Gao 高俊
[email protected]

Reply via email to