HA modifications for the Alert service are easy to implement, and I'll probably have them in my branch by next week.
alert 服务的HA修改,很容易实现,下周我可能就已经在我的分支完成了。 [email protected] From: JUN GAO Date: 2020-08-23 15:04 To: dev Subject: Re: About the high availability implementation of the Alert service I agree with you. Alert server is not pressing . Yichao Yang <[email protected]>于2020年8月23日 周日10:44写道: > Hi, > > > I don't think the ha of alert is necessary at present. This extension can > be extended by users. We should focus on the current scheduling. > > Best, > Yichao Yang > > > ------------------ Original ------------------ > From: JUN GAO <[email protected]> > Date: Sat,Aug 22,2020 9:41 PM > To: dev <[email protected]> > Subject: Re: About the high availability implementation of the Alert > service > > > > I think the first one is better. > > [email protected] <[email protected]>于2020年8月22日 周六19:30写道: > > > hi ALL > > > > I would like to make a suggestion that the Alert Module is not > currently > > designed to be in a high availability state, and that there are > problems > > with sending repeated alerts when multiple alert services are started. > > Alarm service down, DS alarm failure problem. > > So far, I've come up with two architectures that address the problem > of > > sending warning messages repeatedly, while implementing the > > high-availability Alert Moduler feature. > > > > 1、The first is the master-slave relationship between the alert > services > > through ZK. Only the master node is responsible for sending > information. > > After the master node is suspended, the master is selected again, and > the > > new master node continues to provide the warning service. > > 2.The second is a de-centralised design in which all alert services > work > > simultaneously through exclusive locks between them, in which case the > > alert messages are not repeated. > > > > If we have a better plan, we can discuss it together > > > > Thx > > > > 中文: > > 我提一个建议,目前alert module 设计上还不是高可用状态,存在启动多个alert 服务时,会重复发送告警信息的问题。 > > 告警服务挂掉,ds告警功能失效的问题。 > > 目前我想到了两种架构来解决重复发送告警信息的问题,同时实现alert moduler高可用功能。 > > 1.第一种是alert 服务之间通过zk 实现主从关系,只有主节点来负责信息发送,在主节点挂掉后,重新选主,新的主节点来继续提供告警服务。 > > 2.第二种采用去中心的设计,alert 服务 之间通过排它锁来实现所有alert 服务同时工作,并在这种情况下保证告警信息不重复发送。 > > 如果大家有更好的方案,可以一起讨论 > > > > 谢谢 > > > > > > > > > > [email protected] > > > -- > DolphinScheduler(Incubator) PPMC > Jun Gao 高俊 > [email protected] -- DolphinScheduler(Incubator) PPMC Jun Gao 高俊 [email protected]
