Thanks for the response, Zhanghao. PersistenceServices sounds good to me.
Best, Yangze Guo On Wed, Dec 27, 2023 at 11:30 AM Zhanghao Chen <zhanghao.c...@outlook.com> wrote: > > Thanks for driving this effort, Yangze! The proposal overall LGTM. Other from > the throughput enhancement in the OLAP scenario, the separation of leader > election/discovery services and the metadata persistence services will also > make the HA impl clearer and easier to maintain. Just a minor comment on > naming: would it better to rename PersistentServices to PersistenceServices, > as usually we put a noun before Services? > > Best, > Zhanghao Chen > ________________________________ > From: Yangze Guo <karma...@gmail.com> > Sent: Tuesday, December 19, 2023 17:33 > To: dev <dev@flink.apache.org> > Subject: [DISCUSS] FLIP-403: High Availability Services for OLAP Scenarios > > Hi, there, > > We would like to start a discussion thread on "FLIP-403: High > Availability Services for OLAP Scenarios"[1]. > > Currently, Flink's high availability service consists of two > mechanisms: leader election/retrieval services for JobManager and > persistent services for job metadata. However, these mechanisms are > set up in an "all or nothing" manner. In OLAP scenarios, we typically > only require leader election/retrieval services for JobManager > components since jobs usually do not have a restart strategy. > Additionally, the persistence of job states can negatively impact the > cluster's throughput, especially for short query jobs. > > To address these issues, this FLIP proposes splitting the > HighAvailabilityServices into LeaderServices and PersistentServices, > and enable users to independently configure the high availability > strategies specifically related to jobs. > > Please find more details in the FLIP wiki document [1]. Looking > forward to your feedback. > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-403+High+Availability+Services+for+OLAP+Scenarios > > Best, > Yangze Guo