GitHub user shrihari7396 created a discussion: Design Discussion: Embedding AlertServer into dolphinscheduler-api Module
Hi all, I’ve been studying the architectural requirements for embedding the AlertServer into the API Server (related to #8975). After reviewing the initialization flows in `dolphinscheduler-alert-server` and `dolphinscheduler-api`, I’d like to discuss a potential design direction and gather feedback. My goal is to transition the alerting mechanism from a standalone process to an embedded background service while maintaining DolphinScheduler's high-availability and reliability standards. --- ## Proposed Technical Direction ### 1. Logic Decoupling Refactor core logic (e.g., `AlertBootstrapService`, `AlertSender`) into a reusable library module that can be natively consumed by `dolphinscheduler-api`. ### 2. Lifecycle Integration Use Spring-managed components and lifecycle hooks (`@PostConstruct`) to initialize the alerting engine within the API Server process only after it successfully joins the Registry. ### 3. Leader Election & HA To prevent duplicate alert processing in horizontally scaled API deployments, leverage the existing `RegistryClient` (ZooKeeper/Etcd abstraction) to implement a leader-follower model for the embedded alerting loop. ### 4. Fault Tolerance & Atomicity - Implement an atomic claim mechanism using SQL-based optimistic locking (updating rows to a `SENDING` state with an `instance_id` before processing). - Introduce a "Janitor" thread on the leader instance to identify and re-queue alerts stuck in a `SENDING` state due to unexpected API server crashes. ### 5. Performance Isolation Isolate alerting execution within a dedicated thread pool to ensure that long-running notification tasks do not impact the responsiveness of the REST API or UI. ### 6. SPI & Cleanup Ensure the API Server configuration can dynamically load Alert SPI plugins, while decommissioning standalone startup scripts, assembly descriptors, and Docker/K8s definitions for the separate Alert component. --- I would appreciate any feedback or concerns regarding this approach, particularly on the distributed coordination strategy, before proceeding further with implementation planning. Best regards, **Shrihari Rajendrakumar Kulkarni** GitHub link: https://github.com/apache/dolphinscheduler/discussions/18005 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
