zqr10159 opened a new issue, #3892:
URL: https://github.com/apache/hertzbeat/issues/3892

   ### Feature Request
   
   Native Lightweight Analysis Engine
   
   ### Is your feature request related to a problem? Please describe
   
   1. **Static Thresholds:** hertzbeat-alerter relies heavily on static 
thresholds (e.g., CPU \> 80%). This leads to alert fatigue in cyclic business 
scenarios (e.g., daily traffic peaks).  
   2. **Existing AI Module:** The current hertzbeat-analysis (or hertzbeat-ai) 
module focuses on **LLM integration** (ChatBot/Agent). It relies on external 
APIs and is not suitable for high-frequency, low-latency, and cost-effective 
numerical anomaly detection.  
   3. **Lack of Native Intelligence:** HertzBeat needs a built-in mathematical 
engine to understand "Trend" and "Seasonality" without requiring users to 
deploy heavy external Python/AI environments.
   
   ### Describe the solution you'd like
   
   I propose introducing a new module **hertzbeat-analysis** (or extending the 
existing AI module with a "Native" engine).
   
   This module serves as a **"System 1" (Fast & Cheap)** intelligence layer 
using pure Java mathematics (commons-math3), focusing on **Time-Series 
Analysis** and **Dynamic Baseline Prediction**.
   
   #### **1\. Architecture Design**
   
   * **Location:** New Maven module hertzbeat-analysis.  
   * **Role:**  
     * **Consumer:** Subscribes to the metrics data stream (side-by-side with 
alerter or warehouse).  
     * **Trainer:** Periodically queries historical data from 
hertzbeat-warehouse to update model parameters (coefficients).  
     * **Provider:** Provides an API for hertzbeat-alerter to check if a value 
is "Anomalous" based on the model.
   
   #### **2\. Core Algorithms (Java Native)**
   
   We will implement "TinyProphet" \- a lightweight decomposition algorithm:
   
   * **Trend:** Linear Regression (OLS) or Ridge Regression.  
   * **Seasonality:** Fourier Series features fitted via OLS (Ordinary Least 
Squares).  
   * **Stack:** org.apache.commons:commons-math3. **No Python/JNI required.**
   
   #### **3\. Workflow**
   
   1. **Data Preprocessing (TimeSeriesPreprocessor):**  
      * Handle missing data (NaN) from MetricsData with linear interpolation.  
      * Align timestamps to standard windows (e.g., 1-minute buckets).  
   2. **Model Training (Async Job):**  
      * Fetch last 1\~7 days of data from hertzbeat-warehouse.  
      * Calculate Regression Coefficients ($\\beta$) for Trend and Seasonality. 
 
      * Store lightweight coefficients (JSON) in the database 
(hzb\_analysis\_model).  
   3. **Real-time Inference:**  
      * Calculate expected\_value using the stored coefficients.  
      * Compare |actual \- expected| against dynamic tolerance (e.g., 3-Sigma).
   
   ### Describe alternatives you've considered
   
   * **Prometheus predict\_linear:** Stateless and cannot handle complex 
seasonality.  
   * **External Python Agents:** Breaks the "Out-of-the-box" experience.  
   * **Using LLM for everything:** Too expensive and slow for real-time metric 
stream processing.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to