featzhang opened a new pull request, #27772:
URL: https://github.com/apache/flink/pull/27772

   ## Purpose
   
   This PR adds the Diagnosis Advisor feature to the Flink Web UI, providing 
automated diagnostic suggestions based on job metrics analysis. Users often 
face difficulty diagnosing performance issues like high CPU usage, memory 
leaks, and backpressure. The current Flink UI provides raw metrics but lacks 
intelligent diagnostic suggestions.
   
   The Diagnosis Advisor analyzes multiple metric categories and provides 
actionable recommendations for common performance scenarios.
   
   ## Change log
   
   **Backend Changes:**
   - Added `DiagnosisHandler.java` - REST API handler for diagnostic analysis
   - Added `DiagnosisResponseBody.java` - Response body with diagnostic 
suggestions
   - Added `DiagnosisHeaders.java` - REST endpoint definition
   
   **Frontend Changes:**
   - Added `DiagnosisService.ts` - Service to fetch diagnostic suggestions
   - Added `diagnosis.ts` - TypeScript interface definitions
   - Added `DiagnosisComponent` - Angular component with HTML template and Less 
styles
   - Added `DiagnosisDemoComponent` - Demo page showcasing scenarios
   
   **New REST API:**
   - `GET /jobs/:jobid/diagnosis` - Returns automated diagnostic suggestions
   
   **Diagnostic Rules Implemented:**
   1. High CPU + High Heap Memory → Suggests GC-related issues
   2. High CPU + Normal Heap Memory → Indicates heavy computation
   3. Low CPU + High Backpressure → Points to I/O bottlenecks
   4. High GC Count → Flags performance concerns
   
   ## Verifying
   
   1. Build the Flink project: `mvn clean install -DskipTests`
   2. Build the web dashboard: `cd flink-runtime-web/web-dashboard && npm 
install && npm run build`
   3. Access the demo page at `/diagnosis-demo` to see the Diagnosis Advisor
   4. Test each diagnostic scenario to verify recommendations
   5. For integration testing, navigate to `/jobs/{jobId}/diagnosis`
   
   ## Impact
   
   **Scope:**
   - New feature addition to the Flink Web UI
   - Does not modify existing API behavior
   
   **Performance:**
   - Minimal overhead as metrics are already being collected
   - Diagnostic analysis runs on-demand
   
   **Compatibility:**
   - Fully backward compatible
   - Diagnostic rules are extensible
   
   ## Documentation
   
   Documentation updates needed for:
   - Web UI documentation
   - REST API reference
   - Troubleshooting guides


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to