DanielCarter-stack commented on issue #10375:
URL: https://github.com/apache/seatunnel/issues/10375#issuecomment-3795839621

   <!-- code-pr-reviewer -->
   Thanks for the detailed evidence. I've confirmed this is a **real bug** in 
the metrics aggregation logic.
   
   **Root Cause:**
   In `JobClient.getJobMetricsSummary()` 
(seatunnel-engine/seatunnel-engine-client/src/main/java/org/apache/seatunnel/engine/client/job/JobClient.java:165-170),
 the code incorrectly uses `sourceReaders.size()` as the loop boundary:
   
   ```java
   for (int i = 0; i < sourceReaders.size(); i++) {
       JsonNode sinkWriter = sinkWriters.get(i);  // BUG: assumes same size
       sinkWriteCount += sinkWriter.get("value").asLong();
   }
   ```
   
   In multi-sink scenarios with parallelism > 1, `sinkWriters` typically has 
**more elements** than `sourceReaders` (each sink can have multiple writer 
tasks). Your JSON shows 2 sourceReaders vs 4 sinkWriters. When only 2 
iterations occur, two sink writers are ignored. If the ignored writers contain 
actual data (e.g., your taskGroupId=2 writers with 232003), the display 
incorrectly shows 0.
   
   The existing test `testMultipleSinks` (JobClientTest.java:117-146) uses 
symmetric arrays (both size 3) and doesn't catch this.
   
   **Fix Approach:**
   - Change SinkWriteCount/SinkCommittedCount loops to iterate over 
`sinkWriters.size()` independently
   - Add test case for asymmetric source/sink arrays (e.g., 2 sources, 4 sinks)
   
   Would you like to proceed with a PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to