This is an automated email from the ASF dual-hosted git repository.
zuston pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-uniffle.git
The following commit(s) were added to refs/heads/master by this push:
new 66c752fe Fix incorrect metrics of event_queue_size and
total_write_handler (#411)
66c752fe is described below
commit 66c752fe3ecd6d4979d3539fe22265d4239156a9
Author: Junfan Zhang <[email protected]>
AuthorDate: Tue Dec 13 17:45:54 2022 +0800
Fix incorrect metrics of event_queue_size and total_write_handler (#411)
### What changes were proposed in this pull request?
Fix incorrect metrics of event_queue_size and total_write_handler
### Why are the changes needed?
In current codebase, there are bugs on above metrics.
1. The metric of total_write_handler won't desc when exception happened on
flushing to file
2. The metric of event_queue_size won't show the correct wait queue size.
In original logic, if all events are waiting to be operated in flush thread
pool, the flushQueue is always 0 and the metric value also will be 0. This is
wrong.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Don't need.
---
.../main/java/org/apache/uniffle/server/ShuffleFlushManager.java | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git
a/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
b/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
index edddf1a6..4f1cc426 100644
--- a/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
+++ b/server/src/main/java/org/apache/uniffle/server/ShuffleFlushManager.java
@@ -100,12 +100,13 @@ public class ShuffleFlushManager {
ShuffleDataFlushEvent event = flushQueue.take();
threadPoolExecutor.execute(() -> {
try {
- ShuffleServerMetrics.gaugeEventQueueSize.set(flushQueue.size());
ShuffleServerMetrics.gaugeWriteHandler.inc();
flushToFile(event);
- ShuffleServerMetrics.gaugeWriteHandler.dec();
} catch (Exception e) {
LOG.error("Exception happened when flush data for " + event, e);
+ } finally {
+ ShuffleServerMetrics.gaugeWriteHandler.dec();
+ ShuffleServerMetrics.gaugeEventQueueSize.dec();
}
});
} catch (Exception e) {
@@ -142,6 +143,8 @@ public class ShuffleFlushManager {
public void addToFlushQueue(ShuffleDataFlushEvent event) {
if (!flushQueue.offer(event)) {
LOG.warn("Flush queue is full, discard event: " + event);
+ } else {
+ ShuffleServerMetrics.gaugeEventQueueSize.inc();
}
}