[
https://issues.apache.org/jira/browse/GOBBLIN-1598?focusedWorklogId=713997&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-713997
]
ASF GitHub Bot logged work on GOBBLIN-1598:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 24/Jan/22 20:12
Start Date: 24/Jan/22 20:12
Worklog Time Spent: 10m
Work Description: phet commented on a change in pull request #3454:
URL: https://github.com/apache/gobblin/pull/3454#discussion_r791097576
##########
File path:
gobblin-metrics-libs/gobblin-metrics-base/src/test/java/org/apache/gobblin/metrics/metric/filter/MetricNameRegexFilterTest.java
##########
@@ -42,5 +42,11 @@ public void matchesTest() {
Assert.assertTrue(metricNameRegexFilter2.matches("test1",
mock(Metric.class)));
Assert.assertFalse(metricNameRegexFilter2.matches("test2",
mock(Metric.class)));
Assert.assertFalse(metricNameRegexFilter2.matches("test3",
mock(Metric.class)));
+ MetricNameRegexFilter metricNameRegexFilter3 = new
MetricNameRegexFilter("GobblinService\\..*\\.RunningStatus");
Review comment:
I don't want to go for overkill, yet I'd hope to validate the connection
between this regex hard-coded here and what's constructed in the impl:
```
ServiceMetricNames.GOBBLIN_SERVICE_PREFIX + "\\..*\\." +
ServiceMetricNames.RUNNING_STATUS
```
maybe add a static method there to constructs the regex and then have the
test call it to set `metricNameRegexFilter3`?
##########
File path:
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java
##########
@@ -399,6 +400,7 @@ public synchronized void setActive(boolean active) {
} else { //Mark the DagManager inactive.
log.info("Inactivating the DagManager. Shutting down all DagManager
threads");
this.scheduledExecutorPool.shutdown();
+ RootMetricContext.get().removeMatching(new
MetricNameRegexFilter(ServiceMetricNames.GOBBLIN_SERVICE_PREFIX + "\\..*\\." +
ServiceMetricNames.RUNNING_STATUS));
Review comment:
let's add a comment here about background info (since a bit complex how
several pieces conspire to create the issue)... something like:
the DMThread's metrics mappings follow the lifecycle of the DMThread itself
and so are lost by DM deactivation-reactivation. the `RootMetricContext` OTOH
is a (persistent) singleton. to avoid XYZ exception by the RMC preventing
(re-)add of a metric already known, remove all metrics that a new DMThread
thread would attempt to add (in `DagManagerThread::initialize`) whenever
running post-re-enablement
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 713997)
Time Spent: 40m (was: 0.5h)
> Fix metrics already exist issue in dag manager
> ----------------------------------------------
>
> Key: GOBBLIN-1598
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1598
> Project: Apache Gobblin
> Issue Type: Bug
> Reporter: Zihan Li
> Priority: Major
> Time Spent: 40m
> Remaining Estimate: 0h
>
> {code:java}
> 2021/06/12 07:11:32.067 ERROR [DagManager] [pool-19-thread-3]
> [gobblin-service-war] [] Exception encountered in
> org.apache.gobblin.service.modules.orchestration.DagManager$DagManagerThread
> java.lang.IllegalArgumentException: A metric named
> GobblinService.testGroup.testFlow.RunningStatus already exists
> at
> org.apache.gobblin.metrics.InnerMetricContext.register(InnerMetricContext.java:266)
>
> at org.apache.gobblin.metrics.MetricContext.register(MetricContext.java:409)
> at
> org.apache.gobblin.service.modules.orchestration.DagManager$DagManagerThread.initialize(DagManager.java:661)
>
> at
> org.apache.gobblin.service.modules.orchestration.DagManager$DagManagerThread.run(DagManager.java:493)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [?:1.8.0_172]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> [?:1.8.0_172]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> [?:1.8.0_172]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> [?:1.8.0_172]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_172]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_172]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
> {code}
> Likely root cause:
> If we activate/deactivate DagManager several times, DagManagerThreads
> alongside with their flowGauges maps will be recreated, but RootMetricContext
> will still stay the same. So the newly created threads will incorrectly think
> that the flow was not registered previously, and try to register it again.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)