kgyrtkirk commented on code in PR #18510:
URL: https://github.com/apache/druid/pull/18510#discussion_r2377870181
##########
embedded-tests/src/test/java/org/apache/druid/testing/embedded/indexing/IndexTaskTest.java:
##########
@@ -99,6 +100,9 @@ public void test_runIndexTask_forInlineDatasource()
}
cluster.callApi().waitForAllSegmentsToBeAvailable(dataSource, coordinator,
broker);
+ broker.latchableEmitter().waitForEvent(
+ event -> event.hasDimension(DruidMetrics.DATASOURCE, dataSource)
+ );
Review Comment:
oh... 69be8315d675376d7408224ee6c477fa705dade1 removed this and I had a
conflict and this remained incorrectly
##########
indexing-service/src/test/java/org/apache/druid/indexing/common/task/batch/parallel/TaskMonitorTest.java:
##########
@@ -296,10 +305,23 @@ public ListenableFuture<Void> runTask(String taskId,
Object taskObject)
if (task.throwUnknownTypeIdError) {
throw new RuntimeException(new ISE("Could not resolve type id
'test_task_id'"));
}
- taskRunner.submit(() -> tasks.put(task.getId(),
task.run(null).getStatusCode()));
+ TaskToolbox taskToolbox = makeToolbox();
+ taskRunner.submit(() -> tasks.put(task.getId(),
task.run(taskToolbox).getStatusCode()));
Review Comment:
because none of the tests set up things correctly they do the minimum...and
emitting a metric in the task have hit the wall that its simply not set...
I also had to undo the `stopGracefully` enhancements from the `NoopTask` as
it seemed like it was destabilizing some other tests
##########
indexing-service/src/main/java/org/apache/druid/indexing/overlord/hrtr/HttpRemoteTaskRunner.java:
##########
@@ -1543,6 +1550,9 @@ public void taskAddedOrUpdated(final TaskAnnouncement
announcement, final Worker
HttpRemoteTaskRunnerWorkItem.State.RUNNING
);
tasks.put(taskId, taskItem);
+ final ServiceMetricEvent.Builder metricBuilder = new
ServiceMetricEvent.Builder();
+ metricBuilder.setDimension(DruidMetrics.TASK_ID, taskId);
+ emitter.emit(metricBuilder.setMetric(TASK_UNKNOWN_COUNT, (long)
1));
Review Comment:
renamed it to `task/discovered/count`
it seems like in case the new overlord becomes the leader it may launch that
task again...
although I was not able to make that happen with debug points - as there is
a latch which prevents this...
as
[HttpRemoteTaskRunner](https://github.com/apache/druid/blob/9481a535be627d2e9cb1974e68eb12935e70e5b3/indexing-service/src/main/java/org/apache/druid/indexing/overlord/hrtr/HttpRemoteTaskRunner.java#L569)
blocks untile workers sync state.
which blocks the `DruidOverlord#becomeLeader` thru `Lifecycle#start`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]