Chi-Hsuan Huang created HDDS-15449:
--------------------------------------
Summary: Avoid leaked event-processing thread and async work
outliving tests in TestReconTaskControllerImpl
Key: HDDS-15449
URL: https://issues.apache.org/jira/browse/HDDS-15449
Project: Apache Ozone
Issue Type: Sub-task
Components: Ozone Recon, test
Reporter: Chi-Hsuan Huang
TestReconTaskControllerImpl starts the controller's background
event\-processing thread in setUp\(\) but never stops it in a teardown, so
every test leaks that thread.
More importantly, some tests queue async work that outlives the test method.
For example, testNewRetryLogicWithSuccessfulCheckpoint calls
queueReInitializationEvent\(...\), which offers a reinit event to the buffer;
the background loop then runs reInitializeTasks asynchronously after the test
method has already returned and asserted. That async work performs real DB
writes against the per\-test Derby DB while it is being torn down, producing
"Failed to update table" errors.
This was exposed while working on HDDS\-15269: simply adding an @AfterEach that
calls stop\(\) is not sufficient, because stop\(\) \(correctly\) waits for the
in\-flight reinit to finish, and that in\-flight work races the Derby teardown
and stalled for \~30s. So the cleanup must also ensure tests do not leave async
work running.
Proposed:
\- Now that stop\(\) returns promptly \(HDDS\-15269\), add a proper teardown
that stops the controller.
\- Make tests that queue async events quiesce that work deterministically
\(e.g. wait for completion, or drive processing synchronously\) so no async
work outlives the test method and races DB teardown.
This is a test\-correctness/hygiene improvement; it also removes per\-test
thread leakage during the suite.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]