[ 
https://issues.apache.org/jira/browse/HDDS-15449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jheng-Sing Chen reassigned HDDS-15449:
--------------------------------------

    Assignee: Jheng-Sing Chen

> Avoid leaked event-processing thread and async work outliving tests in 
> TestReconTaskControllerImpl
> --------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-15449
>                 URL: https://issues.apache.org/jira/browse/HDDS-15449
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: Ozone Recon, test
>            Reporter: Chi-Hsuan Huang
>            Assignee: Jheng-Sing Chen
>            Priority: Major
>
> TestReconTaskControllerImpl starts the controller's background 
> event\-processing thread in setUp\(\) but never stops it in a teardown, so 
> every test leaks that thread.
> More importantly, some tests queue async work that outlives the test method. 
> For example, testNewRetryLogicWithSuccessfulCheckpoint calls 
> queueReInitializationEvent\(...\), which offers a reinit event to the buffer; 
> the background loop then runs reInitializeTasks asynchronously after the test 
> method has already returned and asserted. That async work performs real DB 
> writes against the per\-test Derby DB while it is being torn down, producing 
> "Failed to update table" errors.
> This was exposed while working on HDDS\-15269: simply adding an @AfterEach 
> that calls stop\(\) is not sufficient, because stop\(\) \(correctly\) waits 
> for the in\-flight reinit to finish, and that in\-flight work races the Derby 
> teardown and stalled for \~30s. So the cleanup must also ensure tests do not 
> leave async work running.
> Proposed:
> \- Now that stop\(\) returns promptly \(HDDS\-15269\), add a proper teardown 
> that stops the controller.
> \- Make tests that queue async events quiesce that work deterministically 
> \(e.g. wait for completion, or drive processing synchronously\) so no async 
> work outlives the test method and races DB teardown.
> This is a test\-correctness/hygiene improvement; it also removes per\-test 
> thread leakage during the suite.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to