[jira] [Commented] (NIFI-10037) Improve System Tests Resilience when one test fails

ASF subversion and git services (Jira) Thu, 19 May 2022 11:15:07 -0700


    [ 
https://issues.apache.org/jira/browse/NIFI-10037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539725#comment-17539725
 ]


ASF subversion and git services commented on NIFI-10037:
--------------------------------------------------------

Commit 38b51b0dde24929563c4fc6c9a8c7a10e39ef713 in nifi's branch 
refs/heads/main from Mark Payne
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=38b51b0dde ]

NIFI-10037: When system test fails to clean up flow, destroy the entire 
environment so that the next test starts in a healthy state. Name 
troubleshooting directories with the name of the test class to avoid ambiguity. 
Also added a log statement so that we know which test is running when looking 
at the log output from the tests themselves. Finally, found an issue in 
AbstractComponentNode in which we iterate over the elements in a Map and call 
setProperty, which can update the underlying Map - updated to first create a 
copy of the HashMap. Updated that in this Jira because I suspect it is causing 
one of the tests failures that I've been investigating.

This closes #6059

Signed-off-by: David Handermann <exceptionfact...@apache.org>


> Improve System Tests Resilience when one test fails
> ---------------------------------------------------
>
>                 Key: NIFI-10037
>                 URL: https://issues.apache.org/jira/browse/NIFI-10037
>             Project: Apache NiFi
>          Issue Type: Sub-task
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> When one system test fails, it can sometimes leave the NiFi instance in a bad 
> state. The NiFiSystemIT then runs its tearDown() method, but with the 
> instance in an unexpected state, the destroyFlow() method may fail. An 
> example of this is if a node is offloaded in a system test but the offload 
> never completes. As a result, destroyFlow() will fail.
> In this case, the instance is left in a bad state. So the next test will 
> fail. This will cause cascading failures because the NiFi instance is not in 
> the expected state to begin with. This then makes it very difficult to 
> understand what is happening.
> We can improve this by updating the NiFiSystemIT class so that when 
> destroyFlow() fails, we call cleanup(), which will shutdown the NiFi instance 
> and destroy the environment, creating a fresh environment for the next test 
> so that we know we're in a good state.
> Also noticed an issue where test names can conflict in different classes. For 
> example, OffloadIT and LoadBalanceIT both have a test named testOffload() and 
> as a result, the names of the 'troubleshooting' directory collide and 
> overwrite one another. We should include the Test Class Name in the name of 
> the directory. This will also help in troubleshooting problems, as it makes 
> it easier to identify which test failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (NIFI-10037) Improve System Tests Resilience when one test fails

Reply via email to