Hi,

On 19-05-2024 2:55 p.m., 陈 晟祺 wrote:
My concern now is that the results do not seem to be stable or reproducible.

That's an reoccurring problem in lots of places, yes.

Is there any convention in handling such situation? E.g., should I mark all 
zfs-test-suite-x
as flaky and treat them as reference only?

It depends ;)

The disadvantage of marking the whole test stanza as flaky means that it won't block regressions at all. Depending on how the test (I mean per stanza in d/t/control) is set up, it makes more sense to mark individual tests as flaky then the whole suite/stanza. However, if there's not enough granularity, that doesn't really help.

Then there's the infrastructure argument. If your test is not a cheap one, running a long test only to fail flaky is a rather high price for very little gain. Then it might make more sense to not run the test by default (add a unknown restriction for example) and only use the test for manual checking, where you can judge (or rerun) the test as you judge fit.

In the end it's your decision. All I can say is that tests that are flaky enough (my level is roughly worse than 1/8) and not marked as such are considered RC buggy.

Paul

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to