The clock sync errors do seem to have increased over the past few
months. If we could just fix those, I think we'd be left with almost
entirely "known" flakies. Any ideas as to what's going on? Think it's
something that could be addressed with your NTP-client-in-Kudu patch
series?

On Fri, Mar 23, 2018 at 9:58 AM, Todd Lipcon <[email protected]> wrote:
> It seems that over recent weeks our precommits have gotten somewhat flaky.
> Some of this is due to actual flaky tests (most of which are tracked by
> JIRAs) but a lot has been due to issues like clock synchronization problems
> on the dist-test slaves.
>
> I'd like to consider changing precommit to retry _all_ tests up to 3 times,
> instead of just known-flakies. It's a bit of a heavy hammer -- the risk is
> that if you introduce flakiness in a test you aren't likely to see it
> precommit, but I think the upside of avoiding wasted effort triaging failed
> precommits is probably worth it.
>
> Longer term hopefully we can improve the dist-test software to support
> something like a "retry if results match a certain regex" to check for
> clock sync errors or somesuch, but I think it's non-trivial.
>
> Thoughts?
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera

Reply via email to