My first suspicion on a test timeout is usually a deadlock.  That
being said, I haven't looked at this test / change in any real detail
so I don't know if that's the case here.  How long does the test take
to run locally?

Second, I would try and remove sleeps, and make sure to use the
utilities SleepABit and SleepABitAsync (which handle very tiny sleeps
much better on Windows) but it doesn't look like that is the case
here.

If there is no deadlock, and there is no sleep, and your test is
simply burning CPU for 5 minutes, then yes I think it is probably time
to reduce the configuration space.  Can you sample the configuration
space with a random seed (make sure to use SCOPED_TRACE to track both
the seed and the case under test so that if there is a failure it can
be reproduced.)?  CI runs quite often so if there is a failure on any
particular case it should still pop up reasonably soon.

Finally, if the configuration space can't be reduced for whatever
reason, then I think we could potentially investigate some kind of
nightly (crossbow) test with a longer timeout but I don't know that
we've had to resort to that yet.

On Wed, Aug 17, 2022 at 3:41 AM Yaron Gvili <rt...@hotmail.com> wrote:
>
> It looks like the test normally takes less than a second. The gap in 
> running-time is not surprising because the tests I locally added cover a much 
> larger configuration-space. Before I reduce the configuration-space being 
> tested, I'd like to figure out what the acceptable alternatives are.
>
>
> Yaron.
> ________________________________
> From: Li Jin <ice.xell...@gmail.com>
> Sent: Wednesday, August 17, 2022 9:04 AM
> To: dev@arrow.apache.org <dev@arrow.apache.org>
> Subject: Re: dealing with tester timeout in a CI job
>
> Yaron, how does the asof join tests normally take?
>
> On Wed, Aug 17, 2022 at 6:13 AM Yaron Gvili <rt...@hotmail.com> wrote:
>
> > Sorry, yes, C++. The failed job is
> > https://github.com/apache/arrow/runs/7839062613?check_suite_focus=true
> > and it timed out on code I wrote (in a PR, not merged). I'd like to avoid a
> > timeout without reengineering or reducing the set of tests I wrote, hence
> > my questions.
> >
> >
> > Yaron.
> > ________________________________
> > From: Sutou Kouhei <k...@clear-code.com>
> > Sent: Tuesday, August 16, 2022 8:13 PM
> > To: dev@arrow.apache.org <dev@arrow.apache.org>
> > Subject: Re: dealing with tester timeout in a CI job
> >
> > Hi,
> >
> > What language are you talking about? C++?
> > For C++, we have two timeouts:
> > * GitHub Action's timeout
> > * GTest's timeout
> >
> > Could you show the URL of the failed macOS related CI job?
> >
> > Thanks,
> > --
> > kou
> >
> > In
> >  <
> > paxp190mb1565310e470e696da667f540bd...@paxp190mb1565.eurp190.prod.outlook.com
> > >
> >   "dealing with tester timeout in a CI job" on Tue, 16 Aug 2022 16:34:24
> > +0000,
> >   Yaron Gvili <rt...@hotmail.com> wrote:
> >
> > > Hi,
> > >
> > > What are some acceptable ways to handle a timeout failure in a CI job
> > for a tester I implemented? For reference, I got such a timeout for only
> > one MacOS related CI job, while the other CI jobs did not get such a
> > timeout.
> > >
> > > Let's assume that I cannot (easily) make the tests run any faster. Is it
> > possible/acceptable to change the timeout, and how? to turn off some of the
> > tests for one or all CI jobs, and how? to split the tester into several, so
> > that each meets the timeout allotment?
> > >
> > >
> > > Cheers,
> > > Yaron.
> >

Reply via email to