Hey Николай,

Apologies about this - I wasn't aware of this behavior. I have made all the
gists public.



On Wed, Dec 20, 2023 at 12:09 AM Greg Harris <greg.har...@aiven.io.invalid>
wrote:

> Hey Stan,
>
> Thanks for opening the discussion. I haven't been looking at overall
> build duration recently, so it's good that you are calling it out.
>
> I worry about us over-indexing on this one build, which itself appears
> to be an outlier. I only see one other build [1] above 6h overall in
> the last 90 days in this view: [2]
> And I don't see any overlap of failed tests in these two builds, which
> makes it less likely that these particular failed tests are the causes
> of long build times.
>
> Separately, I've been investigating build environment slowness, and
> trying to connect it with test failures [3]. I observed that the CI
> build environment is 2-20 times slower than my developer machine (M1
> mac).
> When I simulate a similar slowdown locally, there are tests which
> become significantly more flakey, often due to hard-coded timeouts.
> I think that these particularly nasty builds could be explained by
> long-tail slowdowns causing arbitrary tests to take an excessive time
> to execute.
>
> Rather than trying to find signals in these rare test failures, I
> think we should find tests that have these sorts of failures more
> regularly.
> There are lots of builds in the 5-6h duration bracket, which is
> certainly unacceptably long. We should look into these builds to find
> improvements and optimizations.
>
> [1] https://ge.apache.org/s/ygh4gbz4uma6i/
> [2]
> https://ge.apache.org/scans?list.sortColumn=buildDuration&search.relativeStartTime=P90D&search.rootProjectNames=kafka&search.tags=trunk&search.timeZoneId=America%2FNew_York
> [3] https://github.com/apache/kafka/pull/15008
>
> Thanks for looking into this!
> Greg
>
> On Tue, Dec 19, 2023 at 3:45 PM Николай Ижиков <nizhi...@apache.org>
> wrote:
> >
> > Hello, Stanislav.
> >
> > Can you, please, make the gist public.
> > Private gists not available for some GitHub users even if link are known.
> >
> > > 19 дек. 2023 г., в 17:33, Stanislav Kozlovski 
> > > <stanis...@confluent.io.INVALID>
> написал(а):
> > >
> > > Hey everybody,
> > > I've heard various complaints that build times in trunk are taking too
> > > long, some taking as much as 8 hours (the timeout) - and this is
> slowing us
> > > down from being able to meet the code freeze deadline for 3.7.
> > >
> > > I took it upon myself to gather up some data in Gradle Enterprise to
> see if
> > > there are any outlier tests that are causing this slowness. Turns out
> there
> > > are a few, in this particular build -
> https://ge.apache.org/s/un2hv7n6j374k/
> > > - which took 10 hours and 29 minutes in total.
> > >
> > > I have compiled the tests that took a disproportionately large amount
> of
> > > time (20m+), alongside their time, error message and a link to their
> full
> > > log output here -
> > >
> https://gist.github.com/stanislavkozlovski/8959f7ee59434f774841f4ae2f5228c2
> > >
> > > It includes failures from core, streams, storage and clients.
> > > Interestingly, some other tests that don't fail also take a long time
> in
> > > what is apparently the test harness framework. See the gist for more
> > > information.
> > >
> > > I am starting this thread with the intention of getting the discussion
> > > started and brainstorming what we can do to get the build times back
> under
> > > control.
> > >
> > >
> > > --
> > > Best,
> > > Stanislav
> >
>


-- 
Best,
Stanislav

Reply via email to