Hello,

2018. dec. 21., P 2:18 dátummal Mark H Weaver <m...@netris.org> ezt írta:

> Hi Julien,
>
> I've rearranged your reply from "top-posting" style to "bottom-posting"
> style.  Please consider using bottom-posting in the future.
>
> I wrote:
>
> > Julien Lepiller <jul...@lepiller.eu> writes:
> >
> >> I'd like to get staging merged soon, as it wasn't for quite some
> >> time. Here are some stats about the current state of substitutes for
> >> staging:
> >>
> >> According to guix weather, we have:
> >>
> >> | architecture | berlin | hydra |
> >> +--------------+--------+-------+
> >> | x86_64       | 36.5%  | 81.7% |
> >> | i686         | 23.8%  | 71.0% |
> >> | aarch64      | 22.2%  | 00.0% |
> >> | armhf        | 17.0%  | 45.6% |
> >>
> >> What should the next step be?
> >
> > I think we should wait until the coverage on armhf and aarch64 have
> > become larger, for the sake of users on those systems.
> >
> > Also, I've seen some commits that make me wonder if hydra is still
> > being configured as an authorized substitute server on new Guix
> > installations.
> > Do you know?
> >
> > If 'berlin' is the only substitute server by default, then we certainly
> > need to wait for those numbers to get higher, no?
> >
> > What do you think?
>
> Julien Lepiller <jul...@lepiller.eu> responded:
>
> > I agree, but I wonder if there is a reason for these to be so low?
>
> It's a good question.  I have several hypotheses:
>
> * Unfortunately, it is fairly common for builds for important core
>   packages to spuriously fail, often due to unreliable test suites, and
>   to cause thousands of other important dependent packages to fail.
>   When this happens on Hydra, I can see what's going on, and restart the
>   build and all of its dependents.
>

This is currently a problem, we can't see
which dependency causes the dependency failure.


>   I wouldn't be surprised if some important core packages spuriously
>   failed to build on Berlin, but we have no effective way to see what
>   happened there.  If that's the case, the 'guix weather' numbers above
>   might never get much higher no matter how long we wait.
>
> * Berlin's build slots may have been occupied for long periods of time
>   by 'test.*' jobs stuck in an endless "waiting for udevd..." loop, as
>   described in <https://bugs.gnu.org/33362>.
>
>   Hydra's web interface allows me to monitor active jobs and manually
>   kill those stuck jobs when I find them.  I don't know how to do that
>   on Berlin.
>
> * Especially on armhf and aarch64, where Berlin has very little build
>   capacity, and new builds are being added to Berlin's build queue must
>   faster than they can be built, it is quite possible that Berlin is
>   spending most of its effort on long-outdated builds.
>
>   On Hydra, I can see when this is happening, and often intervene by
>   cancelling large numbers of outdated builds on armhf, so that it
>   remains focused on the most popular and up-to-date packages.
>
We are currently missing an admin interface on berlin, and we would need
that, as canceling a job should be privileged.


> * On WIP branches like 'core-updates' and 'staging', when a new
>   evaluation is done, I cancel all outdated Hydra jobs on those
>   branches.  I don't know if anything similar is done on Berlin.
>
> In summary, there are several things that I regularly do to make
> efficient use of Hydra's limited build capacity.  I periodically look at
> Berlin's web interface to see how it has progressed, but it is currently
> mostly a black box to me.  I see no effective way to focus its limited
> resources on the most important builds, or to see when build slots are
> stuck.
>
>      Regards,
>        Mark
>
I am currently looking around how to improve the situation. Suggestions are
welcome.

G_bor

>
>

Reply via email to