Hello, 2018. dec. 21., P 2:18 dátummal Mark H Weaver <m...@netris.org> ezt írta:
> Hi Julien, > > I've rearranged your reply from "top-posting" style to "bottom-posting" > style. Please consider using bottom-posting in the future. > > I wrote: > > > Julien Lepiller <jul...@lepiller.eu> writes: > > > >> I'd like to get staging merged soon, as it wasn't for quite some > >> time. Here are some stats about the current state of substitutes for > >> staging: > >> > >> According to guix weather, we have: > >> > >> | architecture | berlin | hydra | > >> +--------------+--------+-------+ > >> | x86_64 | 36.5% | 81.7% | > >> | i686 | 23.8% | 71.0% | > >> | aarch64 | 22.2% | 00.0% | > >> | armhf | 17.0% | 45.6% | > >> > >> What should the next step be? > > > > I think we should wait until the coverage on armhf and aarch64 have > > become larger, for the sake of users on those systems. > > > > Also, I've seen some commits that make me wonder if hydra is still > > being configured as an authorized substitute server on new Guix > > installations. > > Do you know? > > > > If 'berlin' is the only substitute server by default, then we certainly > > need to wait for those numbers to get higher, no? > > > > What do you think? > > Julien Lepiller <jul...@lepiller.eu> responded: > > > I agree, but I wonder if there is a reason for these to be so low? > > It's a good question. I have several hypotheses: > > * Unfortunately, it is fairly common for builds for important core > packages to spuriously fail, often due to unreliable test suites, and > to cause thousands of other important dependent packages to fail. > When this happens on Hydra, I can see what's going on, and restart the > build and all of its dependents. > This is currently a problem, we can't see which dependency causes the dependency failure. > I wouldn't be surprised if some important core packages spuriously > failed to build on Berlin, but we have no effective way to see what > happened there. If that's the case, the 'guix weather' numbers above > might never get much higher no matter how long we wait. > > * Berlin's build slots may have been occupied for long periods of time > by 'test.*' jobs stuck in an endless "waiting for udevd..." loop, as > described in <https://bugs.gnu.org/33362>. > > Hydra's web interface allows me to monitor active jobs and manually > kill those stuck jobs when I find them. I don't know how to do that > on Berlin. > > * Especially on armhf and aarch64, where Berlin has very little build > capacity, and new builds are being added to Berlin's build queue must > faster than they can be built, it is quite possible that Berlin is > spending most of its effort on long-outdated builds. > > On Hydra, I can see when this is happening, and often intervene by > cancelling large numbers of outdated builds on armhf, so that it > remains focused on the most popular and up-to-date packages. > We are currently missing an admin interface on berlin, and we would need that, as canceling a job should be privileged. > * On WIP branches like 'core-updates' and 'staging', when a new > evaluation is done, I cancel all outdated Hydra jobs on those > branches. I don't know if anything similar is done on Berlin. > > In summary, there are several things that I regularly do to make > efficient use of Hydra's limited build capacity. I periodically look at > Berlin's web interface to see how it has progressed, but it is currently > mostly a black box to me. I see no effective way to focus its limited > resources on the most important builds, or to see when build slots are > stuck. > > Regards, > Mark > I am currently looking around how to improve the situation. Suggestions are welcome. G_bor > >