I've been seeing this regularly, and getting hundreds of 'dupload failed' emails as a result (they get sent every 5 mins now after it goes wrong). I've not been keeping records, (because I just bin those hundreds of emails) but it happens most weeks, and I've had two this week (opencv and gnuradio)
I'll start collecting info here to see if we can narrow this down a bit. So, latest is opencv_4.6.0+dfsg-13.1~exp1_armel built for experimental on arm-ubc-06 the changes file is Feb 4 03:29 Looking in the build logs I see it was built and uploaded successfully 3hrs later on arm-arm-03 https://buildd.debian.org/status/architecture.php?a=armel&suite=experimental&buildd=buildd_armel-arm-ubc-06 https://buildd.debian.org/status/architecture.php?a=armel&suite=experimental&buildd=buildd_armel-arm-arm-03 The second build started 1hr after the .changes files for the frist one was made, so I guess there is a timeout of 1hr after the log arrives and if there is no uploade by then the buildd assumes failure and schedules another build? I have noticed before that usually by the time I look at the failed upload there is already a new build uploaded. It would be nice if the buildds tidied up after themselves once the build is in the archive and stopped sending tiresome email awaiting a manual clear-up. Once a new build has been issued the old failed upload should be removed. I'm not quite sure exactly what that check should look like. Alternatively we could stop sending very frequent mail to buildd admins, and let the 'are files a week old' script tidy them up in due course. The actual error on the failed log is: Finished at 2024-02-04T03:29:15Z Signature with key '764BC9A1354021955868EF5CC98724D9AA73AAA3' requested: signfile buildinfo /home/buildd/build/opencv_4.6.0+dfsg-13.1~exp1_armel-buildd.buildinfo 764BC9A1354021955868EF5CC98724D9AA73AAA3 gpg: error running '/usr/bin/gpg-agent': exit status 2 gpg: failed to start agent '/usr/bin/gpg-agent': General error gpg: can't connect to the agent: General error gpg: keydb_search failed: No agent running gpg: skipped "764BC9A1354021955868EF5CC98724D9AA73AAA3": No agent running gpg: /tmp/debsign.e1vK8yhj/opencv_4.6.0+dfsg-13.1~exp1_armel-buildd.buildinfo: clear-sign failed: No agent running debsign: gpg error occurred! Aborting.... Looking in var/log/messages at that time (on arm-ubc-06) We see that some script set to log starts 1 second after sbuild returns, at 03:29:16 and sends no more messages after 03:30:38. So takes 1m22s (82s) to run. Does that indicate the suspected high load which might be making gpg fail? I don't think we log load per se, do we? It has a lot of debs to check so I think it just takes a while. times for running that script in this log for various packages: 16s singular_4.3.2-p10+ds-1.1~exp1_armel 15s redland_1.0.17-3.1~exp1_armel 9s shapetools_1.4pl6-16.1~exp1_armel 9s solvespace_3.1+ds1-3.1~exp1_armel 8s secsipidx_1.3.2-2.1~exp1_armel 8s scamper_20211212-1.2~exp1_armel 6s rttr_0.9.6+dfsg1-6.1~exp1_armel 8s mfem_4.5.2+ds-1.5~exp1_armel 82s opencv_4.6.0+dfsg-13.1~exp1_armel (failed to sign) 15s rhvoice_1.8.0+dfsg-3.1~exp1_armel 5s libposix-2008-perl_0.23-1_armel 4s rust-expectrl_0.7.1-2_armel 5s liblinux-fd-perl_0.016-1_armel 10s swami_2.2.2-2.1~exp1_armel 6s symmetrica_3.0.1+ds-2.1~exp1_armel 9s muffin_5.8.1-2.1~exp1_armel 5s t4kcommon_0.1.1-11.1~exp1_armel 4s netperfmeter_1.9.6-1_armel 6s pidgin-skype_20240122+gitab786a3+dfsg-2_armel 6s tinyframe_0.1.1-4.1~exp1_armel 5s toontag_0.0~git20220105193632.41237ef-2.1~exp1_armhf 4s tse3_0.3.1-6.1~exp1_armel 86s gcc-10_10.5.0-3_armel 5s lomiri-camera-app_4.0.5+dfsg-1_armel 8s opendmarc_1.4.2-4.1~exp1_armel 154s libreoffice_24.2.0-1~bpo12+1_armel 59s gnuradio_3.10.9.2-1.1~exp1_armel (failed to sign) So opencv is not the longest package to process. libreoffice takes quite a lot longer. , gcc-10 slightly longer. But most are way quicker. I have noticed that it's usually larger packages that go wrong. (libreoffice, gcc, binutils, but not always) Not sure if any of this info helps but that's my investigations today. Suggestions for other monitoring, or the best way to work around it by not fixing it, just making it less annoying, welcome. Wookey -- Principal hats: Debian, Wookware, ARM http://wookware.org/
signature.asc
Description: PGP signature