bug#36754: New linux-libre failed to build on armhf on Berlin
Mark H Weaver writes: > Hi Ricardo, > > Interesting. I distinctly remember that there was no log file when I > looked last time. Hmm. > > Anyway, it seems that now, all of the failed builds have either build > logs available or else information about which dependency failed. I > don't remember seeing any of this last time, but I'm glad to see it now. > > A pattern has now emerged, but I don't know what it means. All of the > armhf kernel builds failed except for linux-libre-arm-veyron-5.2.2, > which succeeded: > > https://ci.guix.gnu.org/build/1488502/details (arm-veyron-5.2.2) > > Apart from this anomalous success, all of the armhf 5.2.2 and 4.19.60 > have a truncated log file: > > https://ci.guix.gnu.org/build/1488517/details (5.2.2) > https://ci.guix.gnu.org/build/1488503/details (4.19.60) > https://ci.guix.gnu.org/build/1488513/details (arm-generic-5.2.2) > https://ci.guix.gnu.org/build/1488519/details (arm-generic-4.19.60) > https://ci.guix.gnu.org/build/1488504/details (arm-omap2plus-5.2.2) > https://ci.guix.gnu.org/build/1488501/details (arm-omap2plus-4.19.60) > > This pattern seems too regular to be a coincidence. Can we find out > which build machines were used for these builds? I tried building 5.2.2 'interactively' on Berlin, and got an SSH error: CC [M] net/openvswitch/vport-geneve.o CC [M] net/openvswitch/vport-gre.o LD [M] net/openvswitch/openvswitch.o ;;; [2019/07/23 05:14:53.501502, 0] read_from_channel_port: [GSSH ERROR] Error reading from the channel: # Backtrace: 16 (apply-smob/1 #) In ice-9/boot-9.scm: 705:2 15 (call-with-prompt _ _ #) In ice-9/eval.scm: 619:8 14 (_ #(#(#))) In guix/ui.scm: 1747:12 13 (run-guix-command _ . _) In guix/scripts/offload.scm: 781:22 12 (guix-offload . _) In ice-9/boot-9.scm: 829:9 11 (catch _ _ # …) 829:9 10 (catch _ _ # …) In guix/scripts/offload.scm: 580:19 9 (process-request _ _ _ _ #:print-build-trace? _ # _ # _) 531:6 8 (call-with-timeout _ _ _) 361:2 7 (transfer-and-offload # …) In ice-9/boot-9.scm: 829:9 6 (catch _ _ # …) In guix/scripts/offload.scm: 385:6 5 (_) In guix/store.scm: 1203:15 4 (_ # _ _) 692:11 3 (process-stderr # _) In guix/serialization.scm: 87:11 2 (read-int _) 73:12 1 (get-bytevector-n* # …) In unknown file: 0 (get-bytevector-n # …) ERROR: In procedure get-bytevector-n: Throw to key `guile-ssh-error' with args `("read_from_channel_port" "Error reading from the channel" # #f)'. guix build: error: build of `/gnu/store/yfns7ga468vmv9jn72snk79b16p8mhfa-linux-libre-5.2.2.drv' failed real637m24.906s user0m6.661s sys 0m0.897s Unfortunately I failed to record which machine was used and don't know a way to find out after the fact.
bug#36754: New linux-libre failed to build on armhf on Berlin
Hi Ricardo, Interesting. I distinctly remember that there was no log file when I looked last time. Hmm. Anyway, it seems that now, all of the failed builds have either build logs available or else information about which dependency failed. I don't remember seeing any of this last time, but I'm glad to see it now. A pattern has now emerged, but I don't know what it means. All of the armhf kernel builds failed except for linux-libre-arm-veyron-5.2.2, which succeeded: https://ci.guix.gnu.org/build/1488502/details (arm-veyron-5.2.2) Apart from this anomalous success, all of the armhf 5.2.2 and 4.19.60 have a truncated log file: https://ci.guix.gnu.org/build/1488517/details (5.2.2) https://ci.guix.gnu.org/build/1488503/details (4.19.60) https://ci.guix.gnu.org/build/1488513/details (arm-generic-5.2.2) https://ci.guix.gnu.org/build/1488519/details (arm-generic-4.19.60) https://ci.guix.gnu.org/build/1488504/details (arm-omap2plus-5.2.2) https://ci.guix.gnu.org/build/1488501/details (arm-omap2plus-4.19.60) This pattern seems too regular to be a coincidence. Can we find out which build machines were used for these builds? All of the 4.14.134 builds failed in the deblobbing step, due to timeout (1 hour of silence) while packing the linux-libre tarball: https://ci.guix.gnu.org/build/1488514/details (4.14.134) https://ci.guix.gnu.org/build/1488515/details (arm-generic-4.14.134) https://ci.guix.gnu.org/build/1488512/details (arm-omap2plus-4.14.134) I'm not sure how to deal with this. This is a computed origin, not a normal package, and so I don't see a way to configure a longer timeout. Perhaps I should make the tarball packing and unpacking operations verbose, to work around the issue. Of course that's our usual practice, but I find it suboptimal because any warnings will be buried in a mountain of uninteresting output. Thoughts? Anyway, thanks for looking into it. Mark
bug#36754: New linux-libre failed to build on armhf on Berlin
Mark H Weaver writes: > Unfortunately, I'm unable to get *any* information about what went wrong > from Cuirass. None of the failed builds have associated log files, and > the build details page has no useful information either. For example: > > https://ci.guix.gnu.org/build/1488517/details On that page I see a link to the build log, but it appears to be truncated: https://ci.guix.gnu.org/log/33hv7mij9bqqgf5hqwrw14106z9zgav9-linux-libre-5.2.2 Maybe the build node died before the build could be completed? -- Ricardo
bug#36754: New linux-libre failed to build on armhf on Berlin
In commit 1ad9c105c208caa9059924cbfbe4759c8101f6c9, I changed our linux-libre packages to deblob the linux-libre source tarballs ourselves, i.e. to run the deblobbing scripts provided by the linux-libre project to produce linux-libre source tarballs from the upstream linux tarballs: https://git.savannah.gnu.org/cgit/guix.git/commit/?id=1ad9c105c208caa9059924cbfbe4759c8101f6c9 The following queries show that the updated packages built successfully on x86_64, i686, and aarch64, but they all failed on armhf: https://ci.guix.gnu.org/search?query=linux-libre-5.2.2 https://ci.guix.gnu.org/search?query=linux-libre-4.19.60 https://ci.guix.gnu.org/search?query=linux-libre-4.14.134 https://ci.guix.gnu.org/search?query=linux-libre-4.9.186 https://ci.guix.gnu.org/search?query=linux-libre-4.4.186 https://ci.guix.gnu.org/search?query=linux-libre-arm-veyron-5.2.2 https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-5.2.2 https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-4.19.60 https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-4.14.134 https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-5.2.2 https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-4.19.60 https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-4.14.134 Unfortunately, I'm unable to get *any* information about what went wrong from Cuirass. None of the failed builds have associated log files, and the build details page has no useful information either. For example: https://ci.guix.gnu.org/build/1488517/details My first guess was that something went wrong in the 'computed' origin that runs the deblobbing script. However, that's apparently not the case, because all of the updated 'linux-libre-headers' packages built successfully on armhf, and those use the same source tarballs as the main 'linux-libre' packages. https://ci.guix.gnu.org/search?query=linux-libre-headers-5.2.2 https://ci.guix.gnu.org/search?query=linux-libre-headers-4.19.60 https://ci.guix.gnu.org/search?query=linux-libre-headers-4.14.134 Can someone help me find out what's going on here? Until then, I'm sorry to say that armhf-linux users will be unable to update their systems. Mark