bug#36754: New linux-libre failed to build on armhf on Berlin

2019-07-23 Thread Marius Bakke
Mark H Weaver  writes:

> Hi Ricardo,
>
> Interesting.  I distinctly remember that there was no log file when I
> looked last time.  Hmm.
>
> Anyway, it seems that now, all of the failed builds have either build
> logs available or else information about which dependency failed.  I
> don't remember seeing any of this last time, but I'm glad to see it now.
>
> A pattern has now emerged, but I don't know what it means.  All of the
> armhf kernel builds failed except for linux-libre-arm-veyron-5.2.2,
> which succeeded:
>
>   https://ci.guix.gnu.org/build/1488502/details  (arm-veyron-5.2.2)
>
> Apart from this anomalous success, all of the armhf 5.2.2 and 4.19.60
> have a truncated log file:
>
>   https://ci.guix.gnu.org/build/1488517/details  (5.2.2)
>   https://ci.guix.gnu.org/build/1488503/details  (4.19.60)
>   https://ci.guix.gnu.org/build/1488513/details  (arm-generic-5.2.2)
>   https://ci.guix.gnu.org/build/1488519/details  (arm-generic-4.19.60)
>   https://ci.guix.gnu.org/build/1488504/details  (arm-omap2plus-5.2.2)
>   https://ci.guix.gnu.org/build/1488501/details  (arm-omap2plus-4.19.60)
>
> This pattern seems too regular to be a coincidence.  Can we find out
> which build machines were used for these builds?

I tried building 5.2.2 'interactively' on Berlin, and got an SSH error:

  CC [M]  net/openvswitch/vport-geneve.o
  CC [M]  net/openvswitch/vport-gre.o
  LD [M]  net/openvswitch/openvswitch.o
;;; [2019/07/23 05:14:53.501502, 0] read_from_channel_port: [GSSH ERROR] Error 
reading from the channel: #
Backtrace:
  16 (apply-smob/1 #)
In ice-9/boot-9.scm:
705:2 15 (call-with-prompt _ _ #)
In ice-9/eval.scm:
619:8 14 (_ #(#(#)))
In guix/ui.scm:
  1747:12 13 (run-guix-command _ . _)
In guix/scripts/offload.scm:
   781:22 12 (guix-offload . _)
In ice-9/boot-9.scm:
829:9 11 (catch _ _ # …)
829:9 10 (catch _ _ # …)
In guix/scripts/offload.scm:
   580:19  9 (process-request _ _ _ _ #:print-build-trace? _ # _ # _)
531:6  8 (call-with-timeout _ _ _)
361:2  7 (transfer-and-offload # …)
In ice-9/boot-9.scm:
829:9  6 (catch _ _ # …)
In guix/scripts/offload.scm:
385:6  5 (_)
In guix/store.scm:
  1203:15  4 (_ # _ _)
   692:11  3 (process-stderr # _)
In guix/serialization.scm:
87:11  2 (read-int _)
73:12  1 (get-bytevector-n* # …)
In unknown file:
   0 (get-bytevector-n # …)

ERROR: In procedure get-bytevector-n:
Throw to key `guile-ssh-error' with args `("read_from_channel_port" "Error 
reading from the channel" # #f)'.
guix build: error: build of 
`/gnu/store/yfns7ga468vmv9jn72snk79b16p8mhfa-linux-libre-5.2.2.drv' failed

real637m24.906s
user0m6.661s
sys 0m0.897s

Unfortunately I failed to record which machine was used and don't know a
way to find out after the fact.





bug#36754: New linux-libre failed to build on armhf on Berlin

2019-07-22 Thread Mark H Weaver
Hi Ricardo,

Interesting.  I distinctly remember that there was no log file when I
looked last time.  Hmm.

Anyway, it seems that now, all of the failed builds have either build
logs available or else information about which dependency failed.  I
don't remember seeing any of this last time, but I'm glad to see it now.

A pattern has now emerged, but I don't know what it means.  All of the
armhf kernel builds failed except for linux-libre-arm-veyron-5.2.2,
which succeeded:

  https://ci.guix.gnu.org/build/1488502/details  (arm-veyron-5.2.2)

Apart from this anomalous success, all of the armhf 5.2.2 and 4.19.60
have a truncated log file:

  https://ci.guix.gnu.org/build/1488517/details  (5.2.2)
  https://ci.guix.gnu.org/build/1488503/details  (4.19.60)
  https://ci.guix.gnu.org/build/1488513/details  (arm-generic-5.2.2)
  https://ci.guix.gnu.org/build/1488519/details  (arm-generic-4.19.60)
  https://ci.guix.gnu.org/build/1488504/details  (arm-omap2plus-5.2.2)
  https://ci.guix.gnu.org/build/1488501/details  (arm-omap2plus-4.19.60)

This pattern seems too regular to be a coincidence.  Can we find out
which build machines were used for these builds?

All of the 4.14.134 builds failed in the deblobbing step, due to timeout
(1 hour of silence) while packing the linux-libre tarball:

  https://ci.guix.gnu.org/build/1488514/details  (4.14.134)
  https://ci.guix.gnu.org/build/1488515/details  (arm-generic-4.14.134)
  https://ci.guix.gnu.org/build/1488512/details  (arm-omap2plus-4.14.134)

I'm not sure how to deal with this.  This is a computed origin, not a
normal package, and so I don't see a way to configure a longer timeout.

Perhaps I should make the tarball packing and unpacking operations
verbose, to work around the issue.  Of course that's our usual practice,
but I find it suboptimal because any warnings will be buried in a
mountain of uninteresting output.

Thoughts?  Anyway, thanks for looking into it.

   Mark





bug#36754: New linux-libre failed to build on armhf on Berlin

2019-07-22 Thread Ricardo Wurmus


Mark H Weaver  writes:

> Unfortunately, I'm unable to get *any* information about what went wrong
> from Cuirass.  None of the failed builds have associated log files, and
> the build details page has no useful information either.  For example:
>
>   https://ci.guix.gnu.org/build/1488517/details

On that page I see a link to the build log, but it appears to be
truncated:


https://ci.guix.gnu.org/log/33hv7mij9bqqgf5hqwrw14106z9zgav9-linux-libre-5.2.2

Maybe the build node died before the build could be completed?

-- 
Ricardo






bug#36754: New linux-libre failed to build on armhf on Berlin

2019-07-21 Thread Mark H Weaver
In commit 1ad9c105c208caa9059924cbfbe4759c8101f6c9, I changed our
linux-libre packages to deblob the linux-libre source tarballs
ourselves, i.e. to run the deblobbing scripts provided by the
linux-libre project to produce linux-libre source tarballs from the
upstream linux tarballs:

  
https://git.savannah.gnu.org/cgit/guix.git/commit/?id=1ad9c105c208caa9059924cbfbe4759c8101f6c9

The following queries show that the updated packages built successfully
on x86_64, i686, and aarch64, but they all failed on armhf:

  https://ci.guix.gnu.org/search?query=linux-libre-5.2.2
  https://ci.guix.gnu.org/search?query=linux-libre-4.19.60
  https://ci.guix.gnu.org/search?query=linux-libre-4.14.134
  https://ci.guix.gnu.org/search?query=linux-libre-4.9.186
  https://ci.guix.gnu.org/search?query=linux-libre-4.4.186
  https://ci.guix.gnu.org/search?query=linux-libre-arm-veyron-5.2.2
  https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-5.2.2
  https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-4.19.60
  https://ci.guix.gnu.org/search?query=linux-libre-arm-generic-4.14.134
  https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-5.2.2
  https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-4.19.60
  https://ci.guix.gnu.org/search?query=linux-libre-arm-omap2plus-4.14.134

Unfortunately, I'm unable to get *any* information about what went wrong
from Cuirass.  None of the failed builds have associated log files, and
the build details page has no useful information either.  For example:

  https://ci.guix.gnu.org/build/1488517/details

My first guess was that something went wrong in the 'computed' origin
that runs the deblobbing script.  However, that's apparently not the
case, because all of the updated 'linux-libre-headers' packages built
successfully on armhf, and those use the same source tarballs as the
main 'linux-libre' packages.

  https://ci.guix.gnu.org/search?query=linux-libre-headers-5.2.2
  https://ci.guix.gnu.org/search?query=linux-libre-headers-4.19.60
  https://ci.guix.gnu.org/search?query=linux-libre-headers-4.14.134

Can someone help me find out what's going on here?  Until then, I'm
sorry to say that armhf-linux users will be unable to update their
systems.

   Mark