Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
Hi Andreas, On 12/14/2016 11:47 AM, Christian Seiler wrote: > On 12/14/2016 08:50 AM, Christian Seiler wrote: >> I'm going to try an i386 build in a VM running a stable kernel >> and see if that does indeed change things and if I can reproduce >> the problem. Should that not be the issue though then I really >> can't reproduce the problem - and hence won't be able to debug >> it... Let's see... > > Indeed: in a VM with Jessie + sbuild from jessie-backports the > build fails with a segfault: > > ** preparing package for lazy loading > Creating a generic function for 'toJSON' from package 'jsonlite' in package > 'googleVis' > Error: segfault from C stack overflow > * removing > '/<>/debian/r-cran-treescape/usr/lib/R/site-library/treescape' > dh_auto_install: R CMD INSTALL -l > /<>/debian/r-cran-treescape/usr/lib/R/site-library --clean . > --built-timestamp='Wed, 14 Dec 2016 06:45:37 +0100' returned exit code 1 > > Now that I can reporduce this, I'll investigate more later. Well, the stack overflow appears to be an endless loop. I've attached a stack backtrace I obtained via gdb. If I had to guess what was going on in the backtrace, I'd suspect an infinite recursion in R code, which translates to infinite recursion of the underlying C code. But I'm really not sure here. Why that only appears to occur on 32bit LE architectures with stable kernels (and works fine with unstable kernels on the same architecture, and even with the stable kernel on 64bit both LE and BE, as well as on 32bit BE) I also have no clue. Fun fact: if you call R -d gdb, type in "run" at the gdb prompt and then type in the following at the R prompt: install.packages(repos=NULL, lib=".../r-cran-treescape-1.10.18/debian/r-cran-treescape/usr/lib/R/site-library", clean=TRUE, pkgs=".", configure.args=("--built-timestamp='Wed, 14 Dec 2016 06:45:37 +0100'") ) instead of running the command directly as R CMD INSTALL \ -l .../r-cran-treescape-1.10.18/debian/r-cran-treescape/usr/lib/R/site-library \ --clean \ . \ "--built-timestamp='Wed, 14 Dec 2016 06:45:37 +0100'" this will cause the build go through successfully. However, running the R CMD INSTALL directly (in a fresh source package directory) will still trigger the error - and you can attach with gdb from another console. Also, if the source directory is not completely clean, then sometimes stuff is left lying around in there, after which all calls to the R CMD INSTALL will succeed. Unfortunately I know next to nothing about R's internals so I have no idea what to do with it. If anyone has a pointer on how to read the backtrace or someone with more R experience can tell me what to look out for and how to extract useful information from that, I'd be willing to revisit this, but otherwise I'm forced to let this go, sorry. Regards, Christian #0 bcEval (body=body@entry=0xf88c4674, rho=rho@entry=0xfe83f6d4, useCache=useCache@entry=TRUE) at eval.c:5172 #1 0xf74353c6 in Rf_eval (e=0xf88c4674, rho=0xfe83f6d4) at eval.c:616 #2 0xf7435c6f in forcePromise (e=e@entry=0xfe83f6f0) at eval.c:515 #3 0xf7436177 in FORCE_PROMISE (keepmiss=FALSE, rho=0xfe83f7ec, symbol=0xf8822b38, value=0xfe83f6f0) at eval.c:4258 #4 getvar (symbol=0xf8822b38, rho=rho@entry=0xfe83f7ec, dd=dd@entry=FALSE, keepmiss=FALSE, vcache=0xf4def33c, sidx=2) at eval.c:4300 #5 0xf742e377 in bcEval (body=body@entry=0xf88b81ec, rho=rho@entry=0xfe83f7ec, useCache=useCache@entry=TRUE) at eval.c:5425 #6 0xf74353c6 in Rf_eval (e=0xf88b81ec, rho=0xfe83f7ec) at eval.c:616 #7 0xf7437201 in Rf_applyClosure (call=, op=, arglist=, rho=, suppliedvars=) at eval.c:1135 #8 0xf742fdfc in bcEval (body=body@entry=0xf88c30d0, rho=rho@entry=0xfe83f6d4, useCache=useCache@entry=TRUE) at eval.c:5630 #9 0xf74353c6 in Rf_eval (e=0xf88c30d0, rho=0xfe83f6d4) at eval.c:616 #10 0xf7437201 in Rf_applyClosure (call=, op=, arglist=, rho=, suppliedvars=) at eval.c:1135 #11 0xf742fdfc in bcEval (body=body@entry=0xfe8802a8, rho=rho@entry=0xfe83f5f4, useCache=useCache@entry=TRUE) at eval.c:5630 #12 0xf74353c6 in Rf_eval (e=0xfe8802a8, rho=0xfe83f5f4) at eval.c:616 #13 0xf7437201 in Rf_applyClosure (call=, op=, arglist=, rho=, suppliedvars=) at eval.c:1135 #14 0xf743989e in R_forceAndCall (e=, n=1, rho=) at eval.c:1302 #15 0xf73a1ebc in do_lapply (call=0xf889559c, op=0xf87e4500, args=0xf8895580, rho=0xfe8405cc) at apply.c:70 #16 0xf746883a in do_internal (call=, op=, args=0xf8895580, env=) at names.c:1353 #17 0xf7429a69 in bcEval (body=body@entry=0xf88931d4, rho=rho@entry=0xfe8405cc, useCache=useCache@entry=TRUE) at eval.c:5678 #18 0xf74353c6 in Rf_eval (e=0xf88931d4, rho=0xfe8405cc) at eval.c:616 #19 0xf7437201 in Rf_applyClosure (call=, op=, arglist=, rho=, suppliedvars=) at eval.c:1135 #20 0xf742fdfc in bcEval (body=body@entry=0xfe881280, rho=rho@entry=0xfe8402bc, useCache=useCache@entry=TRUE) at eval.c:5630 #21 0xf74353c6 in Rf_eval (e=0xfe881280, rho=0xfe8402bc) at eval.c:616
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
Hi Andreas, On 12/14/2016 08:50 AM, Christian Seiler wrote: > I'm going to try an i386 build in a VM running a stable kernel > and see if that does indeed change things and if I can reproduce > the problem. Should that not be the issue though then I really > can't reproduce the problem - and hence won't be able to debug > it... Let's see... Indeed: in a VM with Jessie + sbuild from jessie-backports the build fails with a segfault: ** preparing package for lazy loading Creating a generic function for 'toJSON' from package 'jsonlite' in package 'googleVis' Error: segfault from C stack overflow * removing '/<>/debian/r-cran-treescape/usr/lib/R/site-library/treescape' dh_auto_install: R CMD INSTALL -l /<>/debian/r-cran-treescape/usr/lib/R/site-library --clean . --built-timestamp='Wed, 14 Dec 2016 06:45:37 +0100' returned exit code 1 Now that I can reporduce this, I'll investigate more later. Regards, Christian
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
Hi Andreas, On 12/14/2016 08:10 AM, Andreas Tille wrote: > On Wed, Dec 14, 2016 at 12:32:24AM +0100, Christian Seiler wrote: >> On 11/02/2016 05:20 PM, Andreas Tille wrote: >> >> Hmm, was going to take a shot at debugging your segfault, but I >> simply can't reproduce this: >> ... >> architectures. > > Unfortunately autobuilders keep on reproducing it. :-( :-( > I have uploaded a package where I fixed the xvfb issue and did a source > only upload to make sure also amd64 will be autobuilt. While amd64 is > fine (also regarding the xserver issue - thanks to Gregor for the hints) > the i386 build log[1] shows the > > ** inst > ** preparing package for lazy loading > Creating a generic function for 'toJSON' from package 'jsonlite' in package > 'googleVis' > Error: segfault from C stack overflow > * removing > '/«PKGBUILDDIR»/debian/r-cran-treescape/usr/lib/R/site-library/treescape' > > again even if the log also has gcc-6-base i386 6.2.1-6 and binutils > i386 2.27.51.20161212-1 - so the toolchain on autobuilder is the same as > it worked for you. Yeah. Hmmm. :( > There might be a difference between a qemu emulation > and real hardware, thought. But emulation is only for armhf, i386 is native on my architecture (amd64 can run i386 directly, and the autobuilders are also amd64 machines running i386 chroots, my setup should be identical). Funnily enough mipsel now also failed at the same point, which it previously didn't. The only other key difference I can see is that the failed builds all run a stable kernel - and the working builds (also the build previously working on powerpc) run a backports kernel (and I'm running testing here). OTOH, the amd64 and arm64 builds are also running on the stable kernel - but those are 64bit platforms. Then OTOH in the ports section of the buildd logs you have 32bit powerpc - and that is also on stable, but powerpc is big endian, in contrast to i386, armhf and mipsel. I'm really not sure what's going on there, but maybe there's a failure case for 32bit little endian architectures when running a 3.16 kernel? But that may be a complete red herring and coincidence... I'm going to try an i386 build in a VM running a stable kernel and see if that does indeed change things and if I can reproduce the problem. Should that not be the issue though then I really can't reproduce the problem - and hence won't be able to debug it... Let's see... Regards, Christian
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
Hi Christian, On Wed, Dec 14, 2016 at 12:32:24AM +0100, Christian Seiler wrote: > On 11/02/2016 05:20 PM, Andreas Tille wrote: > > Hmm, was going to take a shot at debugging your segfault, but I > simply can't reproduce this: > ... > architectures. Unfortunately autobuilders keep on reproducing it. :-( > I can provide full build logs if you need them. > > Maybe ask for a give-back at debian-wb-t...@lists.debian.org to > have the i386 and armhf buildds try the build again? As far as > I can tell the build should succeed... > > Notable differences between buildd chroot and my freshly created > one (in the i386 case): > > buildd:gcc 6.2.1-5, binutils 2.27.51.20161201-1 > my system: gcc 6.2.1-6, binutils 2.27.51.20161212-1 > > Maybe this was a toolchain bug that was fixed recently? If so, > maybe wait a couple of days (buildd chroots are updated twice > a week IIRC) and then ask for a give-back. I have uploaded a package where I fixed the xvfb issue and did a source only upload to make sure also amd64 will be autobuilt. While amd64 is fine (also regarding the xserver issue - thanks to Gregor for the hints) the i386 build log[1] shows the ** inst ** preparing package for lazy loading Creating a generic function for 'toJSON' from package 'jsonlite' in package 'googleVis' Error: segfault from C stack overflow * removing '/«PKGBUILDDIR»/debian/r-cran-treescape/usr/lib/R/site-library/treescape' again even if the log also has gcc-6-base i386 6.2.1-6 and binutils i386 2.27.51.20161212-1 - so the toolchain on autobuilder is the same as it worked for you. There might be a difference between a qemu emulation and real hardware, thought. Kind regards and thanks for checking anyway Andreas. [1] https://buildd.debian.org/status/fetch.php?pkg=r-cran-treescape=i386=1.10.18-2=1481697760 -- http://fam-tille.de
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
On Tue, 13 Dec 2016 13:23:50 +0100, Andreas Tille wrote: > > I guess pabs meant that you're not actually using it in debian/rules. > Urgs, thanks for opening my eyes. ;-) :) > Index: rules > === > --- rules (Revision 23284) > +++ rules (Arbeitskopie) > @@ -2,3 +2,7 @@ > > %: > dh $@ --buildsystem R > + > +override_dh_auto_install: > + xvfb-run --auto-servernum --server-num=20 -s "-screen 0 1024x768x24 > -ac +extension GLX +render -noreset" \ > + dh_auto_install > > > which results in the following diff in the build log: > > > ** preparing package for lazy loading > Creating a generic function for 'toJSON' from package 'jsonlite' in package > 'googleVis' > Warning in rgl.init(initValue, onlyNULL) : > - RGL: unable to open X11 display > + RGL: GLX extension missing on server > Warning: 'rgl_init' failed, running with rgl.useNULL = TRUE > ** help > *** installing help indices Adding libgl1-mesa-dri as a build dependency gets rid of the warning for me (amd64 cowbuilder sid chroot). ** preparing package for lazy loading Creating a generic function for 'toJSON' from package 'jsonlite' in package 'googleVis' ** help *** installing help indices Cheers, gregor -- .''`. https://info.comodo.priv.at/ - Debian Developer https://www.debian.org : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06 `. `' Member of VIBE!AT & SPI, fellow of the Free Software Foundation Europe `- NP: David Bowie: China Girl (Single Version) signature.asc Description: Digital Signature
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
On Tue, Dec 13, 2016 at 11:47:43AM +0100, gregor herrmann wrote: > [..] > > So what exactly do you mean by "didn't add it yet" ? > > I guess pabs meant that you're not actually using it in debian/rules. Urgs, thanks for opening my eyes. ;-) So I tried: Index: rules === --- rules (Revision 23284) +++ rules (Arbeitskopie) @@ -2,3 +2,7 @@ %: dh $@ --buildsystem R + +override_dh_auto_install: + xvfb-run --auto-servernum --server-num=20 -s "-screen 0 1024x768x24 -ac +extension GLX +render -noreset" \ + dh_auto_install which results in the following diff in the build log: ** preparing package for lazy loading Creating a generic function for 'toJSON' from package 'jsonlite' in package 'googleVis' Warning in rgl.init(initValue, onlyNULL) : - RGL: unable to open X11 display + RGL: GLX extension missing on server Warning: 'rgl_init' failed, running with rgl.useNULL = TRUE ** help *** installing help indices Seems the "+extension GLX" is not sufficient to get it fully working. Do you have any better idea. Since I do not have one for the moment I'd like to fix this first and try what happens on the failing architectures (despite I agree with Paul that most probably this is not the cause for the build failure since it only results in a warning). Kind regards Andreas. -- http://fam-tille.de
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
On Tue, 13 Dec 2016 11:30:41 +0100, Andreas Tille wrote: > > > Well, adding xvfb was the usual trick to cope with "unable to open X11 > > > display" messages and thus I added it ... > > To me it looks like you didn't add it yet, at least not to the version > > in Debian. > Hmmm, > $ apt-get source r-cran-treescape > $ grep xvfb r-cran-treescape-1.10.18/debian/control >xvfb [..] > So what exactly do you mean by "didn't add it yet" ? I guess pabs meant that you're not actually using it in debian/rules. Cheers, gregor -- .''`. https://info.comodo.priv.at/ - Debian Developer https://www.debian.org : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06 `. `' Member of VIBE!AT & SPI, fellow of the Free Software Foundation Europe `- BOFH excuse #9: doppler effect
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
On Tue, Dec 13, 2016 at 03:50:32PM +0800, Paul Wise wrote: > On Tue, Dec 13, 2016 at 3:47 PM, Andreas Tille wrote: > > > Well, adding xvfb was the usual trick to cope with "unable to open X11 > > display" messages and thus I added it ... > > To me it looks like you didn't add it yet, at least not to the version > in Debian. Hmmm, $ apt-get source r-cran-treescape $ grep xvfb r-cran-treescape-1.10.18/debian/control xvfb May be I should do some source only upload but at least arm64 log[1] contains Get:298 http://ftp.us.debian.org/debian unstable/main arm64 xvfb arm64 2:1.19.0-2 [2656 kB] So what exactly do you mean by "didn't add it yet" ? Kind regards Andreas. [1] https://buildd.debian.org/status/fetch.php?pkg=r-cran-treescape=arm64=1.10.18-1=1481535210 -- http://fam-tille.de
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
Hi Paul, On Tue, Dec 13, 2016 at 09:07:11AM +0800, Paul Wise wrote: > > Any help would be really appreciated. > > Looking at the build logs, even the architectures that succeeded are > getting the warning about X11. So that is completely unrelated and the > issue is the segfault not the xvfb stuff. Someone need to run the > crashing program under gdb and debug the crash. > > https://buildd.debian.org/status/fetch.php?pkg=r-cran-treescape=arm64=1.10.18-1=1481535210 Ahhh, you are right. > If you can't find someone to help out, the only option to solve the RC > bug is removal from those architectures: > > https://wiki.debian.org/ftpmaster_Removals While amd64 is most probably the main target architecture it would be a shame to loose i386 and there might be some hidden problem which in the end also affects amd64. So I did not yet considered this last resort and hope for some help from people who are more confident with gdb debugging than I am (Ive read here frequently some phrases like "10 minutes of gdb debugging" but I'm afraid it would take me 10 hours - I'd happily join a "Gdb for Debian maintainers" workshop at Debconf, thought. > Also, you build-depend on xvfb but I can't see it being used at all in > the build log nor the packaging: > > https://codesearch.debian.net/search?q=package%3Ar-cran-treescape+xvfb Well, adding xvfb was the usual trick to cope with "unable to open X11 display" messages and thus I added it ... Kind regards Andreas. -- http://fam-tille.de
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
On Tue, Dec 13, 2016 at 3:47 PM, Andreas Tille wrote: > Well, adding xvfb was the usual trick to cope with "unable to open X11 > display" messages and thus I added it ... To me it looks like you didn't add it yet, at least not to the version in Debian. -- bye, pabs https://wiki.debian.org/PaulWise
Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more
Hi Paul, On Thu, Nov 03, 2016 at 11:40:04AM +0800, Paul Wise wrote: > On Thu, Nov 3, 2016 at 12:20 AM, Andreas Tille wrote: > > > I used xauth and xvfb as Build-Depends successfully which works on most > > architectures - but failed on these ones. Any hint how to solve this? > > If you don't have hardware for these arches, login to one of the > porterboxen and install the build-deps in a chroot and then run the > relevant commands under a debugger like gdb. > > https://dsa.debian.org/doc/schroot/ > https://db.debian.org/machines.cgi I admit I do not only lack the hardware I'm also lacking experience to track down this kind of problems. I discussed the issue with upstream and they also do not have any clue. Any help would be really appreciated. Kind regards Andreas. -- http://fam-tille.de