Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-14 Thread Christian Seiler
Hi Andreas,

On 12/14/2016 11:47 AM, Christian Seiler wrote:
> On 12/14/2016 08:50 AM, Christian Seiler wrote:
>> I'm going to try an i386 build in a VM running a stable kernel
>> and see if that does indeed change things and if I can reproduce
>> the problem. Should that not be the issue though then I really
>> can't reproduce the problem - and hence won't be able to debug
>> it... Let's see...
> 
> Indeed: in a VM with Jessie + sbuild from jessie-backports the
> build fails with a segfault:
> 
> ** preparing package for lazy loading
> Creating a generic function for 'toJSON' from package 'jsonlite' in package 
> 'googleVis'
> Error: segfault from C stack overflow
> * removing 
> '/<>/debian/r-cran-treescape/usr/lib/R/site-library/treescape'
> dh_auto_install: R CMD INSTALL -l 
> /<>/debian/r-cran-treescape/usr/lib/R/site-library --clean . 
> --built-timestamp='Wed, 14 Dec 2016 06:45:37 +0100' returned exit code 1
> 
> Now that I can reporduce this, I'll investigate more later.

Well, the stack overflow appears to be an endless loop.
I've attached a stack backtrace I obtained via gdb.

If I had to guess what was going on in the backtrace, I'd suspect
an infinite recursion in R code, which translates to infinite
recursion of the underlying C code. But I'm really not sure here.

Why that only appears to occur on 32bit LE architectures with
stable kernels (and works fine with unstable kernels on the same
architecture, and even with the stable kernel on 64bit both LE
and BE, as well as on 32bit BE) I also have no clue.

Fun fact: if you call R -d gdb, type in "run" at the gdb prompt and
then type in the following at the R prompt:

   install.packages(repos=NULL,
  
lib=".../r-cran-treescape-1.10.18/debian/r-cran-treescape/usr/lib/R/site-library",
  clean=TRUE,
  pkgs=".",
  configure.args=("--built-timestamp='Wed, 14 Dec 2016 06:45:37 +0100'")
   )

instead of running the command directly as

   R CMD INSTALL \
 -l 
.../r-cran-treescape-1.10.18/debian/r-cran-treescape/usr/lib/R/site-library \
 --clean \
 . \
 "--built-timestamp='Wed, 14 Dec 2016 06:45:37 +0100'"

this will cause the build go through successfully. However, running
the R CMD INSTALL directly (in a fresh source package directory)
will still trigger the error - and you can attach with gdb from
another console.

Also, if the source directory is not completely clean, then
sometimes stuff is left lying around in there, after which all
calls to the R CMD INSTALL will succeed.

Unfortunately I know next to nothing about R's internals so I have
no idea what to do with it. If anyone has a pointer on how to read
the backtrace or someone with more R experience can tell me what to
look out for and how to extract useful information from that, I'd be
willing to revisit this, but otherwise I'm forced to let this go,
sorry.

Regards,
Christian
#0  bcEval (body=body@entry=0xf88c4674, rho=rho@entry=0xfe83f6d4, 
useCache=useCache@entry=TRUE) at eval.c:5172
#1  0xf74353c6 in Rf_eval (e=0xf88c4674, rho=0xfe83f6d4) at eval.c:616
#2  0xf7435c6f in forcePromise (e=e@entry=0xfe83f6f0) at eval.c:515
#3  0xf7436177 in FORCE_PROMISE (keepmiss=FALSE, rho=0xfe83f7ec, 
symbol=0xf8822b38, value=0xfe83f6f0) at eval.c:4258
#4  getvar (symbol=0xf8822b38, rho=rho@entry=0xfe83f7ec, dd=dd@entry=FALSE, 
keepmiss=FALSE, vcache=0xf4def33c, sidx=2) at eval.c:4300
#5  0xf742e377 in bcEval (body=body@entry=0xf88b81ec, rho=rho@entry=0xfe83f7ec, 
useCache=useCache@entry=TRUE) at eval.c:5425
#6  0xf74353c6 in Rf_eval (e=0xf88b81ec, rho=0xfe83f7ec) at eval.c:616
#7  0xf7437201 in Rf_applyClosure (call=, op=, 
arglist=, rho=, suppliedvars=) at 
eval.c:1135
#8  0xf742fdfc in bcEval (body=body@entry=0xf88c30d0, rho=rho@entry=0xfe83f6d4, 
useCache=useCache@entry=TRUE) at eval.c:5630
#9  0xf74353c6 in Rf_eval (e=0xf88c30d0, rho=0xfe83f6d4) at eval.c:616
#10 0xf7437201 in Rf_applyClosure (call=, op=, 
arglist=, rho=, suppliedvars=) at 
eval.c:1135
#11 0xf742fdfc in bcEval (body=body@entry=0xfe8802a8, rho=rho@entry=0xfe83f5f4, 
useCache=useCache@entry=TRUE) at eval.c:5630
#12 0xf74353c6 in Rf_eval (e=0xfe8802a8, rho=0xfe83f5f4) at eval.c:616
#13 0xf7437201 in Rf_applyClosure (call=, op=, 
arglist=, rho=, suppliedvars=) at 
eval.c:1135
#14 0xf743989e in R_forceAndCall (e=, n=1, rho=) 
at eval.c:1302
#15 0xf73a1ebc in do_lapply (call=0xf889559c, op=0xf87e4500, args=0xf8895580, 
rho=0xfe8405cc) at apply.c:70
#16 0xf746883a in do_internal (call=, op=, 
args=0xf8895580, env=) at names.c:1353
#17 0xf7429a69 in bcEval (body=body@entry=0xf88931d4, rho=rho@entry=0xfe8405cc, 
useCache=useCache@entry=TRUE) at eval.c:5678
#18 0xf74353c6 in Rf_eval (e=0xf88931d4, rho=0xfe8405cc) at eval.c:616
#19 0xf7437201 in Rf_applyClosure (call=, op=, 
arglist=, rho=, suppliedvars=) at 
eval.c:1135
#20 0xf742fdfc in bcEval (body=body@entry=0xfe881280, rho=rho@entry=0xfe8402bc, 
useCache=useCache@entry=TRUE) at eval.c:5630
#21 0xf74353c6 in Rf_eval (e=0xfe881280, rho=0xfe8402bc) at eval.c:616

Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-14 Thread Christian Seiler
Hi Andreas,

On 12/14/2016 08:50 AM, Christian Seiler wrote:
> I'm going to try an i386 build in a VM running a stable kernel
> and see if that does indeed change things and if I can reproduce
> the problem. Should that not be the issue though then I really
> can't reproduce the problem - and hence won't be able to debug
> it... Let's see...

Indeed: in a VM with Jessie + sbuild from jessie-backports the
build fails with a segfault:

** preparing package for lazy loading
Creating a generic function for 'toJSON' from package 'jsonlite' in package 
'googleVis'
Error: segfault from C stack overflow
* removing 
'/<>/debian/r-cran-treescape/usr/lib/R/site-library/treescape'
dh_auto_install: R CMD INSTALL -l 
/<>/debian/r-cran-treescape/usr/lib/R/site-library --clean . 
--built-timestamp='Wed, 14 Dec 2016 06:45:37 +0100' returned exit code 1

Now that I can reporduce this, I'll investigate more later.

Regards,
Christian



Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-13 Thread Christian Seiler
Hi Andreas,

On 12/14/2016 08:10 AM, Andreas Tille wrote:
> On Wed, Dec 14, 2016 at 12:32:24AM +0100, Christian Seiler wrote:
>> On 11/02/2016 05:20 PM, Andreas Tille wrote:
>>
>> Hmm, was going to take a shot at debugging your segfault, but I
>> simply can't reproduce this:
>> ...
>> architectures.
> 
> Unfortunately autobuilders keep on reproducing it. :-(

:-(

> I have uploaded a package where I fixed the xvfb issue and did a source
> only upload to make sure also amd64 will be autobuilt.  While amd64 is
> fine (also regarding the xserver issue - thanks to Gregor for the hints)
> the i386 build log[1] shows the
> 
> ** inst
> ** preparing package for lazy loading
> Creating a generic function for 'toJSON' from package 'jsonlite' in package 
> 'googleVis'
> Error: segfault from C stack overflow
> * removing 
> '/«PKGBUILDDIR»/debian/r-cran-treescape/usr/lib/R/site-library/treescape'
> 
> again even if the log also has gcc-6-base i386 6.2.1-6  and binutils
> i386 2.27.51.20161212-1 - so the toolchain on autobuilder is the same as
> it worked for you.

Yeah. Hmmm. :(

>  There might be a difference between a qemu emulation
> and real hardware, thought.

But emulation is only for armhf, i386 is native on my architecture
(amd64 can run i386 directly, and the autobuilders are also amd64
machines running i386 chroots, my setup should be identical).

Funnily enough mipsel now also failed at the same point, which it
previously didn't.

The only other key difference I can see is that the failed builds
all run a stable kernel - and the working builds (also the build
previously working on powerpc) run a backports kernel (and I'm
running testing here). OTOH, the amd64 and arm64 builds are also
running on the stable kernel - but those are 64bit platforms.
Then OTOH in the ports section of the buildd logs you have 32bit
powerpc - and that is also on stable, but powerpc is big endian,
in contrast to i386, armhf and mipsel.

I'm really not sure what's going on there, but maybe there's a
failure case for 32bit little endian architectures when running
a 3.16 kernel? But that may be a complete red herring and
coincidence...

I'm going to try an i386 build in a VM running a stable kernel
and see if that does indeed change things and if I can reproduce
the problem. Should that not be the issue though then I really
can't reproduce the problem - and hence won't be able to debug
it... Let's see...

Regards,
Christian



Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-13 Thread Andreas Tille
Hi Christian,

On Wed, Dec 14, 2016 at 12:32:24AM +0100, Christian Seiler wrote:
> On 11/02/2016 05:20 PM, Andreas Tille wrote:
> 
> Hmm, was going to take a shot at debugging your segfault, but I
> simply can't reproduce this:
> ...
> architectures.

Unfortunately autobuilders keep on reproducing it. :-(

> I can provide full build logs if you need them.
> 
> Maybe ask for a give-back at debian-wb-t...@lists.debian.org to
> have the i386 and armhf buildds try the build again? As far as
> I can tell the build should succeed...
> 
> Notable differences between buildd chroot and my freshly created
> one (in the i386 case):
> 
>  buildd:gcc 6.2.1-5, binutils 2.27.51.20161201-1
>  my system: gcc 6.2.1-6, binutils 2.27.51.20161212-1
> 
> Maybe this was a toolchain bug that was fixed recently? If so,
> maybe wait a couple of days (buildd chroots are updated twice
> a week IIRC) and then ask for a give-back.

I have uploaded a package where I fixed the xvfb issue and did a source
only upload to make sure also amd64 will be autobuilt.  While amd64 is
fine (also regarding the xserver issue - thanks to Gregor for the hints)
the i386 build log[1] shows the

** inst
** preparing package for lazy loading
Creating a generic function for 'toJSON' from package 'jsonlite' in package 
'googleVis'
Error: segfault from C stack overflow
* removing 
'/«PKGBUILDDIR»/debian/r-cran-treescape/usr/lib/R/site-library/treescape'

again even if the log also has gcc-6-base i386 6.2.1-6  and binutils
i386 2.27.51.20161212-1 - so the toolchain on autobuilder is the same as
it worked for you.  There might be a difference between a qemu emulation
and real hardware, thought.

Kind regards and thanks for checking anyway

 Andreas.

[1] 
https://buildd.debian.org/status/fetch.php?pkg=r-cran-treescape=i386=1.10.18-2=1481697760
 

-- 
http://fam-tille.de



Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-13 Thread gregor herrmann
On Tue, 13 Dec 2016 13:23:50 +0100, Andreas Tille wrote:

> > I guess pabs meant that you're not actually using it in debian/rules.
> Urgs, thanks for opening my eyes. ;-)

:)
 
> Index: rules
> ===
> --- rules   (Revision 23284)
> +++ rules   (Arbeitskopie)
> @@ -2,3 +2,7 @@
>  
>  %:
> dh $@ --buildsystem R
> +
> +override_dh_auto_install:
> +   xvfb-run --auto-servernum --server-num=20 -s "-screen 0 1024x768x24 
> -ac +extension GLX +render -noreset" \
> +   dh_auto_install
> 
> 
> which results in the following diff in the build log:
> 
> 
>  ** preparing package for lazy loading
>  Creating a generic function for 'toJSON' from package 'jsonlite' in package 
> 'googleVis'
>  Warning in rgl.init(initValue, onlyNULL) :
> -  RGL: unable to open X11 display
> +  RGL: GLX extension missing on server
>  Warning: 'rgl_init' failed, running with rgl.useNULL = TRUE
>  ** help
>  *** installing help indices

Adding libgl1-mesa-dri as a build dependency gets rid of the warning
for me (amd64 cowbuilder sid chroot).

** preparing package for lazy loading
Creating a generic function for 'toJSON' from package 'jsonlite' in package 
'googleVis'
** help
*** installing help indices
 

Cheers,
gregor

-- 
 .''`.  https://info.comodo.priv.at/ - Debian Developer https://www.debian.org
 : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D  85FA BB3A 6801 8649 AA06
 `. `'  Member of VIBE!AT & SPI, fellow of the Free Software Foundation Europe
   `-   NP: David Bowie: China Girl (Single Version)


signature.asc
Description: Digital Signature


Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-13 Thread Andreas Tille
On Tue, Dec 13, 2016 at 11:47:43AM +0100, gregor herrmann wrote:
> [..]
> > So what exactly do you mean by "didn't add it yet" ?
> 
> I guess pabs meant that you're not actually using it in debian/rules.

Urgs, thanks for opening my eyes. ;-)

So I tried:

Index: rules
===
--- rules   (Revision 23284)
+++ rules   (Arbeitskopie)
@@ -2,3 +2,7 @@
 
 %:
dh $@ --buildsystem R
+
+override_dh_auto_install:
+   xvfb-run --auto-servernum --server-num=20 -s "-screen 0 1024x768x24 -ac 
+extension GLX +render -noreset" \
+   dh_auto_install


which results in the following diff in the build log:


 ** preparing package for lazy loading
 Creating a generic function for 'toJSON' from package 'jsonlite' in package 
'googleVis'
 Warning in rgl.init(initValue, onlyNULL) :
-  RGL: unable to open X11 display
+  RGL: GLX extension missing on server
 Warning: 'rgl_init' failed, running with rgl.useNULL = TRUE
 ** help
 *** installing help indices


Seems the "+extension GLX" is not sufficient to get it fully working.
Do you have any better idea.  Since I do not have one for the moment I'd
like to fix this first and try what happens on the failing
architectures (despite I agree with Paul that most probably this is not
the cause for the build failure since it only results in a warning).

Kind regards

 Andreas.

-- 
http://fam-tille.de



Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-13 Thread gregor herrmann
On Tue, 13 Dec 2016 11:30:41 +0100, Andreas Tille wrote:

> > > Well, adding xvfb was the usual trick to cope with "unable to open X11
> > > display" messages and thus I added it ...
> > To me it looks like you didn't add it yet, at least not to the version
> > in Debian.
> Hmmm, 
> $ apt-get source r-cran-treescape
> $ grep xvfb r-cran-treescape-1.10.18/debian/control
>xvfb
[..]
> So what exactly do you mean by "didn't add it yet" ?

I guess pabs meant that you're not actually using it in debian/rules.


Cheers,
gregor

-- 
 .''`.  https://info.comodo.priv.at/ - Debian Developer https://www.debian.org
 : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D  85FA BB3A 6801 8649 AA06
 `. `'  Member of VIBE!AT & SPI, fellow of the Free Software Foundation Europe
   `-   BOFH excuse #9:  doppler effect 



Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-13 Thread Andreas Tille
On Tue, Dec 13, 2016 at 03:50:32PM +0800, Paul Wise wrote:
> On Tue, Dec 13, 2016 at 3:47 PM, Andreas Tille wrote:
> 
> > Well, adding xvfb was the usual trick to cope with "unable to open X11
> > display" messages and thus I added it ...
> 
> To me it looks like you didn't add it yet, at least not to the version
> in Debian.

Hmmm, 


$ apt-get source r-cran-treescape
$ grep xvfb r-cran-treescape-1.10.18/debian/control
   xvfb

May be I should do some source only upload but at least arm64 log[1] contains

  Get:298 http://ftp.us.debian.org/debian unstable/main arm64 xvfb arm64 
2:1.19.0-2 [2656 kB]

So what exactly do you mean by "didn't add it yet" ?

Kind regards

Andreas.

[1] 
https://buildd.debian.org/status/fetch.php?pkg=r-cran-treescape=arm64=1.10.18-1=1481535210

-- 
http://fam-tille.de



Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-12 Thread Andreas Tille
Hi Paul,

On Tue, Dec 13, 2016 at 09:07:11AM +0800, Paul Wise wrote:
> > Any help would be really appreciated.
> 
> Looking at the build logs, even the architectures that succeeded are
> getting the warning about X11. So that is completely unrelated and the
> issue is the segfault not the xvfb stuff. Someone need to run the
> crashing program under gdb and debug the crash.
> 
> https://buildd.debian.org/status/fetch.php?pkg=r-cran-treescape=arm64=1.10.18-1=1481535210

Ahhh, you are right.
 
> If you can't find someone to help out, the only option to solve the RC
> bug is removal from those architectures:
> 
> https://wiki.debian.org/ftpmaster_Removals

While amd64 is most probably the main target architecture it would be a
shame to loose i386 and there might be some hidden problem which in the
end also affects amd64.

So I did not yet considered this last resort and hope for some help from
people who are more confident with gdb debugging than I am (Ive read
here frequently some phrases like "10 minutes of gdb debugging" but I'm
afraid it would take me 10 hours - I'd happily join a "Gdb for Debian
maintainers" workshop at Debconf, thought.

> Also, you build-depend on xvfb but I can't see it being used at all in
> the build log nor the packaging:
> 
> https://codesearch.debian.net/search?q=package%3Ar-cran-treescape+xvfb

Well, adding xvfb was the usual trick to cope with "unable to open X11
display" messages and thus I added it ...

Kind regards

  Andreas.

-- 
http://fam-tille.de



Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-12 Thread Paul Wise
On Tue, Dec 13, 2016 at 3:47 PM, Andreas Tille wrote:

> Well, adding xvfb was the usual trick to cope with "unable to open X11
> display" messages and thus I added it ...

To me it looks like you didn't add it yet, at least not to the version
in Debian.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise



Bug#845753: Help: r-cran-treescape does not build on i386, armel and armhf any more

2016-12-12 Thread Andreas Tille
Hi Paul,

On Thu, Nov 03, 2016 at 11:40:04AM +0800, Paul Wise wrote:
> On Thu, Nov 3, 2016 at 12:20 AM, Andreas Tille wrote:
> 
> > I used xauth and xvfb as Build-Depends successfully which works on most
> > architectures - but failed on these ones.  Any hint how to solve this?
> 
> If you don't have hardware for these arches, login to one of the
> porterboxen and install the build-deps in a chroot and then run the
> relevant commands under a debugger like gdb.
> 
> https://dsa.debian.org/doc/schroot/
> https://db.debian.org/machines.cgi

I admit I do not only lack the hardware I'm also lacking experience to
track down this kind of problems.  I discussed the issue with upstream
and they also do not have any clue.  

Any help would be really appreciated.

Kind regards

  Andreas.

-- 
http://fam-tille.de