Hi Christian,

thanks a lot for your extensive analysis about of the stack problem.  I
admit I have no idea why this large stack is needed on those
architectures with stable kernel.  I also have no idea why everything
went fine with treescape version 1.10.17.  Since I personally fell
totally clueless I'm forwarding this upstream and also CC Dirk
Eddelbuettel who is known for his insight and good contact to the R
community.  May be somebody has a better clue rather than drastically
increasing the stack size on those failed architectures.

Thanks again

     Andreas.

On Wed, Dec 14, 2016 at 03:37:27PM +0100, Christian Seiler wrote:
> Hi again,
> 
> On 12/14/2016 03:00 PM, Christian Seiler wrote:
> > If I had to guess what was going on in the backtrace, I'd suspect
> > an infinite recursion in R code, which translates to infinite
> > recursion of the underlying C code. But I'm really not sure here.
> 
> Interestingly enough, my initial guess was wrong.
> 
> It's not an infinite recursion, it's just a very, very deep
> recursion, using a LOT of stack. If I increase the stack size
> limit by to 200 MB, then the package builds successfully,
> I tried that in a loop 25 times.
> 
> However, with an earlier attempt at 160 MB stack size limit,
> it worked most of the time, but not always, I did get the
> same error once, so the amount of stack space required does
> not appear to be the same when calling the program multiple
> times. (With 160 MB I tried around 15 times, and once the
> 160 MB limit was insufficient.)
> 
> It might even be in rare cases that the 200 MB limit is not
> enough and a build could fail spuriously even with that.
> 
> > Why that only appears to occur on 32bit LE architectures with
> > stable kernels (and works fine with unstable kernels on the same
> > architecture, and even with the stable kernel on 64bit both LE
> > and BE, as well as on 32bit BE) I also have no clue.
> 
> And this is still beyond me, because the default stack size
> limit of 8 MB is more than sufficient on e.g. amd64, where
> pointers are twice as large, so the amount of stack frames
> that fit in that limit there is actually smaller.
> 
> So it appears you can work around this bug by manually
> setting an artificially high stack size limit during the
> build, but there is still an underlying problem there that
> causes the stack usage to be drastically higher on
> 32bit LE platforms with kernel 3.16, that doesn't appear
> on the same platforms with a newer kernel.
> 
> Anyway, to work around this for now, you can replace your
> dh_auto_install line (that is passed to the xvfb call)
> with the following command:
> 
>   /bin/sh -c "ulimit -S -s 200000 ; exec dh_auto_install"
> 
> Just tried it, sbuild built the package successfully on
> i386. I haven't tried armhf, but I suspect the result will
> be the same.
> 
> But the underlying problem should also be fixed: a stack
> size that is 25 times higher than usual is worrisome,
> especially with the standard limit being plenty sufficient
> on platforms with larger pointer sizes. You might have to
> ask upstream and/or the R community for advice though. (Maybe
> see what R function specifically does this deep recursion,
> and fix that function to be a lot shallower. I don't know
> how to get that information from a gdb backtrace though, as
> I don't know the internals of R.)
> 
> Hope that helps.
> 
> Regards,
> Christian
> 

-- 
http://fam-tille.de

Reply via email to