On Tue, 11 Jul 2017 13:24:02 +0200 Frédéric Bonnard <fre...@linux.vnet.ibm.com> wrote: > Tags: patch > User: debian-powe...@lists.debian.org > Usertags: ppc64el > > -- > > Hi, > it just seems that there's too many space taken by different libraries > in the static TLS space. I contacted some people from the toolchain, > especially Alan Modra which seems to confirm that : > "If sagemath is dlopen'ing libraries, one of which is libgomp or has a > dependency on libgomp, and the sagemath executable itself does not load > libgomp at startup, then that would explain the error you're seeing." > Python binary has no direct dependency on libgomp : > $ lddtree /usr/bin/python2.7 > python2.7 => /usr/bin/python2.7 (interpreter => /lib64/ld64.so.2) > libpthread.so.0 => /lib/powerpc64le-linux-gnu/libpthread.so.0 > ld64.so.2 => /lib64/ld64.so.2 > libdl.so.2 => /lib/powerpc64le-linux-gnu/libdl.so.2 > libutil.so.1 => /lib/powerpc64le-linux-gnu/libutil.so.1 > libz.so.1 => /lib/powerpc64le-linux-gnu/libz.so.1 > libm.so.6 => /lib/powerpc64le-linux-gnu/libm.so.6 > libc.so.6 => /lib/powerpc64le-linux-gnu/libc.so.6 > > And also : > sagemath-7.6/sage# LD_DEBUG=files > PYTHONPATH=/build/sagemath-wDWVd1/sagemath-7.6/debian/build/usr/lib/python2.7/dist-packages > /build/sagemath-wDWVd1/sagemath-7.6/sage/src/bin/sage --docbuild > --no-pdf-links all html > ... > [..] > <error is just below> > ... > So the failure occurs while importing the python module > matrix_modn_dense_float.so. > So I propose to preload libgomp which looks good to Alan. > > As Ximin explained, this workaround should not be applied on > documentation build only, as the import should trigger the error on the > CLI as well, thus I inserted LD_PRELOAD export in sage-env, for ppc64el > only. So here is a debdiff for you to review. > I hope that will help, > > [..]
Hi, thanks very much for the investigation and explanation! I am not sure this patch is the best approach however. Also I don't yet completely understand what is wrong, I'm still guessing some things based on your explanation: Firstly your patch is for ppc64el but the same error occurs also on arm64 and possibly other platforms - we'll only know for sure, after we get the right Build-Dependencies into Debian on those other platforms. I don't think it's a long-term sustainable approach to hardcode architecture-specific exceptions. What *aspect* of ppc64el requires this patch? Am I understanding correctly that dlopen(), for some reason, loads stuff into thread-local-storage (TLS) instead of a shared area between all threads? And that this space is running out on ppc64el (and arm64)? Why doesn't it happen on amd64 / x86_64? This sounds like a bug in dlopen() or the threading library, or something else? Even if not, shouldn't it be possible to predict that the space will run out on any platform in a generic way, in order to raise the limit or to do this LD_PRELOAD workaround, in a cross-platform way? (For example, on rustc recently we had a nasty issue on ppc64el but the underlying reason was due to interaction between PAGESIZE and newer Linux kernel stack behaviour, and the workaround I wrote was conditioned on PAGESIZE rather than ppc64el specifically.) Finally, ideally we would push the patch upstream, though testing it out in Debian first would be good - I think we have easier access to some platforms than upstream does. However I'd expect that the chances of Sage accepting a ppc64el-specific patch are very slim. And this one has a DEB_* variable in. X -- GPG: ed25519/56034877E1F87C35 GPG: rsa4096/1318EFAC5FBBDBCE https://github.com/infinity0/pubkeys.git -- debian-science-maintainers mailing list debian-science-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/debian-science-maintainers