On Mon, Feb 8, 2016 at 6:04 PM, Matthew Brett <matthew.br...@gmail.com> wrote: > On Mon, Feb 8, 2016 at 5:26 PM, Nathaniel Smith <n...@pobox.com> wrote: >> On Mon, Feb 8, 2016 at 4:37 PM, Matthew Brett <matthew.br...@gmail.com> >> wrote: >> [...] >>> I can't replicate the segfault with manylinux wheels and scipy. On >>> the other hand, I get a new test error for numpy from manylinux, scipy >>> from manylinux, like this: >>> >>> $ python -c 'import scipy.linalg; scipy.linalg.test()' >>> >>> ====================================================================== >>> FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) >>> ---------------------------------------------------------------------- >>> Traceback (most recent call last): >>> File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line >>> 197, in runTest >>> self.test(*self.arg) >>> File >>> "/usr/local/lib/python2.7/dist-packages/scipy/linalg/tests/test_decomp.py", >>> line 658, in eigenhproblem_general >>> assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) >>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>> line 892, in assert_array_almost_equal >>> precision=decimal) >>> File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", >>> line 713, in assert_array_compare >>> raise AssertionError(msg) >>> AssertionError: >>> Arrays are not almost equal to 4 decimals >>> >>> (mismatch 100.0%) >>> x: array([ 0., 0., 0.], dtype=float32) >>> y: array([ 1., 1., 1.]) >>> >>> ---------------------------------------------------------------------- >>> Ran 1507 tests in 14.928s >>> >>> FAILED (KNOWNFAIL=4, SKIP=1, failures=1) >>> >>> This is a very odd error, which we don't get when running over a numpy >>> installed from source, linked to ATLAS, and doesn't happen when >>> running the tests via: >>> >>> nosetests /usr/local/lib/python2.7/dist-packages/scipy/linalg >>> >>> So, something about the copy of numpy (linked to openblas) is >>> affecting the results of scipy (also linked to openblas), and only >>> with a particular environment / test order. >>> >>> If you'd like to try and see whether y'all can do a better job of >>> debugging than me: >>> >>> # Run this script inside a docker container started with this incantation: >>> # docker run -ti --rm ubuntu:12.04 /bin/bash >>> apt-get update >>> apt-get install -y python curl >>> apt-get install libpython2.7 # this won't be necessary with next >>> iteration of manylinux wheel builds >>> curl -LO https://bootstrap.pypa.io/get-pip.py >>> python get-pip.py >>> pip install -f https://nipy.bic.berkeley.edu/manylinux numpy scipy nose >>> python -c 'import scipy.linalg; scipy.linalg.test()' >> >> I just tried this and on my laptop it completed without error. >> >> Best guess is that we're dealing with some memory corruption bug >> inside openblas, so it's getting perturbed by things like exactly what >> other calls to openblas have happened (which is different depending on >> whether numpy is linked to openblas), and which core type openblas has >> detected. >> >> On my laptop, which *doesn't* show the problem, running with >> OPENBLAS_VERBOSE=2 says "Core: Haswell". >> >> Guess the next step is checking what core type the failing machines >> use, and running valgrind... anyone have a good valgrind suppressions >> file? > > My machine (which does give the failure) gives > > Core: Core2 > > with OPENBLAS_VERBOSE=2
Yep, that allows me to reproduce it: root@f7153f0cc841:/# OPENBLAS_VERBOSE=2 OPENBLAS_CORETYPE=Core2 python -c 'import scipy.linalg; scipy.linalg.test()' Core: Core2 [...] ====================================================================== FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) ---------------------------------------------------------------------- [...] So this is indeed sounding like an OpenBLAS issue... next stop valgrind, I guess :-/ -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion