On Fri, Mar 13 2020, Paul Irofti <p...@irofti.net> wrote:
> On Wed, Mar 11, 2020 at 08:58:27PM +0100, Martin Reindl wrote:
>> Am 11.03.20 um 18:53 schrieb Theo Buehler:
>> > On Wed, Mar 11, 2020 at 04:12:56AM +0100, Jeremie Courreges-Anglas wrote:
>> >> On Tue, Mar 10 2020, Theo Buehler <t...@theobuehler.org> wrote:
>> >>> On Tue, Mar 10, 2020 at 06:35:04PM +0100, Jeremie Courreges-Anglas wrote:
>> >>>> On Mon, Mar 09 2020, Stuart Henderson <s...@spacehopper.org> wrote:
>> >>>>> On 2020/03/09 10:42, Theo Buehler wrote:
>> >>>>>> On Mon, Jan 13, 2020 at 12:50:32PM +0000, Stuart Henderson wrote:
>> >>>>>>> 2/3 through a bulk build and I see that this breaks scipy (missing 
>> >>>>>>> symbols,
>> >>>>>>> blas/cblas-related) so needs a bit more work, but I think it's 
>> >>>>>>> generally
>> >>>>>>> along the right lines.
>> >>>>>>
>> >>>>>> Not sure if this provides any useful clue, but py-numpy doesn't build 
>> >>>>>> at
>> >>>>>> all on sparc64 with this diff, also due to missing blas/cblas symbols:
>> >>>>>
>> >>>>> You'll probably see the same on amd64 with USE_LLD=no.
>> >>>>
>> >>>> I managed to build scipy with no changes on amd64, so I'm not sure what
>> >>>> the problem is on this arch (did not try with USE_LLD=No).
>> >>>>
>> >>>> However I took a look at the issue reported by tb on sparc64.
>> >>>>
>> >>>> --8<--
>> >>>> creating /tmp/tmpKcZ0cd/tmp
>> >>>> creating /tmp/tmpKcZ0cd/tmp/tmpKcZ0cd
>> >>>> compile options: '-I/usr/local/include -I/usr/include -c'
>> >>>> cc: /tmp/tmpKcZ0cd/source.c
>> >>>> cc /tmp/tmpKcZ0cd/tmp/tmpKcZ0cd/source.o -L/usr/local/lib -lcblas -o 
>> >>>> /tmp/tmpKcZ0cd/a.out
>> >>>> /usr/local/lib/libcblas.so.1.0: undefined reference to `ztbsv_'
>> >>>> /usr/local/lib/libcblas.so.1.0: undefined reference to `dasum_'
>> >>>>
>> >>>> [...]
>> >>>>
>> >>>> /usr/local/lib/libcblas.so.1.0: undefined reference to `zsymm_'
>> >>>> /usr/local/lib/libcblas.so.1.0: undefined reference to `ztrsm_'
>> >>>> /usr/local/lib/libcblas.so.1.0: undefined reference to `sswap_'
>> >>>> collect2: error: ld returned 1 exit status
>> >>>> cc /tmp/tmpKcZ0cd/tmp/tmpKcZ0cd/source.o -L/usr/local/lib -lblas -o 
>> >>>> /tmp/tmpKcZ0cd/a.out
>> >>>> /tmp/tmpKcZ0cd/tmp/tmpKcZ0cd/source.o: In function `main':
>> >>>> source.c:(.text.startup+0xdc): undefined reference to `cblas_ddot'
>> >>>> collect2: error: ld returned 1 exit status
>> >>>> -->8--
>> >>>>
>> >>>> libcblas.so doesn't depend on libblas.so so missing symbols are to be
>> >>>> expected if one links with -lcblas instead of -lcblas -lblas.  The
>> >>>> second linking test fails because libblas.so doesn't provide cblas
>> >>>> symbols.
>> >>>
>> >>> Thanks, this makes sense. But why does this work with ld.lld?
>> >>
>> >> ld.lld doesn't bother checking that all symbols in libcblas.so can be
>> >> resolved, ld.bfd does.  This means that if you link against a library
>> >> that references a bogus symbol or lacks some library interdependency
>> >> (DT_NEEDED) you only get a crash at run time.
>> >>
>> >> On amd64, using the testcase from numpy:
>> >>
>> >> --8<--
>> >> russell /tmp$ cat r.c
>> >> #include <cblas.h>
>> >> int main(int argc, const char *argv[])
>> >> {
>> >>     double a[4] = {1,2,3,4};
>> >>     double b[4] = {5,6,7,8};
>> >>     return cblas_ddot(4, a, 1, b, 1) > 10;
>> >> }
>> >> russell /tmp$ cc -I/usr/local/include r.c -L/usr/local/lib -lcblas
>> >> russell /tmp$ ./a.out
>> >> a.out:/usr/local/lib/libcblas.so.1.0: undefined symbol 'ddot_'
>> >> ld.so: a.out: lazy binding failed!
>> >> Killed
>> >> -->8--
>> >>
>> >> I suspect Stuart hit a similar problem with this numpy update and scipy.
>> >>
>> >> Using -fuse-ld=bfd in the testcase above would result in the same errors
>> >> as in your log.
>> > 
>> > I see. Thank you very much for your explanations.
>> > 
>> > FWIW your cblas diff is ok tb (also tested on macppc).
>> > 
>> 
>> Also tested on arm64, with numpy-1.16.6:
>> 42 failed, 7268 passed, 93 skipped, 168 deselected, 12 xfailed, 1
>> xpassed, 1 warnings
>> 
>> I think this is ready to go into the tree with the cblas diff on top?
>> The update to 1.16.6 is straightforward if you want to stick to 1.16.5 now.
>
> Can you please commit this? Thank you!

By popular demand I have committed the cblas diff.  I haven't looked
closely at the numpy update, you folks probably know better.

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

Reply via email to