Pjotr Prins <pjotr.publi...@thebird.nl> writes: >> > If I compile for a target it >> > makes a large difference. >> >> The FAQ document[1] says this: >> >> The environment variable which control the kernel selection is >> OPENBLAS_CORETYPE (see driver/others/dynamic.c) e.g. export >> OPENBLAS_CORETYPE=Haswell. And the function char* >> openblas_get_corename() returns the used target. >> >> [1]: https://github.com/xianyi/OpenBLAS/wiki/Faq >> >> Have you tried this and compared the performance? > > About 10x difference on 24+ cores for matrix multiplication (my > version vs what comes with Guix). > > I do think we need to default to a conservative openblas for general > use. Question is how we make it fly on dedicated hardware.
Have you tried preloading the special library with LD_PRELOAD? -- Ricardo GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC https://elephly.net