Re: GMP 6.2.1 Aborting when running tuneup program in one.cold()
Simon Sobisch writes: Am 20.10.2021 um 16:19 schrieb Torbjörn Granlund: > When tuneup cannot measure things accurately, it bails out. That's interesting. Is there any thing I can do to help tuneup measure things accurately? Make sure your system is idle except for the tuneup process. Can you please add that important information (abort of the program is no bug, just use the non-optimized version) to the documentation https://gmplib.org/manual/Performance-optimization ideally together with the answer to the related questions "What should -f NNN" relate to?" and "Should I manually build the average" (if this isn't an effect of "not accurately measured")? If we find time, perhaps. Running the tuneup program and make use of its results is mainly intended for GMP devs. -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
macOS version detection broken in configure
The configure script mistakenly recognises macOS 11 (Big Sur) as an old version of macOS, thereby choosing the linker flags `-flat_namespace` and `-undefined suppress`. We want to avoid `-flat_namespace` as this can cause name collisions for users of the library. [1] This can be fixed by patching libtool.m4 [2] and regenerating configure. The problem lies in this snippet from `configure`: case $host_os in rhapsody* | darwin1.[012]) _lt_dar_allow_undefined='$wl-undefined ${wl}suppress' ;; darwin1.*) _lt_dar_allow_undefined='$wl-flat_namespace $wl-undefined ${wl}suppress' ;; darwin*) # darwin 5.x on # if running on 10.5 or later, the deployment target defaults # to the OS version, if on x86, and 10.4, the deployment # target defaults to 10.4. Don't you love it? case ${MACOSX_DEPLOYMENT_TARGET-10.0},$host in 10.0,*86*-darwin8*|10.0,*-darwin[91]*) _lt_dar_allow_undefined='$wl-undefined ${wl}dynamic_lookup' ;; 10.[012][,.]*) _lt_dar_allow_undefined='$wl-flat_namespace $wl-undefined ${wl}suppress' ;; 10.*) _lt_dar_allow_undefined='$wl-undefined ${wl}dynamic_lookup' ;; esac To check, we can use `otool -hV` or a newish version of `file` on `libgmp.10.dylib`: ❯ otool -hV .libs/libgmp.10.dylib .libs/libgmp.10.dylib: Mach header magic cputype cpusubtype capsfiletype ncmds sizeofcmds flags MH_MAGIC_64 X86_64ALL 0x00 DYLIB14 1616 DYLDLINK NO_REEXPORTED_DYLIBS ❯ file .libs/libgmp.10.dylib .libs/libgmp.10.dylib: Mach-O 64-bit x86_64 dynamically linked shared library, flags:<|DYLDLINK|NO_REEXPORTED_DYLIBS> A library built with the correct linker options (`-undefined dynamic_lookup`) should produce this output (built on macOS 10.15): ❯ otool -hV lib/libgmp.10.dylib lib/libgmp.10.dylib: Mach header magic cputype cpusubtype capsfiletype ncmds sizeofcmds flags MH_MAGIC_64 X86_64ALL 0x00 DYLIB14 1632 NOUNDEFS DYLDLINK TWOLEVEL NO_REEXPORTED_DYLIBS ❯ file lib/libgmp.10.dylib lib/libgmp.10.dylib: Mach-O 64-bit x86_64 dynamically linked shared library, flags: In particular, `TWOLEVEL` appears in the flags for the library built on Catalina but not on Big Sur. Checking the output of `make` also shows that `libtool` invokes `clang` with the `-flat_namespace` flag. Here is the information requested in your bug reporting manual: configure options: none configure output: see attached `configure_output.txt` compiler: Apple clang version 13.0.0 (clang-1300.0.29.3) uname -a: Darwin hermes.lan 20.6.0 Darwin Kernel Version 20.6.0: Mon Aug 30 06:12:21 PDT 2021; root:xnu-7195.141.6~3/RELEASE_X86_64 x86_64 i386 MacBookAir9,1 Darwin config.guess: nehalem-apple-darwin20.6.0 configfsf.guess: x86_64-apple-darwin20.6.0 config.log: attached [1] http://mirror.informatimago.com/next/developer.apple.com/releasenotes/DeveloperTools/TwoLevelNamespaces.html#intro [2] https://lists.gnu.org/archive/html/libtool-patches/2020-06/msg1.html config.log.tar.gz Description: GNU Zip compressed data checking build system type... nehalem-apple-darwin20.6.0 checking host system type... nehalem-apple-darwin20.6.0 checking for a BSD-compatible install... /usr/local/opt/coreutils/libexec/gnubin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /usr/local/opt/coreutils/libexec/gnubin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking whether to enable maintainer-specific portions of Makefiles... no checking ABI=64 checking whether clang is gcc... yes checking compiler clang -O2 -pedantic -fomit-frame-pointer -m64 ... yes checking compiler clang -O2 -pedantic -fomit-frame-pointer -m64 -mtune=nehalem... yes checking compiler clang -O2 -pedantic -fomit-frame-pointer -m64 -mtune=nehalem -march=nehalem... yes checking for gcc... clang checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether clang accepts -g... yes checking for clang option to accept ISO C89... none needed checking whether clang understands -c and -o together... yes checking for clang option to accept ISO C99... none needed checking how to run the C preprocessor... clang -E checking build system compiler clang... yes checking for build system preprocessor... clang -E checking for build system executable suffix... checking whether build system compiler is ANSI... yes checking for build system compiler math library... -lm checking for grep that handles long lines and -e... /usr/local/opt/grep/libexec/gnubin/grep checking for egrep... /usr/local
Re: GMP 6.2.1 Aborting when running tuneup program in one.cold()
Thanks for the prompt answer! Am 20.10.2021 um 16:19 schrieb Torbjörn Granlund: When tuneup cannot measure things accurately, it bails out. That's interesting. Is there any thing I can do to help tuneup measure things accurately? Can you please add that important information (abort of the program is no bug, just use the non-optimized version) to the documentation https://gmplib.org/manual/Performance-optimization ideally together with the answer to the related questions "What should -f NNN" relate to?" and "Should I manually build the average" (if this isn't an effect of "not accurately measured")? > No bug. Maybe the tuneup program could also hint this at start (additional to the doc change)? Something like ./tuneup Try finding optimal parameters for ./mpn/x86_64/skylake/gmp-mparam.h If this is not possible this program will abort, which is no bug, and you should use the untuned version. Using: CPU cycle counter, supplemented by microsecond getrusage() speed_precision 1, speed_unittime 3.34e-10 secs, CPU freq 2992.97 MHz DEFAULT_MAX_SIZE 1000, fft_max_size 5 Simon ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
Re: GMP 6.2.1 Aborting when running tuneup program in one.cold()
When tuneup cannot measure things accurately, it bails out. No bug. -- Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-bugs mailing list gmp-bugs@gmplib.org https://gmplib.org/mailman/listinfo/gmp-bugs
GMP 6.2.1 Aborting when running tuneup program in one.cold()
make check passed Testsuite summary for GNU MP 6.2.1 # TOTAL: 0 # PASS: 0 # SKIP: 0 # XFAIL: 0 # FAIL: 0 # XPASS: 0 # ERROR: 0 and the installed library also seems to work fine from a quick test, but the tuneup program crashes. System specs below. Questions: * Did I missed a known issue? * Can I do anything about this? * Should I care running tuneup at all? The application does some heavy computations with it in the range +-9 (mostly multiply and divide [often by 10] and addition/subtraction and string representation all via GnuCOBOL) * As the output of the tuneup utility is different each time and the docs at https://gmplib.org/manual/Performance-optimization are more spare than for other parts: Should I run it multiple times and then use the average? It lists -NNN but doesn't give a clue how to set it - should I set that? Does a high value make an average on its own? I'm available to provide more info as needed (please CC me in answers as I'm not subscribed to this list). Thanks, Simon * uname -a Linux cent01test 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux * ../config.guess && ./configfsf.guess sandybridge-pc-linux-gnu x86_64-pc-linux-gnu * gcc -v (build directly before on that machine) Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/opt/gcc-11.2.mit.gcc.flags/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --prefix=/opt/gcc-11.2 --disable-multilib --enable-languages=c,c++,lto --with-gmp=/opt/gmp-6.2.1/lib CFLAGS='-march=native -mtune=native' CXXFLAGS='-march=native -mtune=native' Thread model: posix Supported LTO compression algorithms: zlib gcc version 11.2.0 (GCC) * m4 --version m4 (GNU M4) 1.4.16 * backtrace: $ gdb tuneup --quiet -ex run -ex bt -ex quit Reading symbols from /opt/install/gmp-6.2.1/tune/tuneup...(no debugging symbols found)...done. Starting program: /opt/install/gmp-6.2.1/tune/tuneup Parameters for ./mpn/x86_64/skylake/gmp-mparam.h [Detaching after fork from child process 31364] Using: CPU cycle counter, supplemented by microsecond getrusage() speed_precision 1, speed_unittime 3.34e-10 secs, CPU freq 2992.97 MHz DEFAULT_MAX_SIZE 1000, fft_max_size 5 /* Generated by tuneup.c, 2021-10-20, gcc 11.2 */ #define MOD_1_NORM_THRESHOLD 0 /* always */ #define MOD_1_UNNORM_THRESHOLD 0 /* always */ #define MOD_1N_TO_MOD_1_1_THRESHOLD 4 #define MOD_1U_TO_MOD_1_1_THRESHOLD 3 #define MOD_1_1_TO_MOD_1_2_THRESHOLD13 #define MOD_1_2_TO_MOD_1_4_THRESHOLD38 #define PREINV_MOD_1_TO_MOD_1_THRESHOLD 9 #define USE_PREINV_DIVREM_1 1 /* native */ #define DIV_QR_1_NORM_THRESHOLD 1 #define DIV_QR_1_UNNORM_THRESHOLDMP_SIZE_T_MAX /* never */ #define DIV_QR_2_PI2_THRESHOLD 33 #define DIVEXACT_1_THRESHOLD 0 /* always (native) */ #define BMOD_1_TO_MOD_1_THRESHOLD 20 #define DIV_1_VS_MUL_1_PERCENT 468 #define MUL_TOOM22_THRESHOLD26 #define MUL_TOOM33_THRESHOLD73 #define MUL_TOOM44_THRESHOLD 208 #define MUL_TOOM6H_THRESHOLD 300 #define MUL_TOOM8H_THRESHOLD 406 #define MUL_TOOM32_TO_TOOM43_THRESHOLD 73 #define MUL_TOOM32_TO_TOOM53_THRESHOLD 159 #define MUL_TOOM42_TO_TOOM53_THRESHOLD 137 #define MUL_TOOM42_TO_TOOM63_THRESHOLD 151 #define MUL_TOOM43_TO_TOOM54_THRESHOLD 106 #define SQR_BASECASE_THRESHOLD 0 /* always (native) */ #define SQR_TOOM2_THRESHOLD 32 #define SQR_TOOM3_THRESHOLD114 #define SQR_TOOM4_THRESHOLD176 #define SQR_TOOM6_THRESHOLD446 #define SQR_TOOM8_THRESHOLD547 #define MULMID_TOOM42_THRESHOLD 48 #define MULMOD_BNM1_THRESHOLD 15 #define SQRMOD_BNM1_THRESHOLD 18 #define MUL_FFT_MODF_THRESHOLD 460 /* k = 5 */ #define MUL_FFT_TABLE3 \ { {460, 5}, { 23, 6}, { 27, 7}, { 15, 6}, \ { 31, 7}, { 25, 8}, { 13, 7}, { 29, 8}, \ { 15, 7}, { 33, 8}, { 17, 7}, { 35, 8}, \ { 19, 7}, { 40, 8}, { 21, 9}, { 11, 8}, \ { 35, 9}, { 19, 8}, { 41, 9}, { 23, 8}, \ { 49, 9}, { 27,10}, { 15, 9}, { 39, 8}, \ { 81, 9}, { 43,10}, { 23, 9}, { 55,11}, \ { 15,10}, { 31, 9}, { 71,10}, { 39, 9}, \ { 87,10}, { 47, 9}, {512,10}, { 1024,11}, \ { 2048,12}, { 4096,13}, { 8192,14}, { 16384,15}, \ { 32768,16}, { 65536,17}, { 131072,18}, { 262144,19}, \ { 524288,20}, {1048576,21}, {2097152,22}, {4194304,23}, \ {8388608,24}