Re: GMP 6.2.1 Aborting when running tuneup program in one.cold()

2021-10-21 Thread Marco Bodrato

Ciao Simon,

Il 2021-10-20 16:09 Simon Sobisch ha scritto:

Questions:



* Should I care running tuneup at all?
  The application does some heavy computations with it in the range
  +-9 (mostly multiply and divide [often by 10] and


9 fits in 63 bits, correct?

For that range, on a 64-bits CPU, the native integer types should be 
enough.


The manual, with "extremely large numbers", means much larger bit-sizes.

You probably don't need tuneup at all.

Moreover, the sources of GMP already contain pre-tuned parameters for 
many platforms. They are automatically used by the typical

./condigure&& check
building process. So that tuning is, in most of the cases, superfluous.


* As the output of the tuneup utility is different each time and the
  docs at https://gmplib.org/manual/Performance-optimization are more
  spare than for other parts: Should I run it multiple times and then
  use the average?


Some thresholds may have a large range of tolerance, some doesn't.
In any case, a collection of parameters needs to be coherent.
So my answer is: use the results from a single run.

Ĝis,
m
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: GMP 6.2.1 Aborting when running tuneup program in one.cold()

2021-10-20 Thread Torbjörn Granlund
Simon Sobisch  writes:

  Am 20.10.2021 um 16:19 schrieb Torbjörn Granlund:
  > When tuneup cannot measure things accurately, it bails out.

  That's interesting. Is there any thing I can do to help tuneup measure
  things accurately?

Make sure your system is idle except for the tuneup process.

  Can you please add that important information (abort of the program is
  no bug, just use the non-optimized version) to the documentation
 https://gmplib.org/manual/Performance-optimization
  ideally together with the answer to the related questions
  "What should -f NNN" relate to?" and "Should I manually build the
  average" (if this isn't an effect of "not accurately measured")?

If we find time, perhaps.

Running the tuneup program and make use of its results is mainly
intended for GMP devs.

-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: GMP 6.2.1 Aborting when running tuneup program in one.cold()

2021-10-20 Thread Simon Sobisch

Thanks for the prompt answer!

Am 20.10.2021 um 16:19 schrieb Torbjörn Granlund:

When tuneup cannot measure things accurately, it bails out.


That's interesting. Is there any thing I can do to help tuneup measure 
things accurately?


Can you please add that important information (abort of the program is 
no bug, just use the non-optimized version) to the documentation

   https://gmplib.org/manual/Performance-optimization
ideally together with the answer to the related questions
"What should -f NNN" relate to?" and "Should I manually build the 
average" (if this isn't an effect of "not accurately measured")?


> No bug.

Maybe the tuneup program could also hint this at start (additional to 
the doc change)?


Something like

./tuneup
Try finding optimal parameters for ./mpn/x86_64/skylake/gmp-mparam.h
If this is not possible this program will abort, which is no bug, and 
you should use the untuned version.

Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1, speed_unittime 3.34e-10 secs, CPU freq 2992.97 MHz
DEFAULT_MAX_SIZE 1000, fft_max_size 5


Simon
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


GMP 6.2.1 Aborting when running tuneup program in one.cold()

2021-10-20 Thread Simon Sobisch

make check passed

Testsuite summary for GNU MP 6.2.1

# TOTAL: 0
# PASS:  0
# SKIP:  0
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0

and the installed library also seems to work fine from a quick test, but 
the tuneup program crashes.


System specs below.


Questions:
* Did I missed a known issue?
* Can I do anything about this?
* Should I care running tuneup at all?
  The application does some heavy computations with it in the range
  +-9 (mostly multiply and divide [often by 10] and
  addition/subtraction and string representation all via GnuCOBOL)
* As the output of the tuneup utility is different each time and the
  docs at https://gmplib.org/manual/Performance-optimization are more
  spare than for other parts: Should I run it multiple times and then
  use the average? It lists -NNN but doesn't give a clue how to set
  it - should I set that? Does a high value make an average on its own?

I'm available to provide more info as needed (please CC me in answers as 
I'm not subscribed to this list).



Thanks,
Simon




* uname -a
Linux cent01test 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 
2018 x86_64 x86_64 x86_64 GNU/Linux


* ../config.guess && ./configfsf.guess
sandybridge-pc-linux-gnu
x86_64-pc-linux-gnu

* gcc -v (build directly before on that machine)
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/gcc-11.2.mit.gcc.flags/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/opt/gcc-11.2 --disable-multilib 
--enable-languages=c,c++,lto --with-gmp=/opt/gmp-6.2.1/lib 
CFLAGS='-march=native -mtune=native' CXXFLAGS='-march=native -mtune=native'

Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.2.0 (GCC)

* m4 --version
m4 (GNU M4) 1.4.16

* backtrace:
$ gdb tuneup --quiet -ex run -ex bt -ex quit
Reading symbols from /opt/install/gmp-6.2.1/tune/tuneup...(no debugging 
symbols found)...done.

Starting program: /opt/install/gmp-6.2.1/tune/tuneup
Parameters for ./mpn/x86_64/skylake/gmp-mparam.h
[Detaching after fork from child process 31364]
Using: CPU cycle counter, supplemented by microsecond getrusage()
speed_precision 1, speed_unittime 3.34e-10 secs, CPU freq 2992.97 MHz
DEFAULT_MAX_SIZE 1000, fft_max_size 5

/* Generated by tuneup.c, 2021-10-20, gcc 11.2 */

#define MOD_1_NORM_THRESHOLD 0  /* always */
#define MOD_1_UNNORM_THRESHOLD   0  /* always */
#define MOD_1N_TO_MOD_1_1_THRESHOLD  4
#define MOD_1U_TO_MOD_1_1_THRESHOLD  3
#define MOD_1_1_TO_MOD_1_2_THRESHOLD13
#define MOD_1_2_TO_MOD_1_4_THRESHOLD38
#define PREINV_MOD_1_TO_MOD_1_THRESHOLD  9
#define USE_PREINV_DIVREM_1  1  /* native */
#define DIV_QR_1_NORM_THRESHOLD  1
#define DIV_QR_1_UNNORM_THRESHOLDMP_SIZE_T_MAX  /* never */
#define DIV_QR_2_PI2_THRESHOLD  33
#define DIVEXACT_1_THRESHOLD 0  /* always (native) */
#define BMOD_1_TO_MOD_1_THRESHOLD   20

#define DIV_1_VS_MUL_1_PERCENT 468

#define MUL_TOOM22_THRESHOLD26
#define MUL_TOOM33_THRESHOLD73
#define MUL_TOOM44_THRESHOLD   208
#define MUL_TOOM6H_THRESHOLD   300
#define MUL_TOOM8H_THRESHOLD   406

#define MUL_TOOM32_TO_TOOM43_THRESHOLD  73
#define MUL_TOOM32_TO_TOOM53_THRESHOLD 159
#define MUL_TOOM42_TO_TOOM53_THRESHOLD 137
#define MUL_TOOM42_TO_TOOM63_THRESHOLD 151
#define MUL_TOOM43_TO_TOOM54_THRESHOLD 106

#define SQR_BASECASE_THRESHOLD   0  /* always (native) */
#define SQR_TOOM2_THRESHOLD 32
#define SQR_TOOM3_THRESHOLD114
#define SQR_TOOM4_THRESHOLD176
#define SQR_TOOM6_THRESHOLD446
#define SQR_TOOM8_THRESHOLD547

#define MULMID_TOOM42_THRESHOLD 48

#define MULMOD_BNM1_THRESHOLD   15
#define SQRMOD_BNM1_THRESHOLD   18

#define MUL_FFT_MODF_THRESHOLD 460  /* k = 5 */
#define MUL_FFT_TABLE3  \
  { {460, 5}, { 23, 6}, { 27, 7}, { 15, 6}, \
{ 31, 7}, { 25, 8}, { 13, 7}, { 29, 8}, \
{ 15, 7}, { 33, 8}, { 17, 7}, { 35, 8}, \
{ 19, 7}, { 40, 8}, { 21, 9}, { 11, 8}, \
{ 35, 9}, { 19, 8}, { 41, 9}, { 23, 8}, \
{ 49, 9}, { 27,10}, { 15, 9}, { 39, 8}, \
{ 81, 9}, { 43,10}, { 23, 9}, { 55,11}, \
{ 15,10}, { 31, 9}, { 71,10}, { 39, 9}, \
{ 87,10}, { 47, 9}, {512,10}, {   1024,11}, \
{   2048,12}, {   4096,13}, {   8192,14}, {  16384,15}, \
{  32768,16}, {  65536,17}, { 131072,18}, { 262144,19}, \
{ 524288,20}, {1048576,21}, {2097152,22}, {4194304,23}, \
{8388608,24}