tuneup: speed_measure() could not get 4 results within 1.0%

2018-03-28 Thread Nikita Zlobin
There is issue with tune system. I tried to compile from sources and
tune on two laptops, one with intel b950, seconf - i5 (don't remember
exact cpu number), both sandybridge (i5 - with avx).

./configure --enable-cxx
make
make -j1 -C tune tuneup
./tune/tuneup | tee gmp.mparam.h.new

In all cases, when i could see log, when it completes there is some
diagnostic message:
speed_measure() could not get 4 results within 1.0%

$? is usually 0, but...
$ echo ${PIPESTATUS[*]/#/+}
prints +134 +0
second test is found in gentoo gmp ebuild after tuneup; is it really
critical, or rebuilding with such result is still not bad?

Test usually completes in different point, sometimes too close, so one
need to keep all logs together to see the difference.

Last attemt is made with hg clone from 6.1.2 branch (in main branch
tuneup doesn't even build). Last log:
#define GET_STR_PRECOMPUTE_THRESHOLD24
#define SET_STR_DC_THRESHOLD  1391
#define SET_STR_PRECOMPUTE_THRESHOLD  2404

#define FAC_DSC_THRESHOLD  557
#define FAC_ODD_THRESHOLD   24

#define MATRIX22_STRASSEN_THRESHOLD 23
#define HGCD_THRESHOLD   speed_measure() could not get
4 results within 1.0% unsorted sorted
  0.002976440.00147974is about 0.5%
  0.29764   0.14797
  0.25665   0.14848
(TL;DR)
Resulting file (gmp.mparam.h.new) doesn't have any diag messages at the
end - last line is just: #define HGCD_THRESHOLD

Previous test is done with release - before to try it, i rebooted,
stoped everything not necessary (DM, cups, NetworkManager, gpm)

It proceeded few more lines before to fail, yet it ends with something
new:
#define MATRIX22_STRASSEN_THRESHOLD 17
#define HGCD_THRESHOLD 112
#define HGCD_APPR_THRESHOLD104
#define HGCD_REDUCE_THRESHOLD 4633
#define GCD_DC_THRESHOLD   492
#define GCDEXT_DC_THRESHOLD379
Oops, can't measure all mpn_jacobi_base methods at 48

System: gentoo (from calculate distro), linux 4.14.19, gcc-6.4.0,
binutils-2.19.1, glibc-2.25.

$ cpupower frequency-info
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 800 MHz - 2.10 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 800 MHz and 2.10 GHz.
  The governor "performance" may decide which speed to
use within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 2.08 GHz (asserted by call to kernel)
  boost state support:
Supported: no
Active: no
25500 MHz max turbo 4 active cores
25500 MHz max turbo 3 active cores
25500 MHz max turbo 2 active cores
25500 MHz max turbo 1 active cores

Some fields from /proc/cpuinfo:
cpu family  : 6
model   : 42
model name  : Intel(R) Pentium(R) CPU B950 @ 2.10GHz
stepping: 7
microcode   : 0x25
cpu MHz : 2023.712
cache size  : 2048 KB
...
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Configure unable to recognize mipsisa64r2el triplets

2018-03-28 Thread Jiaxun Yang
Hi GMP developers:

I'm trying to cross-compile GMP for a MIPS64r2 target. The triple is
mipsisa64r2el-unknow-linux-gnu witch get from config.guess. And the
expected ABI should be ABI=64. However, the ./configure told me that
the only choice of ABI is o32.

After investigate, I discovered that it was caused by ./configure Line
: 4464 mips64*-*-* | mips*-*-irix[6789]*)
Probably it should be mips*64*-*-* | mips*-*-irix[6789]*)

The config.log is attached for reference.

Thanks

 
-- 
Jiaxun Yang This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.

It was created by GNU MP configure 6.1.2, which was
generated by GNU Autoconf 2.69.  Invocation command line was

  $ ./configure --prefix=/usr --build=x86_64-aosc-linux-gnu --host=mipsisa64r2el-aosc-linux-gnuabi64 --enable-cxx

## - ##
## Platform. ##
## - ##

hostname = Ry1800X
uname -m = x86_64
uname -r = 4.15.7-aosc-main
uname -s = Linux
uname -v = #1 SMP PREEMPT Mon Mar 5 16:23:37 UTC 2018

/usr/bin/uname -p = unknown
/bin/uname -X = unknown

/bin/arch  = x86_64
/usr/bin/arch -k   = unknown
/usr/convex/getsysinfo = unknown
/usr/bin/hostinfo  = unknown
/bin/machine   = unknown
/usr/bin/oslevel   = unknown
/bin/universe  = unknown

PATH: /usr/local/bin
PATH: /usr/local/sbin
PATH: /bin
PATH: /sbin
PATH: /usr/bin
PATH: /usr/sbin
PATH: /usr/bin/site_perl
PATH: /usr/bin/vendor_perl
PATH: /usr/bin/core_perl
PATH: /root/x-tools/mipsisa64r2el-aosc-linux-gnuabi64/bin


## --- ##
## Core tests. ##
## --- ##

configure:3055: checking build system type
configure:3069: result: x86_64-aosc-linux-gnu
configure:3089: checking host system type
configure:3102: result: mipsisa64r2el-aosc-linux-gnuabi64
configure:3139: checking for a BSD-compatible install
configure:3207: result: /bin/install -c
configure:3218: checking whether build environment is sane
configure:3273: result: yes
configure:3332: checking for mipsisa64r2el-aosc-linux-gnuabi64-strip
configure:3359: result: mipsisa64r2el-aosc-linux-gnuabi64-strip
configure:3424: checking for a thread-safe mkdir -p
configure:3463: result: /bin/mkdir -p
configure:3470: checking for gawk
configure:3486: found /bin/gawk
configure:3497: result: gawk
configure:3508: checking whether make sets $(MAKE)
configure:3530: result: yes
configure:3559: checking whether make supports nested variables
configure:3576: result: yes
configure:3705: checking whether to enable maintainer-specific portions of Makefiles
configure:3714: result: no
User:
ABI=64
CC=mipsisa64r2el-aosc-linux-gnuabi64-gcc
CFLAGS=(unset)
CPPFLAGS=-fexceptions
MPN_PATH=
GMP:
abilist=o32
cclist=gcc cc
configure:5689: error: ABI=64 is not among the following valid choices: o32

##  ##
## Cache variables. ##
##  ##

ac_cv_build=x86_64-aosc-linux-gnu
ac_cv_env_ABI_set=set
ac_cv_env_ABI_value=64
ac_cv_env_CCC_set=
ac_cv_env_CCC_value=
ac_cv_env_CC_FOR_BUILD_set=
ac_cv_env_CC_FOR_BUILD_value=
ac_cv_env_CC_set=set
ac_cv_env_CC_value=mipsisa64r2el-aosc-linux-gnuabi64-gcc
ac_cv_env_CFLAGS_set=
ac_cv_env_CFLAGS_value=
ac_cv_env_CPPFLAGS_set=set
ac_cv_env_CPPFLAGS_value=-fexceptions
ac_cv_env_CPP_FOR_BUILD_set=
ac_cv_env_CPP_FOR_BUILD_value=
ac_cv_env_CPP_set=
ac_cv_env_CPP_value=
ac_cv_env_CXXCPP_set=
ac_cv_env_CXXCPP_value=
ac_cv_env_CXXFLAGS_set=
ac_cv_env_CXXFLAGS_value=
ac_cv_env_CXX_set=set
ac_cv_env_CXX_value=mipsisa64r2el-aosc-linux-gnuabi64-g++
ac_cv_env_LDFLAGS_set=
ac_cv_env_LDFLAGS_value=
ac_cv_env_LIBS_set=
ac_cv_env_LIBS_value=
ac_cv_env_LT_SYS_LIBRARY_PATH_set=
ac_cv_env_LT_SYS_LIBRARY_PATH_value=
ac_cv_env_M4_set=
ac_cv_env_M4_value=
ac_cv_env_YACC_set=
ac_cv_env_YACC_value=
ac_cv_env_YFLAGS_set=
ac_cv_env_YFLAGS_value=
ac_cv_env_build_alias_set=set
ac_cv_env_build_alias_value=x86_64-aosc-linux-gnu
ac_cv_env_host_alias_set=set
ac_cv_env_host_alias_value=mipsisa64r2el-aosc-linux-gnuabi64
ac_cv_env_target_alias_set=
ac_cv_env_target_alias_value=
ac_cv_host=mipsisa64r2el-aosc-linux-gnuabi64
ac_cv_path_install='/bin/install -c'
ac_cv_path_mkdir=/bin/mkdir
ac_cv_prog_AWK=gawk
ac_cv_prog_STRIP=mipsisa64r2el-aosc-linux-gnuabi64-strip
ac_cv_prog_make_make_set=yes
am_cv_make_support_nested_variables=yes

## - ##
## Output variables. ##
## - ##

ABI='64'
ACLOCAL='${SHELL} /root/build/gmp-6.1.2/missing aclocal-1.15'
AMTAR='$${TAR-tar}'
AM_BACKSLASH='\'
AM_DEFAULT_V='$(AM_DEFAULT_VERBOSITY)'
AM_DEFAULT_VERBOSITY='1'
AM_V='$(V)'
AR='mipsisa64r2el-aosc-linux-gnuabi64-ar'
AS='mipsisa64r2el-aosc-linux-gnuabi64-as'
ASMFLAGS=''
AUTOCONF='${SHELL} /root/build/gmp-6.1.2/missing autoconf'
AUTOHEADER='${SHELL} /root/build/gmp-6.1.2/missing autoheader'
AUTOMAKE='${SHELL} /root/build/gmp-6.1.2/missing automake-1.15'
AWK='gawk'
CALLING_CONVENTIONS_OBJS=''
CC='mipsisa64r2el-aosc-linux-gnuabi64-gcc'
CCAS=''
CC_FOR_BUILD=''
CFLAGS=''
CPP=''
CPPFLAGS='-fexceptions'
CPP_FOR_BUILD=''
CXX='mipsisa6

Codelets for ToomN1 (for N=2, 3, 4, 6, 8) should be added and here's why. (Also: a significant non-triviality on where cut-off points should be).

2018-03-28 Thread Rock Brentwood
It may be possible to eliminate the need for a "basecase" (except for 1x1
limb multiplication, of course) by including codelets for Toom21, Toom31,
Toom41, Toom61, Toom81, etc. In many cases, the recursive decompositions
done through these codelets will end up falling *back* into the other
codelets! A method based on thresholding completely misses both of these
features.

I included an ASCII picture showing the cut-off points for multiplication
that uses Toom11 (O) Toom21 (.) and Toom22 (o). Hopefully the mailer will
not cut off the diagram.

In some places (:), a balance point between Toom21 and Toom22 is reached.
This optimum is taken over *all* splitting points for Toom21 and Toom22,
not just midpoints. This diagram is relatively invariant with respect to
changes in the cost ratio for addition versus multiplication. Generally it
only affects what the cutoff for Toom22 is, not where the thresold between
Toom21 and Toom22 lies.

You should run a dynamic programming algorithm to compile a table over all
of your codelets, just to see where the optimal splits occur.

For the cases examined here: all cases of Toom21 optimally cut off at the
largest power of 2 smaller than the digit size of the larger numeral. Many
cases of Toom22 optimally cut off at a power of 2 or places other than
expected; although I didn't indicate in the diagram below where it is.
There is some dependency on how the dataflow is carried out with the
recursion (particularly the moving back and forth between the numerals and
the temporary workspace) but it should not affect the results below
significantly. This is something I will try to check against your
implementations.

The pattern is fractal. The "seeds" of the islands in the diagram below all
grow to larger sizes, for each power of 2 size you scale the diagram up by.
This picture will get far more complex when the other Toom codelets are
added in

This should come out as a 128 x 128 table. You may need to run a script to
lay it out as such, if the e-mailer messes it up.

-- Rock Brentwood

O...
.o..
..oo..:.
..oo:...
...::...
:...
..:.o...
o:..
.:ooo...:...
...:o...:...
o...
o:..
:...oo..
oo:.
ooo.
ooo:
oo..::...::.
...:oo..::..
.:..:...
...:oo:.
ooo.
ooo.:...
ooo..

Re: Configure unable to recognize mipsisa64r2el triplets

2018-03-28 Thread Torbjörn Granlund
Jiaxun Yang  writes:

  I'm trying to cross-compile GMP for a MIPS64r2 target. The triple is
  mipsisa64r2el-unknow-linux-gnu witch get from config.guess. And the
  expected ABI should be ABI=64. However, the ./configure told me that
  the only choice of ABI is o32.

I've never seen "mipsisa64r2el" before.

Where do you run config.guess given that you are cross compiling?

-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: Configure unable to recognize mipsisa64r2el triplets

2018-03-28 Thread Jiaxun Yang
在 2018-03-28三的 13:04 +0200,Torbjörn Granlund写道:
> 
> 
> I've never seen "mipsisa64r2el" before.

Yes, it's rare. However, according to config.sub from GCC [1], it's not
illegal.

> 
> Where do you run config.guess given that you are cross compiling?
> 

I ran it on a LFS rootfs provided by Loongson (A Chinese MIPS64r2
processor).


[1]https://github.com/gcc-mirror/gcc/blob/master/config.sub

Thanks.
-- 
Jiaxun Yang 
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: Configure unable to recognize mipsisa64r2el triplets

2018-03-28 Thread Torbjörn Granlund
Jiaxun Yang  writes:

  > I've never seen "mipsisa64r2el" before.

  Yes, it's rare. However, according to config.sub from GCC [1], it's not
  illegal.

Good! We woudn't want to be gaoled for a CPU moniker!

  I ran it on a LFS rootfs provided by Loongson (A Chinese MIPS64r2
  processor).

  [1]https://github.com/gcc-mirror/gcc/blob/master/config.sub

It seems that GMP's configfsf.sub also has that.  Now, I have no idea
what it means.  Is is just a novel name for mips64r2el or is it somehow
a different ABI?

If you make the obvious change to configure.ac, then build twice, once
with --disable-shared and once with --disable-static, does

  make && make check

run to without any errors.  (OK, you told me you were cross compiling,
so perhaps that will require some intermediary command.  And "make check
TESTS=" for just building the tests would perhaps be useful.)

-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: Configure unable to recognize mipsisa64r2el triplets

2018-03-28 Thread Jiaxun Yang
在 2018-03-28三的 15:54 +0200,Torbjörn Granlund写道:
> Jiaxun Yang  writes:
> 
>   > I've never seen "mipsisa64r2el" before.
> 
>   Yes, it's rare. However, according to config.sub from GCC [1], it's
> not
>   illegal.
> 
> Good! We woudn't want to be gaoled for a CPU moniker!
> 
>   I ran it on a LFS rootfs provided by Loongson (A Chinese MIPS64r2
>   processor).
> 
>   [1]https://github.com/gcc-mirror/gcc/blob/master/config.sub
> 
> It seems that GMP's configfsf.sub also has that.  Now, I have no idea
> what it means.  Is is just a novel name for mips64r2el or is it
> somehow
> a different ABI?

It should be a novel name of mips64r2el. I don't know if mips64r2el is
illegal but it doesn't recognize by config.sub .

> 
> If you make the obvious change to configure.ac, then build twice,
> once
> with --disable-shared and once with --disable-static, does
> 
>   make && make check
> 
> run to without any errors.  (OK, you told me you were cross
> compiling,
> so perhaps that will require some intermediary command.  And "make
> check
> TESTS=" for just building the tests would perhaps be useful.)

Yes I've tried the tests on a Loongson target system with a native
build. After use mips64* instead of mips*64* in configure and set ABI
to ABI=64. No error was reported. I'm not sure if this change will
break other MIPS systems. But since all MIPS triplets with *64*
included should support 64 o32 N32 ABIs, probably it's going to work
just fine.

> 
-- 
Jiaxun Yang 

signature.asc
Description: This is a digitally signed message part
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs


Re: tuneup: speed_measure() could not get 4 results within 1.0%

2018-03-28 Thread Torbjörn Granlund
Nikita Zlobin  writes:

  There is issue with tune system. I tried to compile from sources and
  tune on two laptops, one with intel b950, seconf - i5 (don't remember
  exact cpu number), both sandybridge (i5 - with avx).

We are aware of that tuning of the various GCD thresholds don't work
properly.  This has been broken for a very long time.  I have not spent
enough time on it to understand why it does not work.  If somebody else
could do that, it would be very good.

-- 
Torbjörn
Please encrypt, key id 0xC8601622
___
gmp-bugs mailing list
gmp-bugs@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-bugs