On Mar 13, 12:12 am, Cactus <rieman...@googlemail.com> wrote:
> On Mar 12, 5:31 pm, Cactus <rieman...@googlemail.com> wrote:
>
>
>
> > On Mar 12, 5:11 pm, Jeff Gilchrist <jeff.gilchr...@gmail.com> wrote:
>
> > > On Thu, Mar 12, 2009 at 1:02 PM, Cactus <rieman...@googlemail.com> wrote:
> > > > If the parameters change a lot I have found it necessary to repeat the
> > > > process to get the best parameters.
>
> > > The numbers keep changing so it is hard to tell which one is "best".
>
> > > > None of the Windows 32-bit parameter files have changed in several
> > > > years so it will be interesting to see how much difference there is.
>
> > > The current params file is very basic, it only contains a couple of 
> > > options:
> > > #ifndef BITS_PER_MP_LIMB
> > > #define BITS_PER_MP_LIMB 32
> > > #elif   BITS_PER_MP_LIMB != 32
> > > #error  Bad configuration in gmp-mparam.h
> > > #endif
>
> > > #ifndef BYTES_PER_MP_LIMB
> > > #define BYTES_PER_MP_LIMB 4
> > > #elif   BYTES_PER_MP_LIMB != 4
> > > #error  Bad configuration in gmp-mparam.h
> > > #endif
>
> > > /* Generic x86 mpn_divexact_1 is faster than generic x86 mpn_divrem_1 on 
> > > all
> > >    of p5, p6, k6 and k7, so use it always.  It's probably slower on 386 
> > > and
> > >    486, but that's too bad.  */
> > > #define DIVEXACT_1_THRESHOLD  0
>
> > > This is what tune generated on the 5th run for me (this is a 65nm
> > > Core2 using the 32bit p4 code):
>
> > > /* Generated by tuneup.c, 2009-03-12, system compiler */
>
> > > #define MUL_KARATSUBA_THRESHOLD          24
> > > #define MUL_TOOM3_THRESHOLD             161
>
> > > #define SQR_BASECASE_THRESHOLD            0  /* always (native) */
> > > #define SQR_KARATSUBA_THRESHOLD          42
> > > #define SQR_TOOM3_THRESHOLD             161
>
> > > #define MULLOW_BASECASE_THRESHOLD         8
> > > #define MULLOW_DC_THRESHOLD              92
> > > #define MULLOW_MUL_N_THRESHOLD          486
>
> > > #define DIV_SB_PREINV_THRESHOLD           0  /* always */
> > > #define DIV_DC_THRESHOLD                 64
> > > #define POWM_THRESHOLD                  146
>
> > > #define GCD_ACCEL_THRESHOLD              12
> > > #define GCDEXT_THRESHOLD                 21
> > > #define JACOBI_BASE_METHOD                1
>
> > > #define USE_PREINV_DIVREM_1               1
> > > #define USE_PREINV_MOD_1                  1
> > > #define DIVREM_2_THRESHOLD                0  /* always */
> > > #define DIVEXACT_1_THRESHOLD              0  /* always (native) */
> > > #define MODEXACT_1_ODD_THRESHOLD          0  /* always (native) */
>
> > > #define GET_STR_DC_THRESHOLD             20
> > > #define GET_STR_PRECOMPUTE_THRESHOLD     23
> > > #define SET_STR_THRESHOLD              6418
>
> > > #define MUL_FFT_TABLE  { 496, 1184, 1920, 5632, 14336, 40960, 0 }
> > > #define MUL_FFT_MODF_THRESHOLD          512
> > > #define MUL_FFT_THRESHOLD              3328
>
> > > #define SQR_FFT_TABLE  { 528, 1184, 2432, 5632, 14336, 40960, 0 }
> > > #define SQR_FFT_MODF_THRESHOLD          544
> > > #define SQR_FFT_THRESHOLD              3840
>
> > > /* Tuneup completed successfully, took 9 seconds */
>
> > My guess is that they won't then make much difference.
>
> > I often find that tune results are not optimal and I can get more by
> > tweaking the resulting values.
>
> > But its time consuming so I don't recommend this unless you have a lot
> > of time to spare :-)
>
> I have updated run-speed.py because Bill found some bugs on Linux.  So
> download it again if you are going to try it.
>
I have updated it again today (14 March) to increase its coverage.

    Brian

>    Brian
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to