I have just added the Windows versions of Jason's code to the K8-
experimental SVN branch. I have also added an interim version of
sqr_basecase.asm that Jason has provided. This does not improve on the
performance of the earlier code (in fact it is a bit slower for medium
sized operands) but it removes the upper limit on the Karatsuba
threshold in the earlier code and hence makes tuning easier.

I have also added a Python program g2y.py that converts JM's code to
YASM format.  A few lines need manual intervention (PROLOGUE and
EPILOGUE in particular as I am unclear how to translate these).  I
don't change the GCC calling conventions so the resulting code should
not require much work to be used with YASM on Linux. g2y.py runs in
the build.vc9 directory and translates assembler code in x86_64 into
x86_64w if it is not already there.

The benchmark on Windows on my AMD Athlon X2 (2.4GHz) is now 10,800
with this code (up from 8700 for the original code).

I have also tidied up the Windows AMD64 build in both the trunk and
the experimental branches.

     Happy New Year to all

         Brian



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to