Some comments about the patches:

(1) Why do you set up paths for zen (as a fallback)?

Doing that seems wrong unless all these 3 CPUs support every zen
instruction.  Do they?

Also, passing k8 to the compiler adn choosing zen asm code makes very
little sense to me.  If zen makes sense for asm code, why does it not
make sence for compiler-generated code?

A more conservative strategy is to pick file-by-file  from the desired
x86_64 subdir.  E.g., if mpn_mul_basecase from x86_64/foo runs great for
the kx6000 CPU, do

MULFUNC_PROLOGUE(mpn_mul_basecase)
include_mpn(`x86_64/foo/mul_basecase.asm')

in x86_64/kx6000/mul_basecase.asm.

(At some point, providing specific asm code for these CPUs might be
desirable, of course.)


(2) Testing specific stepping.

You check for stepping 0xE specifically.  I assume the "nano" CPU has
lower stepping values than 0xE, and Zhaoxin variants currently have
exactly 0xE.  It then feels more future-proof to check >= 0xE, as that
would allow Zhaoxin to do stepping iterations without GMP silently
falling back to "nano" for such steppings.

-- 
Torbjörn
Please encrypt, key id 0xC8601622
_______________________________________________
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Reply via email to