Ciao, Il Ven, 4 Gennaio 2013 1:49 am, David Miller ha scritto: > Just FYI, I'm also working on an mpn_mul_basecase that makes use of > the T4 'mpmul' instruction which can do NxN 64-bit limb multiplies > for values of N from 1 to 32.
Great! Maybe it can be useful also for mul_2 or higher. > It's an instruction that seems like it was designed specifically for > libgmp :-) If it support only balanced multiplication (NxN and not NxM), its target probably is 2048-bit public-key crypto. > I guess the ideal implementation would be to have gmp-mparam.h setup > so that basecase only gets invoked for N <= 32. With the current code we can not impose such a restriction. mpn_sqr_basecase is allowed to support only sizes smaller than the TOOM2 threshold, but mpn_mul_basecase must be able to handle unbalanced operands and big sizes of the longer one (the first). Should we add a balanced only mul_basecase_n function, to be used by mul_n, to fully exploit such an instruction? Modular arithmetic (crypto, ECM, etc.) can benefit of such an approach. How much faster than a fully-flexible mul_basecase would it be? Best regards, Marco -- http://bodrato.it/ _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org http://gmplib.org/mailman/listinfo/gmp-devel