Go to http://shoup.net/ntl for full details. Below is the latest entry in the change log.
I've also updated the NTL/FLINT comparison document: http://shoup.net/ntl/benchmarks.pdf The only significant change is to the ZZ_pX factoring benchmarks (the new code in NTL speeds up mat_ZZ_p multiplication and indirectly CompMod and CanZass). ======= Changes in 10.3.0 ======= * Implementation of a multi-modular strategy for matrix multiplication over ZZ_p. Here are some benchmarks that compare the new strategy to the old (naive) one. Here, we consider n-by-n matrices modulo an n-bit prime. o n=128, speedup=6.8x o n=256, speedup=8.6x o n=512, speedup=9.3x o n=1024, speedup=18x o n=2048, speedup=37x I also compared NTL's new mat_ZZ_p multiplication to FLINT's fmpz_mat multiplication. The latter also uses a multi-modular strategy. For n=128,256,512,1024 as above, NTL's code is between 2.7 and 3.1 times faster than FLINT's (and we did not count the time it would take to reduce the entries mod p for the FLINT code). Part of this may be due to the AVX-enhanced small-prime matrix multiplication code used by NTL, and part may be due to better CRT code. * I also instrumented both the plain and multi-modular matrix multiplication for ZZ_p so that they are both "thread boosted" (i.e., will automatcally exploit multiple cores when possible). * As an initial application of this faster matrix multiplication, I implemented a new version of Brent/Kung modular composition for ZZ_pX, which is now between 2 and 5 times faster than the old one (depending on parameters). This is done with a new class called ZZ_pXNewArgument, which supersedes ZZ_pXArgument (retained for compatibility). This also makes the CanZass factoring algorithm for ZZ_pX faster (sometimes by a factor of almost 2, depending on parameters). It also makes CanZass more memory intensive (but while the overall memory usage increases, the memory access pattern is fairly cache friendly). * I would like to see if this faster matrix multiplication can be used to get faster linear algebra (determinants, inverses, more general Gaussian elimination) via reduction to matrix multiplication. If anyone wants to volunteer to work on this, please let me know. Presumably, the FLINT nmod_mat code could be repurposed for this. I won't have time to work on this for a few months, but would be glad to discuss ideas with anyone who wanted to do the work. Note that the "plain" versions of these routines also need some work. * I also added move methods to the Vec and Mat classes, and made a slight tweak to the Lazy class. -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group, send email to sage-devel@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.