T> What is MDU? T> It is faster on all tried MIPS32R1/R2/R5 CPUs (see the c/l table) and is T> expected to be fast with any pipelined MDU. So-called Area-Efficient MDU T> (optional on some MCUs) will run it *much* slower (~3x for addmul_1). T> What is 3x slower than what?
multiply-divide unit. they (MTI, IMG now Wave Computing) have [at least] four MDU kinds in their portfolio: 1) the only non-pipelined "area-efficient" option, circa ~30 cycles with no early exit for multiply (or multiply-add -- does not matter here and below); 2) 32x16, can issue 32x16 multiply-add every cycle or one 32x32 op every second cycle; 3) 32x32 high-performance non-DSP; 4) 32x32 DSP. 3 and 4 are the same performance-wise per specs and as evident from 24Kc (no DSP ASE) and 24KEc (implements DSP ASE) results. naturally, three non-pipelined multiply ops per limb will be slower than a single multiply from the MIPS-II code on cores with MDU 1. T> I took a quick look at the code. Do you use madd/msub for accumulation T> here, while actual multiplication is done by multu? As MIPS lacks not quite so. as MDU 2+ is pipelined, the "multu $xx, 1" idiom is used to quickly reset the accumulator: this is clearly shorter and faster than mthi/mtlo pair on in-order cores and *absolutely* critical on P5600 (and likely 74K* with a similar pipeline). but the choice of maddu/multu depends on the subroutine cause the order matters, addmul_1 just happens to be more flexible than submul_1 regarding this. T> It's long since I did any substantial work with MIPS, but it would T> appear that, at least for addmul_1, madd could be used also for T> multiplication. One should of course avoid creating a slow recurrency T> path. it is all about order: wrong placement of carry accumulation drops P5600 performance to 14 c/l so it is always at the end of the sequence. doing "multu *up, vl; maddu *rp, 1" is slightly faster than "multu *rp, 1; maddu *up, v1" on P5600, especially for N=1,2. _______________________________________________ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel