Re: [Sdcc-user] Even faster multiplication on Z80?

bodrato Mon, 30 Mar 2009 07:06:09 -0700

Dear Philipp,

> thanks for your help with Z80 multiplication.


Thank you for your answer!

> study, but it has been shown, that small integer values are far more
> common than bigger ones. AFAIR values 0,1,2 combined are about as likely
> to occur as all other values combined in an 8 bit integer.

I'm sure you are right, because "char" is often used to store "boolean"
values or for small cycles. But the real question is: are those values
used as operands in multiplications?
If you use the byteXbyte product as a building block for some more complex
mathematical function, you usually have "random" bytes to handle. If you
use it to access elements of a bi-dimensional array A[x+ROWLEN*y], then
the result must be small... and so must be the two operands.

A far faster code for small operands is:

ld hl, #0
ld d, h
or a
jp 3
1:
add hl,de
2:
rl e
rl d
3:
rra
jp C, 1
jp NZ, 2

It assumes the two operands are in A and E; put the result in HL... It is
very fast when A<32, but code size is too big to be inlined...

>> Drawbacks:
>>  - overwrites the accumulator A
> For builtin code generation this is bad, since A is used to backup B
> (when B is in use).

If both DE and B are used, wrap into push bc/pop pc, (saving ld a,b/ld b,a).
If only B is, save it to D or E (same as current).
If only DE is in use, do nothing (save push/pop).

> You might want to have a look at the 16 bit multiplication in mul.s or

There is a faster way for 16x16->16, but it would be slower for
16x8->16... I have to think about it a little bit longer... anyway I can
suggest some very small corrections:

---------8<---------8<---------8<---------8<---------8<---------8<-----
--- sdcc/device/lib/z80/mul.s   2009-01-05 11:20:47.000000000 +0100
+++ ./mul.s     2009-03-30 15:30:08.000000000 +0200
@@ -29,11 +29,11 @@
        ;;   DE = multiplier
        ;;
        ;; Exit conditions
-       ;;   DE = less significant word of product
+       ;;   HL = less significant word of product
        ;;
        ;; Register used: AF,BC,DE,HL
 __mul16::
-        ld      hl,#0
+        ld      l,#0
         ld      a,b
         ; ld c,c
         ld      b,#16
@@ -41,7 +41,7 @@
         ;; Optimise for the case when this side has 8 bits of data or
         ;; less.  This is often the case with support address calls.
         or      a
-        jr      NZ,1$
+        jr      NZ,3$

         ld      b,#8
         ld      a,c
@@ -49,6 +49,7 @@
         ;; Taken from z88dk, which originally borrowed from the
         ;; Spectrum rom.
         add     hl,hl
+3$:
         rl      c
         rla                     ;DLE 27/11/98
         jr      NC,2$
---------8<---------8<---------8<---------8<---------8<---------8<-----

> I wonder how these would compare at device/lib/_mulllong.c

This is 32x32->32, isn't it? It should be nice to write an ad-hoc function
for this, if one can use BC,DE,HL,BC',DE',HL',IX,IY,A,A'... there as a lot
of register-space to work on... Maybe also 32x32->64 is possible.

Regards,
Marco

-- 
http://bodrato.it/


------------------------------------------------------------------------------
_______________________________________________
Sdcc-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sdcc-user

Re: [Sdcc-user] Even faster multiplication on Z80?

Reply via email to