> > > ld hl,DATA_AREA
> > > ld de,DATA_AREA+1
> > > ld bc,4096-1
> > > ld (hl),0
> > > ldir
> >
> > Yes, it is the faster method to fill a memory area.
>
> It's surely the smallest way to fill a memory area,
> but I believe there is a faster method:
>
> ld hl,DATA_AREA
> ld de,DATA_AREA+1
> ld (hl),0
> rept 4095
> ldi
> endm
>
> Remember that LDI=16+2 clocks and LDIR=21+2 clocks
> except for the final iteration, where LDIR is just like LDI.

Why don't you do, for example,

ld hl,DATA_AREA
ld de,DATA_AREA+1
ld (hl),0
loop:
rept 1024
ldi
endm
jp pe,loop

Ofcourse, in this case the data area would have to be a multiple of 1024, and in
this case there should be an additional dummy byte trailing the to-be-cleared
data area (because the copy size actually isn't a multiple of 1024), but in
general this method is a much better tradeoff between space and size, especially
when copying large blocks of data. The size of this routine is decreased from
about 8kbyte (!!!) to about 2kbyte (don't forget, ldi is a 2-byte instruction),
while the time it takes to execute is only increased by only 40 T-states (4
conditional jumps)...

You can ofcourse decrease the ldi repeat to for example 16, which will then only
use about 45 bytes, in that case you still gain 4,375 T states per byte copied
(compared to the 5 T states (on a total of 21) which a fully repeated ldi
offer), which makes it a 21% speedup instead of a 24%. This small tradeoff
however results in a 99.5% reduction of size.


Ha! I am now finished. :)


~Grauw



--
For info, see http://www.stack.nl/~wynke/MSX/listinfo.html

Reply via email to