Hi.

Am 16.09.2010 um 14:39 schrieb pito:
..
> PS: I am still thinking why the amforth overhead is so big?? It
> seems from my naive measurements a typical forth word takes ~7us
> plus minus.
> This is about 175 clock cycles @25MHz, or aprox 100 instructions -
> could it be so much? Just a stupid Q. Pito

I have a real time clock (RTC) implemented, so I can use time@ to see  
time in hours minutes seconds on stack. Defining this:

new
: one ;
: tt0 1000 0 do     loop ;
: tt1 1000 0 do one loop ;
: tt20  0 do tt0 loop ;
: tt21  0 do tt1 loop ;
time@ 10000 tt20 time@ 10000 tt21 time@ .s

I get:
» time@ 10000 tt20 time@ 10000 tt21 time@ .s
0 1181 26
1 1183 6
2 1185 0
3 1187 26
4 1189 5
5 1191 0
6 1193 57
7 1195 4
8 1197 0

00:04:57
00:05:26  --> 29s*10^7 empty loop
00:06:26  --> 60s*10^7 one loop

» time@ 10000 tt20 time@ 10000 tt21 time@ .s
0 1181 18
1 1183 40
2 1185 16
3 1187 19
4 1189 39
5 1191 16
6 1193 49
7 1195 38
8 1197 16

16:38:49
16:39:19  --> 30s * 10^7s empty loop
16:40:18  --> 59s * 10^7s one loop

We get 30s per 10^7 times one word,
or 3us for a single into and out word procedure.
On 20MHz atmega168 one instruction cycle takes 0.05us.
So I have 3 / 0.05 = 60 cycles for 'one word'.


The inner interpreter is:
...
           DO_COLON:
C:001c0a 93bf          push XH                          2 2
C:001c0b 93af          push XL          ; PUSH IP       2 4
C:001c0c 01db          movw XL, wl                      1 5
C:001c0d 9611          adiw xl, 1                       2 7
           DO_NEXT:
           .endif
C:001c0e 01fd          movw zl, XL        ; READ IP     1 8
C:001c0f   +      readflashcell wl, wh                  
C:001c0f 0fee      lsl zl                               1 9
C:001c10 1fff      rol zh                               1 10
C:001c11 9165      lpm wl, Z+                           3 13
C:001c12 9175      lpm wh, Z+                           3 16
C:001c13 9611          adiw XL, 1        ; INC IP       2 18

           DO_EXECUTE:
C:001c14 01fb          movw zl, wl                      1 19
C:001c15   +      readflashcell temp0,temp1
C:001c15 0fee      lsl zl                               1 20
C:001c16 1fff      rol zh                               1 21
C:001c17 9105      lpm temp0, Z+                        3 24
C:001c18 9115      lpm temp1, Z+                        3 27
C:001c19 01f8          movw zl, temp0                   1 28
C:001c1a 9409          ijmp                             2 30


; ( -- ) Compiler
; R( xt -- )
; end of current colon word
VE_EXIT:
     .dw $ff04
     .db "exit"
     .dw VE_HEAD
     .set VE_HEAD = VE_EXIT
XT_EXIT:
     .dw PFA_EXIT
PFA_EXIT:
     pop XL                                             2 32
     pop XH                                             2 34
     rjmp DO_NEXT                                       2 36

total of 36 cycles, right?

So where does amforth spend 24 more cycles? That is at an average the  
overhead caused by my ISR time ticker of RTC, I guess.

To see a word "as is" connect an oscilloscope to a port pin. let an  
empty loop toggle your port pin. Than add your word to the loop an  
run again.
In the resulting frequency difference you get the execution time of  
your word.

Michael









------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Amforth-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/amforth-devel

Reply via email to