>From ChangeLog:

  - "Kevin P. Lawton" <[EMAIL PROTECTED]>: Fri Jan 12 15:54:22 EST 2001
    More enhancements to dt-testbed/proto2, and more notes in the README.

>From dt-testbed/proto2/README:

The workload before each guest branch instruction was previously only
a NOP.  I changed this to create guest code which had a small tight-loop
section before each out-of-page branch.  I chose to use a cascading add
loop to make sure the CPU was kept busy and didn't parallelize things
(and thus compress time spent in the loop).  You can set the number
of tight-loops with 'DT_MicroLoops' in dt.h.  Some results:

                       Native       DT     factor (DT/Native)
  DT_MicroLoops==  5:   0.89      1.46      1.64
  DT_MicroLoops== 10:   2.02      2.50      1.24
  DT_MicroLoops==100:  17.62     18.08      1.03

This doesn't factor in dynamic (computed) branches, which are used
in dense switch statements, C++ virtual functions (C function pointers),
and (implicitly using the stack) by the return() statement.

Anyways, you can see that as the real workload that is executed
between overhead instructions (like out-of-page branches) increases
(approaching more realistic code), the overhead factor signficantly
improves.  Even low tight-loop iteration counts yield pretty good
performance.

Dynamic branches should probably be looked at next...

-Kevin

Reply via email to