Profiling PMCs

Melvin Smith Sat, 18 May 2002 16:10:23 -0700

I decided to do some profiling and tinkering and I picked the PerlInt class
since its one of the most common. There is a large gap between our
MOPS benchmarks when using the plain INT registers as opposed to
the PMC regs.


There seems to be much room for optimization in the PMC virtual
methods, even if it means inlining code and breaking OOP reuse
a bit.

For example, profiling mops_p.pbc gives the following...

Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total
  time   seconds   seconds    calls  ms/call  ms/call  name
  52.73      2.32     2.32        1  2320.00  2320.00  cg_core
  17.95      3.11     0.79 
Parrot_PerlInt_subtract_i
nt
  13.41      3.70     0.59                             Parrot_PerlInt_get_bool
   8.86      4.09     0.39 
Parrot_PerlInt_subtract_b
int
   7.05      4.40     0.31 
Parrot_PerlInt_set_intege
r_native
   0.00      4.40     0.00      642     0.00     0.00  add_to_free_pool
   0.00      4.40     0.00       60     0.00     0.00  mem_sys_allocate
   0.00      4.40     0.00       29     0.00     0.00  mem_allocate


So taking a look at PerlInt, I notice that subtract_int calls the
method set_integer_native. Now, the reasons for encapsulation
for methods for "setting" the core value in an object are not lost
on me, however I think the core PMCs are one place where we
should inline as much as possible. All set_integer_native does
is sets the value for now, I can't think of many other things
to actually encapsulate inside it.

There is the case where we might derive from PerlInt and the
child class reimplements its set_native, but I'm not sure that
this OOP constraint should dictate losing 10Million ops/sec, as
it does on my box.

Going from:

     void subtract_int (INTVAL value, PMC* dest) {
             dest->vtable->set_integer_native(INTERP, dest,
                 SELF->cache.int_val - value
         );
     }

To:

     void subtract_int (INTVAL value, PMC* dest) {
           dest->cache.int_val -= value;
     }


Takes mops_p.pbc from:

     Stock:
     Elapsed time:  4.342769
     M op/s:        46.053566

To:

     Modified:
     Elapsed time:  3.619222
     M op/s:        55.260495


Granted, doing this means, if I make a new PMC that uses the PerlInt
vtable, then if I were to override set_int_native(), the rest of my inlined
ops would need overriding as well, however, realistically, I'm not
concerned that the PMC internals be an elegant OOP framework, when
it means losing this much performance.

What if we specifiy certain PMC classes as "final"? All that might mean is
we say, if you derive from a final PMC, you must expect:

1) To reimplement most of the vtable
2) To not rely on that PMC's interface never changing


Yeh I know that word is yucky and from Java land, but in this case, I think 
that
"system" PMCs should take liberties for optimization.

-Melvin

Profiling PMCs

Reply via email to