Summary: its slower :-(

:(

   Calculating the flags position in the pool in pobject_lives() and 
   free_unused_pobjects() takes more time then the smaller cache foot_print 
   does gain. Two reasons: positions have to be calced twice and cache is 
   more stressed with other things, IMHO.

Hmm... the first reason, a second bit of pointer arithmetic, seems
surprising, cycles being sooo much cheaper than cache misses.  So I
modified the tpmc test with a second calc.  Plus two extra function
calls to make sure it wasn't optimized away (to a separately compiled
file and back).  The two real test cases (linear flag-only walk, and
random PMC->flag) were fine (unchanged and perhaps 1/3 slower), though
the fast toy case of linear PMC->flag was 5x slower (still faster than
the equivalents).  So it's not the first reason.

That leaves the cache being stressed by other things.
Do we have any candidates?

I'd expect some interference effects between flag arrays, given _lots_
of arrays and random access.  I'm not sure the stressX benchmarks are
"lots" enough.  But while this interference might be worse in reality
than in the test program, it should still be much less than for
touching PMCs (say by 10x).  So that doesn't seem a likely candidate.

Is the gc run doing anything memory intensive aside from the flag fiddling?
I don't suppose it is still touching the PMC bodies for any reason?

Puzzled,
Mitchell
("[..] in realiter"?)


   Message-ID: <[EMAIL PROTECTED]>
   Date: Wed, 08 Jan 2003 15:00:38 +0100
   From: Leopold Toetsch <[EMAIL PROTECTED]>
   To: [EMAIL PROTECTED]
   Cc: P6I <[EMAIL PROTECTED]>, Dan Sugalski <[EMAIL PROTECTED]>
   Subject: Re: More thougths on DOD
   References: <[EMAIL PROTECTED]>
       <[EMAIL PROTECTED]>

   Mitchell N Charity wrote:

   > The attached patch adds a scheme where:
   >  - gc flags are in the pool, and
   >  - pmc->pool mapping is done with aligned pools and pmc pointer masking.
   > 
   > Observations:
   >  - It's fast.  (The _test_ is anyway.)  


   I did try it and some more in realiter.

   Summary: its slower :-(
   Calculating the flags position in the pool in pobject_lives() and 
   free_unused_pobjects() takes more time then the smaller cache foot_print 
   does gain. Two reasons: positions have to be calced twice and cache is 
   more stressed with other things, IMHO.

   There seems to be remaining only: smaller PMCs for scalars.

   leo

Reply via email to