At 11:46 AM -0700 7/21/02, Scott Walters wrote:
>In all other cases, as it currently stands, it is a loose. I hate to come out
>and be so blunt, but I think this is the culmination of what I've found, and a
>lot of people are saying. This approach is penny wise, pound foolish.

Being blunt since I'm pressed for time (couldn't you folks have held 
off until after TPC? Sheesh... :) you're insufficiently considering 
the common cases. And, apparently, misinformed about some of the 
current capabilities of the interpreter and the reasons behind the 
design.

>   1) KEY *'s and their atoms are allocated from memory. The memory allocation
>      hurts more than it saves, cycle wise.
>      The number of CPU cycles used doing a virtual machine instruction *pales*
>      in comparison to what is needed to allocate memory.

Wrong. Keys are fixed sized objects, and perfect candidates for the 
object allocator, which is screamingly fast.

Yes, malloc sucks, which is why we almost never use it. Go read 
resources.c before continuing. We call the system allocator only if 
we need another memory pool to satisfy variable-sized allocations 
from, or if we need a new arena for fixed-sized objects.

>   2) Constants are very seldom used as array indexes, in real code, and almost
>      never used exclusively in multidim access. If you had an x*y*z 
>sized array,
>      it would take x*y*z statements to access all of its members! Thats hardly
>      an expressive language =)

You're making a fundamental mistake here in assuming that changing 
variable *contents* requires changing any key structure. This:

    for my $i (1..1000) {
      for my $j (1..1000) {
        for my $k (1..1000) {
          @foo[$i;$j;$k]++'
        }
      }
    }

requires the construction of exactly *one* key structure, and 
requires that it be updated exactly *once*, through the entire loop. 
Even with a stupid optimizer, we only need to construct it once for 
each full invocation of the inner loop. (Not every time through the 
inner loop)

PMCs don't move. Keys can cache pointers to them. This isn't a problem.

>   3) When several PMCs recurse to handle a KEY * request, the system stack
>      has everything pop'd off and back on once for each member of the KEY *
>      list! If nothing else, refactor this into an iterative thing rather than
>      recurisve thing! Create a register that holds KEY * and for gods sake,
>      get everything off the system stack.

So what? Recursion's not bad. Besides, aggregates may well be 
multidimensional, in which case

>Given your objectives of speed, generality and elegance,

I should point out here that elegance appears third in your list 
here. (It's fourth or fifth on mine)

>I could propose
>several other solutions:
>
>1. PMCs spit out an "inner class" of themselves on demand,

This requires constructing PMCs on the fly to represent elements 
inside multidimensional aggregates (and if we toss key access on all 
the non fetch/store vtable entries as proposed, all the elements of 
uni-dimensional packed aggregates)

Not doing this is the reason we have multidimensional aggregates and 
packed aggregates in the first place.

>2. PMCs return iterators

I considered and discarded this idea as it makes construction of 
aggregates even more complex than they already are.

>3. Let PMCs create KEY * lists.

That's what the interpreter is for. While there may not be sufficient 
information available to the interpreter and compiler to generate 
efficient key lists (which is possible, definitely) I can guarantee 
you that the interpreter has more information than any individual PMC 
vtable function will have.

Keys are the interpreter's way of indicating the element of an 
aggregate. Construction of the key belongs to the interpreter. 
Interpretation of the key belongs to the PMC vtable functions. Leave 
them on the appropriate sides of the wall, please.

>General Philosophical Diatribe:
[Snip]
>   * follow malloc() through a debugger sometime - you'll never want 
>to use it again
>     case in point, p6 overallocates whereever it can to avoid it in the future

That's why we don't use it.

>   * function calls consume resources

Generally incorrect. Function calls are, while not free, damned cheap 
on most semi-modern chips.

>   * assuming a 2 meg cache on a 2 meg cache makes for a program that works
>     *great* on your machine, then proceds to suck pond scum on mine =)
>     assuming a 1 meg cache, the program will run marginally slower on yours,
>     but an order of magnitude faster on mine

That's why I've got a 300MHz original Celeron system here.

>   * caches, virtual memory, and all of their ilk work best when you pretend
>     they dont exist. think of them as little faerie helpers - dont 
>demand work,
>     and when your shoes are fixed in the morning, leave them a treat ;)

You forgot a few.

* Think about the common case and plan for it
* Make sure your performance assumptions aren't out of date
* Reevaluate after bottlenecks are removed

>Case in point:
>
>Perl 5 runs *awesome* on a 486/25.

You're beneath our floor. Performance issues on that system aren't 
something I'm worried about, any more than I'm worried about not 
compiling on a K&R-only C compiler. They're only interesting and 
worth addressing if it's an easy way to address performance issues on 
hardware we do care about. (Palms, for example)

>In summary, I beg Dan to reconsider =)

This is always in order, but in this case I think you've not given 
sufficient cause. Part of that is because the design's not been 
sufficiently set down, which is my fault--I do realize that makes 
things difficult. (I'll point out that this is why I did ask people 
to hold off a bit...)
-- 
                                         Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                       teddy bears get drunk

Reply via email to