Re: PARROT QUESTIONS: Keyed access: PROPOSAL

Mike Lambert Wed, 24 Jul 2002 19:28:12 -0700

> This would only automate the generation of large amounts of code, not get rid
> of the large amount of code being generated. Once again, my complaint here is that
> the L2 cache would buckle under the weight of a dozen PMCs each defining a few dozen
> recursive accessors. The performance gain of making the code smaller is worth
> taking two iterations through the main interpreter.
> Stepping back a bit, my suggestion would be a more general: factor out the recursive
> fetch from the operation. Keeping that as a single PVM op, the PVM could use the
> vtable API to fetch the burried PMC/int/num/string, then fall through to the logic
> that performs operations on immediately available values. Having two seperate VM
> ops isn't strictly required to factor this out. Restating, anytime you massive
> cut and paste code, it is indication of trouble ;)


I think the main problem is that we aren't guaranteed of recursion. Take a
look at Damian's Regex::Common to see what he does. Since Perl5's TIEd
interface only supports single-level, you have to jump through hoops to
perform multi-level tieing.

Likewise, matrices and other explicitly multidimensional arrays don't want
to recurse either. By storing the 'get the value for this key' in the PMC,
we allow each PMC to DTRT.

And if I can clarify....you're concerned about the code bloat occuring
from PMC methods, or opcodes? It looks like you're referring to PMCs, but
I just wanted to be sure.

I agree there's a lot of c+p going on in the PMC keyed method code. I
personally agree there. The various get_XXX_keyed and get_XXX_keyed_int
methods have a lot of duplication that doesn't need to be there. But
that's not a strike against keys. We could change these to return a
KEY_ATOM, and let the caller deal with the results. Or we could make all
the keyed functions be stubs which merely called get_bignum on the PMC
they retrieve from the aggregate, somewhat like:

    STRING* get_string_keyed (KEY * key) {
        PMC *value = get_pmc_keyed(key);
        if(key->next != NULL)
            return value->vtable->get_string_keyed(INTERP, value,
key->next);
        else
            return value->vtable->get_string(INTERP, value);
    }

That should certainly cut down on the code bloat, at the expense of more
recursion.

> > variable *contents* requires changing any key structure. This:
> >
> >     for my $i (1..1000) {
> >       for my $j (1..1000) {
> >         for my $k (1..1000) {
> >           @foo[$i;$j;$k]++'
> >         }
> >       }
> >     }
> >
> > requires the construction of exactly *one* key structure, and
> > requires that it be updated exactly *once*, through the entire loop.
> ....
> > PMCs don't move. Keys can cache pointers to them. This isn't a problem.
>
> The definition I have for KEY * here is a linked list. However, if you speak
> truth saying nested access only generating then reusing one key, thats awesome,
> and the allocation overhead is a total win.
> I'm sorry I missed that gem looking at the source. The rough edges do
> obscure the gems. If I can't find this gem at all, I'll be looking for pointers
> on briniging it to light.

Yes, it is a linked list.  Currently, we don't have multidim key support,
so no one sets up a linked list, and our keyed ops merely create
single-element linked lists.

Each KEY_ATOM in the KEY structure can reference a PMC. In the above
triple-for loop, there are only three PMCs, $i, $j, and $k. These PMCs
have a constant address in that function. It means we can create some PASM
that looks like: (note, new_key and add_key_ref don't currently exist ;)
        new P0, .PerlInt #i
        new P1, .PerlInt #j
        new P2, .PerlInt #k
        new_key S1
        add_key_ref S1, P0
        add_key_ref S1, P1
        add_key_ref S1, P2

Now we can run the loop, using the key in S1 (or whever it's stored..might
not be Sn registers). Since we're merely incrementing the values in P0,
P1, and P2 each time through the loop, the key that references these PMCs
is valid throughout the life of that triple-nested loop.


> > So what? Recursion's not bad. Besides, aggregates may well be
> > multidimensional, in which case
>
> Right. I did address. That was part of my "this *is* a win when" diatribe.
> When indexing objects that aren't multidim, there is a performance enhancement
> that isn't insignficiant and requires no change to the design. This is your
> cue to jump with joy ;)

Aren't non-multi-dim, single-level access, the common case? Most perl
programmers deal with one level of access into an aggregate at a time, and
if there's a performance improvement with this case, it should put some
big weight on the side of keys.

> > >Given your objectives of speed, generality and elegance,
> >
> > I should point out here that elegance appears third in your list
> > here. (It's fourth or fifth on mine)
>
> Ooops.

Yes, Dan's coding objectives are somewhat of a mystery to me as
well. :)

> > >   * function calls consume resources
> > Generally incorrect. Function calls are, while not free, damned cheap
> > on most semi-modern chips.
>
> Your inner loop is a few lines of code. If every inner loop execution triggers
> a cascade of function calls, this is lost. It may be small, but certain cases
> do warrent changing extremely frequently used recursive structures to
> iterative structures. I'm not saying this happends - I'm just saying that there
> is a certain point when this value does become significant.

However, if that inner loop references a multi-dim array, a
standard implementation of a recursive keyed access would fail to
work, no?

And if you're that concerned about the recursive key lookup on a heavily
nested loop, I'm sure you could hoist some of the key lookups out of the
appropriate loops (or maybe not, depending upon the Perl code in
question). However, given that Perl is hard to optimize, there's not much
you can do to optimize [$a][$b][$c] access because any one of the PMCs
might be tied, changing the behavior of the system. As such, you might
very well need to perform the full keyed lookup each time.

> I agree with Dan now that I understand better. My complaints have been addressed,
> with the one exception of refactoring code bloat. I feel this is a small change
> in implementation, and shouldn't impact design. I hope Dan will (pending time)
> consider it, and I'll be happy to hash it out with him on IRC to make sure
> both parties understand exactly what is being said and that I don't continue
> to miss things ;)

Hopefully I've helped to explain some of the things you said you were
unsure about. Of course, since I'm not Dan, you might very well hear
something completely different when he gets back from TPC. :)

Mike Lambert

Re: PARROT QUESTIONS: Keyed access: PROPOSAL

Reply via email to