Leo --

>>>- a opcode gets more or less or changed params => you are as out of 
luck
>>>  with the old PBC as my approach is: invalidate PBC file.

>> Nope. The op 'name' includes the number and types of the args, 
foo_i_ic.
>> A "change" involves a new op and marking the old one obsolete. As far 
as
>> existing bytecode is concerned an op signature change is equivalent to 
an
>> op deletion and an op addition, and therefore should not be treated as 
a
>> separate case.

> I see. So you would need to keep the implementation of 'foo_i_ic' for 
> old PBCs and you would have 'foo_x_y_z', your new variant of this 
> opcode. Seems not the best idea too me, in terms of maintanability and 
> code size.

Yep. But I don't see how this is different from what you proposed for
new and obsolete ops. It just combines those two policies in the case
where the op is a replacement with a different signature. You are free
to truly delete the op if you want to, with the "missing op" implications
below.

>> Removing an op instead of marking it obsolete will cause the oplookup 
to
>> fail, and the interpreter would report and error, just as with finger-
>> printing. However, you would actually have more information to report,
>> such as which op is missing from which oplib, which can be helpful in
>> tracking down the problem. So, IMO, the finer granularity is useful 
both
>> for evolution and for failure diagnosis.

> So, how would the PBC with extended ops look like?

You can take a static or dynamic approach. The static approach has you
tacking the optable onto the PBC as a segment (yes, its bigger than a
fingerprint). Small code segments will have small op gamuts (gama? :),
larger code segments will likely have larger op gamuts (and thus larger
op tables), but it would not be linear.

The extreme dynamic approach involves a new oplib (boot.ops) with a
new op called useop_ic_sc_sc. This op is prewired (not necessarily
*hard* wired) to optable slot zero at startup. C<usop> fills in optable
slots:

  useop  1, "foo", "bar_i_ic"

So, instead of a formal optable segment in the packfile, you modify
your optable any time you please. You might even overwrite slot zero,
disabling further useopping. I would expect that it would be common
to put all the optable mucking up front, but it doesn't have to be
that way.


    Exercise for the reader: Write 6502.ops and a program to convert a
    chunk of 6502 machine code into a PBC file with a preamble that sets
    up op slots 0-255 with these ops, and a body that is a recasting of
    the machine code into 32-bit ops and arguments. Beware of data
    handling.


(Of course, you can imagine variants of this op where the arguments
can be registers instead of constants, too. If you need these, you could
pull them in from some oplib.)

There are variants of this idea where the first N op slots are reserved
for the core ops, in some canonical order, and with room for growth.
Dynamic ops would appear in slots N and up. This common-case optimization
trades in some of the flexibility of the approach for speed by not
having to do the dynamic building. Its not my favorite approach, but
as long as the set of "core" ops is kept tight, it wouldn't hurt. The
idea would be to make the core set just large enough that real-world
programs wouldn't need very many dynamic ops (beyond what might be
expected for language support, which is unavoidable).


One objection to this approach could be PBC bloat, but I'd challenge that
one on these grounds: We use 32-bit bytecode slots to hold 5-bit register
numbers and < 10 bit opcode numbers, "wasting" many of the bits (they
were traded for simplicity and speed). But, thats a lot of bits in a
large program. It certainly makes one ponder a .pbcz zipped format
(which screws mmappability, alas).

Another objection is that it makes disassembly harder. The answer: yup.


> Still mmap()able? Or name/signature of extension ops?

Yes, still mmap()able. Bytecode still consists of integer indexes into
an optable followed by integer references to registers and constants.

> Your proposal looks like moving the assembler to runtime.

Not really. This is a separate but equally interesting issue.

Having IMCC available (loadable, anyway) is a good idea. Parrot
code should be able to generate a string or other representation of
of an IMC code chunk and invoke IMCC to convert it to bytecode.
It should be possible to create and populate new in-memory code
chunks and then call them.


<blue-sky>
I like languages with good introspection capabilities. The ability to
do things like define new classes at run time is a bonus, too. I'm
interested in seeing similar capabilities in the underlying virtual
machine (I guess I want to turn it into a malleable machine). Not only
would I like to see dynamic optables, but I'd like a program to be
able to find out about its op table, too.

Oh, and I'd like to have indirect addressing modes where the register
numbers come from other registers.
</blue-sky>


Regards,

-- Gregor

Reply via email to