Just to add my 2 cents here, I've always felt that there should be basic primitives provided, and the HLL can take care of the rest. Technically the low-level IO routines don't have to know about encoding or compression schemes at all. A VM provides the building blocks, and you can add whatever abstractions you want on top.

Obviously there is another camp that feels that everything needs to be implemented in low level C and special PMCs for each HLL and funky APIs everywhere, and those are probably the same guys that think Parrot needs to worry about the issue you are discussing. I'm not in that camp, which is the reason I just watch nowadays. To me, the elegance of the pure VM has been lost on Parrot.

-Melvin

"Its 2006, do you know where your opcodes are?"




At 06:01 PM 2/21/2006, Martin D Kealey wrote:
I'm a bit slow coming back to this, sorry.

It seems that "seek" is used in two ways:

    * returning to some previously identified point (including the start or
      end of the file)
    * moving a given number of characters you want to move relative to a
      known location

Clearly you can always do the first, just by using the underlying byte
offset without regard for the encoding. If you have a fixed number of bytes
per character then you can trivially do the second as well.

But if you have a variable-length encoding then you have to read through the
byte stream to get to the position you want; this might or might not be
desirable depending on the characteristics of the underlying stream.

Furthermore it makes (some) sense always to be able to seek *forwards* --
even on a tty device -- but not backwards.

So my suggestion is that we change the interface to "seek", and have
separate parameters for the "previously known position" and the
"character offset". The latter is obviously just an integer, but the
first is a black-box token -- maybe a PMC, but more likely a mangled
integer -- to ensure that the two args are distinguishable.

(Please excuse me as I discuss this in terms of a HLL rather than
Parrot...)

In other words, change this:

 $fpos = $io.tell();
 $io.seek(SEEK_SET, $fpos)

to this:

 ...
 $io.seek($fpos, 0)

or for brevity, just this:

 ...
 $io.seek($fpos)

Now SEEK_SET, SEEK_CUR and SEEK_END just become special cases of
"previously known positions". And I'm tempted to say that they should be
spelt "0", "undef" and "-1" respectively.

Thence it's fairly straightforward for the units of "seek" to be
whatever you find convenient: counting whole records, or lines of text,
or whatever.

Clearly this needs to be discussed in p6-lang, but having separated the
two parameter types, the filter can decide which it can implement, and
how.

-Martin

Reply via email to