Re: PDD 2, vtables

Dan Sugalski Tue, 06 Feb 2001 09:13:57 -0800
At 11:26 AM 2/6/2001 +0000, Tim Bunce wrote:
>[First off: I've not really been paying attention so forgive me if I'm
>being dumb here.  And many thanks for helping to drive this forwards.]
>
>On Mon, Feb 05, 2001 at 05:14:44PM -0500, Dan Sugalski wrote:
> >
> > =head2 Core datatypes
> >
> > For ease of use, we define the following semi-abstract data types
>
>Probably worth stating upfront that it'll be easy to add new types
>to avoid people argusing for their favorite type to be added here.

I'm not sure it should be--that'd mean extending the vtables in ways they 
have little room to grow. Adding new perl datatypes is easy, adding new 
low-level types is harder.

> > =item INT
> > =item NUM
> > =item STR
> > =item BOOL
>
>What about references?

Special type of scalar, not dealt with here.

>Arrays and hashes should probably be at least mentioned here.

And lists, yes. Or they need their own PDD with details.

> > =head3 String data types
> >
> > =item binary buffer
>
>'Binary string'

I avoided that on purpose. Label it a string and people think of its 
contents as characters, and they're probably not going to be a good chunk 
of the time. Might not outweigh the consistency issue, though.

> > =item UTF-32 string
> > =item Native string
> > =item Foreign string
>
>I'm a little surprised not to see UTF-8 there, but since I'm also
>confused about what Native string and Foreign string are I'll skip it.
>Except to say that some clarification here may help, and explicitly
>mentioning UTF-8 (even to say it won't be a core type and provide a
>reference to why) would be good.

I didn't put UTF-8 in on purpose, because I'd just as soon not deal with it 
internally. Variable length character data's a pain in the butt, and if we 
can avoid having the internals deal with it except as a source that gets 
converted to UTF-32, that's fine with me.

The native and foreign string data types were an attempt to accommodate 
UTF-8, as well as ASCII and EBCDIC character data. One of the three will 
likely be the native type, and the rest will be foreign strings. I'm not 
sure if perl should have only one foreign string type, or if we should have 
a type tag along with the other bits for strings.

> > The functions are divided into two broad categories, those that perl
> > will use the value of internally (for example the type functions) and
> > those that produce or modify a PMC, such as the add function.
>
>So possibly a good idea to explicitly group them that way.

They were, but I see I lost that.

> > =head2 Functions in detail
> >
> > =item type
> >
> > =item name
> >
> >    STR                name(PMC[, key]);
> >
> > Returns the name of the class the PMC belongs to.
>
>So I'd call it type_name (or maybe class_name as you seem to be useing
>the words interchangably. If type != class then clarify somewhere.).

The interchange is due to sloppy thinking. I'll redo it so that class == 
perl data type, while type == (NUM|STR|BOOL|INT).

> > =item move_to
> >
> >    BOOL               move_to(void *, PMC);
> >
> > Tells the PMC to move its contents to a block of memory starting at
> > the passed address. Used by the garbage collector to compact memory,
> > this call can return a false value if the move can't be done for some
> > reason. The pointer is guaranteed to point to a chunk of memory at
> > least as large as that returned by the C<real_size> vtable function.
>
>Shouldn't the PMC be the first arg for consistency?

First arg of the PMC is the destination PMC. We don't have one here.

> > =item real_size
> >
> >    IV         real_size(PMC[, key]);
> >
> > Returns an integer value that represents the real size of the data
> > portion, excluding the vtable, of the PMC.
>
>Contiguous? Sum of parts (allowing for allignment) if it contains
>multiple chunks of data?

Size we'd need to allocate if we were going to move the data. Though 
knowing how much space is currently taken would also be useful, assuming 
they're not the same. (They probably would be within a few bytes, though)

> > =item destroy
> >
> >    void               destroy(PMC[, key]);
> >
> > Destroys the variable the PMC represents, leaving it undef.
>
>Using the word 'variable' here probably isn't a good idea.
>Maybe "Destroys the contents of the PMC leaving it undef."

Better. Thanks.

> > =item is_same
> >
> >    BOOL               is_same(PMC1, PMC2[, key]);
> >
> > Returns TRUE if C<PMC1> and C<PMC2> refer to the same value, and FALSE
> > otherwise.
>
>I think that needs more clarification, especially where they are of
>different types. Contrast with is_equal() below.

If they're different types they can't be the same. This would be used to 
check if two references have the same referent, or if two magic variables 
(database handles, say) pointed to the same thing.

> > =item concatenate
> >
> >    void               concatenate(PMC1, PMC2, PMC3[, key]); ##
> >
> > Concatenates the strings in C<PMC2> and C<PMC3>, storing the result in
> > C<PMC1>.
>
>and insert (ala sv_insert)  etc?

Hadn't considered them. Care to elaborate on the etc?

> > =item is_equal
>
>Contrast with is_same() above.
>
> > =item logical_or
> > =item logical_and
> > =item logical_not
>
>Er, why not just use get_bool? The only reason I can think of is to
>support three-value-logic but that would probably be better handled
>via a higher-level overloading kind of mechanism. Either way, clarify.

Well, there's overloading. Plus the potential that a class will do 
something odd with it--if you || on two custom arrays in list context you 
might get an array with each pair (left[0] || right [0] and so on) 
logically or'd.

> > =item match
> >
> >    void               match(PMC1, PMC2, REGEX[, key]);
> >
> > Performs a regular expression match on C<PMC2> against the expression
> > C<REGEX>, placing the results in C<PMC1>.
>
>Results, plural => container => array or hash. Needs clarifying.

Yep, especially since I'd considered tossing the match destination 
entirely. (Though that means special variables, and I'm not sure I want to 
go there) It'll likely just return true or false. I'll rethink it.

> > =item repeat (x)
> >
> >    void               repeat(PMC1, PMC2, PMC3[, key]); ##
> >
> > Performs the following sequence of operations: finds the string value
> > from C<PMC2>; finds an integer value I<n> from C<PMC3>; replicates the
> > string I<n> times; stores the resulting string in C<PMC1>.
>
>So call it replicate? Could also work for arrays.

Well, we call x the repeat operator, so...

> > =item nextkey (x)
> >
> >    void               nextkey(PMC1, PMC2, start_key[, key]);
> >
> > Looks up the key C<start_key> in C<PMC2> and then stores the key after
> > it in C<PMC1>. If start_key is C<undef>, the first key is returned,
> > and C<PMC1> is set to undef if there is no next key.
>
>Containers again.  And I'd call it key_next()

Containers are fine--nextkey(12) on a 15-element list should return 13. 
You're right that it needs more details on how it should perform.

> > =item exists (x)
>
>Likewise, key_exists()
>
> > =head1 TODO
> >
> > The effects of each function on scalar, array, hash, list, and IO
> > PMCs needs to be hashed out.
>
>Before that I think a section on containers need to be added.

Well, they should be done simultaneously.

> > =head1 REFERENCES
> >
> > PDD 3: Perl's Internal Data Types.
>
>Some references to any other vtable based languages would be good.
>(I presume people have looked at some and learnt lessons.)

Alas not. This is pretty much head of zeus stuff, modulo some ego. (Mine's 
not *that* big...)

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk
Re: PDD 2, vtables

Reply via email to