Re: standard representations

Benjamin Stuhl Tue, 26 Dec 2000 07:50:22 -0800
Thus spake the illustrious Dan Sugalski <[EMAIL PROTECTED]>:
> Okay, here's what I'm currently thinking of for standard
> representations of 
> integers, numbers, strings, and (possibly) complex data.
> These are not 
> necessarily indicative of how the data's stored in
> scalars (or hashes or 
> arrays), merely the types that will need to be dealt with
> in the vtables.
> 
> In addition to each of the types below, each vtable will
> have a 'same type' 
> entry that'll be used if the optimizer can guarantee that
> the scalars 
> involved in an operation are of the identical type.
> (Presumably things can 
> be faster that way)
> 
> For integers, we have two types, platform native, and
> bigint. No guarantees 
> are made as to the size of a native int. bigints can be
> of any size.

I'm not sure about the wisdom of not making any guarrantees
about int size, since that means that extensions have to go
through the same hoops perl5 has, dealing with
"unspecified" behaviors (cf. fun with ANSI stdio). To make
life easy, we might want to ordain sizeof(p6int) >=
sizeof(void *) && sizeof(p6int) >= 4. On the other hand,
this makes a port of the PVM to Palms and the like somewhat
harder (but would it be much easier to wedge them into the
standard PVM?). Also, can we please mandate 2s-complement
integral math? Perl 5 really always has, but can we please
make it official? 

> For floats, we also have two types, C double and
> bigfloat. No guarantees to 
> the size or accuracy of the double. bigfloats can be of
> any size.

Floating point is even harder, and will require a lot of
build-time checks anyway. 

> Strings can be of three types--binary data, platform
> native, and UTF-32. 

"platform native"?

> No, we are not messing around with UTF-8 or 16, nor are
> we messing with 
> EBCDIC, shift-JIS, or any of that stuff. Strings can be
> stored internally 
> that way (and the native form might be one of them) but
> as far as the 
> interface is concerned we have only three. Yes, this does
> mean if we mess 
> with strings in UTF-8 format on a non-UTF-8 system
> they'll need to be fed 
> out in UTF-32. It's bigger, but we can deal.

The issue with UTF-32 is that we'd need to write an entire
string-handling library, while quite a few modern platforms
have _wstr* or equivalent.

> Finally, complex numbers, if we deal with them, will be
> either double or 
> bigfloat complexes. (I don't see any reason to mess with
> integer versions, 
> nor with mixed double/bigfloat types)
> 
> And, unless Larry objects, I feel that all vtable methods
> should have the 
> option of going with a 'scalar native' form if the
> operation if it's 
> determined at runtime that two scalars are the same type,
> though this is 
> optional and bay be skipped for cost reasons. (Doing it
> with, for example, 
> complex numbers might be worth it, or when expensive
> conversions might be 
> avoided)

This part sounds good.

> Comments? I'm trying to balance out accuracy and DWIMmery
> with cost here, 
> and I'm not 100% sure things are quite right yet.
> 
>                                       Dan
> 

-- BKS

__________________________________________________
Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.
http://shopping.yahoo.com/
Re: standard representations

Reply via email to