Re: Internal integral types

Bryan C. Warnock Sat, 11 May 2002 14:14:15 -0700

On Thu, 2002-05-09 at 12:09, Andy Dougherty wrote:
> I was looking at some of the 750 warnings generated by the latest parrot
> build (with INTVAL = opcode_t = long long).  Lots of them stem from some
> minor confusion about which integral types to use when.  To remove the
> warnings, I'd prefer to use the "correct" types rather than clutter
> the source with mindless casts.  But I'm unsure when to use what.


{snip}

> 
> Some of these are size_t; others, superficially quite similar, are
> UINTVAL.  Recall that in my configuration, size_t is 'unsigned long',
> while 'UINTVAL' is 'unsigned long long'.
> 
> Is there a pattern or rule-of-thumb for when to use what?

Well, there is supposed to be, and it's been largely my fault that
Parrot is currently stuck halfway between the old and the new.

The "quick" rule of thumb is that only the values that directly
represent user-space types are supposed to be of the *VAL types.  
IOW, when someone asks, from their language, to add n and m, n and
m should be represented by *VAL types.  Everything else should use
some more sane native C type.

Of course, it's not really that simple.  Parrot provides a gateway
between things that aren't really these numerical types, and its
own internals.  So a number may be an INTVAL at one point, but then
later by used to set the length of a string, which should (probably) be
size_t.

That leads to no end of casting and sizing issues while casting back and
forth.  However, I think once we redress a separate issue, it may
actually be a little clearer.

As an aside, Jarkko, Dan, and I had a brief discussion a while back
about what types to make some of these things, most likely in
conjunction to one of my earlier rants on the subject.  The "final"
decision was that C99 provides many of the types that we are looking
for, and we would define our metatypes in terms of C99 types - and
defining our C99 types in terms of the standard C types in the event the
compiler doesn't support the subset of C99 types that we would want.

Most of the true internals integers are most representative of memory
and sizes, so size_t does seem to be the most accurate "standard" type
for many of the current UINTVAL uses.

There are currently a couple problems with the scheme and how it relates
to Parrot.  Most numbers have dual-purpose.  They are an entity in and
of themselves, which imposes its own constraints on what the numbers can
be.  size_t, for instance, is an unsigned entity, so you're given an
unsigned type, and you'll get (and possibly need) the full 32 or 64 bits
it.  That number, however, all exists as strictly a number - something
to be printed or inputted or converted to a string.  In addition to Perl
merging strings and numbers as convenient, it merges unsigned and signed
numbers as convenient.  Generic numbers, however, are generally signed,
and so you only get 31 or 63 of the available bits for the number
representation.  (Obviously, I'm just addressing integers here.)

That boils everything down to one simple statement - Parrot user-space
numbers and most Parrot internal entities are *never* going to be the
same size.  You're either going to be a word long or a bit short.
We can play casting tricks and stuff, a la Perl 5, and hope that
everything all works out, but a lot of that is going to be dependent on
expected behavior.  What happens when you try to set the length of an
array or a string to a negative number?  An INTVAL larger than the
internal size?  The most responsible answer is probably to raise a
runtime exception, but that will require doing explicit testing when
casting number back and forth between entities.  Another possible way to
address this is to limit even unsigned types to not use the high bit. 
That will make casting simpler, but that will still require explicit
checking.

However, there's another issue I raised on IRC that may dictate one
behavior or another, and that is the one of auto-promotion and
-demotion.  We are going to need to explicitly check standard INTVAL
math operations to see if we need to promote them to BigInts or not.
We (possibly) should do the same for FLOATVALS, if the standard math lib
will end up downcasting the FLOATVAL type because it doesn't have long
double versions of the math lib.  

That raises yet another kink, in that the current opcodes don't have a
method to support this yet: you can add two integers together, and stuff
the result in an integer register.  But what happens if you've
overflowed?  You're either going to have to branch all your code, so
that you can support II -> I or II -> PMC based on internal exceptions,
or you're going to need to resort to all opcodes dealing solely with
PMCs.  (That's actually how Parrot was originally designed - the INS
registers were *only* for the internals' use, and not for general opcode
use.)


-- 
Bryan C. Warnock
bwarnock@(gtemail.net|capita.com)

Re: Internal integral types

Reply via email to