Well, I'm beginning to feel it's habitual for me to periodically pop my head in and waffle on Parrot's core sizes. But waffle I shall.
- opcode_t This has already been discussed, so I'll sum up. To remain compatible (and efficient) across the spectrum of 32-bit and 64-bit platforms, the value of opcode_t is limited to 32-bits. (Or, more accurately, 31 bits.) Although you could do larger on a 64-bit platform, the use of opcode_t as an array index and memory offset limits it to the size of the addressable memory anyway. (So the value would be downcast by the end, if not before. I can't find a reference to what integer type an array index is.) Not to mention all the *other* problems we'll have if we've got more than 2^31 different opcodes. (Although that's why there's UUIDs now, isn't there?) Although Parrot needs to be able to convert 32-bit and 64-bit wide opcodes, there's no reason to process at anything other than native (size_t-ish) size, since a good 90%+ of the uses will be cast that size anyway. - INTVAL Early on, I was a big fan of making INTVALs as big as you could. Bitten by integer rollover, watching the struggles of complete 64-bit Int support in Perl 5, huge INTVALs were important to me. As Parrot has evolved, I've come to realize that what I *really* want is to be able to program with huge INTVALs. Which isn't the same thing. ----------- ------------------| Opcodes | <--- Program S | Interpreter ----------- Y | ^ | S <- | G R <-----| | T -> | u <- e v | E | t -> g -------- M | s s | PMCs | -------------------------- So when I write a program, there are going to be two types of numbers, user and system. (For lack of imagination.) User numbers, of course, are the numbers that exist for their own purpose, and for the user's benefit. $a = 5; $b = $a * 2 + 6; System numbers are those marked "internal use only". File numbers, array indices, counters, the language infrastructure. These bubble down to the guts of the interpreter, and eventually to the system. If INTVAL is greater than the natural system width, conversion is in order. (For the sake of using real numbers, I'll assume 32/64.) Currently, the flow is, in variable sizes: Opcodes: 32 (constants are limited by the spec) PMCs : 64 Regs : 64 Guts : 64/32 mix System : 32 What's troublesome is the rash of conversions between the system and some guts, those guts and other guts, or those guts and registers. (Besides the extra cost of schlepping around the extra data, size differentials between INTVALs and pointers (which is problematic to begin with), unchecked truncation, and the added burden on the JIT, it's not really a problem.) And for what? To be able to add large numbers? Numbers, as a type in a language that rides upon Parrot, never really reach beyond the boundaries of the PMCs themselves. The majority of numerics passed down through the registers are destined for conversion anyway. The flow *really* is, in value sizes: Opcodes: 32 (constants are limited by the spec) PMCs : 64 Regs : 32 Guts : 32 System : 32 Certainly, much like the physical machine the virtual machine runs on, it needs to support, or at least not preclude, wider numeric types for access by languages. But given the mapping of the bulk of the virtual on the machine onto the physical, that should probably be relegated to just support. For example, take Perl 5's struggle for maximal bitness. Given that Perl 6 will continue in that direction - and further, if you consider auto-promotion to arbitrarily sized numbers - and the language will provide all of the functionality within its PMCs, why does it need the interpreter to do any more than not get in its way? (Consider, for a moment, that bytecode strives to be portable across all Parrot virtual machines, which implies that nothing in the bytecode, nor in the supporting languages, should be dependent on Parrot being configured with extended integers in the first place.) On the off chance that a language with extended numerics wants to use registers, what would the feasibility be (from the JIT, compiler, etc) to borrow a page from the physical hardware and simply join two smaller registers together? (The advantage of contiguous memory regions.) - FLOATVAL The same principle, with a twist. Like most operating systems, the interpreter doesn't really have a need - in and of itself - for floating point. Floating points pretty much exist entirely for end calculations. So there's much less internal data flow of floats and needless conversions. But there's also much less need for the interpreter itself to have to have configurable sized floats. But then there's little reason not to have configurable sized floats. The JIT, I guess. - Problems Well, Parrot's had problems from the beginning with non-"long, double, long" configurations. By keeping INTVAL and FLOATVAL as the maximum size supported (basically either "long" or "long long", or "double" or "long double"), languages can feel free to take advantage of what facilities are available to them, if they so choose. But what of inter-language operability? Will the registers become the crossroads for data conversions between PMCs from difference languages? It doesn't look that way, from the direction that PMCs have gone. Can we simplify interpreter types this much, while still providing extended numerics to hosted languages? -- Bryan C. Warnock bwarnock@(gtemail.net|raba.com)