Re: PDD 4 internal data types, version 1.1

2001-03-30 Thread Bart Lateur
On Thu, 29 Mar 2001 19:24:21 +0200 (CEST), Tels wrote: And then, if we have BigFloat, we need a way to specify rounding and precision. Otherwise 1/3 eats up all memory or provides limits ;o) Er... may I suggest ratio's as a data format? It won't work for sqrt(2) or PI, but it can easily store

BigFract was Re: PDD 4 internal data types, version 1.1

2001-03-30 Thread Tels
-BEGIN PGP SIGNED MESSAGE- Moin, And then, if we have BigFloat, we need a way to specify rounding and precision. Otherwise 1/3 eats up all memory or provides limits ;o) Er... may I suggest ratio's as a data format? It won't work for sqrt(2) or PI, but it can easily store 1/3 as two

Re: PDD 4 internal data types, version 1.1

2001-03-30 Thread David L. Nicol
Bart Lateur wrote: Er... may I suggest ratio's as a data format? It won't work for sqrt(2) or PI, but it can easily store 1/3 as two (long) integers. You can postpone doing integer divisions until you need a result, at which time ^to

Re: PDD 4 internal data types, version 1.1

2001-03-30 Thread Dan Sugalski
At 08:58 AM 3/30/2001 -0500, Andy Dougherty wrote: On Fri, 30 Mar 2001, Bart Lateur wrote: On Thu, 29 Mar 2001 19:24:21 +0200 (CEST), Tels wrote: And then, if we have BigFloat, we need a way to specify rounding and precision. Otherwise 1/3 eats up all memory or provides limits ;o)

PDD 4 internal data types, version 1.1

2001-03-29 Thread Tels
-BEGIN PGP SIGNED MESSAGE- Moin all, Dan wrote a lot sensible things about Transparent BigInt/BigFloat support. I am all for it. ;-P Reasons: It's the Perl way. (Int = Float already works that way) Speed (For small numbers, use fast INT, for larger uses BigInt; currently you must

Re: PDD 4: Internal data types

2001-03-22 Thread Buddha Buck
At 11:14 AM 03-22-2001 -0800, Hong Zhang wrote: Please not fight on wording. For most encodings I know of, the concept of normalization does not even exist. What is your definition of normalization? To me, the usual definition of "normalization' is conversion of something into a standard form,

Re: PDD 4: Internal data types

2001-03-22 Thread Simon Cozens
On Thu, Mar 22, 2001 at 11:14:53AM -0800, Hong Zhang wrote: Please not fight on wording. For most encodings I know of, the concept of normalization does not even exist. *boggle*. I don't think we're talking about the same Unicode. What is your definition of normalization? Well, either

Re: PDD 4: Internal data types

2001-03-22 Thread Simon Cozens
On Tue, Mar 06, 2001 at 01:21:20PM -0800, Hong Zhang wrote: The normalization has something to do with encoding. If you compare two strings with the same encoding, of course you don't have to care about it. Of course you do. Think about it. If I'm comparing "(Greek letter lower case alpha

Re: PDD 4: Internal data types

2001-03-09 Thread David Mitchell
As I see it, there will be 3 types of access to bigint/nums. 1) there is the internal implementation of the PMC types associated with them. This is where all the messy code gets hidden (assembler optimations, register length-specific code etc). 2) PDD 2 requires that all PMC types return their

Re: PDD 4: Internal data types

2001-03-09 Thread Paolo Molaro
Most won't, honestly. At a guess, 90% of perl's current userbase doesn't care about Unicode for any reason other than XML, The next version of Gtk+ will use utf8. Qt use unicode already. Tk will probably move in the same direction if it doesn't do it already. So most user interface

Re: PDD 4 internal data types, version 1.1

2001-03-09 Thread Paolo Molaro
On 03/05/01 Dan Sugalski wrote: =item Arbitrary precision integers Big integers, or bigints, are arbitrary-length integer numbers. The only limit to the number of digits in a bigint is the lesser of the amount of memory available or the maximum value that can be represented by a CUV. This

Re: PDD 4 internal data types, version 1.1

2001-03-09 Thread Dan Sugalski
At 01:01 AM 3/10/2001 +0100, Paolo Molaro wrote: On 03/05/01 Dan Sugalski wrote: =item Arbitrary precision integers Big integers, or bigints, are arbitrary-length integer numbers. The only limit to the number of digits in a bigint is the lesser of the amount of memory available or the

Re: PDD 4: Internal data types

2001-03-08 Thread David Mitchell
The C structure that represents a bigint is: struct bigint { void *num_buffer; UV length; IV exponent; UV flags; } [snip] The Cnum_buffer pointer points to the buffer holding the actual number, Clength is the length of the buffer, Cexponent is the base 10

Re: PDD 4: Internal data types

2001-03-08 Thread Dan Sugalski
At 03:00 PM 3/8/2001 +, David Mitchell wrote: The C structure that represents a bigint is: struct bigint { void *num_buffer; UV length; IV exponent; UV flags; } [snip] The Cnum_buffer pointer points to the buffer holding the actual number,

Re: PDD 4: Internal data types

2001-03-08 Thread Hong Zhang
Looks like they do operations with 16-bit integers. I'd as soon go with 32-bit ones--wastes a little space, but should be faster. (Except where we should shift to 64-bit words) Using 32/31-bit requires general support of 64-bit arithmetics, for shift and multiply. Without it, we have to use

Re: PDD 4: Internal data types

2001-03-08 Thread Nicholas Clark
On Thu, Mar 08, 2001 at 11:43:31AM -0500, Dan Sugalski wrote: I was thinking maybe (length/4)*31-bit 2s complement to make portable overflow detection easier, but that would be only if there wasn't a good C library for this available to snag. The only portable integer overflow in ANSI C is

Re: PDD 4: Internal data types

2001-03-08 Thread Hong Zhang
I was thinking maybe (length/4)*31-bit 2s complement to make portable overflow detection easier, but that would be only if there wasn't a good C library for this available to snag. I believe Python uses (length/2)*15-bit 2's complement representation. Because bigint and bitnum are

Re: PDD 4: Internal data types

2001-03-08 Thread Dan Sugalski
At 06:49 PM 3/8/2001 +, Nicholas Clark wrote: On Thu, Mar 08, 2001 at 11:43:31AM -0500, Dan Sugalski wrote: I was thinking maybe (length/4)*31-bit 2s complement to make portable overflow detection easier, but that would be only if there wasn't a good C library for this available to

Re: PDD 4: Internal data types

2001-03-08 Thread Dan Sugalski
At 10:49 AM 3/8/2001 -0800, Hong Zhang wrote: I was thinking maybe (length/4)*31-bit 2s complement to make portable overflow detection easier, but that would be only if there wasn't a good C library for this available to snag. I believe Python uses (length/2)*15-bit 2's complement

Re: PDD 4: Internal data types

2001-03-08 Thread Dan Sugalski
At 03:35 PM 3/8/2001 -0500, Bryan C. Warnock wrote: On Thursday 08 March 2001 11:43, Dan Sugalski wrote: It probably ought to be left undefined, in case we switch implementations later. Er, except, aren't you (we) supposed to be defining the implementation? I thought the hand-waving period

Re: PDD 4: Internal data types

2001-03-08 Thread Bryan C. Warnock
On Thursday 08 March 2001 11:43, Dan Sugalski wrote: It probably ought to be left undefined, in case we switch implementations later. Er, except, aren't you (we) supposed to be defining the implementation? I thought the hand-waving period was over, and we're doing specifications. If

Re: PDD 4: Internal data types

2001-03-08 Thread Nicholas Clark
On Thu, Mar 08, 2001 at 11:31:06AM -0800, Hong Zhang wrote: Looks like they do operations with 16-bit integers. I'd as soon go with 32-bit ones--wastes a little space, but should be faster. (Except where we should shift to 64-bit words) Using 32/31-bit requires general support of 64-bit

Re: PDD 4: Internal data types

2001-03-08 Thread Dan Sugalski
At 08:34 PM 3/8/2001 +, Nicholas Clark wrote: On Thu, Mar 08, 2001 at 01:55:57PM -0500, Dan Sugalski wrote: At 06:49 PM 3/8/2001 +, Nicholas Clark wrote: On Thu, Mar 08, 2001 at 11:43:31AM -0500, Dan Sugalski wrote: I was thinking maybe (length/4)*31-bit 2s complement to make

Re: PDD 4: Internal data types

2001-03-08 Thread Nicholas Clark
On Thu, Mar 08, 2001 at 04:28:48PM -0500, Dan Sugalski wrote: At 08:43 PM 3/8/2001 +, Nicholas Clark wrote: I think most processors that do 32x32 multiply provide a way to get the 64-bit result. Whether *we* can is another matter, of course, but if platform folks want to drop to

Re: PDD 4: Internal data types

2001-03-08 Thread Dan Sugalski
At 10:28 PM 3/8/2001 +, Nicholas Clark wrote: On Thu, Mar 08, 2001 at 04:28:48PM -0500, Dan Sugalski wrote: At 08:43 PM 3/8/2001 +, Nicholas Clark wrote: I think most processors that do 32x32 multiply provide a way to get the 64-bit result. Whether *we* can is another matter, of

Re: PDD 4: Internal data types

2001-03-08 Thread Hong Zhang
For bigint, we definite need a highly portable implementation. People can do platform specific optimization on their own later. We should settle the generic implementation first, with proper encapsulation. Hong Do we need to settle on anything - can it vary by platform so that 64 bit

Re: PDD 4: Internal data types

2001-03-06 Thread Dan Sugalski
At 04:14 PM 3/5/2001 -0800, Hong Zhang wrote: Here is an example, "re`sume`" takes 6 characters in Latin-1, but could take 8 characters in Unicode. All Perl functions that directly deal with character position and length will be sensitive to encoding. I wonder how we should handle this

Re: PDD 4: Internal data types

2001-03-06 Thread Hong Zhang
Unless I really, *really* misread the unicode standard (which is distinctly possible) normalization has nothing to do with encoding, I understand what you are trying to say. But it is not very easy in practice. The normalization has something to do with encoding. If you compare two strings

Re: PDD 4: Internal data types

2001-03-06 Thread Dan Sugalski
At 01:21 PM 3/6/2001 -0800, Hong Zhang wrote: Unless I really, *really* misread the unicode standard (which is distinctly possible) normalization has nothing to do with encoding, I understand what you are trying to say. But it is not very easy in practice. The normalization has something to

Re: PDD 4: Internal data types

2001-03-05 Thread Dan Sugalski
At 12:01 PM 3/5/2001 -0800, Hong Zhang wrote: struct perl_string { void *string_buffer; UV length; UV allocated; UV flags; } The low three bits of the flags field is reserved for the type of the string. The various types are: =over 4 =item

PDD 4 internal data types, version 1.1

2001-03-05 Thread Dan Sugalski
Here's a mildly fixed up version of PDD 4. References to the NUM and INT types have been reduced to generic concepts rather than data types of some sort. Cut Here-- =head1 TITLE Perl's internal data types =head1 VERSION 1.1 =head2 CURRENT Maintainer: Dan Sugalski

Re: PDD 4: Internal data types

2001-03-05 Thread Hong Zhang
Here is an example, "re`sume`" takes 6 characters in Latin-1, but could take 8 characters in Unicode. All Perl functions that directly deal with character position and length will be sensitive to encoding. I wonder how we should handle this case. My first inclination is to force

PDD 4: Internal data types

2001-03-02 Thread Dan Sugalski
Yes, I know I promised the GC PDD, but this was simpler and half finished. Now it's all finished, and can be used some in both the vtable PDD and the utility functions PDD. -Cut here with a sharp knife =head1 TITLE Perl's internal data types =head1 VERSION 1 =head2 CURRENT

Re: PDD 4: Internal data types

2001-03-02 Thread Dan Sugalski
At 01:36 PM 3/2/2001 -0500, Andy Dougherty wrote: On Fri, 2 Mar 2001, Dan Sugalski wrote: =head2 Intger data types Integer data types are generically referred to as CINTs. There is an CINT typedef that is guaranteed to hold any integer type. [gazing into crystal ball . . . ] I predict

RE: Questions about PDD 4: Internal data types

2001-03-02 Thread Dan Sugalski
At 02:01 PM 3/2/2001 -0500, wiz wrote: =item BINARY (0) =item ASCII (1) =item EBCDIC (2) =item UTF_8 (3) =item UTF_32 (4) =item NATIVE_1 (5) through NATIVE_3 (7) A little more complex, but why not use bits 3-7 as actual flags: 7|6|5|4|3|2|1|0 0 0 0 0 1 x x x = UTF UTF_8 0 0 0 1 1 x

Re: Questions about PDD 4: Internal data types

2001-03-02 Thread Nicholas Clark
On Fri, Mar 02, 2001 at 02:01:35PM -0500, Dan Sugalski wrote: At 02:01 PM 3/2/2001 -0500, wiz wrote: =item BINARY (0) =item ASCII (1) =item EBCDIC (2) =item UTF_8 (3) =item UTF_32 (4) =item NATIVE_1 (5) through NATIVE_3 (7) A little more complex, but why not use bits 3-7 as

Questions about PDD 4: Internal data types

2001-03-02 Thread Hong Zhang
Integer data types are generically referred to as CINTs. There is an CINT typedef that is guaranteed to hold any integer type. Does such thing exist? Unless it is BIGINT. Should we scrap the buffer pointer and just tack the buffer on the end of the structure? Saves a level of indirection,

Re: Questions about PDD 4: Internal data types

2001-03-02 Thread Dan Sugalski
At 10:31 AM 3/2/2001 -0800, Hong Zhang wrote: Integer data types are generically referred to as CINTs. There is an CINT typedef that is guaranteed to hold any integer type. Does such thing exist? Unless it is BIGINT. I'm confused here, looks like you're missing some words from those

Re: PDD 4: Internal data types

2001-03-02 Thread Andy Dougherty
On Fri, 2 Mar 2001, Dan Sugalski wrote: =head2 Intger data types Integer data types are generically referred to as CINTs. There is an CINT typedef that is guaranteed to hold any integer type. [gazing into crystal ball . . . ] I predict some header somewhere is going to already #define

Re: Questions about PDD 4: Internal data types

2001-03-02 Thread Andy Dougherty
On Fri, 2 Mar 2001, Dan Sugalski wrote: At 10:31 AM 3/2/2001 -0800, Hong Zhang wrote: Integer data types are generically referred to as CINTs. There is an CINT typedef that is guaranteed to hold any integer type. The intention is that if you need to deal with integers in an abstract

Re: PDD 4: Internal data types

2001-03-02 Thread Jarkko Hietaniemi
On Fri, Mar 02, 2001 at 12:05:59PM -0800, Hong Zhang wrote: at some points it becomes necessary to have an unsigned type for "the largest integer" which in this case would be 72 bits. [and on a machine with nothing larger than 32 will be 32] Sure. The size of an INT will probably be

Re: Questions about PDD 4: Internal data types

2001-03-02 Thread Hong Zhang
I believe we should use low bits for tagging. It will make switch case much faster. If you still emphasize on speed, we can make 0x05 = UTF-8 0x06 = UTF-16 0x07 = UTF-32 #define IS_UTF_ANY(a) \ (((a)-flags 0x07) = UTF-8) However, I don't believe it is needed. Hong If your interest is

RE: Questions about PDD 4: Internal data types

2001-03-02 Thread NeonEdge
If your interest is in speed alone, then adding UTF_16 might offer options as well: FORMAT (enc_flags): 7|6|5|4|3|2|1|0 x x 0 0 1 x x x = UTF_8 x x 0 1 0 x x x = UTF_16 x x 1 0 0 x x x = UTF_32 then: #define UTF 56 utf_encoding = UTF enc_flags; if( utf_encoding ) { cout "String is UTF_"

Re: PDD 4: Internal data types

2001-03-02 Thread Hong Zhang
at some points it becomes necessary to have an unsigned type for "the largest integer" which in this case would be 72 bits. [and on a machine with nothing larger than 32 will be 32] Sure. The size of an INT will probably be either 32 or 64 bits, depending both on the size of an IV and the

Re: PDD 4: Internal data types

2001-03-02 Thread Uri Guttman
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS One needs to specify *which* PDP is the target. The PDP-11 was the DS most popular of the bunch, and it's a 16-bit machine. One of the DS TOPS machines was 36 bit, IIRC, with either 7 or 9 bit DS chars. (Can't remember which) i don't

Re: PDD 4: Internal data types

2001-03-02 Thread Dan Sugalski
At 01:38 PM 3/2/2001 -0800, Hong Zhang wrote: I was hoping to get us something that was guaranteed to hold an integer, no matter what it was, so you could do something like: struct thingie { UV type; INT my_int; } What is the purpose of doing this? At this point

Re: PDD 4: Internal data types

2001-03-02 Thread Dan Sugalski
At 12:05 PM 3/2/2001 -0800, Hong Zhang wrote: at some points it becomes necessary to have an unsigned type for "the largest integer" which in this case would be 72 bits. [and on a machine with nothing larger than 32 will be 32] Sure. The size of an INT will probably be either 32 or 64

Re: PDD 4: Internal data types

2001-03-02 Thread Hong Zhang
I was hoping to get us something that was guaranteed to hold an integer, no matter what it was, so you could do something like: struct thingie { UV type; INT my_int; } What is the purpose of doing this? The SV is guaranteed to hold anything. Why we need a type that can