On Thu, 29 Mar 2001 19:24:21 +0200 (CEST), Tels wrote:
And then, if we have BigFloat, we need a way to specify rounding and
precision. Otherwise 1/3 eats up all memory or provides limits ;o)
Er... may I suggest ratio's as a data format? It won't work for sqrt(2)
or PI, but it can easily store
-BEGIN PGP SIGNED MESSAGE-
Moin all,
Dan wrote a lot sensible things about Transparent BigInt/BigFloat support.
I am all for it. ;-P
Reasons:
It's the Perl way. (Int = Float already works that way)
Speed (For small numbers, use fast INT, for larger uses BigInt; currently
you must
On Tue, Mar 06, 2001 at 01:21:20PM -0800, Hong Zhang wrote:
The normalization has something to do with encoding. If you compare two
strings with the same encoding, of course you don't have to care about it.
Of course you do. Think about it.
If I'm comparing "(Greek letter lower case alpha
The normalization has something to do with encoding. If you compare two
strings with the same encoding, of course you don't have to care about
it.
Of course you do. Think about it.
I said "you don't have to". You can use "==" for codepoint comparison, and
something like
As I see it, there will be 3 types of access to bigint/nums.
1) there is the internal implementation of the PMC types associated
with them. This is where all the messy code gets hidden (assembler
optimations, register length-specific code etc).
2) PDD 2 requires that all PMC types return their
The C structure that represents a bigint is:
struct bigint {
void *num_buffer;
UV length;
IV exponent;
UV flags;
}
[snip]
The Cnum_buffer pointer points to the buffer holding the actual
number, Clength is the length of the buffer, Cexponent is the base
10
At 03:00 PM 3/8/2001 +, David Mitchell wrote:
The C structure that represents a bigint is:
struct bigint {
void *num_buffer;
UV length;
IV exponent;
UV flags;
}
[snip]
The Cnum_buffer pointer points to the buffer holding the actual
number,
On Thu, Mar 08, 2001 at 11:43:31AM -0500, Dan Sugalski wrote:
I was thinking maybe (length/4)*31-bit 2s complement to make portable
overflow detection easier, but that would be only if there wasn't a good C
library for this available to snag.
The only portable integer overflow in ANSI C is
I was thinking maybe (length/4)*31-bit 2s complement to make portable
overflow detection easier, but that would be only if there wasn't a good C
library for this available to snag.
I believe Python uses (length/2)*15-bit 2's complement representation.
Because bigint and bitnum are
At 06:49 PM 3/8/2001 +, Nicholas Clark wrote:
On Thu, Mar 08, 2001 at 11:43:31AM -0500, Dan Sugalski wrote:
I was thinking maybe (length/4)*31-bit 2s complement to make portable
overflow detection easier, but that would be only if there wasn't a good C
library for this available to
At 10:49 AM 3/8/2001 -0800, Hong Zhang wrote:
I was thinking maybe (length/4)*31-bit 2s complement to make portable
overflow detection easier, but that would be only if there wasn't a good C
library for this available to snag.
I believe Python uses (length/2)*15-bit 2's complement
On Thu, Mar 08, 2001 at 01:55:57PM -0500, Dan Sugalski wrote:
At 06:49 PM 3/8/2001 +, Nicholas Clark wrote:
On Thu, Mar 08, 2001 at 11:43:31AM -0500, Dan Sugalski wrote:
I was thinking maybe (length/4)*31-bit 2s complement to make portable
overflow detection easier, but that would be
On Thursday 08 March 2001 11:43, Dan Sugalski wrote:
It probably ought to be left undefined, in case we switch implementations
later.
Er, except, aren't you (we) supposed to be defining the implementation?
I thought the hand-waving period was over, and we're doing specifications.
If
On Thu, Mar 08, 2001 at 11:31:06AM -0800, Hong Zhang wrote:
Looks like they do operations with 16-bit integers. I'd as soon go with
32-bit ones--wastes a little space, but should be faster. (Except where we
should shift to 64-bit words)
Using 32/31-bit requires general support of 64-bit
At 03:35 PM 3/8/2001 -0500, Bryan C. Warnock wrote:
On Thursday 08 March 2001 11:43, Dan Sugalski wrote:
It probably ought to be left undefined, in case we switch implementations
later.
Er, except, aren't you (we) supposed to be defining the implementation?
I thought the hand-waving period
At 08:43 PM 3/8/2001 +, Nicholas Clark wrote:
On Thu, Mar 08, 2001 at 11:31:06AM -0800, Hong Zhang wrote:
Looks like they do operations with 16-bit integers. I'd as soon go with
32-bit ones--wastes a little space, but should be faster. (Except
where we
should shift to 64-bit words)
On Thu, Mar 08, 2001 at 04:28:48PM -0500, Dan Sugalski wrote:
At 08:43 PM 3/8/2001 +, Nicholas Clark wrote:
I think most processors that do 32x32 multiply provide a way to get the
64-bit result. Whether *we* can is another matter, of course, but if
platform folks want to drop to
At 10:28 PM 3/8/2001 +, Nicholas Clark wrote:
On Thu, Mar 08, 2001 at 04:28:48PM -0500, Dan Sugalski wrote:
At 08:43 PM 3/8/2001 +, Nicholas Clark wrote:
I think most processors that do 32x32 multiply provide a way to get the
64-bit result. Whether *we* can is another matter, of
For bigint, we definite need a highly portable implementation.
People can do platform specific optimization on their own later.
We should settle the generic implementation first, with proper
encapsulation.
Hong
Do we need to settle on anything - can it vary by platform so that 64 bit
At 03:03 PM 3/8/2001 -0800, Hong Zhang wrote:
For bigint, we definite need a highly portable implementation.
People can do platform specific optimization on their own later.
We should settle the generic implementation first, with proper
encapsulation.
Care to start one? I'm about to start the
At 04:14 PM 3/5/2001 -0800, Hong Zhang wrote:
Here is an example, "re`sume`" takes 6 characters in Latin-1, but
could take 8 characters in Unicode. All Perl functions that directly
deal with character position and length will be sensitive to encoding.
I wonder how we should handle this
Unless I really, *really* misread the unicode standard (which is
distinctly
possible) normalization has nothing to do with encoding,
I understand what you are trying to say. But it is not very easy in
practice.
The normalization has something to do with encoding. If you compare two
strings
At 01:21 PM 3/6/2001 -0800, Hong Zhang wrote:
Unless I really, *really* misread the unicode standard (which is
distinctly
possible) normalization has nothing to do with encoding,
I understand what you are trying to say. But it is not very easy in
practice. The normalization has something to
struct perl_string {
void *string_buffer;
UV length;
UV allocated;
UV flags;
}
The low three bits of the flags field is reserved for the type of the
string. The various types are:
=over 4
=item BINARY (0)
=item ASCII (1)
=item EBCDIC (2)
=item
At 12:01 PM 3/5/2001 -0800, Hong Zhang wrote:
struct perl_string {
void *string_buffer;
UV length;
UV allocated;
UV flags;
}
The low three bits of the flags field is reserved for the type of the
string. The various types are:
=over 4
=item
Here's a mildly fixed up version of PDD 4. References to the NUM and INT
types have been reduced to generic concepts rather than data types of some
sort.
Cut Here--
=head1 TITLE
Perl's internal data types
=head1 VERSION
1.1
=head2 CURRENT
Maintainer: Dan Sugalski
Here is an example, "re`sume`" takes 6 characters in Latin-1, but
could take 8 characters in Unicode. All Perl functions that directly
deal with character position and length will be sensitive to encoding.
I wonder how we should handle this case.
My first inclination is to force
Yes, I know I promised the GC PDD, but this was simpler and half finished.
Now it's all finished, and can be used some in both the vtable PDD and the
utility functions PDD.
-Cut here with a sharp knife
=head1 TITLE
Perl's internal data types
=head1 VERSION
1
=head2 CURRENT
Integer data types are generically referred to as CINTs. There is an
CINT typedef that is guaranteed to hold any integer type.
Does such thing exist? Unless it is BIGINT.
Should we scrap the buffer pointer and just tack the buffer on the end
of the structure? Saves a level of indirection,
On Fri, 2 Mar 2001, Dan Sugalski wrote:
=head2 Intger data types
Integer data types are generically referred to as CINTs. There is an
CINT typedef that is guaranteed to hold any integer type.
[gazing into crystal ball . . . ] I predict some header somewhere is going
to already #define
At 01:36 PM 3/2/2001 -0500, Andy Dougherty wrote:
On Fri, 2 Mar 2001, Dan Sugalski wrote:
=head2 Intger data types
Integer data types are generically referred to as CINTs. There is an
CINT typedef that is guaranteed to hold any integer type.
[gazing into crystal ball . . . ] I predict
=item BINARY (0)
=item ASCII (1)
=item EBCDIC (2)
=item UTF_8 (3)
=item UTF_32 (4)
=item NATIVE_1 (5) through NATIVE_3 (7)
A little more complex, but why not use bits 3-7 as actual flags:
7|6|5|4|3|2|1|0
0 0 0 0 1 x x x = UTF UTF_8
0 0 0 1 1 x x x = UTF UTF_32
x x 1 0 1 x x x = UTF
On Fri, Mar 02, 2001 at 02:01:35PM -0500, Dan Sugalski wrote:
At 02:01 PM 3/2/2001 -0500, wiz wrote:
=item BINARY (0)
=item ASCII (1)
=item EBCDIC (2)
=item UTF_8 (3)
=item UTF_32 (4)
=item NATIVE_1 (5) through NATIVE_3 (7)
A little more complex, but why not use bits 3-7 as
On Fri, Mar 02, 2001 at 01:40:40PM -0500, Dan Sugalski wrote:
At 01:36 PM 3/2/2001 -0500, Andy Dougherty wrote:
Do you also want an unsigned variant? (trying to spare Nick some of
the sign preservation madness he's currently battling in perl5.)
Well, we've got an unsigned version of the
At 07:12 PM 3/2/2001 +, Nicholas Clark wrote:
On Fri, Mar 02, 2001 at 02:01:35PM -0500, Dan Sugalski wrote:
At 02:01 PM 3/2/2001 -0500, wiz wrote:
=item BINARY (0)
=item ASCII (1)
=item EBCDIC (2)
=item UTF_8 (3)
=item UTF_32 (4)
=item NATIVE_1 (5) through NATIVE_3
At 07:17 PM 3/2/2001 +, Nicholas Clark wrote:
On Fri, Mar 02, 2001 at 01:40:40PM -0500, Dan Sugalski wrote:
At 01:36 PM 3/2/2001 -0500, Andy Dougherty wrote:
Do you also want an unsigned variant? (trying to spare Nick some of
the sign preservation madness he's currently battling in
If your interest is in speed alone, then adding UTF_16 might offer options as
well:
FORMAT (enc_flags):
7|6|5|4|3|2|1|0
x x 0 0 1 x x x = UTF_8
x x 0 1 0 x x x = UTF_16
x x 1 0 0 x x x = UTF_32
then:
#define UTF 56
utf_encoding = UTF enc_flags;
if( utf_encoding ) {
cout "String is UTF_"
On Fri, Mar 02, 2001 at 12:05:59PM -0800, Hong Zhang wrote:
at some
points it becomes necessary to have an unsigned type for "the largest
integer" which in this case would be 72 bits.
[and on a machine with nothing larger than 32 will be 32]
Sure. The size of an INT will probably be
At 12:21 PM 3/2/2001 -0800, Hong Zhang wrote:
I believe we should use low bits for tagging. It will make switch
case much faster.
That's pretty much what I intended. The only reason not to have them as the
low bits is if there's some other field that uses multiple bits, and
optimizing for
At 12:05 PM 3/2/2001 -0800, Hong Zhang wrote:
at some
points it becomes necessary to have an unsigned type for "the largest
integer" which in this case would be 72 bits.
[and on a machine with nothing larger than 32 will be 32]
Sure. The size of an INT will probably be either 32 or 64
I was hoping to get us something that was guaranteed to hold an integer,
no
matter what it was, so you could do something like:
struct thingie {
UV type;
INT my_int;
}
What is the purpose of doing this? The SV is guaranteed to hold anything.
Why we need a type that can
41 matches
Mail list logo