RE: Transcoding patch

2001-10-10 Thread Dan Sugalski
At 01:36 PM 10/10/2001 +0200, Henrik Tougaard wrote: >From: Dan Sugalski [mailto:[EMAIL PROTECTED]] > >... > > strnative's the native encoding, right? It shouldn't be US-ASCII by > > default, particularly, at least not for everyone. (Does > > anyone handy have > > an 8-bit set that's not US ASCII

Re: Transcoding patch

2001-10-10 Thread Dan Sugalski
At 01:36 PM 10/10/2001 +0200, Bart Lateur wrote: >On Tue, 09 Oct 2001 21:12:00 -0400, Dan Sugalski wrote: > > >Does anyone handy have > >an 8-bit set that's not US ASCII as their default character set? > >EBCDIC? Or any ASCII variant with a different set of high-bit characters. If we could get,

RE: Transcoding patch

2001-10-10 Thread Henrik Tougaard
From: Dan Sugalski [mailto:[EMAIL PROTECTED]] >... > strnative's the native encoding, right? It shouldn't be US-ASCII by > default, particularly, at least not for everyone. (Does > anyone handy have > an 8-bit set that's not US ASCII as their default character > set? I use ISO-8859-1 - its no

Re: Transcoding patch

2001-10-10 Thread Bart Lateur
On Tue, 09 Oct 2001 21:12:00 -0400, Dan Sugalski wrote: >Does anyone handy have >an 8-bit set that's not US ASCII as their default character set? EBCDIC? Not me. -- Bart.

Re: Transcoding patch

2001-10-09 Thread Dan Sugalski
At 12:58 AM 10/10/2001 +0100, Simon Cozens wrote: >On Tue, Oct 09, 2001 at 10:37:22AM -0400, Dan Sugalski wrote: > > On the other hand, I'd really, *really* rather not have Unicode > > constants in anything other than UTF-32 > >That's a bizarre decision; I'm sure you mean UCS-4 by that. Nope, I m

Re: Transcoding patch

2001-10-09 Thread Simon Cozens
On Tue, Oct 09, 2001 at 10:37:22AM -0400, Dan Sugalski wrote: > On the other hand, I'd really, *really* rather not have Unicode > constants in anything other than UTF-32 That's a bizarre decision; I'm sure you mean UCS-4 by that. I don't think UTF-32 can address outside of the BMP, but I can't q

RE: Transcoding patch

2001-10-09 Thread Dan Sugalski
At 10:29 PM 10/9/2001 +0100, Tom Hughes wrote: >I havn't added the A prefix because I'm still not clear what >encoding those are supposed to map to. I can understand the >following mappings: > > N => enc_native > U => enc_utf32 > >but what is A supposed to map to exactly? or is the assembler >

RE: Transcoding patch

2001-10-09 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Dan Sugalski <[EMAIL PROTECTED]> wrote: > utf8 and utf16 are both variable length encodings for space reasons. > There's not much reason to space-compact something then expand the heck out > of it. On the other hand, I'd really, *really* rather not have Un

RE: Transcoding patch

2001-10-09 Thread Dan Sugalski
At 03:03 PM 10/9/2001 -0500, Gibbs Tanton - tgibbs wrote: > > At 07:03 PM 10/8/2001 -0500, Gibbs Tanton - tgibbs wrote: > > >This looks good. > > > > > >Also, WRT the utf8_t, utf16_t, and utf32_t can we not just use >utf32_t and > > >then mask off the lower 8 or 16 bits? We can still have utf8_t

RE: Transcoding patch

2001-10-09 Thread Gibbs Tanton - tgibbs
> At 07:03 PM 10/8/2001 -0500, Gibbs Tanton - tgibbs wrote: > >This looks good. > > > >Also, WRT the utf8_t, utf16_t, and utf32_t can we not just use utf32_t and > >then mask off the lower 8 or 16 bits? We can still have utf8_t be defined > >as char to allow sizeof to work right and we can do siz

RE: Transcoding patch

2001-10-09 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Dan Sugalski <[EMAIL PROTECTED]> wrote: > At 07:03 PM 10/8/2001 -0500, Gibbs Tanton - tgibbs wrote: > >This looks good. > > > >Also, WRT the utf8_t, utf16_t, and utf32_t can we not just use utf32_t and > >then mask off the lower 8 or 16 bits? We can still

RE: Transcoding patch

2001-10-09 Thread Dan Sugalski
At 07:03 PM 10/8/2001 -0500, Gibbs Tanton - tgibbs wrote: >This looks good. > >Also, WRT the utf8_t, utf16_t, and utf32_t can we not just use utf32_t and >then mask off the lower 8 or 16 bits? We can still have utf8_t be defined >as char to allow sizeof to work right and we can do sizeof(utf8_t)*

RE: Transcoding patch

2001-10-08 Thread Gibbs Tanton - tgibbs
Thanks! Applied. -Original Message- From: Tom Hughes To: [EMAIL PROTECTED] Sent: 10/8/2001 6:51 PM Subject: RE: Transcoding patch In message <[EMAIL PROTECTED]> Gibbs Tanton - tgibbs <[EMAIL PROTECTED]> wrote: > This is good, unless someone has objections I

RE: Transcoding patch

2001-10-08 Thread Gibbs Tanton - tgibbs
From: Tom Hughes To: [EMAIL PROTECTED] Sent: 10/8/2001 6:51 PM Subject: RE: Transcoding patch In message <[EMAIL PROTECTED]> Gibbs Tanton - tgibbs <[EMAIL PROTECTED]> wrote: > This is good, unless someone has objections I'll commit this. However, we > also need

RE: Transcoding patch

2001-10-08 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Gibbs Tanton - tgibbs <[EMAIL PROTECTED]> wrote: > This is good, unless someone has objections I'll commit this. However, we > also need the ability to do unicode in the assembler (I'll do this later > today if no one beats me to it), and we need some way

Re: Transcoding patch

2001-10-08 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Gibbs Tanton <[EMAIL PROTECTED]> wrote: > > - The utf8_t, utf16_t and utf32_t types will need to be determined > >by configure as they will currently break on some machines. Plus > >machines without native 8, 16 and 32 bit types will be a problem. >

RE: Transcoding patch

2001-10-08 Thread Gibbs Tanton - tgibbs
> Absolutely. A few other issues that I remembered last night are: > > - The current code assumes that the string data will be two >byte aligned for UTF-16 and four byte aligned for UTF-32 which >is probably reasonable but maybe not. Yeah, I think we can handle that in the constant secti

Re: Transcoding patch

2001-10-08 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Gibbs Tanton <[EMAIL PROTECTED]> wrote: > I've applied this patch. I just did an update and noticed the new files had appeared about two seconds before your mail arrived ;-) > I realize that we have a ways to go before we can fully support unicode, but > I

RE: Transcoding patch

2001-10-08 Thread Gibbs Tanton - tgibbs
I've applied this patch. I realize that we have a ways to go before we can fully support unicode, but I felt that this patch was a big step in the right direction; with it committed we can now start incrementally cleaning it up and making it work correctly. Since it doesn't affect anything we ar

Re: Transcoding patch

2001-10-07 Thread Simon Cozens
On Sun, Oct 07, 2001 at 11:08:56AM -0500, Gibbs Tanton - tgibbs wrote: > I guess the question with native strings is will it always be ASCII or will > it be Shift-JIS etc...? Can I just say: locales. -- Ah the joys of festival + Gutenburg project. I can now have Moby Dick read to me by Steph

RE: Transcoding patch

2001-10-07 Thread Tom Hughes
In message <[EMAIL PROTECTED]> Gibbs Tanton - tgibbs <[EMAIL PROTECTED]> wrote: > This is good, unless someone has objections I'll commit this. However, we > also need the ability to do unicode in the assembler (I'll do this later > today if no one beats me to it), and we need some way

RE: Transcoding patch

2001-10-07 Thread Gibbs Tanton - tgibbs
This is good, unless someone has objections I'll commit this. However, we also need the ability to do unicode in the assembler (I'll do this later today if no one beats me to it), and we need some way to communicate the encoding number between the C and the Perl code. I guess the question with n