[Haskell] Availability of Hugs with Unicode support?

2004-07-14 Thread Graham Klyne
When can I expect a version of Hugs with Unicode support to be generally released? I've been using an unofficial build of Hugs with Unicode character support to develop enhancements to the HaXml parser which are needed for my RDF/XML parser. I'm starting to think about packaging

Re: [Haskell] Bug in experimental Unicode support for Hugs?

2004-05-28 Thread Ross Paterson
On Fri, May 28, 2004 at 01:20:32PM +0100, Graham Klyne wrote: > I've noticed a discrepancy in by version of Hugs with experimental Unicode > support enabled, based on the 20040109 codebase. It's exemplified by this: > > [[ > Main> '\x10' > '\1

[Haskell] Bug in experimental Unicode support for Hugs?

2004-05-28 Thread Graham Klyne
I've noticed a discrepancy in by version of Hugs with experimental Unicode support enabled, based on the 20040109 codebase. It's exemplified by this: [[ Main> '\x10' '\1114111' Main> maxBound::Char '\255' Main> ]] It appears that this v

MS-Windows build of Hugs with experimental Unicode support

2004-01-12 Thread Graham Klyne
Further to my last message [1], if anyone else wants to play with the experimental Unicode support for Hugs under MS-Windows, I've placed a Windows executable file on my Web site, linked from [2]. #g -- [1] http://haskell.org/pipermail/haskell/2004-January/013377.html [2]

RE: Unicode stupidity (Was: Unicode support)

2001-10-24 Thread Karlsson Kent - keka
> None of that "But 21 bits *is* enough". > Yeah, like 640K was enough. And countless other examples. That is not comparable. Never was. > I thought we had learned, but I was wrong... I'm especially > disheartened to hear that ISO bought into the same crap. Who's going to invent all these gaz

Unicode stupidity (Was: Unicode support)

2001-10-24 Thread "Jürgen A. Erhard"
> "Marcin" == Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> writes: Marcin> 30 Sep 2001 22:28:52 +0900, Jens Petersen <[EMAIL PROTECTED]> pisze: >> 16 bits is enough to describe the Basic Multilingual Plane >> and I think 24 bits all the currently defined extended >> planes. [

Re: Unicode support

2001-10-10 Thread Marcin 'Qrczak' Kowalczyk
Tue, 9 Oct 2001 14:59:09 -0700, John Meacham <[EMAIL PROTECTED]> pisze: > I think a cannonical way to get at iconvs ('man 3 iconv' for info.) > functionality in one of the standard librarys would be great. perhaps > I will have a go at it. even if the underlying platform does not have > iconv the

Re: Unicode support

2001-10-09 Thread John Meacham
On Tue, Oct 09, 2001 at 12:37:27PM +0200, Kent Karlsson wrote: > > At 2001-10-09 02:58, Kent Karlsson wrote: > > >In summary: > > >code position (=code point): a value between and 10. > > Would this be a reasonable basis for Haskell's 'Char' type? > > Yes. It's essentially UTF-32, b

Re: Unicode support

2001-10-09 Thread Marcin 'Qrczak' Kowalczyk
On Tue, 9 Oct 2001, Ashley Yakeley wrote: > Would it be worthwhile restricting Char to the 0-10 range, just as a > Word8 is restricted to 0-FF even though in GHC at least it's stored > 32-bit? It is thus restricted in GHC. I think it's a good compromise between 32-bit-Unicode and 16-bit-U

Re: Unicode support

2001-10-09 Thread Ashley Yakeley
At 2001-10-09 03:37, Kent Karlsson wrote: >> >code position (=code point): a value between and 10. >> >> Would this be a reasonable basis for Haskell's 'Char' type? > >Yes. It's essentially UTF-32, but without the fixation to 32-bit >(21 bits suffice). UTF-32 (a.k.a. UCS-4 in 10646,

Re: Unicode support

2001-10-09 Thread Kent Karlsson
- Original Message - From: "Ashley Yakeley" <[EMAIL PROTECTED]> To: "Kent Karlsson" <[EMAIL PROTECTED]>; "Haskell List" <[EMAIL PROTECTED]>; "Libraries for Haskell List" <[EMAIL PROTECTED]> Sent: Tuesday, October 09, 20

Re: Unicode support

2001-10-09 Thread Ashley Yakeley
At 2001-10-09 02:58, Kent Karlsson wrote: >In summary: > >code position (=code point): a value between and 10. Would this be a reasonable basis for Haskell's 'Char' type? At some point perhaps there should be a 'Unicode' standard library for Haskell. For instance: encodeUTF8 :: S

Re: Unicode support

2001-10-09 Thread Kent Karlsson
Just to clear up any misunderstanding: - Original Message - From: "Ashley Yakeley" <[EMAIL PROTECTED]> To: "Haskell List" <[EMAIL PROTECTED]> Sent: Monday, October 01, 2001 12:36 AM Subject: Re: Unicode support > At 2001-09-30 07:29, Marcin 'Qrc

Re: Unicode support

2001-10-08 Thread Kent Karlsson
- Original Message - From: "Dylan Thurston" <[EMAIL PROTECTED]> To: "John Meacham" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Friday, October 05, 2001 5:47 PM Subject: Re: Unicode support > On Sun, Sep 30, 2001 at 11:01:38AM -0700, John Mea

Re: Unicode support

2001-10-08 Thread Kent Karlsson
- Original Message - From: "Wolfgang Jeltsch" <[EMAIL PROTECTED]> To: "The Haskell Mailing List" <[EMAIL PROTECTED]> Sent: Thursday, October 04, 2001 8:47 PM Subject: Re: Unicode support > On Sunday, 30 September 2001 20:01, John Meacham wrote: >

Re: Unicode support

2001-10-05 Thread Dylan Thurston
On Sun, Sep 30, 2001 at 11:01:38AM -0700, John Meacham wrote: > seeing as how the haskell standard is horribly vauge when it comes to > character set encodings anyway, I would recommend that we just omit any > reference to the bit size of Char, and just say abstractly that each > Char represents o

Re: Unicode support

2001-10-04 Thread Wolfgang Jeltsch
On Sunday, 30 September 2001 20:01, John Meacham wrote: > sorry for the me too post, but this has been a major pet peeve of mine > for a long time. 16 bit unicode should be gotten rid of, being the worst > of both worlds, non backwards compatable with ascii, endianness issues > and no constant len

Re: Unicode support

2001-09-30 Thread Jens Petersen
effect? (There was some unicode discussion earlier, about upper and lower case if I remember correctly, but I am surprised noone has raised this point again.) Jens ps We need better unicode support in the implementations too. At least ghc-5 has 31 bit

Re: Unicode support

2001-09-30 Thread Ashley Yakeley
At 2001-09-30 07:29, Marcin 'Qrczak' Kowalczyk wrote: >Some time ago the Unicode Consortium slowly began switching to the >point of view that abstract characters are denoted by numbers in the >range U+..10. It's worth mentioning that these are 'codepoints', not 'characters'. Sometimes a

Re: Unicode support

2001-09-30 Thread John Meacham
sorry for the me too post, but this has been a major pet peeve of mine for a long time. 16 bit unicode should be gotten rid of, being the worst of both worlds, non backwards compatable with ascii, endianness issues and no constant length encoding utf8 externally and utf32 when worknig with ind

Re: Unicode support

2001-09-30 Thread Marcin 'Qrczak' Kowalczyk
30 Sep 2001 14:43:21 +0100, Colin Paul Adams <[EMAIL PROTECTED]> pisze: > I think it should either be amended to mention the BMP subset of > Unicode, or, better, change the reference from 16-bit to 24-bit. 24-bit is not accurate. The range from 0 to 0x10 has 20.087462841250343 bits. There is

Re: Unicode support

2001-09-30 Thread Marcin 'Qrczak' Kowalczyk
30 Sep 2001 22:28:52 +0900, Jens Petersen <[EMAIL PROTECTED]> pisze: > 16 bits is enough to describe the Basic Multilingual Plane > and I think 24 bits all the currently defined extended > planes. So I guess the report just refers to the BMP. In early days the Unicode Consortium was doing every

Re: Unicode support

2001-09-30 Thread Colin Paul Adams
> "Jens" == Jens Petersen <[EMAIL PROTECTED]> writes: Jens> 16 bits is enough to describe the Basic Multilingual Plane Jens> and I think 24 bits all the currently defined extended Jens> planes. So I guess the report just refers to the BMP. I guess it does, and I think back in 19

Re: Unicode support

2001-09-30 Thread Jens Petersen
Hamilton Richards <[EMAIL PROTECTED]> writes: > At 12:20 PM -0500 9/29/01, Colin Paul Adams wrote: > >I have just been reading through the Haskell report to refresh my > >memory of the language. I was surprised to see this: > > > >The character type Char is an enumeration and consists of 16 bit v

Re: Unicode support

2001-09-29 Thread Hamilton Richards
At 12:20 PM -0500 9/29/01, Colin Paul Adams wrote: >I have just been reading through the Haskell report to refresh my >memory of the language. I was surprised to see this: > >The character type Char is an enumeration and consists of 16 bit values, >conforming to >the Unicode standard [10]. > >Unic

Unicode support

2001-09-29 Thread Colin Paul Adams
I have just been reading through the Haskell report to refresh my memory of the language. I was surprised to see this: The character type Char is an enumeration and consists of 16 bit values, conforming to the Unicode standard [10]. Unicode uses 24-bit values to identify characters. -- Colin Pa