Re: Unicode windows console output.

2010-11-04 Thread David Sankel
On Thu, Nov 4, 2010 at 6:09 AM, Simon Marlow wrote: > On 04/11/2010 02:35, David Sankel wrote: > >> On Wed, Nov 3, 2010 at 9:00 AM, Simon Marlow > > wrote: >> >>On 03/11/2010 10:36, Bulat Ziganshin wrote: >> >>Hello Max, >> >>Wednesday, November 3, 2

Re: Unicode windows console output.

2010-11-04 Thread Simon Marlow
On 04/11/2010 02:35, David Sankel wrote: On Wed, Nov 3, 2010 at 9:00 AM, Simon Marlow mailto:marlo...@gmail.com>> wrote: On 03/11/2010 10:36, Bulat Ziganshin wrote: Hello Max, Wednesday, November 3, 2010, 1:26:50 PM, you wrote: 1. You need to use "chcp 65001" t

Re: Unicode windows console output.

2010-11-03 Thread David Sankel
On Wed, Nov 3, 2010 at 9:00 AM, Simon Marlow wrote: > On 03/11/2010 10:36, Bulat Ziganshin wrote: > >> Hello Max, >> >> Wednesday, November 3, 2010, 1:26:50 PM, you wrote: >> >> 1. You need to use "chcp 65001" to set the console code page to UTF8 >>> 2. It is very likely that your Windows consol

Re: Unicode windows console output.

2010-11-03 Thread Simon Marlow
On 03/11/2010 10:36, Bulat Ziganshin wrote: Hello Max, Wednesday, November 3, 2010, 1:26:50 PM, you wrote: 1. You need to use "chcp 65001" to set the console code page to UTF8 2. It is very likely that your Windows console won't have the fonts required to actually make sense of the output. Pip

Re: Unicode windows console output.

2010-11-03 Thread Max Bolingbroke
On 2 November 2010 21:05, David Sankel wrote: > Is there a ghc "wontfix" bug ticket for this? Perhaps we can make a small C > test case and send it to the Microsoft people. Some[1] are reporting success > with Unicode console output. I confirmed that I can output Chinese unicode from Haskell. You

Re: Unicode windows console output.

2010-11-03 Thread Krasimir Angelov
It is possible to output some non Latin1 symbols if you use the wide string API but not all of them. Basically the console supports all European language but nothing else - Latin, Cyrillic and Greek. 2010/11/2 David Sankel : > Is there a ghc "wontfix" bug ticket for this? Perhaps we can make a sm

Re: Unicode windows console output.

2010-11-02 Thread David Sankel
Is there a ghc "wontfix" bug ticket for this? Perhaps we can make a small C test case and send it to the Microsoft people. Some[1] are reporting success with Unicode console output. David [1] http://www.codeproject.com/KB/cpp/unicode_console_output.aspx On Tue, Nov 2, 2010 at 3:49 AM, Krasimir A

Re: Unicode windows console output.

2010-11-02 Thread Krasimir Angelov
This is evidence for the broken Unicode support in the Windows terminal and not a problem with GHC. I experienced the same many times. 2010/11/2 David Sankel : > > On Mon, Nov 1, 2010 at 10:20 PM, David Sankel wrote: >> >> Hello all, >> I'm attempting to output some Unicode on the windows console

Re: Unicode windows console output.

2010-11-01 Thread David Sankel
On Mon, Nov 1, 2010 at 10:20 PM, David Sankel wrote: > Hello all, > > I'm attempting to output some Unicode on the windows console. I set my > windows console code page to utf-8 using "chcp 65001". > > The program: > > -- Test.hs > main = putStr "λ.x→x" > > > The output of `runghc Test.hs`: > > λ

Re: unicode characters in operator name

2010-09-10 Thread Greg
Oh cripe...  Yet another reason not to use funny symbols-- even the developer can't tell them apart!Yeah, I wanted a degree sign, but if it's all that subtle then I should probably reconsider the whole idea.On the positive side, I know what ª is for now so today wasn't a complete waste.  =)Thanks--

Re: unicode characters in operator name

2010-09-10 Thread Brandon S Allbery KF8NH
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 9/10/10 21:12 , Greg wrote: > unicode symbol (defined as any Unicode symbol or punctuation). I'm pretty > sure º is a unicode symbol or punctuation. No, it's a raised lowercase "o" used by convention to indicate gender of abbreviated ordinals. Yo

Re: unicode characters in operator name

2010-09-10 Thread Brandon S Allbery KF8NH
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 9/10/10 21:39 , Daniel Fischer wrote: > On Saturday 11 September 2010 03:12:11, Greg wrote: >> a unicode symbol (defined as any Unicode symbol or punctuation). I'm >> pretty sure º is a unicode symbol or punctuation. > > Prelude Data.Char> general

Re: unicode characters in operator name

2010-09-10 Thread Daniel Fischer
On Saturday 11 September 2010 03:12:11, Greg wrote: > > If I read the Haskell Report correctly, operators are named by (symbol > {symbol | : }), where symbol is either an ascii symbol (including *) or > a unicode symbol (defined as any Unicode symbol or punctuation).  I'm > pretty sure º is a unico

Re: Unicode alternative for '..' (ticket #3894)

2010-04-21 Thread Roel van Dijk
On Wed, Apr 21, 2010 at 12:51 AM, Yitzchak Gale wrote: > Yes, sorry. Either use TWO DOT LEADER, or remove > this Unicode alternative altogether > (i.e. leave it the way it is *without* the UnicodeSyntax extension). > > I'm happy with either of those. I just don't like moving the dots > up to the m

Re: Unicode alternative for '..' (ticket #3894)

2010-04-20 Thread Yitzchak Gale
I wrote: >> My opinion is that we should either use TWO DOT LEADER, >> or just leave it as it is now, two FULL STOP characters. Simon Marlow wrote: > Just to be clear, you're suggesting *removing* the Unicode alternative for > '..' from GHC's UnicodeSyntax extension? Yes, sorry. Either use TWO DO

Re: Unicode alternative for '..' (ticket #3894)

2010-04-19 Thread Simon Marlow
On 15/04/2010 18:12, Yitzchak Gale wrote: My opinion is that we should either use TWO DOT LEADER, or just leave it as it is now, two FULL STOP characters. Just to be clear, you're suggesting *removing* the Unicode alternative for '..' from GHC's UnicodeSyntax extension? I have no strong opin

Re: Unicode alternative for '..' (ticket #3894)

2010-04-15 Thread Roel van Dijk
That is very interesting. I didn't know the history of those characters. > If we can't find a Unicode character that everyone agrees upon, > I also don't see any problem with leaving it as two FULL STOP > characters. I agree. I don't like the current Unicode variant for "..", therefore I suggeste

Re: Unicode alternative for '..' (ticket #3894)

2010-04-15 Thread Yitzchak Gale
My opinion is that we should either use TWO DOT LEADER, or just leave it as it is now, two FULL STOP characters. Two dots indicating a range is not the same symbol as a three dot ellipsis. Traditional non-Unicode Haskell will continue to be around for a long time to come. It would be very confusi

Re: Unicode alternative for '..' (ticket #3894)

2010-04-15 Thread Jason Dusek
I think the baseline ellipsis makes much more sense; it's hard to see how the midline ellipsis was chosen. -- Jason Dusek ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell

RE: Unicode in GHC 6.2.2 and 6.4.x (was: Re: [Haskell-cafe] Unicode.hs)

2005-07-18 Thread Simon Marlow
On 17 July 2005 04:42, Dimitry Golubovsky wrote: > Dear List Subscribers, > > Simon Marlow wrote: >> On 30 June 2005 14:36, Dimitry Golubovsky wrote: >> >> >>> It is in CVS now, and I believe will be in 6.4.1 >> >> >> Not planned for 6.4.1, but definitely in 6.6. >> > > I have put those fil

RE: Unicode source files

2005-05-05 Thread Simon Marlow
On 04 May 2005 15:57, Bulat Ziganshin wrote: > it is true what to support unicode source files only StringBuffer > implementation must be changed? It depends whether you want to support several different encodings, or just UTF-8. If we only want to support UTF-8, then we can keep the StringBuffe

RE: Unicode in GHC: need more advice

2005-01-17 Thread Simon Marlow
On 14 January 2005 12:58, Dimitry Golubovsky wrote: > Now I need more advice on which "flavor" of Unicode support to > implement. In Haskell-cafe, there were 3 flavors summarized: I am > reposting the table here (its latest version). > > |Sebastien's| Marcin's | Hugs > ---+--

Re: Unicode in GHC: need more advice

2005-01-14 Thread Dimitry Golubovsky
Hi, Simon Marlow wrote: You're doing fine - but a better place for the tables is as part of the base package, rather than the RTS. We already have some C files in the base package: see libraries/base/cbits, for example. I suggest just putting your code in there. I have done that - now GHCi recog

Re: Unicode in GHC: need some advice on building

2005-01-11 Thread Shawn Garbett
--- Dimitry Golubovsky <[EMAIL PROTECTED]> wrote: > Hi, > > Following up the discussion in Haskell-Cafe about > ways to bring better > Unicode support in GHC. A radical suggestion from an earlier discussion was to make String a typeclass. Have unicode, ascii, etc. all be representations. The qu

RE: Unicode in GHC: need some advice on building

2005-01-11 Thread Simon Marlow
On 11 January 2005 02:29, Dimitry Golubovsky wrote: > Bad thing is, LD_PRELOAD does not work on all systems. So I tried to > put the code directly into the runtime (where I believe it should be; > the Unicode properties table is packed, and won't eat much space). I > renamed foreign function names

Re: Unicode

2001-10-08 Thread Kent Karlsson
- Original Message - From: "Dylan Thurston" <[EMAIL PROTECTED]> To: "Andrew J Bromage" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Friday, October 05, 2001 6:00 PM Subject: Re: UniCode > On Fri, Oct 05, 2

Re: Unicode

2001-10-08 Thread Kent Karlsson
- Original Message - From: "Ketil Malde" <[EMAIL PROTECTED]> To: "Dylan Thurston" <[EMAIL PROTECTED]> Cc: "Andrew J Bromage" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Monday, October 08, 2001 9:02 A

Re: UniCode

2001-10-07 Thread Ketil Malde
Dylan Thurston <[EMAIL PROTECTED]> writes: > Right. In Unicode, the concept of a "character" is not really so > useful; After reading a bit about it, I'm certainly confused. Unicode/ISO-10646 contains a lot of things that aren'r really one character, e.g. ligatures. > most functions that tradi

Re: UniCode

2001-10-06 Thread Andrew J Bromage
G'day all. On Fri, Oct 05, 2001 at 06:17:26PM +, Marcin 'Qrczak' Kowalczyk wrote: > This information is out of date. AFAIR about 4 of them is assigned. > Most for Chinese (current, not historic). I wasn't aware of this. Last time I looked was Unicode 3.0. Thanks for the update. > In

Re: UniCode

2001-10-05 Thread Marcin 'Qrczak' Kowalczyk
05 Oct 2001 14:35:17 +0200, Ketil Malde <[EMAIL PROTECTED]> pisze: > Does Haskell's support of "Unicode" mean UTF-32, or full UCS-4? It's not decided officially. GHC uses UTF-32. It's expected that UCS-4 will vanish and ISO-10646 will be reduced to the same range U+..10 as Unicode. --

Re: UniCode

2001-10-05 Thread Marcin 'Qrczak' Kowalczyk
Fri, 5 Oct 2001 23:23:50 +1000, Andrew J Bromage <[EMAIL PROTECTED]> pisze: > There is a set of one million (more correctly, 1M) Unicode characters > which are only accessible using surrogate pairs (i.e. two UTF-16 > codes). There are currently none of these codes assigned, This information is

Re: UniCode

2001-10-05 Thread Dylan Thurston
On Fri, Oct 05, 2001 at 11:23:50PM +1000, Andrew J Bromage wrote: > G'day all. > > On Fri, Oct 05, 2001 at 02:29:51AM -0700, Krasimir Angelov wrote: > > > Why Char is 32 bit. UniCode characters is 16 bit. > > It's not quite as simple as that. There is a set of one million > (more correctly, 1M

Re: UniCode

2001-10-05 Thread Andrew J Bromage
G'day all. On Fri, Oct 05, 2001 at 02:29:51AM -0700, Krasimir Angelov wrote: > Why Char is 32 bit. UniCode characters is 16 bit. It's not quite as simple as that. There is a set of one million (more correctly, 1M) Unicode characters which are only accessible using surrogate pairs (i.e. two UTF

Re: UniCode

2001-10-05 Thread Ketil Malde
"Marcin 'Qrczak' Kowalczyk" <[EMAIL PROTECTED]> writes: > Fri, 5 Oct 2001 02:29:51 -0700 (PDT), Krasimir Angelov <[EMAIL PROTECTED]> pisze: > > > Why Char is 32 bit. UniCode characters is 16 bit. > No, Unicode characters have 21 bits (range U+..10). We've been through all this, of cour

Re: UniCode

2001-10-05 Thread Marcin 'Qrczak' Kowalczyk
Fri, 5 Oct 2001 02:29:51 -0700 (PDT), Krasimir Angelov <[EMAIL PROTECTED]> pisze: > Why Char is 32 bit. UniCode characters is 16 bit. No, Unicode characters have 21 bits (range U+..10). They used to fit in 16 bits a long time ago, and they are sometimes encoded as UTF-16 (each character

Re: Unicode

2000-05-17 Thread Frank Atanassow
Manuel M. T. Chakravarty writes: > The problem with restricting youself to the Jouyou-Kanji is > that you have a hard time with names (of persons and > places). Many exotic and otherwise unused Kanji are used in > names (for historical reasons) and as the Kanji > representation of a name is

Re: Unicode

2000-05-16 Thread Manuel M. T. Chakravarty
Frank Atanassow <[EMAIL PROTECTED]> wrote, > George Russell writes: > > Marcin 'Qrczak' Kowalczyk wrote: > > > As for the language standard: I hope that Char will be allowed or > > > required to have >=30 bits instead of current 16; but never more than > > > Int, to be able to use ord and chr

Re: Unicode

2000-05-16 Thread Marcin 'Qrczak' Kowalczyk
Tue, 16 May 2000 12:26:12 +0200 (MET DST), Frank Atanassow <[EMAIL PROTECTED]> pisze: > Of course, you can always come up with specialized schemes involving stateful > encodings and/or "block-swapping" (using the Unicode private-use areas, for > example), but then, that subverts the purpose of Un

Re: Unicode

2000-05-16 Thread Marcin 'Qrczak' Kowalczyk
Tue, 16 May 2000 10:44:28 +0200, George Russell <[EMAIL PROTECTED]> pisze: > > As for the language standard: I hope that Char will be allowed or > > required to have >=30 bits instead of current 16; but never more than > > Int, to be able to use ord and chr safely. > > Er does it have to? The J

Re: Unicode

2000-05-16 Thread Frank Atanassow
George Russell writes: > Marcin 'Qrczak' Kowalczyk wrote: > > As for the language standard: I hope that Char will be allowed or > > required to have >=30 bits instead of current 16; but never more than > > Int, to be able to use ord and chr safely. > Er does it have to? The Java Virtual Mach

RE: Unicode

2000-05-16 Thread Simon Marlow
> > OTOH, it wouldn't be hard to change GHC's Char datatype to be a > > full 32-bit integral data type. > > Could we do it please? > > It will not break anything if done slowly. I imagine that > {read,write}CharOffAddr and _ccall_ will still use only 8 bits of > Char. But after Char is wide, lib

Re: Unicode

2000-05-16 Thread George Russell
Marcin 'Qrczak' Kowalczyk wrote: > As for the language standard: I hope that Char will be allowed or > required to have >=30 bits instead of current 16; but never more than > Int, to be able to use ord and chr safely. Er does it have to? The Java Virtual Machine implements Unicode with 16 bits.

Re: Unicode

2000-05-15 Thread Marcin 'Qrczak' Kowalczyk
Mon, 15 May 2000 02:45:17 -0700, Simon Marlow <[EMAIL PROTECTED]> pisze: > OTOH, it wouldn't be hard to change GHC's Char datatype to be a > full 32-bit integral data type. Could we do it please? It will not break anything if done slowly. I imagine that {read,write}CharOffAddr and _ccall_ will

RE: Unicode

2000-05-15 Thread Simon Marlow
> How safe is representinging Unicode characters as Chars unsafeCoerce#d > from large Ints? Seems to work in simple cases :-) er, "downright dangerous". There are lots of places where we assume that Chars have only 8 bits of data, even though the representation has room for 32. eg. the Char pr

Re: Unicode support

1998-04-24 Thread Frank A. Christoph
>> What is the status of the lastest release (3.01) with respect to Unicode >> support? Is it possible to write source in Unicode? How wide are >> characters? Do the I/O library functions support it? etc. > >I don't believe that we've done anything much about Unicode >support. If it's import