[Haskell] [ANN] unicode-transforms-0.2.0 pure Haskell unicode normalization

2016-10-25 Thread Harendra Kumar
Hi, I released unicode-transforms sometime back as bindings to a C library (utf8proc). Since then I have rewritten it completely in Haskell. Haskell data structures are automatically generated from unicode database, so it can be kept up-to-date with the standard unlike the C implementation which

[Haskell] ANN: unicode-transforms-0.1.0.1 (unicode normalization)

2016-08-03 Thread Harendra Kumar
This is for those who want unicode normalization but do not want a dependency on the heavyweight icu libraries. It has the same API as the text-icu package. It is based on the utf8proc C implementation. https://hackage.haskell.org/package/unicode-transforms https://github.com/harendra-kumar

[Haskell] ANN: unicode-0.0

2014-01-04 Thread Henning Thielemann
The first announcement this year: The 'unicode' package contains functions for construction of various characters like: * block graphic elements * frame elements * fractions * subscript and superscript characters http://hackage.haskell.org/package/unicode The package is simp

[Haskell] HWN Issue 148 -- unicode error

2010-01-31 Thread Joe Fredette
Indeed, the Unicode only made it through my system, and not the mailer itself. In fact, the name should look something like "Ćwikłowski", and not the garbled mess that made it through. Apologies again, Bartek! /Joe___ Haskell ma

[Haskell] Re: ANNOUNCE: Haskeline 0.6.2 - unicode width calculation not working

2009-09-15 Thread Ahn, Ki Yung
http://trac.haskell.org/haskeline/wiki/KeyBindings * A new preference 'historyDuplicates' to remove repeated history entries * Recognize PageUp and PageDown keys * Compatibility with ghc-6.12 * Correct width calculations for Unicode combining characters Oh, this Unicode width calculati

Re: [Haskell] Unicode advice request

2008-03-14 Thread László Kálmán
Dear Don, thanks very much! You may need to write the strings to the database using the utf8-string package. My program may not be the only one writing into the database. But I now *read* from the database using utf8-string, which solves the problem: import Codec.Binary.UTF8.String ( decode

Re: [Haskell] Unicode advice request

2008-03-14 Thread Don Stewart
kalman: > Excuse me if this question is misplaced or too trivial. > > I'm writing a CGI program in Haskell (CGI/HDBC/Sqlite3), the database > contains UTF-8 strings. If I use HDBC.fetchRow() to retrieve the data, > then HDBC.fromSql() to convert the data to Haskell, then Text.XHtml > construc

[Haskell] Unicode advice request

2008-03-14 Thread László Kálmán
Excuse me if this question is misplaced or too trivial. I'm writing a CGI program in Haskell (CGI/HDBC/Sqlite3), the database contains UTF-8 strings. If I use HDBC.fetchRow() to retrieve the data, then HDBC.fromSql() to convert the data to Haskell, then Text.XHtml constructs to dsplay it, I

[Haskell] Problems with Unicode Symbols as Infix Function Names in Propositional Calculus Haskell DSL

2008-01-09 Thread Cetin Sert
I want to design a DSL in Haskell for propositional calculus. But instead of using natural language names for functions like or, and, implies etc. I want to use Unicode symbols as infix functions ¬, ˅, ˄, →, ↔ But I keep getting error messages from the GHC parser. Is there a way to make GHC

[Haskell] ANNOUNCE: Data.CompactString 0.3 - Unicode ByteString with different encodings

2007-03-11 Thread Twan van Laarhoven
Hello all, I would like to announce version 0.3 of my Data.CompactString library. Data.CompactString is a wrapper around Data.ByteString that represents a Unicode string. This new version supports different encodings, as can be seen from the data type: > data Encoding a => CompactSt

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-09 Thread Deborah Goldsmith
Unicode in use are UTF-16, UTF-8, and (less frequently) UTF-32. On Feb 9, 2007, at 6:02 AM, Duncan Coutts wrote: Apparently UTF-16 (which is like UCS-2 but covers all code points) is a good internal format. It is more compact than UTF-32 in almost all cases and a less complex encoding than UTF-8

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-09 Thread Duncan Coutts
find it helpful to maintain the fiction that there's no such thing (any more) as UCS-N, there's only UTF-8, 16 and 32. This is also what the Unicode consortium tries to encourage. My view is that we should just provide all three: Data.PackedString.UTF8 Data.PackedString.UTF16 Data.PackedS

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-08 Thread John Meacham
On Mon, Feb 05, 2007 at 01:14:26PM +0100, Twan van Laarhoven wrote: > The reason for inventing my own encoding is that it is easier to use and > takes less space than UTF-8. The only advantage UTF-8 has is that it can > be read and written directly. I guess this is a trade off, faster > manipula

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-08 Thread John Meacham
On Tue, Feb 06, 2007 at 03:16:17PM +0900, shelarcy wrote: > I'm afraid that its fantasy is broken again, as no surrogate > pair UCS-2 cover all language that is trusted before Europe > and America people. UCS-2 is a disaster in every way. someone had to say it. :) everything should be ascii, utf8

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-05 Thread shelarcy
es, it seems you are incorrect. > > Haskell Char go up to Unicode 1114111 (decimal) or 0x10 Hexidecimal). > These are encoded by UTF-8 in 1,2,3,or 4 bytes. I see. I'm confused Unicode support with Charset support. I'm sorry about it. UCS-4 can support greater than 1114111 code

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-05 Thread David Menendez
allowing up to 6 bytes to encode a single char. > There's nothing stopping the Unicode consortium from expanding the > range of codepoints, is there? Or have they said that'll never happen? I believe they have. In particular, UTF-16 only supports code points up to 10. From &l

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-05 Thread Alistair Bayley
as the most accurate UTF8 en + de-coders: http://abridgegame.org/cgi-bin/darcs.cgi/darcs/UTF8.lhs?c=annotate There's nothing stopping the Unicode consortium from expanding the range of codepoints, is there? Or have they said that'll never happen? Alistair

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-05 Thread Chris Kuklewicz
shelarcy wrote: > Hello Twan, > > On Mon, 05 Feb 2007 08:46:35 +0900, Twan van Laarhoven <[EMAIL PROTECTED]> > wrote: >> I would like to announce my attempt at making a Unicode version of >> Data.ByteString. The library is named Data.CompactString to avoi

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-05 Thread shelarcy
Hello Twan, On Mon, 05 Feb 2007 08:46:35 +0900, Twan van Laarhoven <[EMAIL PROTECTED]> wrote: > I would like to announce my attempt at making a Unicode version of > Data.ByteString. The library is named Data.CompactString to avoid > conflict with other (Fast)PackedString librar

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-05 Thread Twan van Laarhoven
Chris Kuklewicz wrote: Can I be among the first to ask that any Unicode variant of ByteString use a recognized encoding? In reading all the poke/peek function I did not see anything that your tag bits accomplish that the tag bits in utf-8 do not, except that you want to write only a single

Re: [Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-05 Thread Chris Kuklewicz
Twan van Laarhoven wrote: > Hello all, > > I would like to announce my attempt at making a Unicode version of > Data.ByteString. The library is named Data.CompactString to avoid > conflict with other (Fast)PackedString libraries. > > The library uses a variable length encod

[Haskell] ANNOUNCE: Data.CompactString 0.1 - my attempt at a Unicode ByteString

2007-02-04 Thread Twan van Laarhoven
Hello all, I would like to announce my attempt at making a Unicode version of Data.ByteString. The library is named Data.CompactString to avoid conflict with other (Fast)PackedString libraries. The library uses a variable length encoding (1 to 3 bytes) of Chars into Word8s, which are then

[Haskell] Re: unicode/internalization issues

2006-03-28 Thread John Meacham
On Sun, Mar 26, 2006 at 03:22:38PM +0400, Bulat Ziganshin wrote: > 3. Unicode support in I/O routines, i.e. ability to read/write UTF-8 > encoded files and files what use other Unicode byte encodings: not > implemented in any compiler, afaik, but there are 3rd-party libs: > Streams li

[Haskell] unicode/internalization issues

2006-03-26 Thread Bulat Ziganshin
Hello haskell-prime, i've planned some time ago to open unicode/internalization wiki page, what reflects current state of the art in this area. here is the information i have, please add/correct me if i don't know something or wrong. 1. Char supports full Unicode range (about millio

Re: [Haskell] Unicode Source / Keyboard Layout

2005-03-22 Thread Sven Moritz Hallberg
Am 23. Mrz 2005 um 5.23 Uhr schrieb Antonio Regidor García: 8. Accept "element of" U+2208 instead of <- in list comprehensions? That would take it away as a regular operator (to use as the elem function). Also, the right hand side of those arrows being lists rather than proper sets, with an impo

Re: [Haskell] Unicode Source / Keyboard Layout

2005-03-22 Thread Antonio Regidor García
> > De: Sven Moritz Hallberg <[EMAIL PROTECTED]> > > To make the fun complete, I think a few adjustments to the Haskell > grammar would be in order, so, guessing the relevant people read these > lists, may I suggest the following? > > 1. In addition to the backslash, accept "mathematical

Re: [Haskell] Unicode Source / Keyboard Layout

2005-03-21 Thread Jules Bean
On 21 Mar 2005, at 12:12, Marcin 'Qrczak' Kowalczyk wrote: Sven Moritz Hallberg <[EMAIL PROTECTED]> writes: 1. In addition to the backslash, accept "mathematical * small lamda" (U+1D6CC, U+1D706, U+1D740, U+1D77A, and U+1D7B4) for lambda abstractions. Leave "greek small letter lamda" as a r

Re: [Haskell] Unicode Source / Keyboard Layout

2005-03-21 Thread Marcin 'Qrczak' Kowalczyk
Sven Moritz Hallberg <[EMAIL PROTECTED]> writes: > 1. In addition to the backslash, accept "mathematical * small > lamda" (U+1D6CC, U+1D706, U+1D740, U+1D77A, and U+1D7B4) for lambda > abstractions. Leave "greek small letter lamda" as a regular letter, > so the Greeks can write their native

[Haskell] Unicode Source / Keyboard Layout

2005-03-21 Thread Sven Moritz Hallberg
Greetings GHC and Haskell folk, please excuse the cross-post. This is a coordinational message. :) I've been longing for Unicode (UTF-8) input support in GHC for a long time. I am currently customizing a keyboard layout to include many mathematical operators and special characters which wou

[Haskell] [ANNOUNCE] New version of unicode CWString library with extras

2005-01-18 Thread John Meacham
A while ago I wrote a glibc specific implementation of the CWString library. I have since made several improvements: * No longer glibc specific, should compile and work on any system with iconv (which is unix standard) (but there are still glibc specific optimizations) * general iconv library

[Haskell] Availability of Hugs with Unicode support?

2004-07-14 Thread Graham Klyne
When can I expect a version of Hugs with Unicode support to be generally released? I've been using an unofficial build of Hugs with Unicode character support to develop enhancements to the HaXml parser which are needed for my RDF/XML parser. I'm starting to think about packaging

Re: [Haskell] Bug in experimental Unicode support for Hugs?

2004-05-28 Thread Ross Paterson
On Fri, May 28, 2004 at 01:20:32PM +0100, Graham Klyne wrote: > I've noticed a discrepancy in by version of Hugs with experimental Unicode > support enabled, based on the 20040109 codebase. It's exemplified by this: > > [[ > Main> '\x10' > '\1

[Haskell] Bug in experimental Unicode support for Hugs?

2004-05-28 Thread Graham Klyne
I've noticed a discrepancy in by version of Hugs with experimental Unicode support enabled, based on the 20040109 codebase. It's exemplified by this: [[ Main> '\x10' '\1114111' Main> maxBound::Char '\255' Main> ]] It appears that this v

MS-Windows build of Hugs with experimental Unicode support

2004-01-12 Thread Graham Klyne
Further to my last message [1], if anyone else wants to play with the experimental Unicode support for Hugs under MS-Windows, I've placed a Windows executable file on my Web site, linked from [2]. #g -- [1] http://haskell.org/pipermail/haskell/2004-January/013377.html [2]

Building experimental Unicode version of Hugs on Windows

2004-01-12 Thread Graham Klyne
e to run a complex test suite program. My next step is to see if the Unicode support is sufficient to run the HXML toolbox software. ... Attempting to load module HUnitExample.hs from the 3.01 HXML toolbox distribution (), I'm getting: [[ ERROR "..\hparser\Unicode.hs":116 - Hex

Problems building experimental Unicode version of Hugs on Windows

2004-01-09 Thread Graham Klyne
This message notes some edits I'm making to try and build the Dec2003 Hugs source kit with Unicode under MS-Windows (because I want to try and use HXML toolbox under Hugs). It also describes a problem I'm having with the experimental Unicode support under Windows. (I'm also

Re: Unicode + Re: Reading/Writing Binary Data in Haskell

2003-07-14 Thread George Russell
hoice as to the encoding, UTF-8 is > definitely the way to go. No I don't think so. UTF8 is a good choice if you want a way of storing Unicode files on an 8-bit file-system, but it is not as efficient an encoding for characters in general. Thus with UTF8 you can represent character codes less

Re: Unicode + Re: Reading/Writing Binary Data in Haskell

2003-07-14 Thread Glynn Clements
George Russell wrote: > > OTOH, existing implementations (at least GHC and Hugs) currently read > > and write "8-bit binary", i.e. characters 0-255 get read and written > > "as-is" and anything else breaks, and changing that would probably > > break a fair amount of existing code. > > The bi

Unicode + Re: Reading/Writing Binary Data in Haskell

2003-07-14 Thread George Russell
d the top bit to signal that the encoding is not yet complete. Characters 0-127 (which include the standard ASCII ones) get encoded as themselves. This is probably not nearly as efficient as encoding characters as themselves, but it's nice to be Unicode-proof ... ___

RE: H98 Report: Unicode (was: Re: H98 Report: input functions)

2002-09-11 Thread Simon Peyton-Jones
Ketil says: | While we're at it, are there any plans to remove this paragraph from | section 2.1: | | | Haskell uses a pre-processor to convert non-Unicode character sets | | into Unicode. This pre-processor converts all characters to Unicode | | and uses the escape sequence \u, wher

H98 Report: Unicode (was: Re: H98 Report: input functions)

2002-09-10 Thread Ketil Z. Malde
uses a pre-processor to convert non-Unicode character sets | into Unicode. This pre-processor converts all characters to Unicode | and uses the escape sequence \u, where the "h" are hex digits, | to denote escaped Unicode characters. Since this translation occurs | before the program

RE: Unicode again

2002-01-17 Thread Simon Peyton-Jones
| there was some discussion about Unicode and the Char type | some time ago. At the moment I'm writing some Haskell code | dealing with XML. The problem is that there seems to be no | consensus concerning Char so that it is difficult for me to | deal with the XML unicode issues appropri

RE: Unicode again

2002-01-16 Thread Kent Karlsson
This is getting a bit off-topic for Haskell... > Isn't it fairly common to use 32bit Unicode character types in C? Yes, in some implementations, but nobody by a few Linux and SunOS programmers use that... (Those systems are far from committed to Unicode.) In some other systems wc

Re: Unicode again

2002-01-16 Thread Ketil's local user
"Kent Karlsson" <[EMAIL PROTECTED]> writes: > Everyone that is serious about Unicode and where efficiency > is also of concern(!) target UTF-16 (MacOS, Windows, Epoc, Java, > Oracle, ...). Isn't it fairly common to use 32bit Unicode character types in C? I'm

RE: Unicode again

2002-01-15 Thread Kent Karlsson
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On > Behalf Of Wolfgang Jeltsch > Sent: den 5 januari 2002 13:04 > To: The Haskell Mailing List > Subject: Unicode again > > > Hello, > there was some discussion about Unicode and

Unicode again

2002-01-08 Thread Wolfgang Jeltsch
Hello, there was some discussion about Unicode and the Char type some time ago. At the moment I'm writing some Haskell code dealing with XML. The problem is that there seems to be no consensus concerning Char so that it is difficult for me to deal with the XML unicode issues appropriatel

RE: Unicode stupidity (Was: Unicode support)

2001-10-24 Thread Karlsson Kent - keka
> None of that "But 21 bits *is* enough". > Yeah, like 640K was enough. And countless other examples. That is not comparable. Never was. > I thought we had learned, but I was wrong... I'm especially > disheartened to hear that ISO bought into the same crap. Who's going to invent all these gaz

Unicode stupidity (Was: Unicode support)

2001-10-24 Thread "Jürgen A. Erhard"
think 24 bits all the currently defined extended >> planes. [...] Just for reference... "currently" is the important word here. [Snipped interesting and (hopefully) enlightening stuff about Unicode and marketing] Marcin> IMHO it would have been better to not invent UTF

Re: More Unicode nit-picking

2001-10-19 Thread Kent Karlsson
- Original Message - From: "Colin Paul Adams" ... > But this seems to assume there is a one-to-one mapping of upper-case > to lower-case equivalent, and vice-versa. Apparently this is not > so. True. It's quite tricky. See below. > It seems that wh

Re: More Unicode nit-picking

2001-10-19 Thread Marcin 'Qrczak' Kowalczyk
ng. Language-specific mappings and irregular cases are described as deviations from it. > It seems that whilst the Unicode database's definitions of whether or > not a character is upper/lower/title case are normative, the mappings > from upper to lower case are only suggestive

More Unicode nit-picking

2001-10-18 Thread Colin Paul Adams
even harder to say what the cure should be, nevertheless: Firstly: "This module offers only a limited view of the full Unicode character set; the full set of Unicode character attributes is not accessible in this library." I take it this implies that operations in this module apply to

Re: Unicode support

2001-10-10 Thread Marcin 'Qrczak' Kowalczyk
Tue, 9 Oct 2001 14:59:09 -0700, John Meacham <[EMAIL PROTECTED]> pisze: > I think a cannonical way to get at iconvs ('man 3 iconv' for info.) > functionality in one of the standard librarys would be great. perhaps > I will have a go at it. even if the underlying platform does not have > iconv the

Re: Unicode support

2001-10-09 Thread John Meacham
ing smaller than a pointer in general so for haskell the simplification of UTF-32 is most likely worth it. If space efficiency is a concern than I imagine people would want to use mutable arrays of bytes or words anyway (perhaps mmap'ed from a file) and not haskell lists of Chars. >

Re: Unicode support

2001-10-09 Thread Marcin 'Qrczak' Kowalczyk
On Tue, 9 Oct 2001, Ashley Yakeley wrote: > Would it be worthwhile restricting Char to the 0-10 range, just as a > Word8 is restricted to 0-FF even though in GHC at least it's stored > 32-bit? It is thus restricted in GHC. I think it's a good compromise between 32-bit

Re: Unicode support

2001-10-09 Thread Ashley Yakeley
At 2001-10-09 03:37, Kent Karlsson wrote: >> >code position (=code point): a value between and 10. >> >> Would this be a reasonable basis for Haskell's 'Char' type? > >Yes. It's essentially UTF-32, but without the fixation to 32-bit >(21 bits suffice). UTF-32 (a.k.a. UCS-4 in 10646,

Re: Unicode support

2001-10-09 Thread Kent Karlsson
- Original Message - From: "Ashley Yakeley" <[EMAIL PROTECTED]> To: "Kent Karlsson" <[EMAIL PROTECTED]>; "Haskell List" <[EMAIL PROTECTED]>; "Libraries for Haskell List" <[EMAIL PROTECTED]> Sent: Tuesday, October 09, 20

Re: Unicode support

2001-10-09 Thread Ashley Yakeley
At 2001-10-09 02:58, Kent Karlsson wrote: >In summary: > >code position (=code point): a value between and 10. Would this be a reasonable basis for Haskell's 'Char' type? At some point perhaps there should be a 'Unicode' standard library for H

Re: Unicode support

2001-10-09 Thread Kent Karlsson
Just to clear up any misunderstanding: - Original Message - From: "Ashley Yakeley" <[EMAIL PROTECTED]> To: "Haskell List" <[EMAIL PROTECTED]> Sent: Monday, October 01, 2001 12:36 AM Subject: Re: Unicode support > At 2001-09-30 07:29, Marcin 'Qrc

Re: Unicode support

2001-10-08 Thread Kent Karlsson
- Original Message - From: "Dylan Thurston" <[EMAIL PROTECTED]> To: "John Meacham" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Friday, October 05, 2001 5:47 PM Subject: Re: Unicode support > On Sun, Sep 30, 2001 at 11:01:38AM -0700, John Mea

Re: Unicode support

2001-10-08 Thread Kent Karlsson
- Original Message - From: "Wolfgang Jeltsch" <[EMAIL PROTECTED]> To: "The Haskell Mailing List" <[EMAIL PROTECTED]> Sent: Thursday, October 04, 2001 8:47 PM Subject: Re: Unicode support > On Sunday, 30 September 2001 20:01, John Meacham wrote: >

Re: Unicode support

2001-10-05 Thread Dylan Thurston
r represents one unicode character, but the entire range of unicode > is not guarenteed to be expressable, which must be true, since haskell > 98 implementations can be written now, but unicode can change in the > future. The only range guarenteed to be expressable in any > representation

Re: Unicode support

2001-10-04 Thread Wolfgang Jeltsch
On Sunday, 30 September 2001 20:01, John Meacham wrote: > sorry for the me too post, but this has been a major pet peeve of mine > for a long time. 16 bit unicode should be gotten rid of, being the worst > of both worlds, non backwards compatable with ascii, endianness issues > and

Re: Unicode support

2001-09-30 Thread Jens Petersen
xtended > Jens> planes. So I guess the report just refers to the BMP. > > I guess it does, and I think back in 1998 that may still have been > identical to Unicode. > But the revision of the report that SPJ is preparing is unchanged in > this respect, and so is factually in

Re: Unicode support

2001-09-30 Thread Ashley Yakeley
At 2001-09-30 07:29, Marcin 'Qrczak' Kowalczyk wrote: >Some time ago the Unicode Consortium slowly began switching to the >point of view that abstract characters are denoted by numbers in the >range U+..10. It's worth mentioning that these are 'codepoints&#

Re: Unicode support

2001-09-30 Thread John Meacham
sorry for the me too post, but this has been a major pet peeve of mine for a long time. 16 bit unicode should be gotten rid of, being the worst of both worlds, non backwards compatable with ascii, endianness issues and no constant length encoding utf8 externally and utf32 when worknig with

Re: Unicode support

2001-09-30 Thread Marcin 'Qrczak' Kowalczyk
30 Sep 2001 14:43:21 +0100, Colin Paul Adams <[EMAIL PROTECTED]> pisze: > I think it should either be amended to mention the BMP subset of > Unicode, or, better, change the reference from 16-bit to 24-bit. 24-bit is not accurate. The range from 0 to 0x10 has 20.087462841250343

Re: Unicode support

2001-09-30 Thread Marcin 'Qrczak' Kowalczyk
30 Sep 2001 22:28:52 +0900, Jens Petersen <[EMAIL PROTECTED]> pisze: > 16 bits is enough to describe the Basic Multilingual Plane > and I think 24 bits all the currently defined extended > planes. So I guess the report just refers to the BMP. In early days the Unicode Conso

Re: Unicode support

2001-09-30 Thread Colin Paul Adams
P. I guess it does, and I think back in 1998 that may still have been identical to Unicode. But the revision of the report that SPJ is preparing is unchanged in this respect, and so is factually inaccurate. I think it should either be amended to mention the BMP subset of Unicode, or, better, ch

Re: Unicode support

2001-09-30 Thread Jens Petersen
numeration and consists of 16 bit values, > >conforming to > >the Unicode standard [10]. > > > >Unicode uses 24-bit values to identify characters. > > According to the official Unicode web site [0], > > The Unicode Standard defines three encodi

Re: Unicode support

2001-09-29 Thread Hamilton Richards
At 12:20 PM -0500 9/29/01, Colin Paul Adams wrote: >I have just been reading through the Haskell report to refresh my >memory of the language. I was surprised to see this: > >The character type Char is an enumeration and consists of 16 bit values, >conforming to >the Un

Unicode support

2001-09-29 Thread Colin Paul Adams
I have just been reading through the Haskell report to refresh my memory of the language. I was surprised to see this: The character type Char is an enumeration and consists of 16 bit values, conforming to the Unicode standard [10]. Unicode uses 24-bit values to identify characters. -- Colin

Re: Picky details about Unicode (was RE: Haskell 98 Report possible errors, part one)

2001-07-24 Thread Marcin 'Qrczak' Kowalczyk
Mon, 23 Jul 2001 11:23:30 -0700, Mark P Jones <[EMAIL PROTECTED]> pisze: > I guess the intention here is that: > > symbol -> ascSymbol | uniSymbol_ Right. > In fact, since all the characters in ascSymbol are either > punctuation or symbols in Unicode, the inc

Picky details about Unicode (was RE: Haskell 98 Report possible errors, part one)

2001-07-23 Thread Mark P Jones
| 2.2. Identifiers can use small and large Unicode letters ... If we're picking on the report's handling of Unicode, here's another minor quibble to add to the list. In describing the lexical syntax of operator symbols, the report uses: varsym-> (symbol {symbo

Re: Unicode

2001-05-25 Thread Marcin 'Qrczak' Kowalczyk
Sat, 26 May 2001 03:17:40 +1000, Fergus Henderson <[EMAIL PROTECTED]> pisze: > Is there a way to convert a Haskell String into a UTF-16 > encoded byte stream without writing to a file and then > reading the file back in? Sure: all conversions are available as memory to memory conversions for dir

Re: Unicode

2001-05-25 Thread John Meacham
The algorithms for encoding unicode characters into the various transport formats, UTF16,UTF8,UTF32 are well defined, they can trivially be implemented in Haskell, for instance encodeUTF8 :: String -> [Byte] decodeUTF8 :: [Byte] -> Maybe String would be easily definable. BTW, since a char

Re: Unicode

2001-05-25 Thread Fergus Henderson
On 24-May-2001, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote: > Thu, 24 May 2001 14:41:21 -0700, Ashley Yakeley <[EMAIL PROTECTED]> pisze: > > >> - Initial Unicode support - the Char type is now 31 bits. > > > > It might be appropriate t

Re: Unicode

2001-05-24 Thread Marcin 'Qrczak' Kowalczyk
Thu, 24 May 2001 14:41:21 -0700, Ashley Yakeley <[EMAIL PROTECTED]> pisze: >> - Initial Unicode support - the Char type is now 31 bits. > > It might be appropriate to have two types for Unicode, a UCS2 type > (16 bits) and a UCS4 type (31 bits). Actually it's 20.087

Unicode

2001-05-24 Thread Ashley Yakeley
At 2001-05-24 05:57, Julian Seward (Intl Vendor) wrote: > - Initial Unicode support - the Char type is now 31 bits. It might be appropriate to have two types for Unicode, a UCS2 type (16 bits) and a UCS4 type (31 bits). For instance, something like: -- newtype UCS2CodePo

Re: Unicode and is*

2001-05-04 Thread Marcin 'Qrczak' Kowalczyk
Fri, 4 May 2001 15:20:02 +0100, Ian Lynagh <[EMAIL PROTECTED]> pisze: > Is there a reason why isUpper and isLower include all unicode > characters of the appropriate class but isDigit is only 0..9? There are also other weirdnesses, e.g. isSpace is specified to work only on ISO-88

Unicode and is*

2001-05-04 Thread Ian Lynagh
Hi all Is there a reason why isUpper and isLower include all unicode characters of the appropriate class but isDigit is only 0..9? Are there any Haskell unicode libraries around? And is the implementation of unicode support for GHC being discussed/developed anywhere? BTW, The comments in the

RE: Unicode, emacs, xterm, (Linux)

1999-10-12 Thread Karlsson Kent - keka
t it quite right yet.) I haven't followed all on what is done regarding Unicode in emacs, but apparently Unicode gets into emacs too (see below). I never use emacs myself, so I haven't tried the package ref. below in any way whatsoever. You probably need to install some more or les

Re: Unicode (Re: Reverse composition)

1999-10-11 Thread Ralf Muschall
Lennart Augustsson wrote: > It's not hard to find a text editor, use w.g. wily. It's widely available. But it is hard to use some nonstandard (i.e. neither vi nor emacs) editor just for one special kind of source code - it means to lose all the keybindings, highlight settings, 100-lines-of-defi

Unicode (was RE: Reverse composition)

1999-10-11 Thread Tom Pledger
Brian Boutel writes: > [...] > > If the supply of suitable Ascii symbols seems inadequate, remember > that Haskell uses Unicode. There is no reason to limit symbols to > those in the Ascii set. While we're on the subject, I suggest Unicode as a Hugs/GHC wish list item.

Re: Unicode (Re: Reverse composition)

1999-10-11 Thread George Russell
Marcin 'Qrczak' Kowalczyk wrote: [snip] > But when Unicode finally comes... How should Haskell's textfile IO > work? I don't think the current standard functions for textfile IO would have too many problems. You can do hSeek in Haskell, but "The offset i

Re: Unicode (Re: Reverse composition)

1999-10-09 Thread Lennart Augustsson
two fragments: a = x + y where x = 1 y = 1 vs. a = x ++ y where x = 1 y = 1 They have very different syntactical meaning. > > If the supply of suitable Ascii symbols seems inadequate, remember > > that Haskell uses Unicode. There is no reason to limit

Layout (Re: Unicode)

1999-10-09 Thread Marcin 'Qrczak' Kowalczyk
Sat, 09 Oct 1999 13:08:39 +0200, Lennart Augustsson <[EMAIL PROTECTED]> pisze: > > No, because only the indent before the first non-whitespace character > > in a line matters. Haskell programs can be typeset even in proportional > > font as long as indents have correct relationships between their

Unicode (Re: Reverse composition)

1999-10-09 Thread Marcin 'Qrczak' Kowalczyk
n a line matters. Haskell programs can be typeset even in proportional font as long as indents have correct relationships between their lengths. > If the supply of suitable Ascii symbols seems inadequate, remember > that Haskell uses Unicode. There is no reason to limit symbols to > those in

Re: Haskell 1.4 and Unicode

1997-11-10 Thread Hans Aberg
I think that we are perhaps getting a little off-topic now, but Unicde will clearly help forward computing, so perhaps it can continue a few more postings. :-) At 17:45 +0100 97/11/10, Kent Karlsson [EMAIL PROTECTED] wrote: > Let me reiterate: > Unicode is ***NOT*** a

SV: Haskell 1.4 and Unicode

1997-11-10 Thread Kent Karlsson [EMAIL PROTECTED]
Let me reiterate: Unicode is ***NOT*** a glyph encoding! Unicode is ***NOT*** a glyph encoding! and never will be. The same character can be displayed as a variety of glyphs, depending not only of the font/style, but also, and this is the important

Re: SV: Haskell 1.4 and Unicode

1997-11-10 Thread Hans Aberg
At 12:45 +0100 97/11/10, Kent Karlsson [EMAIL PROTECTED] wrote: > As everyone (getting) familiar with Unicode should > know, Unicode is **NOT** a font encoding. > It is a CHARACTER encoding. The difference > shows up mostly for 'complex scripts', such as Arabic > and D

SV: Haskell 1.4 and Unicode

1997-11-10 Thread Kent Karlsson [EMAIL PROTECTED]
ll programs are not written that way.) 3. (In reply to Hans Aberg (Aberg?)) > The easiest way of thinking of Unicode is perhaps as a font encoding; a > font using this encoding would add such things as typeface family, style, > size, kerning (but Unicod

Re: Haskell 1.4 and Unicode

1997-11-10 Thread Ron Wichers Schreur
Carl R. Witty wrote (to the Haskell mailing list): > [..] > The Report could give up and say that column numbers in the > presence of \u escapes are explicitly implementation-defined. > [..] > [This] sounds pretty bad (effectively prohibiting layout in portable > prog

Re: Haskell 1.4 and Unicode

1997-11-07 Thread John C. Peterson
I had option 1 in mind when that part of the report was written. We should clarify this in the next revision. And thanks for your analysis of the problem! John

Re: Haskell 1.4 and Unicode

1997-11-07 Thread Hans Aberg
>Carl R. Witty wrote: > >> 1) I assume that layout processing occurs after Unicode preprocessing; >> otherwise, you can't even find the lexemes. If so, are all Unicode >> characters assumed to be the same width? The easiest way of thinking of Unicode is perha

Re: Haskell 1.4 and Unicode

1997-11-07 Thread Kent Karlsson
Carl R. Witty wrote: > 1) I assume that layout processing occurs after Unicode preprocessing; > otherwise, you can't even find the lexemes. If so, are all Unicode > characters assumed to be the same width? Unicode characters ***cannot in any way*** be considered as being of th

Re: Haskell 1.4 and Unicode

1997-11-07 Thread Carl R. Witty
Kent Karlsson <[EMAIL PROTECTED]> writes: > Carl R. Witty wrote: > > > 1) I assume that layout processing occurs after Unicode preprocessing; > > otherwise, you can't even find the lexemes. If so, are all Unicode > > characters assumed to be the same wid

Re: Haskell 1.4 and Unicode

1997-11-07 Thread Lennart Augustsson
Unicode was added at the last moment, so there is likely to be some descrepancies. > 1) I assume that layout processing occurs after Unicode preprocessing; > otherwise, you can't even find the lexemes. If so, are all Unicode > characters assumed to be the same width? I think

Haskell 1.4 and Unicode

1997-11-07 Thread Carl R. Witty
I have some questions regarding Haskell 1.4 and Unicode. My source materials for these questions are "The Haskell 1.4 Report" and the files ftp://ftp.unicode.org/Public/2.0-Update/ReadMe-2.0.14.txt and ftp://ftp.unicode.org/Public/2.0-Update/UnicodeData-2.0.14.txt It'