Hi,
I released unicode-transforms sometime back as bindings to a C library
(utf8proc). Since then I have rewritten it completely in Haskell. Haskell
data structures are automatically generated from unicode database, so it
can be kept up-to-date with the standard unlike the C implementation which
This is for those who want unicode normalization but do not want a
dependency on the heavyweight icu libraries. It has the same API as the
text-icu package. It is based on the utf8proc C implementation.
https://hackage.haskell.org/package/unicode-transforms
https://github.com/harendra-kumar
The first announcement this year:
The 'unicode' package contains functions for construction of various
characters like:
* block graphic elements
* frame elements
* fractions
* subscript and superscript characters
http://hackage.haskell.org/package/unicode
The package is simp
Indeed, the Unicode only made it through my system, and not the mailer
itself. In fact, the name should look something like "Ćwikłowski",
and not the garbled mess that made it through. Apologies again, Bartek!
/Joe___
Haskell ma
http://trac.haskell.org/haskeline/wiki/KeyBindings
* A new preference 'historyDuplicates' to remove repeated history
entries
* Recognize PageUp and PageDown keys
* Compatibility with ghc-6.12
* Correct width calculations for Unicode combining characters
Oh, this Unicode width calculati
Dear Don,
thanks very much!
You may need to write the strings to the database using the utf8-string
package.
My program may not be the only one writing into the database. But I now
*read* from the database using utf8-string, which solves the problem:
import Codec.Binary.UTF8.String ( decode
kalman:
> Excuse me if this question is misplaced or too trivial.
>
> I'm writing a CGI program in Haskell (CGI/HDBC/Sqlite3), the database
> contains UTF-8 strings. If I use HDBC.fetchRow() to retrieve the data,
> then HDBC.fromSql() to convert the data to Haskell, then Text.XHtml
> construc
Excuse me if this question is misplaced or too trivial.
I'm writing a CGI program in Haskell (CGI/HDBC/Sqlite3), the database
contains UTF-8 strings. If I use HDBC.fetchRow() to retrieve the data,
then HDBC.fromSql() to convert the data to Haskell, then Text.XHtml
constructs to dsplay it, I
I want to design a DSL in Haskell for propositional calculus. But instead of
using natural language names for functions like or, and, implies etc. I want to
use Unicode symbols as infix functions ¬, ˅, ˄, →, ↔ But I keep getting error
messages from the GHC parser. Is there a way to make GHC
Hello all,
I would like to announce version 0.3 of my Data.CompactString library.
Data.CompactString is a wrapper around Data.ByteString that represents a
Unicode string. This new version supports different encodings, as can be
seen from the data type:
> data Encoding a => CompactSt
Unicode in use are UTF-16, UTF-8, and (less frequently) UTF-32.
On Feb 9, 2007, at 6:02 AM, Duncan Coutts wrote:
Apparently UTF-16 (which is like UCS-2 but covers all code points)
is a
good internal format. It is more compact than UTF-32 in almost all
cases
and a less complex encoding than UTF-8
find it helpful to
maintain the fiction that there's no such thing (any more) as UCS-N,
there's only UTF-8, 16 and 32. This is also what the Unicode consortium
tries to encourage.
My view is that we should just provide all three:
Data.PackedString.UTF8
Data.PackedString.UTF16
Data.PackedS
On Mon, Feb 05, 2007 at 01:14:26PM +0100, Twan van Laarhoven wrote:
> The reason for inventing my own encoding is that it is easier to use and
> takes less space than UTF-8. The only advantage UTF-8 has is that it can
> be read and written directly. I guess this is a trade off, faster
> manipula
On Tue, Feb 06, 2007 at 03:16:17PM +0900, shelarcy wrote:
> I'm afraid that its fantasy is broken again, as no surrogate
> pair UCS-2 cover all language that is trusted before Europe
> and America people.
UCS-2 is a disaster in every way. someone had to say it. :)
everything should be ascii, utf8
es, it seems you are incorrect.
>
> Haskell Char go up to Unicode 1114111 (decimal) or 0x10 Hexidecimal).
> These are encoded by UTF-8 in 1,2,3,or 4 bytes.
I see. I'm confused Unicode support with Charset support.
I'm sorry about it.
UCS-4 can support greater than 1114111 code
allowing up to 6 bytes to encode a single char.
> There's nothing stopping the Unicode consortium from expanding the
> range of codepoints, is there? Or have they said that'll never happen?
I believe they have. In particular, UTF-16 only supports code points up
to 10.
From &l
as the most accurate UTF8 en + de-coders:
http://abridgegame.org/cgi-bin/darcs.cgi/darcs/UTF8.lhs?c=annotate
There's nothing stopping the Unicode consortium from expanding the
range of codepoints, is there? Or have they said that'll never happen?
Alistair
shelarcy wrote:
> Hello Twan,
>
> On Mon, 05 Feb 2007 08:46:35 +0900, Twan van Laarhoven <[EMAIL PROTECTED]>
> wrote:
>> I would like to announce my attempt at making a Unicode version of
>> Data.ByteString. The library is named Data.CompactString to avoi
Hello Twan,
On Mon, 05 Feb 2007 08:46:35 +0900, Twan van Laarhoven <[EMAIL PROTECTED]>
wrote:
> I would like to announce my attempt at making a Unicode version of
> Data.ByteString. The library is named Data.CompactString to avoid
> conflict with other (Fast)PackedString librar
Chris Kuklewicz wrote:
Can I be among the first to ask that any Unicode variant of ByteString use a
recognized encoding?
In reading all the poke/peek function I did not see anything that your tag bits
accomplish that the tag bits in utf-8 do not, except that you want to write only
a single
Twan van Laarhoven wrote:
> Hello all,
>
> I would like to announce my attempt at making a Unicode version of
> Data.ByteString. The library is named Data.CompactString to avoid
> conflict with other (Fast)PackedString libraries.
>
> The library uses a variable length encod
Hello all,
I would like to announce my attempt at making a Unicode version of
Data.ByteString. The library is named Data.CompactString to avoid
conflict with other (Fast)PackedString libraries.
The library uses a variable length encoding (1 to 3 bytes) of Chars into
Word8s, which are then
On Sun, Mar 26, 2006 at 03:22:38PM +0400, Bulat Ziganshin wrote:
> 3. Unicode support in I/O routines, i.e. ability to read/write UTF-8
> encoded files and files what use other Unicode byte encodings: not
> implemented in any compiler, afaik, but there are 3rd-party libs:
> Streams li
Hello haskell-prime,
i've planned some time ago to open unicode/internalization wiki page,
what reflects current state of the art in this area. here is the
information i have, please add/correct me if i don't know something or
wrong.
1. Char supports full Unicode range (about millio
Am 23. Mrz 2005 um 5.23 Uhr schrieb Antonio Regidor García:
8. Accept "element of" U+2208 instead of <- in list comprehensions?
That would take it away as a regular operator (to use as the elem
function).
Also, the right hand side of those arrows being lists rather than
proper sets, with an impo
> > De: Sven Moritz Hallberg <[EMAIL PROTECTED]>
>
> To make the fun complete, I think a few adjustments to the Haskell
> grammar would be in order, so, guessing the relevant people read these
> lists, may I suggest the following?
>
> 1. In addition to the backslash, accept "mathematical
On 21 Mar 2005, at 12:12, Marcin 'Qrczak' Kowalczyk wrote:
Sven Moritz Hallberg <[EMAIL PROTECTED]> writes:
1. In addition to the backslash, accept "mathematical * small
lamda" (U+1D6CC, U+1D706, U+1D740, U+1D77A, and U+1D7B4) for lambda
abstractions. Leave "greek small letter lamda" as a r
Sven Moritz Hallberg <[EMAIL PROTECTED]> writes:
> 1. In addition to the backslash, accept "mathematical * small
> lamda" (U+1D6CC, U+1D706, U+1D740, U+1D77A, and U+1D7B4) for lambda
> abstractions. Leave "greek small letter lamda" as a regular letter,
> so the Greeks can write their native
Greetings GHC and Haskell folk, please excuse the cross-post.
This is a coordinational message. :)
I've been longing for Unicode (UTF-8) input support in GHC for a long
time. I am currently customizing a keyboard layout to include many
mathematical operators and special characters which wou
A while ago I wrote a glibc specific implementation of the CWString
library. I have since made several improvements:
* No longer glibc specific, should compile and work on any system with
iconv (which is unix standard) (but there are still glibc specific
optimizations)
* general iconv library
When can I expect a version of Hugs with Unicode support to be generally
released?
I've been using an unofficial build of Hugs with Unicode character support
to develop enhancements to the HaXml parser which are needed for my RDF/XML
parser. I'm starting to think about packaging
On Fri, May 28, 2004 at 01:20:32PM +0100, Graham Klyne wrote:
> I've noticed a discrepancy in by version of Hugs with experimental Unicode
> support enabled, based on the 20040109 codebase. It's exemplified by this:
>
> [[
> Main> '\x10'
> '\1
I've noticed a discrepancy in by version of Hugs with experimental Unicode
support enabled, based on the 20040109 codebase. It's exemplified by this:
[[
Main> '\x10'
'\1114111'
Main> maxBound::Char
'\255'
Main>
]]
It appears that this v
Further to my last message [1], if anyone else wants to play with the
experimental Unicode support for Hugs under MS-Windows, I've placed a
Windows executable file on my Web site, linked from [2].
#g
--
[1] http://haskell.org/pipermail/haskell/2004-January/013377.html
[2]
e to run a complex
test suite program. My next step is to see if the Unicode support is
sufficient to run the HXML toolbox software.
...
Attempting to load module HUnitExample.hs from the 3.01 HXML toolbox
distribution (), I'm getting:
[[
ERROR "..\hparser\Unicode.hs":116 - Hex
This message notes some edits I'm making to try and build the Dec2003 Hugs
source kit with Unicode under MS-Windows (because I want to try and use
HXML toolbox under Hugs). It also describes a problem I'm having with the
experimental Unicode support under Windows.
(I'm also
hoice as to the encoding, UTF-8 is
> definitely the way to go.
No I don't think so. UTF8 is a good choice if you want
a way of storing Unicode files on an 8-bit file-system, but it
is not as efficient an encoding for characters in general.
Thus with UTF8 you can represent character codes less
George Russell wrote:
> > OTOH, existing implementations (at least GHC and Hugs) currently read
> > and write "8-bit binary", i.e. characters 0-255 get read and written
> > "as-is" and anything else breaks, and changing that would probably
> > break a fair amount of existing code.
>
> The bi
d the top bit to signal
that the encoding is not yet complete. Characters 0-127 (which
include the standard ASCII ones) get encoded as themselves.
This is probably not nearly as efficient as encoding characters
as themselves, but it's nice to be Unicode-proof ...
___
Ketil says:
| While we're at it, are there any plans to remove this paragraph from
| section 2.1:
|
| | Haskell uses a pre-processor to convert non-Unicode character sets
| | into Unicode. This pre-processor converts all characters to Unicode
| | and uses the escape sequence \u, wher
uses a pre-processor to convert non-Unicode character sets
| into Unicode. This pre-processor converts all characters to Unicode
| and uses the escape sequence \u, where the "h" are hex digits,
| to denote escaped Unicode characters. Since this translation occurs
| before the program
| there was some discussion about Unicode and the Char type
| some time ago. At the moment I'm writing some Haskell code
| dealing with XML. The problem is that there seems to be no
| consensus concerning Char so that it is difficult for me to
| deal with the XML unicode issues appropri
This is getting a bit off-topic for Haskell...
> Isn't it fairly common to use 32bit Unicode character types in C?
Yes, in some implementations, but nobody by a few Linux and SunOS
programmers use that... (Those systems are far from committed to
Unicode.)
In some other systems wc
"Kent Karlsson" <[EMAIL PROTECTED]> writes:
> Everyone that is serious about Unicode and where efficiency
> is also of concern(!) target UTF-16 (MacOS, Windows, Epoc, Java,
> Oracle, ...).
Isn't it fairly common to use 32bit Unicode character types in C?
I'm
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Wolfgang Jeltsch
> Sent: den 5 januari 2002 13:04
> To: The Haskell Mailing List
> Subject: Unicode again
>
>
> Hello,
> there was some discussion about Unicode and
Hello,
there was some discussion about Unicode and the Char type some time ago. At
the moment I'm writing some Haskell code dealing with XML. The problem is
that there seems to be no consensus concerning Char so that it is difficult
for me to deal with the XML unicode issues appropriatel
> None of that "But 21 bits *is* enough".
> Yeah, like 640K was enough. And countless other examples.
That is not comparable. Never was.
> I thought we had learned, but I was wrong... I'm especially
> disheartened to hear that ISO bought into the same crap.
Who's going to invent all these gaz
think 24 bits all the currently defined extended
>> planes. [...]
Just for reference... "currently" is the important word here.
[Snipped interesting and (hopefully) enlightening stuff about Unicode
and marketing]
Marcin> IMHO it would have been better to not invent UTF
- Original Message -
From: "Colin Paul Adams"
...
> But this seems to assume there is a one-to-one mapping of upper-case
> to lower-case equivalent, and vice-versa. Apparently this is not
> so.
True. It's quite tricky. See below.
> It seems that wh
ng.
Language-specific mappings and irregular cases are described as
deviations from it.
> It seems that whilst the Unicode database's definitions of whether or
> not a character is upper/lower/title case are normative, the mappings
> from upper to lower case are only suggestive
even harder to
say what the cure should be, nevertheless:
Firstly:
"This module offers only a limited view of the
full Unicode character set; the full set of Unicode character
attributes is not accessible in this library."
I take it this implies that operations in this module apply to
Tue, 9 Oct 2001 14:59:09 -0700, John Meacham <[EMAIL PROTECTED]> pisze:
> I think a cannonical way to get at iconvs ('man 3 iconv' for info.)
> functionality in one of the standard librarys would be great. perhaps
> I will have a go at it. even if the underlying platform does not have
> iconv the
ing smaller than a pointer in general so for
haskell the simplification of UTF-32 is most likely worth it.
If space efficiency is a concern than I imagine people would want to use
mutable arrays of bytes or words anyway (perhaps mmap'ed from a file)
and not haskell lists of Chars.
>
On Tue, 9 Oct 2001, Ashley Yakeley wrote:
> Would it be worthwhile restricting Char to the 0-10 range, just as a
> Word8 is restricted to 0-FF even though in GHC at least it's stored
> 32-bit?
It is thus restricted in GHC. I think it's a good compromise between
32-bit
At 2001-10-09 03:37, Kent Karlsson wrote:
>> >code position (=code point): a value between and 10.
>>
>> Would this be a reasonable basis for Haskell's 'Char' type?
>
>Yes. It's essentially UTF-32, but without the fixation to 32-bit
>(21 bits suffice). UTF-32 (a.k.a. UCS-4 in 10646,
- Original Message -
From: "Ashley Yakeley" <[EMAIL PROTECTED]>
To: "Kent Karlsson" <[EMAIL PROTECTED]>; "Haskell List" <[EMAIL PROTECTED]>;
"Libraries for Haskell List"
<[EMAIL PROTECTED]>
Sent: Tuesday, October 09, 20
At 2001-10-09 02:58, Kent Karlsson wrote:
>In summary:
>
>code position (=code point): a value between and 10.
Would this be a reasonable basis for Haskell's 'Char' type? At some point
perhaps there should be a 'Unicode' standard library for H
Just to clear up any misunderstanding:
- Original Message -
From: "Ashley Yakeley" <[EMAIL PROTECTED]>
To: "Haskell List" <[EMAIL PROTECTED]>
Sent: Monday, October 01, 2001 12:36 AM
Subject: Re: Unicode support
> At 2001-09-30 07:29, Marcin 'Qrc
- Original Message -
From: "Dylan Thurston" <[EMAIL PROTECTED]>
To: "John Meacham" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Friday, October 05, 2001 5:47 PM
Subject: Re: Unicode support
> On Sun, Sep 30, 2001 at 11:01:38AM -0700, John Mea
- Original Message -
From: "Wolfgang Jeltsch" <[EMAIL PROTECTED]>
To: "The Haskell Mailing List" <[EMAIL PROTECTED]>
Sent: Thursday, October 04, 2001 8:47 PM
Subject: Re: Unicode support
> On Sunday, 30 September 2001 20:01, John Meacham wrote:
>
r represents one unicode character, but the entire range of unicode
> is not guarenteed to be expressable, which must be true, since haskell
> 98 implementations can be written now, but unicode can change in the
> future. The only range guarenteed to be expressable in any
> representation
On Sunday, 30 September 2001 20:01, John Meacham wrote:
> sorry for the me too post, but this has been a major pet peeve of mine
> for a long time. 16 bit unicode should be gotten rid of, being the worst
> of both worlds, non backwards compatable with ascii, endianness issues
> and
xtended
> Jens> planes. So I guess the report just refers to the BMP.
>
> I guess it does, and I think back in 1998 that may still have been
> identical to Unicode.
> But the revision of the report that SPJ is preparing is unchanged in
> this respect, and so is factually in
At 2001-09-30 07:29, Marcin 'Qrczak' Kowalczyk wrote:
>Some time ago the Unicode Consortium slowly began switching to the
>point of view that abstract characters are denoted by numbers in the
>range U+..10.
It's worth mentioning that these are 'codepoints
sorry for the me too post, but this has been a major pet peeve of mine
for a long time. 16 bit unicode should be gotten rid of, being the worst
of both worlds, non backwards compatable with ascii, endianness issues
and no constant length encoding utf8 externally and utf32 when
worknig with
30 Sep 2001 14:43:21 +0100, Colin Paul Adams <[EMAIL PROTECTED]> pisze:
> I think it should either be amended to mention the BMP subset of
> Unicode, or, better, change the reference from 16-bit to 24-bit.
24-bit is not accurate. The range from 0 to 0x10 has
20.087462841250343
30 Sep 2001 22:28:52 +0900, Jens Petersen <[EMAIL PROTECTED]> pisze:
> 16 bits is enough to describe the Basic Multilingual Plane
> and I think 24 bits all the currently defined extended
> planes. So I guess the report just refers to the BMP.
In early days the Unicode Conso
P.
I guess it does, and I think back in 1998 that may still have been
identical to Unicode.
But the revision of the report that SPJ is preparing is unchanged in
this respect, and so is factually inaccurate.
I think it should either be amended to mention the BMP subset of
Unicode, or, better, ch
numeration and consists of 16 bit values,
> >conforming to
> >the Unicode standard [10].
> >
> >Unicode uses 24-bit values to identify characters.
>
> According to the official Unicode web site [0],
>
> The Unicode Standard defines three encodi
At 12:20 PM -0500 9/29/01, Colin Paul Adams wrote:
>I have just been reading through the Haskell report to refresh my
>memory of the language. I was surprised to see this:
>
>The character type Char is an enumeration and consists of 16 bit values,
>conforming to
>the Un
I have just been reading through the Haskell report to refresh my
memory of the language. I was surprised to see this:
The character type Char is an enumeration and consists of 16 bit values, conforming to
the Unicode standard [10].
Unicode uses 24-bit values to identify characters.
--
Colin
Mon, 23 Jul 2001 11:23:30 -0700, Mark P Jones <[EMAIL PROTECTED]> pisze:
> I guess the intention here is that:
>
> symbol -> ascSymbol | uniSymbol_
Right.
> In fact, since all the characters in ascSymbol are either
> punctuation or symbols in Unicode, the inc
| 2.2. Identifiers can use small and large Unicode letters ...
If we're picking on the report's handling of Unicode, here's
another minor quibble to add to the list. In describing the
lexical syntax of operator symbols, the report uses:
varsym-> (symbol {symbo
Sat, 26 May 2001 03:17:40 +1000, Fergus Henderson <[EMAIL PROTECTED]> pisze:
> Is there a way to convert a Haskell String into a UTF-16
> encoded byte stream without writing to a file and then
> reading the file back in?
Sure: all conversions are available as memory to memory conversions
for dir
The algorithms for encoding unicode characters into the various
transport formats, UTF16,UTF8,UTF32 are well defined, they can trivially
be implemented in Haskell, for instance
encodeUTF8 :: String -> [Byte]
decodeUTF8 :: [Byte] -> Maybe String
would be easily definable.
BTW, since a char
On 24-May-2001, Marcin 'Qrczak' Kowalczyk <[EMAIL PROTECTED]> wrote:
> Thu, 24 May 2001 14:41:21 -0700, Ashley Yakeley <[EMAIL PROTECTED]> pisze:
>
> >> - Initial Unicode support - the Char type is now 31 bits.
> >
> > It might be appropriate t
Thu, 24 May 2001 14:41:21 -0700, Ashley Yakeley <[EMAIL PROTECTED]> pisze:
>> - Initial Unicode support - the Char type is now 31 bits.
>
> It might be appropriate to have two types for Unicode, a UCS2 type
> (16 bits) and a UCS4 type (31 bits).
Actually it's 20.087
At 2001-05-24 05:57, Julian Seward (Intl Vendor) wrote:
> - Initial Unicode support - the Char type is now 31 bits.
It might be appropriate to have two types for Unicode, a UCS2 type (16
bits) and a UCS4 type (31 bits). For instance, something like:
--
newtype UCS2CodePo
Fri, 4 May 2001 15:20:02 +0100, Ian Lynagh <[EMAIL PROTECTED]> pisze:
> Is there a reason why isUpper and isLower include all unicode
> characters of the appropriate class but isDigit is only 0..9?
There are also other weirdnesses, e.g. isSpace is specified to work
only on ISO-88
Hi all
Is there a reason why isUpper and isLower include all unicode characters
of the appropriate class but isDigit is only 0..9? Are there any
Haskell unicode libraries around? And is the implementation of unicode
support for GHC being discussed/developed anywhere?
BTW, The comments in the
t it quite right yet.)
I haven't followed all on what is done regarding Unicode in emacs,
but apparently Unicode gets into emacs too (see below). I never
use emacs myself, so I haven't tried the package ref. below in any
way whatsoever. You probably need to install some more or les
Lennart Augustsson wrote:
> It's not hard to find a text editor, use w.g. wily. It's widely available.
But it is hard to use some nonstandard (i.e. neither vi nor emacs)
editor just for one special kind of source code - it means to lose
all the keybindings, highlight settings, 100-lines-of-defi
Brian Boutel writes:
> [...]
>
> If the supply of suitable Ascii symbols seems inadequate, remember
> that Haskell uses Unicode. There is no reason to limit symbols to
> those in the Ascii set.
While we're on the subject, I suggest Unicode as a Hugs/GHC wish list
item.
Marcin 'Qrczak' Kowalczyk wrote:
[snip]
> But when Unicode finally comes... How should Haskell's textfile IO
> work?
I don't think the current standard functions for textfile IO would
have too many problems. You can do hSeek in Haskell, but
"The offset i
two fragments:
a = x + y where x = 1
y = 1
vs.
a = x ++ y where x = 1
y = 1
They have very different syntactical meaning.
> > If the supply of suitable Ascii symbols seems inadequate, remember
> > that Haskell uses Unicode. There is no reason to limit
Sat, 09 Oct 1999 13:08:39 +0200, Lennart Augustsson <[EMAIL PROTECTED]> pisze:
> > No, because only the indent before the first non-whitespace character
> > in a line matters. Haskell programs can be typeset even in proportional
> > font as long as indents have correct relationships between their
n a line matters. Haskell programs can be typeset even in proportional
font as long as indents have correct relationships between their
lengths.
> If the supply of suitable Ascii symbols seems inadequate, remember
> that Haskell uses Unicode. There is no reason to limit symbols to
> those in
I think that we are perhaps getting a little off-topic now, but Unicde
will clearly help forward computing, so perhaps it can continue a few more
postings. :-)
At 17:45 +0100 97/11/10, Kent Karlsson [EMAIL PROTECTED] wrote:
> Let me reiterate:
> Unicode is ***NOT*** a
Let me reiterate:
Unicode is ***NOT*** a glyph encoding!
Unicode is ***NOT*** a glyph encoding!
and never will be. The same character can be displayed as
a variety of glyphs, depending not only of the font/style,
but also, and this is the important
At 12:45 +0100 97/11/10, Kent Karlsson [EMAIL PROTECTED] wrote:
> As everyone (getting) familiar with Unicode should
> know, Unicode is **NOT** a font encoding.
> It is a CHARACTER encoding. The difference
> shows up mostly for 'complex scripts', such as Arabic
> and D
ll programs are not written that way.)
3. (In reply to Hans Aberg (Aberg?))
> The easiest way of thinking of Unicode is perhaps as a font
encoding; a
> font using this encoding would add such things as typeface
family, style,
> size, kerning (but Unicod
Carl R. Witty wrote (to the Haskell mailing list):
> [..]
> The Report could give up and say that column numbers in the
> presence of \u escapes are explicitly implementation-defined.
> [..]
> [This] sounds pretty bad (effectively prohibiting layout in portable
> prog
I had option 1 in mind when that part of the report was written. We
should clarify this in the next revision.
And thanks for your analysis of the problem!
John
>Carl R. Witty wrote:
>
>> 1) I assume that layout processing occurs after Unicode preprocessing;
>> otherwise, you can't even find the lexemes. If so, are all Unicode
>> characters assumed to be the same width?
The easiest way of thinking of Unicode is perha
Carl R. Witty wrote:
> 1) I assume that layout processing occurs after Unicode preprocessing;
> otherwise, you can't even find the lexemes. If so, are all Unicode
> characters assumed to be the same width?
Unicode characters ***cannot in any way*** be considered as being of
th
Kent Karlsson <[EMAIL PROTECTED]> writes:
> Carl R. Witty wrote:
>
> > 1) I assume that layout processing occurs after Unicode preprocessing;
> > otherwise, you can't even find the lexemes. If so, are all Unicode
> > characters assumed to be the same wid
Unicode was added at the last moment, so there is likely to
be some descrepancies.
> 1) I assume that layout processing occurs after Unicode preprocessing;
> otherwise, you can't even find the lexemes. If so, are all Unicode
> characters assumed to be the same width?
I think
I have some questions regarding Haskell 1.4 and Unicode. My source
materials for these questions are "The Haskell 1.4 Report" and the
files
ftp://ftp.unicode.org/Public/2.0-Update/ReadMe-2.0.14.txt
and
ftp://ftp.unicode.org/Public/2.0-Update/UnicodeData-2.0.14.txt
It'
98 matches
Mail list logo