Re: UTF-8 encode/decode libraries.

2004-05-05 Thread Antti-Juhani Kaijanaho
On 20040426T104946-0700, David Brown wrote: > Is anyone aware of any Haskell libraries for doing UTF-8 decoding and > encoding? If not, I'll write something simple. I wrote a simple Unicode library for my MSc project a couple of years ago. It might not compile with recent GHC, but you can have a

Re: UTF-8 encode/decode

2004-04-27 Thread George Russell
David Brown wrote (snipped): What license is your code covered under? As it stands now, it is an informative example, but cannot be used by anybody. As author, I am quite happy for it to be used and modified by other people for non-commercial purposes. As far as I know my employers wouldn't any p

Re: UTF-8 encode/decode

2004-04-27 Thread David Brown
On Tue, Apr 27, 2004 at 10:55:57AM +0200, George Russell wrote: > I have implemented UTF8-encode/decode. Unlike the code someone has already > posted it handles all UTF8 sequences, including those longer than 3 bytes. > It also catches all illegal UTF8 sequences (such as characters encoded > with

UTF-8 encode/decode

2004-04-27 Thread George Russell
I have implemented UTF8-encode/decode. Unlike the code someone has already posted it handles all UTF8 sequences, including those longer than 3 bytes. It also catches all illegal UTF8 sequences (such as characters encoded with a longer sequence than necessary). Here is the code. --

Re: UTF-8 encode/decode libraries.

2004-04-26 Thread David Brown
On Mon, Apr 26, 2004 at 08:33:38PM +0200, Sven Panne wrote: > Duncan Coutts wrote: > >On Mon, 2004-04-26 at 18:49, David Brown wrote: [...] > >toUTF :: String -> String > > Hmmm, "String -> [Word8]" would be nicer... > > >fromUTF :: String -> String > > ... and here: "[Word8] -> String" or "[Wor

Re: UTF-8 encode/decode libraries.

2004-04-26 Thread Sven Panne
Duncan Coutts wrote: On Mon, 2004-04-26 at 18:49, David Brown wrote: [...] toUTF :: String -> String Hmmm, "String -> [Word8]" would be nicer... fromUTF :: String -> String ... and here: "[Word8] -> String" or "[Word8] -> Maybe String". Furthermore, UTF-8 is not restricted to a maximum of 3 bytes

Re: UTF-8 encode/decode libraries.

2004-04-26 Thread Duncan Coutts
On Mon, 2004-04-26 at 18:49, David Brown wrote: > Is anyone aware of any Haskell libraries for doing UTF-8 decoding and > encoding? If not, I'll write something simple. The gtk2hs library uses the following functions internally. Credit to Axel Simon I believe unless he swiped them from somewhere

UTF-8 encode/decode libraries.

2004-04-26 Thread David Brown
I am writing some utilities to deal with UTF-8 encoded text files (not source). Currently, I'm just reading in the UTF-8 directly, and things work reasonably well, since my parse tokens are ASCII, they are easy to parse. However, the character type seems perfectly happy with larger values for eac