I'm using the lazy ByteString representation to match against, so it's no surprise that it fails.
On Fri, Mar 20, 2009 at 11:41 AM, Chris Kuklewicz <[email protected]> wrote: > With [Char] and (Seq Char) the text is full unicode. > > With ByteString and ByteString.Lazy you are really using > ByteString.Char8 and ByteString.Lazy.Char8 > > Here is a test (I saved the source file in utf8): > > import Text.Regex.TDFA > text = "☮☯♲☢☣☠☃" > regex = "(☢|☣)" > search :: [[String]] > search = text =~ regex > main = do > print text > print regex > print search > > in ghci this prints: > > *Main> main > main > "\9774\9775\9842\9762\9763\9760\9731" > "(\9762|\9763)" > [["\9762","\9762"],["\9763","\9763"]] > > So this works. Are you using bytestrings to hold unicode as utf-8 or > utf-16 ? yup, utf-8. Thanks for the quick reply! -- JP --~--~---------~--~----~------------~-------~--~----~ Yi development mailing list [email protected] http://groups.google.com/group/yi-devel -~----------~----~----~----~------~----~------~--~---
