At 2002-02-28 07:18, Simon Peyton-Jones wrote: > whitechar -> newline | vertab | space | tab | uniWhite > newline -> return linefeed | return | linefeed | formfeed > return -> a carriage return > linefeed -> a line feed > >This means that CR, LF, or CRLF, are all valid 'newline' separators, >and the same sequence of characters should therefore work on any >Haskell implementation.
Good. While you're fiddling with it, I recommend this: newline -> return linefeed | return | linefeed | formfeed | uniLineSep | uniParaSep uniLineSep -> any char of General Category Zl uniParaSep -> any char of General Category Zp Unicode defines two codepoints that unambiguously mean 'line separator' (\u2028) and 'paragraph separator' (\u2029). As it happens, they are the only codepoints in General Categories Zl and Zp. There are other paragraph separators (e.g. Georgian and Urdu), but they are actual marks rather than being whitespace and are not in GC Zp -- much like the pilcrow. uniWhite -> any UNIcode character defined as whitespace This is fine. But note that whitespace is an 'extended property', it can't be derived from General Category: <http://www.unicode.org/Public/3.1-Update1/PropList-3.1.1.html> -- Ashley Yakeley, Seattle WA _______________________________________________ Haskell mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell