Marcin 'Qrczak' Kowalczyk wrote:
> What should the following functions return, assuming they are designed
> now (for Haskell)? What else should be provided? I am putting here
> current definitions:
>
> isControl c = c < ' ' || c >= '\x7F' && c <= '\x9F'
Here I would add: category is one of [Zl,Zp]
because the Line/Paragraph Separators behave like LineFeed.
> isPrint c = category is other than [Zl,Zp,Cc,Cf,Cs,Co]
I think Cf (Format Control) and Co (Private Use) should be counted as
printable.
> isSpace c = one of "\t\n\r\f\v" || category is one of [Zs,Zl,Zp]
From that, please exclude those characters of category Zs (Space)
which have "noBreak" mentioned in their UnicodeData line.
> isPunct c = isGraph c && not (isAlphaNum c)
This is traditional Unix semantics of "punctuation". Unicode has a
more restricted notion of "punctuation" (category P).
> isDigit c = c >= '0' && c <= '9'
I'd prefer: category is Nd
> isUpper c = category is one of [Lu,Lt]
> isLower c = category is Ll
The isUpper/isLower categorization should take the toUpper/toLower
mappings into accound.
> But perhaps it's enough to have toTitle in addition to toUpper and
> toLower, because what could isTitle be used for?
IMO an 'isTitle' function doesn't make sense. But toTitle is important
(as a function String -> String, not char -> char). For example, in
German, toTitle("�") = "ss"; you can't do that with a char -> char
mapping.
Bruno
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/