At 06:52 AM 6/15/2001 -0400, Bryan C. Warnock wrote:
>On Thursday 14 June 2001 12:01 pm, Dan Sugalski wrote:
> > As I see it, locales specify:
> >
> > * Collating order
> > * Comparison/equality specification
> > * Unicode codepoint interpretation
>
>What do you mean by that?
Unless I'm missing something (Simon? Hong?) Japanese (and potentially all
the languages that use the Han characters) can interpret a particular
character as either a number or not a number, depending on context. The
character for one also could mean first, at least according to the Kanji
dictionary I have handy. It looks like this can happen in other non-numeric
cases, but I'm much less sure of that.
I am potentially all wet, here, though. I'd be OK with that, since it would
make things simpler to not have to deal with it.
> > * Regex character classes
> > * Regex character identification
> > * Regex zero-width assertion rules
> > * 'casing' rules
> >
> > It'd be nice to specify them all separately and inherit the ones you don't
> > need to change from some parent locale.
>
>Or have these individual bits and pieces be addressable through the regexen,
>and have locales *defined* via that.
>
>module Locale::Hawaiian;
>use re 'class (\w => [aeiou�����hklmnpw`])';
>...
Sure. I expect Damian will write us something that lets you specify them
upside-down in Klingon or something by the time this is done. :)
>On a side note (and this *will* sound stupid, but there is a reason I'm
>asking). Why is there no logical opposite to '.'; that is, a character
>which never matches another character? (Besides, of course, that it's
>utterly useless from a classic regex perspective.)
Ask Larry. When would you want to match nothing? (And how do you upcase a
period anyway? I suppose we could go for one of the Unicode open circle
marks...)
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk