At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote:
> > 6) There will be a glyph boundary/non-glyph boundary pair of regex
> > characters to match the word/non-word boundary ones we already have.
>(While
> > I'd personally like \g and \G, that won't work as \G is already taken)
> >
> > I also realize that the decomposition flag on regexes would mean that
> > s/A/B/D would turn A ACUTE to B ACUTE, which is meaningless. See the
> > previous paragraph.
>
>I recommend to use 'u' flag, which indicates all operations are performed
>against unicode grapheme/glyph. By default re is performed on codepoint.
U doesn't really signal "glyph" to me, but we are sort of limited in what
we have left. We still need a zero-width assertion for glyph boundary
within regexes themselves.
>We need the character equivalence construct, such as [[=a=]], which
>matches "a", "A ACUTE".
Yeah, we really need a big list of these. PDD anyone?
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk