At 01:30 PM 3/22/2001 -0800, Hong Zhang wrote:
> > 6) There will be a glyph boundary/non-glyph boundary pair of regex
> > characters to match the word/non-word boundary ones we already have.
>(While
> > I'd personally like \g and \G, that won't work as \G is already taken)
> >
> > I also realize that the decomposition flag on regexes would mean that
> > s/A/B/D would turn A ACUTE to B ACUTE, which is meaningless. See the
> > previous paragraph.
>
>I recommend to use 'u' flag, which indicates all operations are performed
>against unicode grapheme/glyph. By default re is performed on codepoint.

U doesn't really signal "glyph" to me, but we are sort of limited in what 
we have left. We still need a zero-width assertion for glyph boundary 
within regexes themselves.

>We need the character equivalence construct, such as [[=a=]], which
>matches "a", "A ACUTE".

Yeah, we really need a big list of these. PDD anyone?

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to