There is one open issue I'd like to draw people's attention to: whether to have a narrow or broader approach to the whitespace in a pattern environment. The narrower definition would be:
0009..000D ; Pattern_White_Space # <CHARACTER TABULATION>..<CARRIAGE RETURN (CR)> 0020 ; Pattern_White_Space # SPACE 0085 ; Pattern_White_Space # <NEXT LINE (NEL)> 200E..200F ; Pattern_White_Space # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK 2028 ; Pattern_White_Space # LINE SEPARATOR 2029 ; Pattern_White_Space # PARAGRAPH SEPARATOR while the broader one would add: 00A0 ; Pattern_White_Space # NO-BREAK SPACE 2000..200A ; Pattern_White_Space # EN QUAD..HAIR SPACE 202F ; Pattern_White_Space # NARROW NO-BREAK SPACE 205F ; Pattern_White_Space # MEDIUM MATHEMATICAL SPACE 3000 ; Pattern_White_Space # IDEOGRAPHIC SPACE My judgement is that in a pattern environment the narrower devition would be better. One might go so far as recommending that the others be quoted, to reduce possible confusion when reading regular expressions, queries, or other patterns. Mark __________________________________ http://www.macchiato.com ► “Eppur si muove” ◄ ----- Original Message ----- From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, August 21, 2003 02:44 Subject: RE: Proposed Draft UTR #31 - Syntax Characters > > > This notice is relevant to anyone dealing with programming languages, > query > > specifications, regular expressions, scripting languages, and similar > domains. > > That's me. > > I read the draft, and actually I was very happy with it. No complaints at > all. I am particularly happy that the mathematical letters and numbers > (1D400-1D7FF) will be permitted in identifiers. This is important because it > allows mathematical expressions and programming-language expressions to use > the same symbols (for the first time!). I also noted the comment about how > specific porgramming languages could, if they wished, ignore <font> > equivalences (and hence ignore the mathematical letters and numbers) - so I > guess that keeps everyone happy. > > I would have used the feedback form, but I didn't see much point as I had no > complaints. > Jill > > > > -----Original Message----- > From: Rick McGowan [mailto:[EMAIL PROTECTED] > Sent: Wednesday, August 20, 2003 7:23 PM > To: [EMAIL PROTECTED] > Subject: Proposed Draft UTR #31 - Syntax Characters > > > This notice is relevant to anyone dealing with programming languages, query > specifications, regular expressions, scripting languages, and similar > domains. > > The Proposed Draft UTR #31: Identifier and Pattern Syntax will be discussed > at > the UTC meeting next week. Part of that document (Section 4) is a proposal > for > two new immutable properties, Pattern_White_Space and Pattern_Syntax. As > immutable properties, these would not ever change once they are introduced > into > the standard, so it is important to get feedback on their contents > beforehand. > > The UTC will not be making a final determination on these properties at this > meeting, but it is important that any feedback on them is supplied as early > in > the process as possible so that it can be considered thoroughly. The draft > is > found at http://www.unicode.org/reports/tr31/ and feedback can be submitted > as > described there. > > Regards, > Rick McGowan > Unicode, Inc. > >