There is one open issue I'd like to draw people's attention to: whether to have
a narrow or broader approach to the whitespace in a pattern environment. The
narrower definition would be:

0009..000D ; Pattern_White_Space # <CHARACTER TABULATION>..<CARRIAGE RETURN
(CR)>
0020       ; Pattern_White_Space # SPACE
0085       ; Pattern_White_Space # <NEXT LINE (NEL)>
200E..200F ; Pattern_White_Space # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK
2028       ; Pattern_White_Space # LINE SEPARATOR
2029       ; Pattern_White_Space # PARAGRAPH SEPARATOR

while the broader one would add:

00A0       ; Pattern_White_Space # NO-BREAK SPACE
2000..200A ; Pattern_White_Space # EN QUAD..HAIR SPACE
202F       ; Pattern_White_Space # NARROW NO-BREAK SPACE
205F       ; Pattern_White_Space # MEDIUM MATHEMATICAL SPACE
3000       ; Pattern_White_Space # IDEOGRAPHIC SPACE

My judgement is that in a pattern environment the narrower devition would be
better. One might go so far as recommending that the others be quoted, to reduce
possible confusion when reading regular expressions, queries, or other patterns.

Mark
__________________________________
http://www.macchiato.com
►  “Eppur si muove” ◄

----- Original Message ----- 
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, August 21, 2003 02:44
Subject: RE: Proposed Draft UTR #31 - Syntax Characters


>
> > This notice is relevant to anyone dealing with programming languages,
> query
> > specifications, regular expressions, scripting languages, and similar
> domains.
>
> That's me.
>
> I read the draft, and actually I was very happy with it. No complaints at
> all. I am particularly happy that the mathematical letters and numbers
> (1D400-1D7FF) will be permitted in identifiers. This is important because it
> allows mathematical expressions and programming-language expressions to use
> the same symbols (for the first time!). I also noted the comment about how
> specific porgramming languages could, if they wished, ignore <font>
> equivalences (and hence ignore the mathematical letters and numbers) - so I
> guess that keeps everyone happy.
>
> I would have used the feedback form, but I didn't see much point as I had no
> complaints.
> Jill
>
>
>
> -----Original Message-----
> From: Rick McGowan [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, August 20, 2003 7:23 PM
> To: [EMAIL PROTECTED]
> Subject: Proposed Draft UTR #31 - Syntax Characters
>
>
> This notice is relevant to anyone dealing with programming languages, query
> specifications, regular expressions, scripting languages, and similar
> domains.
>
> The Proposed Draft UTR #31: Identifier and Pattern Syntax will be discussed
> at
> the UTC meeting next week. Part of that document (Section 4) is a proposal
> for
> two new immutable properties, Pattern_White_Space and Pattern_Syntax. As
> immutable properties, these would not ever change once they are introduced
> into
> the standard, so it is important to get feedback on their contents
> beforehand.
>
> The UTC will not be making a final determination on these properties at this
> meeting, but it is important that any feedback on them is supplied as early
> in
> the process as possible so that it can be considered thoroughly. The draft
> is
> found at http://www.unicode.org/reports/tr31/ and feedback can be submitted
> as
> described there.
>
> Regards,
> Rick McGowan
> Unicode, Inc.
>
>


Reply via email to