On 22/08/2003 08:31, Mark Davis wrote:

The purpose of the Pattern Syntax characters is *not* to list everything that is
a symbol or punctuation mark. That exists independently. Think of them as
operators in the engine syntax, as "?" or "*" are used today in Perl, or as
+, -, /, * could be used in math expressions.

The goal is to have a relatively small, unchangeable list of ranges, which
contain a reasonable restriction on characters for future syntax characters in a
general pattern environment. General regular expression engines, for example,
would *not* add 05C3 HEBREW PUNCTUATION SOF PASUQ as an operator, to indicate
(say) a non-greedy match variant of *.

Mark
__________________________________
http://www.macchiato.com
►  “Eppur si muove” ◄



Maybe I misunderstood what Marco was talking about. No, I would not expect a separate SOF PASUQ operator. My point was more that a Hebrew user might acccidentally type or prefer to type SOF PASUQ instead of a colon etc.

I don't think we should be defining as an "unchangeable list" only Latin characters for the syntax, thus tying computer languages inseparably to the Latin alphabet. That would give the Africans some good reasons to complain that Unicode is too American and/or European. It's the "unchangeable" which makes me very nervous here. For now all computer languages are Latin alphabet based, as far as I know, but who knows what will happen in 50-100 years? Computer languages based on Devanagari, Arabic or Japanese scripts would be a real possibility (or Cyrillic, but the punctuation is the same as Latin).

Now it would be a different matter if we could somehow reserve other punctuation characters for further extension. Then we could allow Latin punctuation to be used as operators but require that all other punctuation be quoted.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to