> On Jan 28, 2015, at 8:30 , Allen Wirfs-Brock <al...@wirfs-brock.com> wrote: > > > On Jan 28, 2015, at 4:54 AM, Wes Garland <w...@page.ca> wrote: >
>> Do we extend the regexp syntax to have a symbol which matches an unmatched >> surrogate? > we already have it: \u{D83D} Or to match any unpaired surrogate: /[\u{D800}-\u{DFFF}]/u >> How about reserved code points? What happens when they become assigned? > Other than the initial decoding of valid surrogate pairs into 32-bit code > points, the ES6 //u RegExp spec. applies no semantics to any code points in > the string that is being matched. There are a few places where RegExp applies Unicode semantics: – //ui uses Unicode case folding to compare case-insensitively. If the comparison involves code points that are unassigned in the Unicode version assumed by an ECMAScript implementation and in a later version get assigned to characters that are case-variants of each other, then the RegExp behavior can change. See section 21.2.2.8.2. – RegExp knows a few character classes: \d, \D, \s, \S, \w, \W. \d, \D, \w, \W are defined by character lists that cannot change, but \s and therefore \S could change if Unicode assigns new characters with the category “Separator, space”. See section 21.2.2.12. But in general //u is defined based on code points and doesn’t care whether code points are assigned or reserved. Norbert _______________________________________________ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss