On Mon, May 24, 2010 at 2:40 AM, Aaron Sherman <a...@ajs.com> wrote:
> Which I tried to translate as:
>
>        token ucschar {
>            <+[\xA0 .. \xD7FF] + [\xF900 .. \xFDCF] + [\xFDF0 .. \xFFEF] +
>            [\x10000 .. \x1FFFD] + [\x20000 .. \x2FFFD] +
>            [\x30000 .. \x3FFFD] + [\x40000 .. \x4FFFD] +
>            [\x50000 .. \x5FFFD] + [\x60000 .. \x6FFFD] +
>            [\x70000 .. \x7FFFD] + [\x80000 .. \x8FFFD] +
>            [\x90000 .. \x9FFFD] + [\xA0000 .. \xAFFFD] +
>            [\xB0000 .. \xBFFFD] + [\xC0000 .. \xCFFFD] +
>            [\xD0000 .. \xDFFFD] + [\xE1000 .. \xEFFFD]>
>        }
>
> But this refuses to match my test IRI's one-character path:
>
>  http://www.example.com/π


For testing purposes, I re-wrote this as:

  token ucschar { <-[\  .. ~]> }

Which isn't great, and certainly is not exactly the same as the above,
but at least for my testing it should suffice.

Also, I need to do some testing to narrow down how it happened, but I
found a case where "$x eq $y" failed but "~$x eq $y" succeeded,
indicating that, contrary to spec, eq isn't coercing to Str.

-- 
Aaron Sherman
Email or GTalk: a...@ajs.com
http://www.ajs.com/~ajs

Reply via email to