On Mon, May 24, 2010 at 2:40 AM, Aaron Sherman <a...@ajs.com> wrote: > Which I tried to translate as: > > token ucschar { > <+[\xA0 .. \xD7FF] + [\xF900 .. \xFDCF] + [\xFDF0 .. \xFFEF] + > [\x10000 .. \x1FFFD] + [\x20000 .. \x2FFFD] + > [\x30000 .. \x3FFFD] + [\x40000 .. \x4FFFD] + > [\x50000 .. \x5FFFD] + [\x60000 .. \x6FFFD] + > [\x70000 .. \x7FFFD] + [\x80000 .. \x8FFFD] + > [\x90000 .. \x9FFFD] + [\xA0000 .. \xAFFFD] + > [\xB0000 .. \xBFFFD] + [\xC0000 .. \xCFFFD] + > [\xD0000 .. \xDFFFD] + [\xE1000 .. \xEFFFD]> > } > > But this refuses to match my test IRI's one-character path: > > http://www.example.com/π
For testing purposes, I re-wrote this as: token ucschar { <-[\ .. ~]> } Which isn't great, and certainly is not exactly the same as the above, but at least for my testing it should suffice. Also, I need to do some testing to narrow down how it happened, but I found a case where "$x eq $y" failed but "~$x eq $y" succeeded, indicating that, contrary to spec, eq isn't coercing to Str. -- Aaron Sherman Email or GTalk: a...@ajs.com http://www.ajs.com/~ajs