subject:"\"\\\[Chicken\\\-users\\\] Codepoint indices for matched regexps \\\(UTF\\\-8\\\)\\\?\""

Re: [Chicken-users] Codepoint indices for matched regexps (UTF-8)?

2018-06-15 Thread John Cowan

On Fri, Jun 15, 2018 at 9:44 AM, Henry Hu wrote: I tried (use utf8), but it is documented that it doesn't affect irregex and > it sure enough doesn't. I tried using the 'utf8 option while compiling my > regex, but it doesn't change the index returned by > irregex-match-start-index. > Do "(use u

[Chicken-users] Codepoint indices for matched regexps (UTF-8)?

2018-06-15 Thread Henry Hu

Hello world! I am trying to use unit irregex to match regular expressions in UTF-8 text. Is anyone familiar with a way to ask for the codepoint indices rather than byte indices for the match? For example: (irregex-match-start-index (irregex-search (irregex "Č" 'utf8) "čččČččč")) returns 6 when