You can adjust a port to count characters instead of bytes by using `port-count-lines!`.
At Wed, 13 Jan 2016 12:23:24 -0800 (PST), Ben Draut wrote: > I've been tinkering with a lexer/parser for Lambda calculus expressions. I'm > trying to add a feature to highlight unexpected characters, rather than > making > the user count columns. For example, the lexer will choke on this string, > because it doesn't have a match for % anywhere: (λx.x %) > > I'd like to get output like this: > > <Some kind of explanation message>: > > (λx.x %) > ^ > > I've hit a problem though where the srclocs returned from the lexer don't > correspond exactly to string indices when unicode characters (such as λ) are > included in the expression. > > Take this program for example: > > (with-handlers ([exn:fail:read? (λ (e) (printf "ERROR: ~a~n" e))]) > ((lexer-src-pos [(eof) 'EOF]) (open-input-string "λ"))) > > This outputs: > > ERROR: #(struct:exn:fail:read lexer: No match found in input starting with: λ > #<continuation-mark-set> (#(struct:srcloc #f #f #f 1 2))) > > You can see in the srcloc that the position is 1, (as expected) but the span > is 2. > > So let's imagine I have a lexer that understands λ, but nothing else. When I > try to lex the expression "λx", the lexer chokes on the x, and the srcloc > information has position 3 and span 1. However, (string-length "λx") > evaluates > to 2. > > I'm struggling to figure out how to solve the problem. I'm guessing that the > λ > is two bytes, and the lexer counts 1 position per byte. So what's the 'right' > way to handle this? I could just sub1 from a position for each λ that > appeared > in the expression beforehand, but that feels sick and wrong. > > Is there a way to tell the lexer to count positions on a > character-by-character basis, rather than bytes? (Or perhaps I'm > misunderstanding how that works too.) > > Any pointers would be appreciated. Thanks! > > -- > You received this message because you are subscribed to the Google Groups > "Racket Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.

