On Monday, 25 August 2014 at 20:35:32 UTC, Sönke Ludwig wrote:
BTW, JSON is *required* to be UTF encoded anyway as per RFC-7159, which is another argument for just letting the lexer assume valid UTF.

The lexer cannot assume valid UTF since the client might be a rogue, but it can just bail out if the lookahead isn't jSON? So UTF-validation is limited to strings.

You have to parse the strings because of the \uXXXX escapes of course, so some basic validation is unavoidable? But I guess full validation of string content could be another useful option along with "ignore escapes" for the case where you want to avoid decode-encode scenarios. (like for a proxy, or if you store pre-escaped unicode in a database)

Reply via email to