On Mon, May 6, 2024 at 8:43 PM Michael Paquier <mich...@paquier.xyz> wrote: > On Fri, May 03, 2024 at 07:05:38AM -0700, Jacob Champion wrote: > > We could port something like that to src/common. IMO that'd be more > > suited for an actual conversion routine, though, as opposed to a > > parser that for the most part assumes you didn't lie about the input > > encoding and is just trying not to crash if you're wrong. Most of the > > time, the parser just copies bytes between delimiters around and it's > > up to the caller to handle encodings... the exceptions to that are the > > \uXXXX escapes and the error handling. > > Hmm. That would still leave the backpatch issue at hand, which is > kind of confusing to leave as it is. Would it be complicated to > truncate the entire byte sequence in the error message and just give > up because we cannot do better if the input byte sequence is > incomplete?
Maybe I've misunderstood, but isn't that what's being done in v2? > > Maybe I'm missing > > code somewhere, but I don't see a conversion routine from > > json_errdetail() to the actual client/locale encoding. (And the parser > > does not support multibyte input_encodings that contain ASCII in trail > > bytes.) > > Referring to json_lex_string() that does UTF-8 -> ASCII -> give-up in > its conversion for FRONTEND, I guess? Yep. This limitation looks > like a problem, especially if plugging that to libpq. Okay. How we deal with that will likely guide the "optimal" fix to error reporting, I think... Thanks, --Jacob