Hi hackers, COPY TO FORMAT JSON silently accepts the ENCODING option but doesn't perform encoding conversion(?) CopyToJsonOneRow() sends the output of composite_to_json() via CopySendData() without calling pg_server_to_any(), unlike the text and CSV paths.
COPY t TO '/tmp/out.json' WITH (FORMAT json, ENCODING 'LATIN1'); On a UTF-8 server this produces UTF-8 output, not LATIN1. RFC 8259 says JSON text must be UTF-8, so arguably JSON output should never be converted. But even under that interpretation, silently accepting the option and ignoring it looks wrong, the user explicitly asked for LATIN1 and got something else. The same issue also affects COPY TO STDOUT when client_encoding differs from the server encoding, since the default file_encoding is the client encoding and CopyToJsonOneRow never checks need_transcoding. The attached patch rejects the explicit ENCODING option for JSON mode, consistent with how DELIMITER, NULL, DEFAULT, and HEADER are already rejected. The implicit client_encoding case is a separate design question (should COPY TO JSON always emit UTF-8 regardless of client_encoding?) that maybe we should address separately and not as part of v19. Introduced by 7dadd38cda9 (json format for COPY TO). I've attached a patch for rejecting the ENCODING option. Thoughts?
0001-Reject-ENCODING-option-for-COPY-TO-FORMAT-JSON.patch
Description: Binary data
