>From ongdisheng: Attention is currently required from: Ian Maxon.
ongdisheng has posted comments on this change by Ian Maxon. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21007?usp=email ) Change subject: [ASTERIXDB-2877][EXT] Fix multi-byte/emoji character corruption in CSV output ...................................................................... Patch Set 2: Code-Review+1 (1 comment) File asterixdb/asterix-om/src/main/java/org/apache/asterix/dataflow/data/nontagged/printers/PrintTools.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21007/comment/5f5bd699_4bc48b43?usp=email : PS2, Line 322: char quote > right now the quote, escape and delimiters are weird. […] +1 on compile time checking I traced through the code and found that compile time validation seems to already exists in `WriterValidationUtil.validateCSV()` from `asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/util/WriterValidationUtil.java`. Currently, there are two code paths that eventually call writeUTF8StringAsCSV(): 1. HTTP SELECT queries with CSV output: These use `CSVPrinterFactoryProvider.INSTANCE` which is initialized with an empty configuration, so delimiter, quote and escape always default to ASCII values (, " "). 2. COPY TO statements: These allow users to specify custom delimiter, quote and escape values. The compile-time validation happens in `WriterValidationUtil.validateCSV()` which calls `validateDelimiter()`, `validateQuote()` and `validateEscape()`. Perhaps we can move the current runtime ASCII validation logic in `writeUTF8StringAsCSV()` to the compile time validators in `validateDelimiter()`, `validateQuote()` and validateEscape(). -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21007?usp=email To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings?usp=email Gerrit-MessageType: comment Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: I434142a9b9cd2d1fc941b1e1f350e97403a8a3e1 Gerrit-Change-Number: 21007 Gerrit-PatchSet: 2 Gerrit-Owner: Ian Maxon <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: ongdisheng Gerrit-CC: Hussain Towaileb <[email protected]> Gerrit-CC: Murtadha Hubail <[email protected]> Gerrit-Attention: Ian Maxon <[email protected]> Gerrit-Comment-Date: Fri, 20 Mar 2026 06:52:51 +0000 Gerrit-HasComments: Yes Gerrit-Has-Labels: Yes Comment-In-Reply-To: Ian Maxon <[email protected]>
