>From ongdisheng: Attention is currently required from: Ian Maxon.
ongdisheng has posted comments on this change by Ian Maxon. ( https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21007?usp=email ) Change subject: [ASTERIXDB-2877][EXT] Fix multi-byte/emoji character corruption in CSV output ...................................................................... Patch Set 3: Code-Review+1 (1 comment) File asterixdb/asterix-om/src/main/java/org/apache/asterix/dataflow/data/nontagged/printers/PrintTools.java: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21007/comment/01f4e89d_e553f647?usp=email : PS2, Line 322: char quote > Oh, very nice detective work. […] LGTM, seems like I don't have +2 permissions. I just noticed something probably worth mentioning. The current compile-time validation in `WriterValidationUtil.unitByteCondition()` uses AND: ``` if (param != null && param.length() > 1 && param.getBytes().length != 1) ``` However, characters with `length()=1` but multiple bytes like `¢` or `中` will pass validation. Example: - User query: COPY (...) WITH {"delimiter":"¢", "quote":"中"} - Character `¢`: length()=1, getBytes().length=2 - Condition above would result in false - No compilation error being thrown and query executes with non-ASCII delimiter I think changing AND to OR would probably help to enforce ASCII-only: ``` if (param != null && (param.length() > 1 || param.getBytes().length != 1)) ``` Feel free to let me know what you think on this :D -- To view, visit https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/21007?usp=email To unsubscribe, or for help writing mail filters, visit https://asterix-gerrit.ics.uci.edu/settings?usp=email Gerrit-MessageType: comment Gerrit-Project: asterixdb Gerrit-Branch: master Gerrit-Change-Id: I434142a9b9cd2d1fc941b1e1f350e97403a8a3e1 Gerrit-Change-Number: 21007 Gerrit-PatchSet: 3 Gerrit-Owner: Ian Maxon <[email protected]> Gerrit-Reviewer: Anon. E. Moose #1000171 Gerrit-Reviewer: Jenkins <[email protected]> Gerrit-Reviewer: ongdisheng Gerrit-CC: Hussain Towaileb <[email protected]> Gerrit-CC: Murtadha Hubail <[email protected]> Gerrit-Attention: Ian Maxon <[email protected]> Gerrit-Comment-Date: Mon, 23 Mar 2026 13:58:01 +0000 Gerrit-HasComments: Yes Gerrit-Has-Labels: Yes Comment-In-Reply-To: ongdisheng Comment-In-Reply-To: Ian Maxon <[email protected]>
