digi-scrypt opened a new pull request, #605: URL: https://github.com/apache/commons-csv/pull/605
1. with byte tracking on (setTrackBytes + a charset), read(char[]) advances lastChar to the last buffer char before counting, and the per-char helper reads that field instead of the actual preceding char, so a surrogate pair gets matched against the wrong neighbor (the loop also ran to length instead of offset+length). 2. a 4-byte char taken through the char[] path, e.g. a multi-character delimiter holding a supplementary character, then throws CharacterCodingException out of nextRecord and getBytePosition() goes wrong. Passed the previous char explicitly and moved the counting ahead of the lastChar update. What happens for a pair split across two buffer reads is covered too since the first char still pairs against the saved lastChar. Added a regression test next to the existing multi-character-delimiter byte-position one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
