[
https://issues.apache.org/jira/browse/SANDBOX-308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Colin Goodheart-Smithe updated SANDBOX-308:
-------------------------------------------
Attachment: CSVPrintTest.java
This JUnit test illustrates the bug in this issue.
> The CSVPrinter ecapsing inconsistant with CSVParser
> ---------------------------------------------------
>
> Key: SANDBOX-308
> URL: https://issues.apache.org/jira/browse/SANDBOX-308
> Project: Commons Sandbox
> Issue Type: Bug
> Components: CSV
> Reporter: Colin Goodheart-Smithe
> Priority: Minor
> Attachments: CSVParser.java, CSVPrintTest.java
>
>
> The CSVPrinter ecapses new line and return character to "\n" and "\r" if
> these occur within the encapsulators (this is within the
> CSVPrinter.escapeAndQuote(String) method). However, the CSVParser do not
> convert these back to new line and return characters in the same fashion. So
> if you use the CSVPrinter to create a delimited file containing new line or
> return characters within an entry and then read this file using the CSVParser
> the text read in by the CSVParser will not match the text written by the
> CSVPrinter (the difference being that every new line and return character
> will be replaced by "\n" and "\r" respectively).
> A possible fix for this would be to add two extra 'else if' statements to
> CSVParser.encapsulatedTokenLexer(Token, int) starting at line 49, as detailed
> below (the _ehampsised_ text indicated the changes):
> else if (c == '\\' && in.lookAhead() == '\\')
> {
> // doubled escape char, it does not escape itself, only
> encapsulator
> // -> add both escape chars to stream
> tkn.content.append((char) c);
> c = in.read();
> tkn.content.append((char) c);
> }
> _else if (c == '\\' && in.lookAhead() == 'n')_
> _{_
> _ // escaped java new line character, append a new line
> character_
> _tkn.content.append('\n');_
> _c = in.read();_
> _}_
> _else if (c == '\\' && in.lookAhead() == 'r')_
> _{_
> _// escaped java return character, append a return character_
> _tkn.content.append('\r');_
> _c = in.read();_
> _}_
> else if (strategy.getUnicodeEscapeInterpretation() && c ==
> '\\'
> && in.lookAhead() == 'u')
> {
> // interpret unicode escaped chars (like \u0070 -> p)
> tkn.content.append((char) unicodeEscapeLexer(c));
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.