[
https://issues.apache.org/jira/browse/CSV-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034470#comment-14034470
]
Sebb commented on CSV-58:
-------------------------
I now think that only meta-characters (and the record-separator) should be
unescaped, because only meta-characters need to be unescaped on output. All
other escapes should be left as-is, and should be handled separately (probably
by the application).
However, this may cause some issues with multi-char record separators - needs
further investigation.
More complications may occur if the RS can be specified as a list of strings.
It may be necessary to restrict the RS to a single string.
> Unescape handling needs rethinking
> ----------------------------------
>
> Key: CSV-58
> URL: https://issues.apache.org/jira/browse/CSV-58
> Project: Commons CSV
> Issue Type: Bug
> Components: Parser
> Reporter: Sebb
> Fix For: Patch Needed, 1.0
>
> Attachments: commons-csv.diff
>
>
> The current escape parsing converts <esc><char> to plain <char> if the <char>
> is not one of the special characters to be escaped.
> This can affect unicode escapes if the <esc> character is backslash.
> One way round this is to specifically check for <char> == 'u', but it seems
> wrong to only do this for 'u'.
> Another solution would be to leave <esc><char> as is unless the <char> is one
> of the special characters.
> There are several possible ways to treat unrecognised escapes:
> - treat it as if the escape char had not been present (current behaviour)
> - leave the escape char as is
> - throw an exception
--
This message was sent by Atlassian JIRA
(v6.2#6252)