UnicodeUnescapeReader should not be applied before parsing
----------------------------------------------------------

                 Key: CSV-67
                 URL: https://issues.apache.org/jira/browse/CSV-67
             Project: Commons CSV
          Issue Type: Bug
            Reporter: Sebb


The UnicodeEscapeReader is currently applied before the input file is parsed.

This means that unicode escapes are treated differently from other escapes.

For example, the sequence <esc>r<esc>n is not treated as a new-line for the 
purpose of recognising the end of a record, yet \o000D\u000A is converted to 
CRLF and would terminate the record (unless embedded in a quoted string).

The unicode escape processing (if selected) should occur as part of the 
parsing, just as for ordinary escape processing.

The class can be made public so the user can wrap the input if required; this 
preserves the existing functionality should it be required, so there is no need 
to introduce another setting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to