[ 
https://issues.apache.org/jira/browse/CSV-290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17492340#comment-17492340
 ] 

Angus C edited comment on CSV-290 at 2/15/22, 3:52 AM:
-------------------------------------------------------

Basically the "EOF reached" always happens if quote-char = escape-char. 
Considering the input string ("a"), Lexer.java treats the second (") as an 
escape char and read the unescaped \r, and then complain for missing the 
ending-quote (")
{code:java}
CSVFormat.Builder.create().setEscape('"').build().parse(new 
StringReader("\"a\"")).getRecords();
{code}
I think the setEscape() is used for escaping special char like \r, \t etc. as 
in Lexer.readEscape() but not the quote-char. The quote-char should always be 
escaped by quote-char, not the escape-char.

Your fix is to disable the escape-char in quoted-string if it is equal to 
quote-char. It can be a fail-safe but I think we should remove the 
.setEscape(DOUBLE_QUOTE_CHAR) in POSTGRESQL_CSV. The javadoc says "special 
characters are escaped with quote" but I doubt that it is correct or not


was (Author: JIRAUSER285196):
Basically the "EOF reached" always happens if quote-char = escape-char. 
Considering  the input string ("a"), Lexer.java treats the second (") as an 
escape char and read the unescaped \r, and then complain for missing the 
ending-quote (")
{code:java}
CSVFormat.Builder.create().setEscape('"').build().parse(new 
StringReader("\"a\"")).getRecords();
{code}
I think the setEscape() is used for escaping special char like \r, \t etc. as 
in Lexer.readEscape() but not the quote-char.  The quote-char should be always 
escaped by quote-char, not the escape-char.

Your fix is to disable the escape-char in quoted-string if it is equal to 
quote-char.  It can be a fail-save but I think we should remove the 
.setEscape(DOUBLE_QUOTE_CHAR) in POSTGRESQL_CSV.  The javadoc says "special * 
characters are escaped with quote" but I doubt that it is correct or not

> Produced CSV using PostgreSQL format cannot be read
> ---------------------------------------------------
>
>                 Key: CSV-290
>                 URL: https://issues.apache.org/jira/browse/CSV-290
>             Project: Commons CSV
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 1.6, 1.9.0
>            Reporter: Anatoliy Artemenko
>            Priority: Major
>
> {code:java}
> // code placeholder
> {code}
> CSV, produced using printer:
>  
> CSVPrinter printer = new CSVPrinter(sw, 
> CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
>  
> cannot be be read with same format parser:
>  
> CSVParser parser = new CSVParser(new StringReader(sw.toString()), 
> CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());
>  
> To reproduce: 
>  
> {code:java}
> StringWriter sw = new StringWriter(); 
> CSVPrinter printer = new CSVPrinter(sw, 
> CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());  
> printer.printRecord("column1", "column2"); 
> printer.printRecord("v11", "v12"); 
> printer.printRecord("v21", "v22");  
> printer.close();  
> CSVParser parser = new CSVParser(new StringReader(sw.toString()), 
> CSVFormat.POSTGRESQL_CSV.withFirstRecordAsHeader());  
> System.out.println("headers: " + 
> Arrays.equals(parser.getHeaderNames().toArray(), new String[] {"column1", 
> "column2"}));  
> Iterator<CSVRecord> i = parser.iterator(); 
> System.out.println("row: " + Arrays.equals(i.next().toList().toArray(), new 
> String[] {"v11", "v12"})); 
> System.out.println("row: " + Arrays.equals(i.next().toList().toArray(), new 
> String[] {"v21", "v22"}));{code}
> I'd expect the above code to work, but it fails:
> {code:java}
> java.io.IOException: (startline 1) EOF reached before encapsulated token 
> finishedjava.io.IOException: (startline 1) EOF reached before encapsulated 
> token finished 
> at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:371) 
> at org.apache.commons.csv.Lexer.nextToken(Lexer.java:285) 
> at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:701) 
> at org.apache.commons.csv.CSVParser.createHeaders(CSVParser.java:480) 
> at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:432) 
> at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:398) 
> at Test.main(Test.java:25)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to