[ 
https://issues.apache.org/jira/browse/CSV-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17492428#comment-17492428
 ] 

Angus C edited comment on CSV-288 at 2/15/22, 8:15 AM:
-------------------------------------------------------

In below line in Lexer.java, the isDelimiter() unintentionally advance the 
buffer pointer  and "eat" the first "|" when it comes to the "b" in "|b|" 
(lastChar is "|", nextChar is also "|", make it "||"). So "a||bc||d " doesn't 
fail (lastChar is "|", but next char is "c"), neither if the delimiter is two 
different char (e.g. |!).

The checking is used to detect the last empty column (e.g. in "a,b,")
{code:java}
// did we reach eof during the last iteration already ? EOF
if (isEndOfFile(lastChar) || !isDelimiter(lastChar) && isEndOfFile(c)) { {code}


was (Author: JIRAUSER285196):
In below line in Lexar.java, the isDelimiter() unintentionally advance the 
buffer pointer  and "eat" the first "|" when it comes to the "b" in "|b|" 
(lastChar is "|", nextChar is also "|", make it "||"). So "a||bc||d " doesn't 
fail (lastChar is "|", but next char is "c"), neither if the delimiter is two 
different char (e.g. |!).

The checking is used to detect the last empty column (e.g. in "a,b,")
{code:java}
// did we reach eof during the last iteration already ? EOF
if (isEndOfFile(lastChar) || !isDelimiter(lastChar) && isEndOfFile(c)) { {code}

> String delimiter (||) is not working as expected.
> -------------------------------------------------
>
>                 Key: CSV-288
>                 URL: https://issues.apache.org/jira/browse/CSV-288
>             Project: Commons CSV
>          Issue Type: Bug
>            Reporter: Santhsoh
>            Priority: Major
>
> Steps to reproduce  : 
> 1. Parse CSV file with || as delimiter and having empty columns
> 2. Print the CSVRecord resulting from CSVParser
>  
> //Expected : a,b,c,d,,f,g 
> // Actual : a,b|c,d,|f,g
> public static void main(String[] args) throws Exception\{
>      String row = "a||b||c||d||||f||g";
>      StringBuilder stringBuilder = new StringBuilder();
>      try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, 
> CSVFormat.EXCEL);
>           CSVParser csvParser = CSVParser.parse(new StringInputStream(row), 
> StandardCharsets.UTF_8, 
> CSVFormat.Builder.create().setDelimiter("||").build())) {
>          for (CSVRecord csvRecord : csvParser) {
>              for (int i = 0; i < csvRecord.size(); i++) {
>                  csvPrinter.print(csvRecord.get(i));
>              }
>              System.out.println(stringBuilder.toString());
>              //Expected : a,b,c,d,,f,g
>             // Actual : a,b|c,d,|f,g
>          }
>      }
>  }
> With the snippet provided above, actual value is not same as expected value



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to