Hello Commons-Dev team,

I was trying samples with the 'setQuoteMode()' method in the 'CSVFormat'
class.
When I have a sting list as follows to be formatted,

@Test
public void test_empty_columns() throws IOException {
    CSVFormat csvFormat = CSVFormat.DEFAULT.builder()
            .setHeader()
            .setQuoteMode(QuoteMode.MINIMAL)
            .build();

    CSVPrinter printer = new CSVPrinter(System.out, csvFormat);

    List<String[]> tempStrList = new ArrayList<>();
    tempStrList.add(new String[] { "", "col2", "", "col4" });
    tempStrList.add(new String[] { "col1", "col2", "", "" });

    for (String[] temp1 : tempStrList) {
        printer.printRecord(temp1);
    }
    printer.close();
}

Above will print the following...

"",col2,,col4
col1,col2,,

If you have noticed properly, you will realize that in line 1, column 1 it
is printing empty quotes ("") but not printing anything for line 1 column 3.
This output looks a bit odd.

When I was looking for the reason for the above behavior, I came across the
following comment added for the method 'printWithQuotes()' in class
'CSVFormat'.

case MINIMAL:
    if (len <= 0) {
        // Always quote an empty token that is the first
        // on the line, as it may be the only thing on the
        // line. If it were not quoted in that case,
        // an empty line has no tokens.
        if (newRecord) {
            quote = true;
        }
    }
    // other operations...

However, I think we can optimize this behavior by introducing the total
length of values passed to the print method.
Please check the draft PR I have created (this PR is just to provide the
idea before actual refactoring):
https://github.com/apache/commons-csv/pull/351
If we apply changes as suggested in above PR, then we might be able to
produce more accurate output which will not print empty quotes for empty
first column.

Appreciate your feedback on this.

Thank You.
Buddhi De Silva.

Reply via email to