Hi All, Alex, more below:
On 2022/10/22 21:23:13 Alex Herbert wrote:
> On Sat, 22 Oct 2022 at 20:05, Gary D. Gregory <[email protected]> wrote:
> >
> > Thank you for the new tests Alex!
> >
> > Here is one area that is easy to overlook: As Commons CSV has evolved, _not
> > all settings in CSVFormat_ apply to both writing and parsing. To wit, the
> > existing Javadoc for CSVFormat.getAllowMissingColumnNames(): "Gets whether
> > missing column names are allowed when parsing the header line."
> >
> > I've updated the Javadoc for
> > CSVFormat.Builder.setAllowMissingColumnNames(boolean) to state this is a
> > parser setting.
> >
> > I've also updated the test since the commented-out test data is valid for
> > parsing.
> >
> > In git master now.
>
> OK, thanks for the explanation. If the CSVFormat ignores the
> allowEmptyColumnsNames in the validate() method it should act as if
> the setting is true, i.e. only the DuplicateHeaderMode should be used
> to validate the header.
That's how it is now, allowEmptyColumnsNames cannot be considered in the
validate method because a CSVFormat does not know if it is going to be used for
parsing or writing.
So the test can be updated to stream the full
> set of cases (reinstating those you commented out) and filter to those
> where the flag is set to true, e.g. using:
I already had added a new test data provider method to do that so, all I had to
do now is make sure that when I deleted the commented out test elements, these
were covered in the 2nd data provider called duplicateHeaderParseOnlyData().
>
> ---
>
> static Stream<Arguments> duplicateHeaderAllowsMissingColumnsNamesData() {
> return duplicateHeaderData().filter(arg ->
> Boolean.TRUE.equals(arg.get()[1]));
> }
I did not do this because it seemed (to me) clearer (less magical) to create a
duplicateHeaderParseOnlyData() and list explicitly test cases there. WDYT?
>
> @ParameterizedTest
> @MethodSource(value = {"duplicateHeaderAllowsMissingColumnsNamesData"})
> public void testCSVFormat(final DuplicateHeaderMode duplicateHeaderMode, ...
>
> ---
>
> This finds the following failure:
>
> Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, new String[] {"
> ", " "}, true),
>
> This is due to the difference in the CSVParser and CSVFormat in the
> use of the trim() method to detect an empty header:
>
> CSVParser:
> final boolean emptyHeader = header == null || header.trim().isEmpty();
>
> CSVFormat:
> final boolean empty = header == null || header.isEmpty();
>
> This means that the definition of 'empty' is different for the
> CSVFormat and the CSVParser. Should this be corrected in CSVFormat to
> use trim()?
I refactored the logic in CSVFormat#isBlank(String) where the "blank" concept
vs. "empty" is the same as in Apache Commons Lang where "empty" means size 0
and "blank" means "trims to size 0"; see
org.apache.commons.lang3.StringUtils.isBlank(CharSequence).
>
> If so then I can push the updated test and fix to master. That would
> remove the requirement to comment out any of the test cases and moving
> those to a different test stream method.
Ah, well, let's have you review git master now and feel free to refactor. I
think we are close if not done for another RC. WDYT?
TY!
Gary
>
> Alex
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]