On Tue, 25 Oct 2022 at 17:29, Alex Herbert <alex.d.herb...@gmail.com> wrote: > > On Sun, 23 Oct 2022 at 19:44, Alex Herbert <alex.d.herb...@gmail.com> wrote: > > > > Summary: > > > > 1. Should CSVParser treat null and blank headers as the same when > > checking for duplicates, i.e. all are considered an 'empty' name? This > > is current CSVFormat behaviour. > > 2. Should CSVFormat respect ignoreHeaderCase when checking for > > duplicates? This is current CSVParser behaviour. > > 3. Should blank column names be sanitised to the empty string ""? This > > is not current behaviour but is the logical behaviour for checking > > duplicates in CSVFormat. > > I have proposed a fix for this in PR #279 [1]. It maintains a flag > that notes when any type of missing header name has occurred, Thus it > now throws when a duplicate null is found when using > DuplicateHeaderMode.DISALLOW. > > I marked the PR as a WIP. It should probably have an associated Jira > ticket to track this change if merged. Or it could be added to CSV-264 > as further details of that fix [2]. > > I have not updated the documentation for ignoreHeaderCase to address item 2. > > The functionality with regard to the header map is unchanged since the > header map does not store null headers, and any missing headers are > not modified (i.e. they are not all sanitised to the empty string ""). > > Alex > > [1] https://github.com/apache/commons-csv/pull/279 > [2] https://issues.apache.org/jira/browse/CSV-264
PR now updated with: - Documentation of the parser specific flag 'ignore header case' - CSVDuplicateHeaderTest to have test cases using the case insensitive duplicates I believe this to be all that is required to fix the issues with handling duplicate header names. Alex --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org