[jira] [Commented] (CSV-164) Support duplicate header names

2022-03-10 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/CSV-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504609#comment-17504609
 ] 

Gary D. Gregory commented on CSV-164:
-

We now have a duplicate header mode from [CSV-264] in git master for the 
forthcoming 1.10.0. See also our Maven snapshot repository 
https://repository.apache.org/content/repositories/snapshots/.

> Support duplicate header names
> --
>
> Key: CSV-164
> URL: https://issues.apache.org/jira/browse/CSV-164
> Project: Commons CSV
>  Issue Type: Bug
>Affects Versions: 1.2
>Reporter: Romain Manni-Bucau
>Priority: Major
>
> nothing prevents a CSV to have the same time the same header name so 
> validation at the end of org.apache.commons.csv.CSVFormat#validate should 
> likely disappear or should support a flag to disable it



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (CSV-164) Support duplicate header names

2016-06-14 Thread Steffen Zschaler (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329555#comment-15329555
 ] 

Steffen Zschaler commented on CSV-164:
--

Code is available here: 
https://github.com/szschaler/commons-csv/tree/list_empty_headers 

Let me know if this is of interest for pushing back into Commons CSV and I will 
write some tests and prepare a proper pull request.

> Support duplicate header names
> --
>
> Key: CSV-164
> URL: https://issues.apache.org/jira/browse/CSV-164
> Project: Commons CSV
>  Issue Type: Bug
>Affects Versions: 1.2
>Reporter: Romain Manni-Bucau
>
> nothing prevents a CSV to have the same time the same header name so 
> validation at the end of org.apache.commons.csv.CSVFormat#validate should 
> likely disappear or should support a flag to disable it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CSV-164) Support duplicate header names

2016-06-14 Thread Steffen Zschaler (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329481#comment-15329481
 ] 

Steffen Zschaler commented on CSV-164:
--

I have a similar issue, specifically where the duplicate headers are empty 
headers. So, for example:

{code}
A,B,,C,,D
1,2,,3,,4
{code}

The empty columns here have been inserted for readability.

I need to do some processing over the file, removing some columns and doing 
some updates to other places, and then write out the modified CSV file again. 
Ideally, I would keep the empty columns so that readability is maintained. I 
also need to keep the header names from the original file. Finally, I have no 
_ad hoc_ information about how many columns there are in total (beyond a number 
of standard columns at the left of the file), so cannot easily predefine an 
artificial header either.

Currently, Commons CSV cannot handle this because it only keeps track of the 
last empty column. For this specific use case, I think there is a solution that 
is non-API breaking by providing additional functionality to get a list of all 
columns with empty headers if empty headers are allowed (which can be flagged 
already). Optionally, we could also stop putting empty headers into the header 
map, but this may break some users.

I'm going to have a go at implementing this in a commons-csv fork anyway, as I 
need it for my current project. Is there an interest in having this contributed 
back to the main code and if so, should I open a separate issue for it or 
reference it to this issue?

> Support duplicate header names
> --
>
> Key: CSV-164
> URL: https://issues.apache.org/jira/browse/CSV-164
> Project: Commons CSV
>  Issue Type: Bug
>Affects Versions: 1.2
>Reporter: Romain Manni-Bucau
>
> nothing prevents a CSV to have the same time the same header name so 
> validation at the end of org.apache.commons.csv.CSVFormat#validate should 
> likely disappear or should support a flag to disable it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CSV-164) Support duplicate header names

2015-11-23 Thread Michael Osipov (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021800#comment-15021800
 ] 

Michael Osipov commented on CSV-164:


How is get is supposed to work on a map when the column name is not unique?

> Support duplicate header names
> --
>
> Key: CSV-164
> URL: https://issues.apache.org/jira/browse/CSV-164
> Project: Commons CSV
>  Issue Type: Bug
>Reporter: Romain Manni-Bucau
>
> nothing prevents a CSV to have the same time the same header name so 
> validation at the end of org.apache.commons.csv.CSVFormat#validate should 
> likely disappear or should support a flag to disable it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CSV-164) Support duplicate header names

2015-11-23 Thread Romain Manni-Bucau (JIRA)

[ 
https://issues.apache.org/jira/browse/CSV-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021924#comment-15021924
 ] 

Romain Manni-Bucau commented on CSV-164:


This is what I proposed early: if you do get("duplicatedColumnName") then you 
should fail but get("uniqueColumnName") should still work. Doesnt remove the 
access (parse) feature since you can still access it by index.

Typically in batchee mapping we have a thin layer on top of [csv] where you can 
map CSV on an object either by index or name. If you use name then you define 
header names but index access is priviledged over name access which makes this 
case pretty smooth: in 
https://github.com/apache/incubator-batchee/blob/master/extensions/commons-csv/src/main/java/org/apache/batchee/csv/mapper/DefaultMapper.java#L89
 the values of fieldByPosition and fieldByName are unique (ie fieldByName 
doesnt have any duplicate with fieldByPosition) to guaratee all of this to work.

The access by header name is a nice API most of the time but not the way CSV 
really works (column index by definition) so I like this API but it has some 
limits we hit with this issue. 

> Support duplicate header names
> --
>
> Key: CSV-164
> URL: https://issues.apache.org/jira/browse/CSV-164
> Project: Commons CSV
>  Issue Type: Bug
>Reporter: Romain Manni-Bucau
>
> nothing prevents a CSV to have the same time the same header name so 
> validation at the end of org.apache.commons.csv.CSVFormat#validate should 
> likely disappear or should support a flag to disable it



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)