Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-25 Thread Gary Gregory
On Mon, Oct 24, 2022 at 4:19 PM Alex Herbert  wrote:
>
> On Sun, 23 Oct 2022 at 14:09, Gary D. Gregory  wrote:
> >
> > Ah, well, let's have you review git master now and feel free to refactor. I 
> > think we are close if not done for another RC. WDYT?
>
> Since this thread was for the RC1 vote I started a new thread titled:
>
> [csv] validation of duplicate headers (was [VOTE] Release Apache
> Commons CSV 1.10.0 based on RC1)
>
> I found one bug in the implementation is master which I fixed for
> CSVFormat. There is still some inconsistent behaviour by CSVParser
> that I discuss in the new thread that should be fixed, or current
> behaviour documented, before an RC2.

Thanks Alex for putting in the time and effort, needs careful consideration...

Gary

>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-24 Thread Alex Herbert
On Sun, 23 Oct 2022 at 14:09, Gary D. Gregory  wrote:
>
> Ah, well, let's have you review git master now and feel free to refactor. I 
> think we are close if not done for another RC. WDYT?

Since this thread was for the RC1 vote I started a new thread titled:

[csv] validation of duplicate headers (was [VOTE] Release Apache
Commons CSV 1.10.0 based on RC1)

I found one bug in the implementation is master which I fixed for
CSVFormat. There is still some inconsistent behaviour by CSVParser
that I discuss in the new thread that should be fixed, or current
behaviour documented, before an RC2.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-23 Thread Gary D. Gregory
Hi All, Alex, more below:

On 2022/10/22 21:23:13 Alex Herbert wrote:
> On Sat, 22 Oct 2022 at 20:05, Gary D. Gregory  wrote:
> >
> > Thank you for the new tests Alex!
> >
> > Here is one area that is easy to overlook: As Commons CSV has evolved, _not 
> > all settings in CSVFormat_ apply to both writing and parsing. To wit, the 
> > existing Javadoc for CSVFormat.getAllowMissingColumnNames(): "Gets whether 
> > missing column names are allowed when parsing the header line."
> >
> > I've updated the Javadoc for 
> > CSVFormat.Builder.setAllowMissingColumnNames(boolean) to state this is a 
> > parser setting.
> >
> > I've also updated the test since the commented-out test data is valid for 
> > parsing.
> >
> > In git master now.
> 
> OK, thanks for the explanation. If the CSVFormat ignores the
> allowEmptyColumnsNames in the validate() method it should act as if
> the setting is true, i.e. only the DuplicateHeaderMode should be used
> to validate the header. 

That's how it is now, allowEmptyColumnsNames cannot be considered in the 
validate method because a CSVFormat does not know if it is going to be used for 
parsing or writing.

So the test can be updated to stream the full
> set of cases (reinstating those you commented out) and filter to those
> where the flag is set to true, e.g. using:

I already had added a new test data provider method to do that so, all I had to 
do now is make sure that when I deleted the commented out test elements, these 
were covered in the 2nd data provider called duplicateHeaderParseOnlyData().

> 
> ---
> 
> static Stream duplicateHeaderAllowsMissingColumnsNamesData() {
> return duplicateHeaderData().filter(arg ->
> Boolean.TRUE.equals(arg.get()[1]));
> }

I did not do this because it seemed (to me) clearer (less magical) to create a 
duplicateHeaderParseOnlyData() and list explicitly test cases there. WDYT?

> 
> @ParameterizedTest
> @MethodSource(value = {"duplicateHeaderAllowsMissingColumnsNamesData"})
> public void testCSVFormat(final DuplicateHeaderMode duplicateHeaderMode, ...
> 
> ---
> 
> This finds the following failure:
> 
> Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true,  new String[] {"
> ", " "}, true),
> 
> This is due to the difference in the CSVParser and CSVFormat in the
> use of the trim() method to detect an empty header:
> 
> CSVParser:
> final boolean emptyHeader = header == null || header.trim().isEmpty();
> 
> CSVFormat:
> final boolean empty = header == null || header.isEmpty();
> 
> This means that the definition of 'empty' is different for the
> CSVFormat and the CSVParser. Should this be corrected in CSVFormat to
> use trim()?

I refactored the logic in CSVFormat#isBlank(String) where the "blank" concept 
vs. "empty" is the same as in Apache Commons Lang where "empty" means size 0 
and "blank" means "trims to size 0"; see 
org.apache.commons.lang3.StringUtils.isBlank(CharSequence).

> 
> If so then I can push the updated test and fix to master. That would
> remove the requirement to comment out any of the test cases and moving
> those to a different test stream method.

Ah, well, let's have you review git master now and feel free to refactor. I 
think we are close if not done for another RC. WDYT?

TY!
Gary

> 
> Alex
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-22 Thread Alex Herbert
On Sat, 22 Oct 2022 at 20:05, Gary D. Gregory  wrote:
>
> Thank you for the new tests Alex!
>
> Here is one area that is easy to overlook: As Commons CSV has evolved, _not 
> all settings in CSVFormat_ apply to both writing and parsing. To wit, the 
> existing Javadoc for CSVFormat.getAllowMissingColumnNames(): "Gets whether 
> missing column names are allowed when parsing the header line."
>
> I've updated the Javadoc for 
> CSVFormat.Builder.setAllowMissingColumnNames(boolean) to state this is a 
> parser setting.
>
> I've also updated the test since the commented-out test data is valid for 
> parsing.
>
> In git master now.

OK, thanks for the explanation. If the CSVFormat ignores the
allowEmptyColumnsNames in the validate() method it should act as if
the setting is true, i.e. only the DuplicateHeaderMode should be used
to validate the header. So the test can be updated to stream the full
set of cases (reinstating those you commented out) and filter to those
where the flag is set to true, e.g. using:

---

static Stream duplicateHeaderAllowsMissingColumnsNamesData() {
return duplicateHeaderData().filter(arg ->
Boolean.TRUE.equals(arg.get()[1]));
}

@ParameterizedTest
@MethodSource(value = {"duplicateHeaderAllowsMissingColumnsNamesData"})
public void testCSVFormat(final DuplicateHeaderMode duplicateHeaderMode, ...

---

This finds the following failure:

Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true,  new String[] {"
", " "}, true),

This is due to the difference in the CSVParser and CSVFormat in the
use of the trim() method to detect an empty header:

CSVParser:
final boolean emptyHeader = header == null || header.trim().isEmpty();

CSVFormat:
final boolean empty = header == null || header.isEmpty();

This means that the definition of 'empty' is different for the
CSVFormat and the CSVParser. Should this be corrected in CSVFormat to
use trim()?

If so then I can push the updated test and fix to master. That would
remove the requirement to comment out any of the test cases and moving
those to a different test stream method.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-22 Thread Gary D. Gregory
Thank you for the new tests Alex!

Here is one area that is easy to overlook: As Commons CSV has evolved, _not all 
settings in CSVFormat_ apply to both writing and parsing. To wit, the existing 
Javadoc for CSVFormat.getAllowMissingColumnNames(): "Gets whether missing 
column names are allowed when parsing the header line."

I've updated the Javadoc for 
CSVFormat.Builder.setAllowMissingColumnNames(boolean) to state this is a parser 
setting.

I've also updated the test since the commented-out test data is valid for 
parsing.

In git master now.

Gary


On 2022/10/22 17:37:15 Alex Herbert wrote:
> On Sat, 22 Oct 2022 at 13:56, Gary D. Gregory  wrote:
> >
> > Thank you for your excellent investigation, Alex, and for finding and 
> > fixing the missing Test annotations. Overall, we have 98% coverage.
> 
> :)
> 
> > Taking stock, I think that this is where we are, we have main 3 issues:
> >
> > 1) Compatibility of duplicate header behavior from version to version
> > 2) Validation bug with empty headers
> > 3) Writing and parsing behavior mismatch
> >
> > One at a time:
> >
> > 1) Compatibility of duplicate header behavior from version to version
> >
> > - We have behavior [A] from 1.0 to 1.6.
> > - We have behavior [B] from 1.7 to 1.9.
> > - We are discussing if 1.10.0 should have behavior [A] or [B]
> >
> > I would vote for keeping behavior B for simplicity and keeping with the 
> > principle of least surprise: Maintaining the current behavior is better 
> > than flip-flopping.
> 
> +1
> 
> > 2) Validation bug with empty headers
> >
> > Fixed in git master commit 289ffa16275c518722c6cda913bfca8a6e1a1411; please 
> > do verify the tests. I've added 3 test methods per Alex's failing examples 
> > below.
> 
> The fix does not address the allowMissingColumnNames flag. This flag
> is not mentioned in the Jira tickets for duplicate column names but it
> is used by the CSVParser and changes behaviour.
> 
> I added a new test class that has all the combinations of
> DuplicateHeaderMode and allowMissingColumnNames with various headers.
> The test has 36 cases and all are passed by the CSVParser. 6 are
> currently not passed by CSVFormat and so have been disabled. Please
> review git master commit 24ffa7b4cf9cc392c4fd6a70363f103d40d1d1a7. Add
> more cases for other headers if required.
> 
> > 3) Writing and parsing behavior mismatch
> >
> > I've updated the documentation in git master as Alex suggested.
> 
> Thanks.
> 
> Alex
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-22 Thread Alex Herbert
On Sat, 22 Oct 2022 at 13:56, Gary D. Gregory  wrote:
>
> Thank you for your excellent investigation, Alex, and for finding and fixing 
> the missing Test annotations. Overall, we have 98% coverage.

:)

> Taking stock, I think that this is where we are, we have main 3 issues:
>
> 1) Compatibility of duplicate header behavior from version to version
> 2) Validation bug with empty headers
> 3) Writing and parsing behavior mismatch
>
> One at a time:
>
> 1) Compatibility of duplicate header behavior from version to version
>
> - We have behavior [A] from 1.0 to 1.6.
> - We have behavior [B] from 1.7 to 1.9.
> - We are discussing if 1.10.0 should have behavior [A] or [B]
>
> I would vote for keeping behavior B for simplicity and keeping with the 
> principle of least surprise: Maintaining the current behavior is better than 
> flip-flopping.

+1

> 2) Validation bug with empty headers
>
> Fixed in git master commit 289ffa16275c518722c6cda913bfca8a6e1a1411; please 
> do verify the tests. I've added 3 test methods per Alex's failing examples 
> below.

The fix does not address the allowMissingColumnNames flag. This flag
is not mentioned in the Jira tickets for duplicate column names but it
is used by the CSVParser and changes behaviour.

I added a new test class that has all the combinations of
DuplicateHeaderMode and allowMissingColumnNames with various headers.
The test has 36 cases and all are passed by the CSVParser. 6 are
currently not passed by CSVFormat and so have been disabled. Please
review git master commit 24ffa7b4cf9cc392c4fd6a70363f103d40d1d1a7. Add
more cases for other headers if required.

> 3) Writing and parsing behavior mismatch
>
> I've updated the documentation in git master as Alex suggested.

Thanks.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-22 Thread Gary D. Gregory
n {
> assertThrows(IllegalArgumentException.class,
> () -> CSVParser.parse("a,,c,\n1,2,3,4\nw,x,y,z",
> CSVFormat.DEFAULT.withHeader().withAllowDuplicateHeaderNames(false)));
> }
> 
> @Test
> public void 
> testEmptyColumnsAndDuplicateHeadersNotAllowedWithMissingColumnsAllowed()
> throws Exception {
> try (CSVParser parser = CSVParser.parse("a,,c,\n1,2,3,4\nw,x,y,z",
> 
> CSVFormat.DEFAULT.withHeader().withAllowDuplicateHeaderNames(false).withAllowMissingColumnNames(true)))
> {
> // noop
> }
> }
> 
> So the duplicate headers cannot be considered on its own, it requires
> use of the allow missing column names flag.
> 
> Also, if a CSVFormat has an explicit header then the CSVParser will
> not construct a header Map to allow column names to
> look up column indices.
> 
> I am not sure of the best path to fix this. I would document the
> CSVParser header map as not being created if the header is not read
> from the CSV input (current behaviour). Then maybe change the logic in
> CSVFormat.validate() to respect the ALLOW_ALL, ALLOW_EMPTY, ALLOW_NONE
> duplicates choice, possibly also considering allowEmptyColumnNames
> which defaults to false. Ideally we should get the same exceptions
> when building a CSVFormat with explicitly set headers, and parsing a
> CSV file and getting the header from the file. Currently I do not
> think that is true.
> 
> Alex
> 
> On Fri, 21 Oct 2022 at 15:54, Gary Gregory  wrote:
> >
> > Ok, I don't see any conflict anymore but the PR title does not match the
> > code: "CSVFormat.duplicateHeaderMode requires default DISALLOW"
> >
> > I am ok with the changes as they stand now but we have to resolve if the
> > default should be changed.
> >
> > Gary
> >
> > On Fri, Oct 21, 2022, 09:44  wrote:
> >
> > > Hi Gary,
> > >
> > > the PR did not have conflicts 30 minutes ago. So we must have encountered
> > > a race condition between you updating master and me rebasing and
> > > force-pushing my pr.
> > > I've done that again just now. There are no conflicts.
> > > At the same time I see parts of my PR have dribbled into master already,
> > > so if you rather implement the changes yourself, please go ahead. I am 
> > > fine
> > > with closing my pr.
> > >
> > > regards,
> > > Markus
> > >
> > >
> > > From: Gary Gregory 
> > > Sent: Friday, October 21, 2022 15:17
> > > To: Commons Developers List 
> > > Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
> > >
> > > Hi Markus,
> > >
> > > The PR has conflicts and does not test its changes.
> > >
> > > I'd like feedback from the community on whether or not git master as it is
> > > now is OK for the duplicate header behavior or if changing the default is
> > > needed and if doing so is just ping-pongonging from one incompatibility to
> > > another, based on previous versions.
> > >
> > > TY!
> > >
> > > Gary
> > >
> > > On Fri, Oct 21, 2022, 08:52  wrote:
> > >
> > > > I've removed serialization stuff from the PR and rebased it to master
> > > > https://github.com/apache/commons-csv/pull/276
> > > >
> > > > kind regards,
> > > > Markus
> > > >
> > > > From: Gary D. Gregory 
> > > > Sent: Friday, October 21, 2022 14:18
> > > > To: dev@commons.apache.org 
> > > > Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
> > > >
> > > > On 2022/10/20 22:56:05 Alex Herbert wrote:
> > > > > On Thu, 20 Oct 2022 at 23:43, Alex Herbert 
> > > > wrote:
> > > > > >
> > > > > > I did not have time to track through whether this behaviour changed
> > > > > > after the initial implementation of the flag. I would think not as
> > > the
> > > > > > original behaviour is from 1.0. This would map to:
> > > > > >
> > > > > > true -> ALLOW_ALL
> > > > > > false -> ALLOW_EMPTY
> > > > > > new -> DISALLOW
> > > > > >
> > > > > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7]
> > > to
> > > > > > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > > > > > behavioural compatibility (to 1.7, or back to 1.0).
> > > 

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Alex Herbert
>From my investigation the behaviour from 1.0 to 1.6 was to throw on
duplicates, but ignore empty duplicates (so allowing pad columns in a
CSV).

The allowDuplicates flag was added in 1.7 to allow duplicates. DEFAULT
behaviour was true! So the behavioural compatibility was broken in
release 1.7. Note there is also an allowMissingColumnNames flag. Logic
was:

- Store column headers as we go
- For each new column header, check in the existing headers
- If present:
-- If not empty then throw if allowDuplicateHeaderNames=true
-- If empty then throw if allowMissingColumnNames=false

There is some logic failure here. It would allow the first missing
column name for example; only the second empty missing column name
would trigger an exception.

The current master has updated this logic somewhat but I think the
concept is still the same. There are tests to show that the default
enum value is ALLOW_ALL. This is the same behaviour as 1.7, but not
1.0 - 1.6.

Setting the boolean flag to true -> ALLOW_ALL.
Setting the boolean flag to false -> ALLOW_EMPTY.

So I think the default settings are correct, and the use of the old
flag flips the behaviour correctly. But when investigating I found
another issue.

I looked at the CSVFormat tests and found two fixtures missing the
@Test annotation:

testDuplicateHeaderElementsTrue()
testDuplicateHeaderElementsTrue_Deprecated()

I added these to master.

The tests for duplicate headers use the builder and setHeader("A",
"A"). So there are no tests for empty headers. Trying this throws an
exception in CSVFormat.validate for all of these:

CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(false).setHeader("A",
"", "B", "").build();
CSVFormat.DEFAULT.builder().setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_EMPTY).setHeader("A",
"", "B", "").build();
CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(false)
.setAllowMissingColumnNames(true).setHeader("A", "", "B", "").build();

So it seems the logic when building a CSVFormat with explicit headers
is different from when parsing a format in CSVParser.createHeaders().
That method has a lot of logic to set up the header map. If a
CSVFormat is created with an explicit header the behaviour is
different with respect to duplicate headers, and
allowMissingColumnNames which is not even considered in
CSVFormat.validate().

This is the current behaviour in master for the parser:

@Test
public void testEmptyColumnsAndDuplicateHeadersNotAllowed() throws Exception {
assertThrows(IllegalArgumentException.class,
() -> CSVParser.parse("a,,c,\n1,2,3,4\nw,x,y,z",
CSVFormat.DEFAULT.withHeader().withAllowDuplicateHeaderNames(false)));
}

@Test
public void 
testEmptyColumnsAndDuplicateHeadersNotAllowedWithMissingColumnsAllowed()
throws Exception {
try (CSVParser parser = CSVParser.parse("a,,c,\n1,2,3,4\nw,x,y,z",

CSVFormat.DEFAULT.withHeader().withAllowDuplicateHeaderNames(false).withAllowMissingColumnNames(true)))
{
// noop
}
}

So the duplicate headers cannot be considered on its own, it requires
use of the allow missing column names flag.

Also, if a CSVFormat has an explicit header then the CSVParser will
not construct a header Map to allow column names to
look up column indices.

I am not sure of the best path to fix this. I would document the
CSVParser header map as not being created if the header is not read
from the CSV input (current behaviour). Then maybe change the logic in
CSVFormat.validate() to respect the ALLOW_ALL, ALLOW_EMPTY, ALLOW_NONE
duplicates choice, possibly also considering allowEmptyColumnNames
which defaults to false. Ideally we should get the same exceptions
when building a CSVFormat with explicitly set headers, and parsing a
CSV file and getting the header from the file. Currently I do not
think that is true.

Alex

On Fri, 21 Oct 2022 at 15:54, Gary Gregory  wrote:
>
> Ok, I don't see any conflict anymore but the PR title does not match the
> code: "CSVFormat.duplicateHeaderMode requires default DISALLOW"
>
> I am ok with the changes as they stand now but we have to resolve if the
> default should be changed.
>
> Gary
>
> On Fri, Oct 21, 2022, 09:44  wrote:
>
> > Hi Gary,
> >
> > the PR did not have conflicts 30 minutes ago. So we must have encountered
> > a race condition between you updating master and me rebasing and
> > force-pushing my pr.
> > I've done that again just now. There are no conflicts.
> > At the same time I see parts of my PR have dribbled into master already,
> > so if you rather implement the changes yourself, please go ahead. I am fine
> > with closing my pr.
> >
> > regards,
> > Markus
> >
> >
> > From: Gary Gregory 
> > Sent:

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Gary Gregory
Ok, I don't see any conflict anymore but the PR title does not match the
code: "CSVFormat.duplicateHeaderMode requires default DISALLOW"

I am ok with the changes as they stand now but we have to resolve if the
default should be changed.

Gary

On Fri, Oct 21, 2022, 09:44  wrote:

> Hi Gary,
>
> the PR did not have conflicts 30 minutes ago. So we must have encountered
> a race condition between you updating master and me rebasing and
> force-pushing my pr.
> I've done that again just now. There are no conflicts.
> At the same time I see parts of my PR have dribbled into master already,
> so if you rather implement the changes yourself, please go ahead. I am fine
> with closing my pr.
>
> regards,
> Markus
>
>
> From: Gary Gregory 
> Sent: Friday, October 21, 2022 15:17
> To: Commons Developers List 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> Hi Markus,
>
> The PR has conflicts and does not test its changes.
>
> I'd like feedback from the community on whether or not git master as it is
> now is OK for the duplicate header behavior or if changing the default is
> needed and if doing so is just ping-pongonging from one incompatibility to
> another, based on previous versions.
>
> TY!
>
> Gary
>
> On Fri, Oct 21, 2022, 08:52  wrote:
>
> > I've removed serialization stuff from the PR and rebased it to master
> > https://github.com/apache/commons-csv/pull/276
> >
> > kind regards,
> > Markus
> >
> > From: Gary D. Gregory 
> > Sent: Friday, October 21, 2022 14:18
> > To: dev@commons.apache.org 
> > Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
> >
> > On 2022/10/20 22:56:05 Alex Herbert wrote:
> > > On Thu, 20 Oct 2022 at 23:43, Alex Herbert 
> > wrote:
> > > >
> > > > I did not have time to track through whether this behaviour changed
> > > > after the initial implementation of the flag. I would think not as
> the
> > > > original behaviour is from 1.0. This would map to:
> > > >
> > > > true -> ALLOW_ALL
> > > > false -> ALLOW_EMPTY
> > > > new -> DISALLOW
> > > >
> > > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7]
> to
> > > > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > > > behavioural compatibility (to 1.7, or back to 1.0).
> > >
> > > PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> > > for allowDuplicates=false and does not change any tests.
> > >
> > > So the test suite is currently not enforcing behavioural
> > > compatibility. This seems like a glaring hole in the tests and should
> > > be addressed to prevent regressions.
> >
> > In git master, there are many tests for senders of
> > withAllowDuplicateHeaderNames(boolean) and
> > setAllowDuplicateHeaderNames(boolean).
> >
> > I filled in the coverage in testGetAllowDuplicateHeaderNames() for
> > getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right
> or
> > wrong.
> >
> > Let's reassess now git master now please. If the PR matters still, it
> > should be rebased on git master.
> >
> > Gary
> >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread sman81
Hi Gary,

the PR did not have conflicts 30 minutes ago. So we must have encountered a 
race condition between you updating master and me rebasing and force-pushing my 
pr.
I've done that again just now. There are no conflicts.
At the same time I see parts of my PR have dribbled into master already, so if 
you rather implement the changes yourself, please go ahead. I am fine with 
closing my pr.

regards,
Markus


From: Gary Gregory 
Sent: Friday, October 21, 2022 15:17
To: Commons Developers List 
Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 
 
Hi Markus,

The PR has conflicts and does not test its changes.

I'd like feedback from the community on whether or not git master as it is
now is OK for the duplicate header behavior or if changing the default is
needed and if doing so is just ping-pongonging from one incompatibility to
another, based on previous versions.

TY!

Gary

On Fri, Oct 21, 2022, 08:52  wrote:

> I've removed serialization stuff from the PR and rebased it to master
> https://github.com/apache/commons-csv/pull/276
>
> kind regards,
> Markus
>
> From: Gary D. Gregory 
> Sent: Friday, October 21, 2022 14:18
> To: dev@commons.apache.org 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> On 2022/10/20 22:56:05 Alex Herbert wrote:
> > On Thu, 20 Oct 2022 at 23:43, Alex Herbert 
> wrote:
> > >
> > > I did not have time to track through whether this behaviour changed
> > > after the initial implementation of the flag. I would think not as the
> > > original behaviour is from 1.0. This would map to:
> > >
> > > true -> ALLOW_ALL
> > > false -> ALLOW_EMPTY
> > > new -> DISALLOW
> > >
> > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > > behavioural compatibility (to 1.7, or back to 1.0).
> >
> > PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> > for allowDuplicates=false and does not change any tests.
> >
> > So the test suite is currently not enforcing behavioural
> > compatibility. This seems like a glaring hole in the tests and should
> > be addressed to prevent regressions.
>
> In git master, there are many tests for senders of
> withAllowDuplicateHeaderNames(boolean) and
> setAllowDuplicateHeaderNames(boolean).
>
> I filled in the coverage in testGetAllowDuplicateHeaderNames() for
> getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right or
> wrong.
>
> Let's reassess now git master now please. If the PR matters still, it
> should be rebased on git master.
>
> Gary
>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>
-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Gary Gregory
Hi Markus,

The PR has conflicts and does not test its changes.

I'd like feedback from the community on whether or not git master as it is
now is OK for the duplicate header behavior or if changing the default is
needed and if doing so is just ping-pongonging from one incompatibility to
another, based on previous versions.

TY!

Gary

On Fri, Oct 21, 2022, 08:52  wrote:

> I've removed serialization stuff from the PR and rebased it to master
> https://github.com/apache/commons-csv/pull/276
>
> kind regards,
> Markus
>
> From: Gary D. Gregory 
> Sent: Friday, October 21, 2022 14:18
> To: dev@commons.apache.org 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> On 2022/10/20 22:56:05 Alex Herbert wrote:
> > On Thu, 20 Oct 2022 at 23:43, Alex Herbert 
> wrote:
> > >
> > > I did not have time to track through whether this behaviour changed
> > > after the initial implementation of the flag. I would think not as the
> > > original behaviour is from 1.0. This would map to:
> > >
> > > true -> ALLOW_ALL
> > > false -> ALLOW_EMPTY
> > > new -> DISALLOW
> > >
> > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > > behavioural compatibility (to 1.7, or back to 1.0).
> >
> > PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> > for allowDuplicates=false and does not change any tests.
> >
> > So the test suite is currently not enforcing behavioural
> > compatibility. This seems like a glaring hole in the tests and should
> > be addressed to prevent regressions.
>
> In git master, there are many tests for senders of
> withAllowDuplicateHeaderNames(boolean) and
> setAllowDuplicateHeaderNames(boolean).
>
> I filled in the coverage in testGetAllowDuplicateHeaderNames() for
> getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right or
> wrong.
>
> Let's reassess now git master now please. If the PR matters still, it
> should be rebased on git master.
>
> Gary
>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread sman81
I've removed serialization stuff from the PR and rebased it to master
https://github.com/apache/commons-csv/pull/276

kind regards,
Markus

From: Gary D. Gregory 
Sent: Friday, October 21, 2022 14:18
To: dev@commons.apache.org 
Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 
 
On 2022/10/20 22:56:05 Alex Herbert wrote:
> On Thu, 20 Oct 2022 at 23:43, Alex Herbert  wrote:
> >
> > I did not have time to track through whether this behaviour changed
> > after the initial implementation of the flag. I would think not as the
> > original behaviour is from 1.0. This would map to:
> >
> > true -> ALLOW_ALL
> > false -> ALLOW_EMPTY
> > new -> DISALLOW
> >
> > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > behavioural compatibility (to 1.7, or back to 1.0).
> 
> PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> for allowDuplicates=false and does not change any tests.
> 
> So the test suite is currently not enforcing behavioural
> compatibility. This seems like a glaring hole in the tests and should
> be addressed to prevent regressions.

In git master, there are many tests for senders of 
withAllowDuplicateHeaderNames(boolean) and 
setAllowDuplicateHeaderNames(boolean). 

I filled in the coverage in testGetAllowDuplicateHeaderNames() for 
getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right or 
wrong. 

Let's reassess now git master now please. If the PR matters still, it should be 
rebased on git master.

Gary

> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Gary D. Gregory
On 2022/10/20 22:56:05 Alex Herbert wrote:
> On Thu, 20 Oct 2022 at 23:43, Alex Herbert  wrote:
> >
> > I did not have time to track through whether this behaviour changed
> > after the initial implementation of the flag. I would think not as the
> > original behaviour is from 1.0. This would map to:
> >
> > true -> ALLOW_ALL
> > false -> ALLOW_EMPTY
> > new -> DISALLOW
> >
> > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > behavioural compatibility (to 1.7, or back to 1.0).
> 
> PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> for allowDuplicates=false and does not change any tests.
> 
> So the test suite is currently not enforcing behavioural
> compatibility. This seems like a glaring hole in the tests and should
> be addressed to prevent regressions.

In git master, there are many tests for senders of 
withAllowDuplicateHeaderNames(boolean) and 
setAllowDuplicateHeaderNames(boolean). 

I filled in the coverage in testGetAllowDuplicateHeaderNames() for 
getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right or 
wrong. 

Let's reassess now git master now please. If the PR matters still, it should be 
rebased on git master.

Gary

> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Gary D. Gregory
I updated git master with Javadoc to signal that Serialization in CSVFormat is 
not supported from one version to the next. I also bumped the serial version ID 
from 1 to 2. All of this is noted in changes.xml.

Gary

On 2022/10/21 11:27:43 sebb wrote:
> On Fri, 21 Oct 2022 at 11:57,  wrote:
> >
> > > Would't it be simpler to deal with the serialization issue by bumping the 
> > > serialVersionID?
> > simpler yes, but it's a different thing
> > The PR makes the serialized forms for commons-csv versions 1.9.0 and 1.10.0 
> > compatible.
> >
> > Given that serialization has been broken for several versions in 
> > commons-csv and given that "fixing" it now is a labor of love which does 
> > not add much value,
> > I am in favor of throwing it out altogether now rather than in some future 
> > (major) version. Clearly inform about it in the release notes and be done 
> > with it.
> 
> +1
> 
> > > Also note the PR will throw an NPE in the builder
> > > when instead of using the validate() method.
> >
> > setDuplicateHeaderMode(null) is illegal and should be fail-fast IMO
> >
> > kind regards,
> > Markus
> >
> > From: Gary Gregory 
> > Sent: Thursday, October 20, 2022 16:43
> > To: Commons Developers List 
> > Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
> >
> > Would't it be simpler to deal with the serialization issue by bumping the
> > serialVersionID? We can just say that you only serialized and deserialize
> > for the same version. Also note the PR will throw an NPE in the builder
> > when instead of using the validate() method.
> >
> > Gary
> >
> > On Wed, Oct 19, 2022, 18:27 Gary D. Gregory  wrote:
> >
> > > I've commented on the PR.
> > > TY.
> > > Gary
> > >
> > > On 2022/10/19 16:51:57 Gary Gregory wrote:
> > > > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert 
> > > wrote:
> > > > >
> > > > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory 
> > > wrote:
> > > > > >
> > > > > > My +1
> > > > > >
> > > > > > Gary
> > > > >
> > > > > Gary,
> > > > >
> > > > > PR #276 highlights a behavioural compatibility error in the 1.10.0 
> > > > > RC1.
> > > > >
> > > > > AllowDuplicates enum may be set to the incorrect value when setting
> > > > > the allow duplicates boolean. Have you reviewed this? I believe it is
> > > > > valid.
> > > >
> > > > I will re-read later tonight...
> > > >
> > > > Gary
> > > >
> > > > >
> > > > > Alex
> > > > >
> > > > > -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > > >
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > >
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread sebb
On Fri, 21 Oct 2022 at 11:57,  wrote:
>
> > Would't it be simpler to deal with the serialization issue by bumping the 
> > serialVersionID?
> simpler yes, but it's a different thing
> The PR makes the serialized forms for commons-csv versions 1.9.0 and 1.10.0 
> compatible.
>
> Given that serialization has been broken for several versions in commons-csv 
> and given that "fixing" it now is a labor of love which does not add much 
> value,
> I am in favor of throwing it out altogether now rather than in some future 
> (major) version. Clearly inform about it in the release notes and be done 
> with it.

+1

> > Also note the PR will throw an NPE in the builder
> > when instead of using the validate() method.
>
> setDuplicateHeaderMode(null) is illegal and should be fail-fast IMO
>
> kind regards,
> Markus
>
> From: Gary Gregory 
> Sent: Thursday, October 20, 2022 16:43
> To: Commons Developers List 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> Would't it be simpler to deal with the serialization issue by bumping the
> serialVersionID? We can just say that you only serialized and deserialize
> for the same version. Also note the PR will throw an NPE in the builder
> when instead of using the validate() method.
>
> Gary
>
> On Wed, Oct 19, 2022, 18:27 Gary D. Gregory  wrote:
>
> > I've commented on the PR.
> > TY.
> > Gary
> >
> > On 2022/10/19 16:51:57 Gary Gregory wrote:
> > > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert 
> > wrote:
> > > >
> > > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory 
> > wrote:
> > > > >
> > > > > My +1
> > > > >
> > > > > Gary
> > > >
> > > > Gary,
> > > >
> > > > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.
> > > >
> > > > AllowDuplicates enum may be set to the incorrect value when setting
> > > > the allow duplicates boolean. Have you reviewed this? I believe it is
> > > > valid.
> > >
> > > I will re-read later tonight...
> > >
> > > Gary
> > >
> > > >
> > > > Alex
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread sman81
> Would't it be simpler to deal with the serialization issue by bumping the 
> serialVersionID?
simpler yes, but it's a different thing
The PR makes the serialized forms for commons-csv versions 1.9.0 and 1.10.0 
compatible.

Given that serialization has been broken for several versions in commons-csv 
and given that "fixing" it now is a labor of love which does not add much value,
I am in favor of throwing it out altogether now rather than in some future 
(major) version. Clearly inform about it in the release notes and be done with 
it.

> Also note the PR will throw an NPE in the builder
> when instead of using the validate() method.

setDuplicateHeaderMode(null) is illegal and should be fail-fast IMO

kind regards,
Markus

From: Gary Gregory 
Sent: Thursday, October 20, 2022 16:43
To: Commons Developers List 
Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 
 
Would't it be simpler to deal with the serialization issue by bumping the
serialVersionID? We can just say that you only serialized and deserialize
for the same version. Also note the PR will throw an NPE in the builder
when instead of using the validate() method.

Gary

On Wed, Oct 19, 2022, 18:27 Gary D. Gregory  wrote:

> I've commented on the PR.
> TY.
> Gary
>
> On 2022/10/19 16:51:57 Gary Gregory wrote:
> > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert 
> wrote:
> > >
> > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory 
> wrote:
> > > >
> > > > My +1
> > > >
> > > > Gary
> > >
> > > Gary,
> > >
> > > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.
> > >
> > > AllowDuplicates enum may be set to the incorrect value when setting
> > > the allow duplicates boolean. Have you reviewed this? I believe it is
> > > valid.
> >
> > I will re-read later tonight...
> >
> > Gary
> >
> > >
> > > Alex
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>
-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-20 Thread David Dellsperger
I had just started to look into this and was going to call out the same
thing.  I'm concerned with those changes, especially the ones regarding the
allowDuplicates change, I made a note in my ticket for work to make sure we
have appropriate test cases on our end, with the RC, we didn't see any
issues with the compatibility between 1.9.0 and 1.10.0.RC.

David

On Thu, Oct 20, 2022 at 5:56 PM Alex Herbert 
wrote:

> On Thu, 20 Oct 2022 at 23:43, Alex Herbert 
> wrote:
> >
> > I did not have time to track through whether this behaviour changed
> > after the initial implementation of the flag. I would think not as the
> > original behaviour is from 1.0. This would map to:
> >
> > true -> ALLOW_ALL
> > false -> ALLOW_EMPTY
> > new -> DISALLOW
> >
> > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > behavioural compatibility (to 1.7, or back to 1.0).
>
> PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> for allowDuplicates=false and does not change any tests.
>
> So the test suite is currently not enforcing behavioural
> compatibility. This seems like a glaring hole in the tests and should
> be addressed to prevent regressions.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-20 Thread Alex Herbert
On Thu, 20 Oct 2022 at 23:43, Alex Herbert  wrote:
>
> I did not have time to track through whether this behaviour changed
> after the initial implementation of the flag. I would think not as the
> original behaviour is from 1.0. This would map to:
>
> true -> ALLOW_ALL
> false -> ALLOW_EMPTY
> new -> DISALLOW
>
> Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> change the use of the flag to 'false -> DISALLOW' is not maintaining
> behavioural compatibility (to 1.7, or back to 1.0).

PS. I just verified that PR 276 changes the DuplicateHeaderMode value
for allowDuplicates=false and does not change any tests.

So the test suite is currently not enforcing behavioural
compatibility. This seems like a glaring hole in the tests and should
be addressed to prevent regressions.

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-20 Thread Alex Herbert
On Thu, 20 Oct 2022 at 22:45, Gary D. Gregory  wrote:
>
> Hi All (below)
>
> On 2022/10/20 18:08:31 Alex Herbert wrote:
> > On Thu, 20 Oct 2022 at 17:05, sebb  wrote:
> > >
> > > On Thu, 20 Oct 2022 at 15:43, Gary Gregory  wrote:
> > > >
> > > > Would't it be simpler to deal with the serialization issue by bumping 
> > > > the
> > > > serialVersionID? We can just say that you only serialized and 
> > > > deserialize
> > > > for the same version.
> > >
> > > Are we willing to continue supporting serialisation going forward?
> > > It is not easy to ensure compatibility and avoid security issues.
> > >
> > > Are there any use-cases for allowing serialisation?
> >
> > Serialisation was broken for CSVRecord before and partially fixed in
> > 1.8 to support 1.0 to 1.6. Fields from 1.7 are not supported. The
> > release notes for 1.8 state that serialisation will not be supported
> > going forward. So this breakage of serialisation for CSVFormat has
> > precedent.
> >
> > I think Gary's suggestion to change the serial version ID commits to
> > the same path as not supporting serialisation from 2.0.
>
> I think this is the simplest solution. We can Javadoc the class and say that 
> we do not support serialization from one version to the next and that it will 
> be removed in 2.0. Check? If this OK, then I'll update git master and we can 
> move on to the duplicate headers enum.

+1

>
> >
> > Regarding the release, I am not concerned about serialisation as it
> > seems to be a lost cause. The issue is the behavioural compatibility
> > of switching from a boolean flag for duplicate headers to an enum with
> > 3 options. We should get this correct to avoid a future release having
> > to explain a behavioural compatibility change.
>
> Duplicate headers enum: We are leaning towards canceling RC1, updating the 
> PR, or creating a new PR. I'll wait to cancel RC1 until it is clear what is 
> being proposed, with a PR.

I think we need to check what happened before we had the duplicate
headers flag. This is what I have found:

CSV-239 added the flag (1.7.0) [1] in commit [2]. This was added to
allow the CSVRecord getHeaderNames to return all headers including
repeats. Before that duplicates threw an exception (see CSV-236 [3],
which predates CSV-239). Throwing an exception for duplicate headers
is mentioned in the changes log for release 1.0 [5]. Note the original
behavior was to throw for non-empty duplicates due to a fix
implemented in CSV-121 for release 1.0 [6]. So this is the original
behaviour.

CSV-264 added the enum (1.10.0) [4].

So if the duplicate headers flag has been in since 1.7 then we should
just map the behaviour to the new enum. The commit when the boolean
flag was added has this text in CSVParser:

"This will always allow a duplicate header if the header is empty"

So behaviour when the flag was added:

true - allow duplicates
false - only allow empty duplicates, throw for non-empy duplicates

I did not have time to track through whether this behaviour changed
after the initial implementation of the flag. I would think not as the
original behaviour is from 1.0. This would map to:

true -> ALLOW_ALL
false -> ALLOW_EMPTY
new -> DISALLOW

Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
change the use of the flag to 'false -> DISALLOW' is not maintaining
behavioural compatibility (to 1.7, or back to 1.0).

The original review by Markus Span also found inconsistent settings of
the quotedNullString. But this was removed from PR #276 and I lost
track of whether that change was required.

Alex

[1] https://issues.apache.org/jira/browse/CSV-236
[2] 
https://github.com/apache/commons-csv/commit/030fb8e37c4024b24fac2b5404300449a6741699
[3] https://issues.apache.org/jira/browse/CSV-236
[4] https://issues.apache.org/jira/browse/CSV-264
[5] https://commons.apache.org/proper/commons-csv/changes-report.html
[6] https://issues.apache.org/jira/browse/CSV-121
[7] https://github.com/apache/commons-csv/pull/276

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-20 Thread Gary D. Gregory
Hi All (below)

On 2022/10/20 18:08:31 Alex Herbert wrote:
> On Thu, 20 Oct 2022 at 17:05, sebb  wrote:
> >
> > On Thu, 20 Oct 2022 at 15:43, Gary Gregory  wrote:
> > >
> > > Would't it be simpler to deal with the serialization issue by bumping the
> > > serialVersionID? We can just say that you only serialized and deserialize
> > > for the same version.
> >
> > Are we willing to continue supporting serialisation going forward?
> > It is not easy to ensure compatibility and avoid security issues.
> >
> > Are there any use-cases for allowing serialisation?
> 
> Serialisation was broken for CSVRecord before and partially fixed in
> 1.8 to support 1.0 to 1.6. Fields from 1.7 are not supported. The
> release notes for 1.8 state that serialisation will not be supported
> going forward. So this breakage of serialisation for CSVFormat has
> precedent.
> 
> I think Gary's suggestion to change the serial version ID commits to
> the same path as not supporting serialisation from 2.0.

I think this is the simplest solution. We can Javadoc the class and say that we 
do not support serialization from one version to the next and that it will be 
removed in 2.0. Check? If this OK, then I'll update git master and we can move 
on to the duplicate headers enum.

> 
> Regarding the release, I am not concerned about serialisation as it
> seems to be a lost cause. The issue is the behavioural compatibility
> of switching from a boolean flag for duplicate headers to an enum with
> 3 options. We should get this correct to avoid a future release having
> to explain a behavioural compatibility change.

Duplicate headers enum: We are leaning towards canceling RC1, updating the PR, 
or creating a new PR. I'll wait to cancel RC1 until it is clear what is being 
proposed, with a PR.

Gary

> 
> Alex
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-20 Thread Alex Herbert
On Thu, 20 Oct 2022 at 17:05, sebb  wrote:
>
> On Thu, 20 Oct 2022 at 15:43, Gary Gregory  wrote:
> >
> > Would't it be simpler to deal with the serialization issue by bumping the
> > serialVersionID? We can just say that you only serialized and deserialize
> > for the same version.
>
> Are we willing to continue supporting serialisation going forward?
> It is not easy to ensure compatibility and avoid security issues.
>
> Are there any use-cases for allowing serialisation?

Serialisation was broken for CSVRecord before and partially fixed in
1.8 to support 1.0 to 1.6. Fields from 1.7 are not supported. The
release notes for 1.8 state that serialisation will not be supported
going forward. So this breakage of serialisation for CSVFormat has
precedent.

I think Gary's suggestion to change the serial version ID commits to
the same path as not supporting serialisation from 2.0.

Regarding the release, I am not concerned about serialisation as it
seems to be a lost cause. The issue is the behavioural compatibility
of switching from a boolean flag for duplicate headers to an enum with
3 options. We should get this correct to avoid a future release having
to explain a behavioural compatibility change.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-20 Thread sebb
On Thu, 20 Oct 2022 at 15:43, Gary Gregory  wrote:
>
> Would't it be simpler to deal with the serialization issue by bumping the
> serialVersionID? We can just say that you only serialized and deserialize
> for the same version.

Are we willing to continue supporting serialisation going forward?
It is not easy to ensure compatibility and avoid security issues.

Are there any use-cases for allowing serialisation?

> Also note the PR will throw an NPE in the builder
> when instead of using the validate() method.
>
> Gary
>
> On Wed, Oct 19, 2022, 18:27 Gary D. Gregory  wrote:
>
> > I've commented on the PR.
> > TY.
> > Gary
> >
> > On 2022/10/19 16:51:57 Gary Gregory wrote:
> > > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert 
> > wrote:
> > > >
> > > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory 
> > wrote:
> > > > >
> > > > > My +1
> > > > >
> > > > > Gary
> > > >
> > > > Gary,
> > > >
> > > > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.
> > > >
> > > > AllowDuplicates enum may be set to the incorrect value when setting
> > > > the allow duplicates boolean. Have you reviewed this? I believe it is
> > > > valid.
> > >
> > > I will re-read later tonight...
> > >
> > > Gary
> > >
> > > >
> > > > Alex
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-20 Thread Gary Gregory
Would't it be simpler to deal with the serialization issue by bumping the
serialVersionID? We can just say that you only serialized and deserialize
for the same version. Also note the PR will throw an NPE in the builder
when instead of using the validate() method.

Gary

On Wed, Oct 19, 2022, 18:27 Gary D. Gregory  wrote:

> I've commented on the PR.
> TY.
> Gary
>
> On 2022/10/19 16:51:57 Gary Gregory wrote:
> > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert 
> wrote:
> > >
> > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory 
> wrote:
> > > >
> > > > My +1
> > > >
> > > > Gary
> > >
> > > Gary,
> > >
> > > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.
> > >
> > > AllowDuplicates enum may be set to the incorrect value when setting
> > > the allow duplicates boolean. Have you reviewed this? I believe it is
> > > valid.
> >
> > I will re-read later tonight...
> >
> > Gary
> >
> > >
> > > Alex
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-19 Thread Gary D. Gregory
I've commented on the PR.
TY.
Gary

On 2022/10/19 16:51:57 Gary Gregory wrote:
> On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert  
> wrote:
> >
> > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory  wrote:
> > >
> > > My +1
> > >
> > > Gary
> >
> > Gary,
> >
> > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.
> >
> > AllowDuplicates enum may be set to the incorrect value when setting
> > the allow duplicates boolean. Have you reviewed this? I believe it is
> > valid.
> 
> I will re-read later tonight...
> 
> Gary
> 
> >
> > Alex
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-19 Thread Gary Gregory
On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert  wrote:
>
> On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory  wrote:
> >
> > My +1
> >
> > Gary
>
> Gary,
>
> PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.
>
> AllowDuplicates enum may be set to the incorrect value when setting
> the allow duplicates boolean. Have you reviewed this? I believe it is
> valid.

I will re-read later tonight...

Gary

>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-19 Thread Gary Gregory
On Wed, Oct 19, 2022 at 9:52 AM  wrote:
>
> > You have added test data for CSVFormat for 1.7 and 1.8 and these do
> not work (commented out). I take it this means serialization has been
> broken since the CSVFormat.delimiter was changed from char to String
> in 1.9.0.
>
> That's correct, Alex. I added the comments for documentation. Should we 
> decide to fix serialization from version 1.7 and version 1.8 upwards, the 
> test and test data can be enabled.
>
> > So given serialization was already broken in the last
> release does it make any sense to fix it for 1.9.0 to 1.10.0 for the new 
> field?
>
> According to the number of Maven central downloads versions 1.8 and 1.9.0 are 
> the most popular of commons-csv.
> So I think there is merit in fixing serialization issues from 1.9.0 to 1.10.0 
> Besides the PR already offers the fix, why abandon it?

It's not abandoned, it will be for the next go around. This is a not a blocker.

>
> Will PR #276 be part of version 1.10.0?

No.

> Do we have a volunteer to review and merge it?

Me, later.

TY,
Gary

>
> regards,
> Markus
>
>
> 
> From: Alex Herbert 
> Sent: Monday, October 17, 2022 19:53
> To: Commons Developers List 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> On Mon, 17 Oct 2022 at 17:11,  wrote:
> >
> > Hello
> >
> > > This is the logic from the current builder:
> > > DuplicateHeaderMode mode = allowDuplicateHeaderNames ?
> > > DuplicateHeaderMode.ALLOW_ALL : DuplicateHeaderMode.ALLOW_EMPTY
> >
> > This is true only if Builder.setAllowDuplicateHeaderNames is actually 
> > called, which is at the library user's discretion.
>
> I was only discussing the deserialization path and the logic that
> would be required in custom deserialization code to support backwards
> compatibility. However I do not think this library is supporting
> Serializable going forward. It is already known to be broken for
> CSVRecord and now also CSVFormat.
>
> >
> > Otherwise CSVFormat.Builder.duplicateHeaderMode remains null.
> > Also, the method Builder.setDuplicateHeaderMode(DuplicateHeaderMode) 
> > accepts null. A non-null parameter should be enforced.
> >
> > Besides, shouldn't the above statement be:
> > allowDuplicateHeaderNames ? ALLOW_ALL : DISALLOW
> > ?
> >
> > duplicateHeaderMode should default to DISALLOW for backward compatibility.
>
> Duplicate header support was added in 1.7. Before that I do not know
> what happened without checking the old code. It may have thrown an
> exception or silently ignored it. I would have to track down the Jira
> ticket for when the feature was added.
>
> >
> > Found another small bug in setNullString where member quotedNullString was 
> > inconsistently written (missing write in setQuote):
> >
> > this.quotedNullString = quoteCharacter + nullString + 
> > quoteCharacter;
> >
> > All of the above in pull request 
> > https://github.com/apache/commons-csv/pull/276
> >
> > @Alex would you review my PR please?
>
> Sure.
>
> You have added test data for CSVFormat for 1.7 and 1.8 and these do
> not work (commented out). I take it this means serialization has been
> broken since the CSVFormat.delimiter was changed from char to String
> in 1.9.0. So given serialization was already broken in the last
> release does it make any sense to fix it for 1.9.0 to 1.10.0 for the
> new field?
>
> I agree that the quotedNullString is inconsistently handled.
>
> Regards,
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-19 Thread Gary D. Gregory
Hi Markus,

Anyone can vote, please see https://www.apache.org/foundation/voting.html

Note that PMC member votes are binding, while others are advisory.

Gary

On 2022/10/17 10:00:13 sma...@outlook.de wrote:
> Hello
> 
> CSV-264 (Add DuplicateHeaderMode) introduces bugs that should be fixed before 
> shipping 1.10.0 IMO
> - missing default
> - broken serialization of class CSVFormat
> 
> I raised these issues in CSV-302.
> 
> The serialization issue is caught by Revapi. I had suggested to include 
> Revapi in the project (CSV-303) or alternatively add revapi and drop clirr in 
> commons-parent but that more or less fell on deaf ears.
> 
> mvn org.revapi:revapi-maven-plugin:check
> 
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  3.560 s
> [INFO] Finished at: 2022-10-17T11:30:21+02:00
> [INFO] 
> 
> [ERROR] Failed to execute goal org.revapi:revapi-maven-plugin:0.14.7:check 
> (default-cli) on project commons-csv: The following API problems caused the 
> build to fail:
> [ERROR] java.field.serialVersionUIDUnchanged: field 
> org.apache.commons.csv.CSVFormat.serialVersionUID: The class changed in an 
> incompatible way with regards to serialization but the serialVersionUID field 
> stayed unchanged. This might be ok and/or desired but is suspicious. 
> https://revapi.org/revapi-java/differences.html#java.field.serialVersionUIDUnchanged
> [ERROR] java.method.exception.checkedRemoved: method 
> java.util.List 
> org.apache.commons.csv.CSVParser::getRecords(): Method no longer throws 
> checked exceptions: java.io.IOException. 
> https://revapi.org/revapi-java/differences.html#java.method.exception.checkedRemoved
> 
> Also noticed that class DuplicateHeaderMode and method 
> CSVFormat.getDuplicateHeaderMode() are incorrectly annotated with "@since 
> 1.9.0". Both are new in version 1.10.0.
> 
> My vote is -1 assuming I am allowed to vote.
> 
> Kind regards,
> Markus
> 
> From: Gary Gregory 
> Sent: Sunday, October 16, 2022 14:48
> To: Commons Developers List 
> Subject: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 
>  
> To: dev@commons.apache.org
> We have fixed a few bugs and added some enhancements since Apache
> Commons CSV 1.9.0 was released, so I would like to release Apache
> Commons CSV 1.10.0.
> 
> Apache Commons CSV 1.10.0 RC1 is available for review here:
>     https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1 (svn
> revision 57404)
> 
> The Git tag commons-csv-1.10.0-RC1 commit for this RC is
> 1bd1fd8e6065da9d07b5a3a1723b059246b14001 which you can browse here:
>     
> https://gitbox.apache.org/repos/asf?p=commons-csv.git;a=commit;h=1bd1fd8e6065da9d07b5a3a1723b059246b14001
> You may checkout this tag using:
>     git clone https://gitbox.apache.org/repos/asf/commons-csv.git
> --branch commons-csv-1.10.0-RC1 commons-csv-1.10.0-RC1
> 
> Maven artifacts are here:
>     
> https://repository.apache.org/content/repositories/orgapachecommons-1600/org/apache/commons/commons-csv/1.10.0/
> 
> These are the artifacts and their hashes:
> 
> #Release SHA-512s
> #Sun Oct 16 08:32:39 EDT 2022
> Apache\ Commons\
> CSV-1.10.0.spdx.rdf.xml=c8cf7495637ce5282c72aa8e8ef4b93f206fbbdf21c7b337a34daa3b89af9816073d0185118cab4cdbed5104cc7f090e81efb614594079b02829c96829fe8aeb
> commons-csv-1.10.0-bin.tar.gz=5b0ff3f5cc4aeff6c2aa2c11e92ea8b1b1fbbcd098ba79d97bcfd9af3ea08a9ebc922e5237bda8abcb238eb0d2fe7a5ef27534d0d796a1fc77691684a698f3d1
> commons-csv-1.10.0-bin.zip=a52caa6ccda5b830c133c965fd6843d8f960e1ce3ef83decf5435022481746093cf8b4308c1fded21a168b751724b4a9f5e72f3eca34381828968bff89e7239b
> commons-csv-1.10.0-bom.json=0e3e6652e520afb697461a3f6b33458dc8ef1dd157c2b1442664b8f12f7504226916e64960cea5cc92dd2b923d18047ed921be60eb7febfbeffc8ac3e3254fc0
> commons-csv-1.10.0-bom.xml=91e494b94eea35ee7d85c8d8dbdc5029f981e0c5a8b1aae2b0f3cd77d1cea0fb0fc8c857d18d41ca055d12c4c4eb1adf70582f178a88751b836861c3286283db
> commons-csv-1.10.0-javadoc.jar=719656f5399e0889af2e8b40f49458d0dec6e8ae85413f36caa2187235c68d9e42502bf07c66c0bcc7c5c3427bc0aac51f9897581842d303501bbff5ec73bdc0
> commons-csv-1.10.0-sources.jar=c78c8fa112ef35d6d6c061b05e08816ce516a3acd56d720c59dbdaeb9d1202b16294233bd2aa2fac91cfa759456f8d7e08951876d85decdf0b8fb215a8e31754
> commons-csv-1.10.0-src.tar.gz=a0d3a7aedab567aaa32a3dd1ceddeff412e686f3fd2b0805993db7e3dd4804b4d5d59e70f1500f23e8d0ba6531832e9f20d0c298bd443d9825724d5d1b1c87b3
> commons-csv-1.10.0-src.zip=5cbdbcf367c9dd78e76228817592d70378b3cd8d919fdcd7df5c3eb2d7ebd8bd47fbb17376a631efb640be4115d42d3cc7aafa866bfefdfe073d5a4fcf625e0b
> commons-csv-1.10.0-test-sources.jar=0d77c791934609b176457df5c60fdd5bfa3c6f392375adef81e5e7cd31de7fb25bd91b0bbe60a9ff46acb62efc2a33660adff24850370b7546a71374aa267f94
> 

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-19 Thread Alex Herbert
On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory  wrote:
>
> My +1
>
> Gary

Gary,

PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.

AllowDuplicates enum may be set to the incorrect value when setting
the allow duplicates boolean. Have you reviewed this? I believe it is
valid.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-19 Thread Alex Herbert
On Wed, 19 Oct 2022 at 14:52,  wrote:
>
> > You have added test data for CSVFormat for 1.7 and 1.8 and these do
> not work (commented out). I take it this means serialization has been
> broken since the CSVFormat.delimiter was changed from char to String
> in 1.9.0.
>
> That's correct, Alex. I added the comments for documentation. Should we 
> decide to fix serialization from version 1.7 and version 1.8 upwards, the 
> test and test data can be enabled.
>
> > So given serialization was already broken in the last
> release does it make any sense to fix it for 1.9.0 to 1.10.0 for the new 
> field?
>
> According to the number of Maven central downloads versions 1.8 and 1.9.0 are 
> the most popular of commons-csv.
> So I think there is merit in fixing serialization issues from 1.9.0 to 1.10.0 
> Besides the PR already offers the fix, why abandon it?

Serialization is already broken for CSVRecord from 1.7+. It is broken
for CSVFormat 1.7+. It will be removed in 2.0. So I think that
serialization should be ignored going forward.

>
> Will PR #276 be part of version 1.10.0?

Perhaps it should be.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-19 Thread Gary D. Gregory
My +1

Gary

On 2022/10/16 12:48:50 Gary Gregory wrote:
> To: dev@commons.apache.org
> We have fixed a few bugs and added some enhancements since Apache
> Commons CSV 1.9.0 was released, so I would like to release Apache
> Commons CSV 1.10.0.
> 
> Apache Commons CSV 1.10.0 RC1 is available for review here:
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1 (svn
> revision 57404)
> 
> The Git tag commons-csv-1.10.0-RC1 commit for this RC is
> 1bd1fd8e6065da9d07b5a3a1723b059246b14001 which you can browse here:
> 
> https://gitbox.apache.org/repos/asf?p=commons-csv.git;a=commit;h=1bd1fd8e6065da9d07b5a3a1723b059246b14001
> You may checkout this tag using:
> git clone https://gitbox.apache.org/repos/asf/commons-csv.git
> --branch commons-csv-1.10.0-RC1 commons-csv-1.10.0-RC1
> 
> Maven artifacts are here:
> 
> https://repository.apache.org/content/repositories/orgapachecommons-1600/org/apache/commons/commons-csv/1.10.0/
> 
> These are the artifacts and their hashes:
> 
> #Release SHA-512s
> #Sun Oct 16 08:32:39 EDT 2022
> Apache\ Commons\
> CSV-1.10.0.spdx.rdf.xml=c8cf7495637ce5282c72aa8e8ef4b93f206fbbdf21c7b337a34daa3b89af9816073d0185118cab4cdbed5104cc7f090e81efb614594079b02829c96829fe8aeb
> commons-csv-1.10.0-bin.tar.gz=5b0ff3f5cc4aeff6c2aa2c11e92ea8b1b1fbbcd098ba79d97bcfd9af3ea08a9ebc922e5237bda8abcb238eb0d2fe7a5ef27534d0d796a1fc77691684a698f3d1
> commons-csv-1.10.0-bin.zip=a52caa6ccda5b830c133c965fd6843d8f960e1ce3ef83decf5435022481746093cf8b4308c1fded21a168b751724b4a9f5e72f3eca34381828968bff89e7239b
> commons-csv-1.10.0-bom.json=0e3e6652e520afb697461a3f6b33458dc8ef1dd157c2b1442664b8f12f7504226916e64960cea5cc92dd2b923d18047ed921be60eb7febfbeffc8ac3e3254fc0
> commons-csv-1.10.0-bom.xml=91e494b94eea35ee7d85c8d8dbdc5029f981e0c5a8b1aae2b0f3cd77d1cea0fb0fc8c857d18d41ca055d12c4c4eb1adf70582f178a88751b836861c3286283db
> commons-csv-1.10.0-javadoc.jar=719656f5399e0889af2e8b40f49458d0dec6e8ae85413f36caa2187235c68d9e42502bf07c66c0bcc7c5c3427bc0aac51f9897581842d303501bbff5ec73bdc0
> commons-csv-1.10.0-sources.jar=c78c8fa112ef35d6d6c061b05e08816ce516a3acd56d720c59dbdaeb9d1202b16294233bd2aa2fac91cfa759456f8d7e08951876d85decdf0b8fb215a8e31754
> commons-csv-1.10.0-src.tar.gz=a0d3a7aedab567aaa32a3dd1ceddeff412e686f3fd2b0805993db7e3dd4804b4d5d59e70f1500f23e8d0ba6531832e9f20d0c298bd443d9825724d5d1b1c87b3
> commons-csv-1.10.0-src.zip=5cbdbcf367c9dd78e76228817592d70378b3cd8d919fdcd7df5c3eb2d7ebd8bd47fbb17376a631efb640be4115d42d3cc7aafa866bfefdfe073d5a4fcf625e0b
> commons-csv-1.10.0-test-sources.jar=0d77c791934609b176457df5c60fdd5bfa3c6f392375adef81e5e7cd31de7fb25bd91b0bbe60a9ff46acb62efc2a33660adff24850370b7546a71374aa267f94
> commons-csv-1.10.0-tests.jar=9a08e4d2ec865d9e1dafc01e4ee58804490b2d50f886bac47a5235581077b31a15dda5b3f8f2adb5a70be8794cca886126d980d6d3722eec2e10471fc24e1940
> 
> I have tested this with 'mvn' and 'mvn -V -Duser.name=$my_apache_id
> -Prelease -Ptest-deploy -P jacoco -P japicmp clean package site
> deploy' using:
> 
> Darwin *** 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10
> PDT 2022; root:xnu-8020.140.49~2/RELEASE_X86_64 x86_64
> 
> Apache Maven 3.8.6 (84538c9988a25aec085021c365c560670ad80f63)
> Maven home: /usr/local/Cellar/maven/3.8.6/libexec
> Java version: 1.8.0_345, vendor: Homebrew, runtime:
> /usr/local/Cellar/openjdk@8/1.8.0+345/libexec/openjdk.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "12.6", arch: "x86_64", family: "mac"
> 
> Details of changes since 1.9.0 are in the release notes:
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/RELEASE-NOTES.txt
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/site/changes-report.html
> 
> Site:
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/site/index.html
> (note some *relative* links are broken and the 1.10.0 directories
> are not yet created - these will be OK once the site is deployed.)
> 
> JApiCmp Report (compared to 1.9.0):
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/site/japicmp.html
> 
> RAT Report:
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/site/rat-report.html
> 
> KEYS:
>   https://downloads.apache.org/commons/KEYS
> 
> Please review the release candidate and vote.
> This vote will close no sooner than 72 hours from now.
> 
>   [ ] +1 Release these artifacts
>   [ ] +0 OK, but...
>   [ ] -0 OK, but really should fix...
>   [ ] -1 I oppose this release because...
> 
> Thank you,
> 
> Gary Gregory,
> Release Manager (using key 86fdc7e2a11262cb)
> 
> For following is intended as a helper and refresher for reviewers.
> 
> Validating a release candidate
> ==
> 
> These guidelines are NOT complete.
> 
> Requirements: Git, Java, Maven.
> 
> You can validate a release from a release candidate (RC) tag as follows.
> 
> 1) Clone and checkout the RC tag
> 
> git clone 

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-19 Thread sman81
> You have added test data for CSVFormat for 1.7 and 1.8 and these do
not work (commented out). I take it this means serialization has been
broken since the CSVFormat.delimiter was changed from char to String
in 1.9.0.

That's correct, Alex. I added the comments for documentation. Should we decide 
to fix serialization from version 1.7 and version 1.8 upwards, the test and 
test data can be enabled.

> So given serialization was already broken in the last
release does it make any sense to fix it for 1.9.0 to 1.10.0 for the new field?

According to the number of Maven central downloads versions 1.8 and 1.9.0 are 
the most popular of commons-csv.
So I think there is merit in fixing serialization issues from 1.9.0 to 1.10.0 
Besides the PR already offers the fix, why abandon it?

Will PR #276 be part of version 1.10.0?
Do we have a volunteer to review and merge it?

regards,
Markus



From: Alex Herbert 
Sent: Monday, October 17, 2022 19:53
To: Commons Developers List 
Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

On Mon, 17 Oct 2022 at 17:11,  wrote:
>
> Hello
>
> > This is the logic from the current builder:
> > DuplicateHeaderMode mode = allowDuplicateHeaderNames ?
> > DuplicateHeaderMode.ALLOW_ALL : DuplicateHeaderMode.ALLOW_EMPTY
>
> This is true only if Builder.setAllowDuplicateHeaderNames is actually called, 
> which is at the library user's discretion.

I was only discussing the deserialization path and the logic that
would be required in custom deserialization code to support backwards
compatibility. However I do not think this library is supporting
Serializable going forward. It is already known to be broken for
CSVRecord and now also CSVFormat.

>
> Otherwise CSVFormat.Builder.duplicateHeaderMode remains null.
> Also, the method Builder.setDuplicateHeaderMode(DuplicateHeaderMode) accepts 
> null. A non-null parameter should be enforced.
>
> Besides, shouldn't the above statement be:
> allowDuplicateHeaderNames ? ALLOW_ALL : DISALLOW
> ?
>
> duplicateHeaderMode should default to DISALLOW for backward compatibility.

Duplicate header support was added in 1.7. Before that I do not know
what happened without checking the old code. It may have thrown an
exception or silently ignored it. I would have to track down the Jira
ticket for when the feature was added.

>
> Found another small bug in setNullString where member quotedNullString was 
> inconsistently written (missing write in setQuote):
>
> this.quotedNullString = quoteCharacter + nullString + 
> quoteCharacter;
>
> All of the above in pull request 
> https://github.com/apache/commons-csv/pull/276
>
> @Alex would you review my PR please?

Sure.

You have added test data for CSVFormat for 1.7 and 1.8 and these do
not work (commented out). I take it this means serialization has been
broken since the CSVFormat.delimiter was changed from char to String
in 1.9.0. So given serialization was already broken in the last
release does it make any sense to fix it for 1.9.0 to 1.10.0 for the
new field?

I agree that the quotedNullString is inconsistently handled.

Regards,

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-17 Thread Alex Herbert
On Mon, 17 Oct 2022 at 17:11,  wrote:
>
> Hello
>
> > This is the logic from the current builder:
> > DuplicateHeaderMode mode = allowDuplicateHeaderNames ?
> > DuplicateHeaderMode.ALLOW_ALL : DuplicateHeaderMode.ALLOW_EMPTY
>
> This is true only if Builder.setAllowDuplicateHeaderNames is actually called, 
> which is at the library user's discretion.

I was only discussing the deserialization path and the logic that
would be required in custom deserialization code to support backwards
compatibility. However I do not think this library is supporting
Serializable going forward. It is already known to be broken for
CSVRecord and now also CSVFormat.

>
> Otherwise CSVFormat.Builder.duplicateHeaderMode remains null.
> Also, the method Builder.setDuplicateHeaderMode(DuplicateHeaderMode) accepts 
> null. A non-null parameter should be enforced.
>
> Besides, shouldn't the above statement be:
> allowDuplicateHeaderNames ? ALLOW_ALL : DISALLOW
> ?
>
> duplicateHeaderMode should default to DISALLOW for backward compatibility.

Duplicate header support was added in 1.7. Before that I do not know
what happened without checking the old code. It may have thrown an
exception or silently ignored it. I would have to track down the Jira
ticket for when the feature was added.

>
> Found another small bug in setNullString where member quotedNullString was 
> inconsistently written (missing write in setQuote):
>
> this.quotedNullString = quoteCharacter + nullString + 
> quoteCharacter;
>
> All of the above in pull request 
> https://github.com/apache/commons-csv/pull/276
>
> @Alex would you review my PR please?

Sure.

You have added test data for CSVFormat for 1.7 and 1.8 and these do
not work (commented out). I take it this means serialization has been
broken since the CSVFormat.delimiter was changed from char to String
in 1.9.0. So given serialization was already broken in the last
release does it make any sense to fix it for 1.9.0 to 1.10.0 for the
new field?

I agree that the quotedNullString is inconsistently handled.

Regards,

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-17 Thread sman81
Hello

> This is the logic from the current builder:
> DuplicateHeaderMode mode = allowDuplicateHeaderNames ?
> DuplicateHeaderMode.ALLOW_ALL : DuplicateHeaderMode.ALLOW_EMPTY

This is true only if Builder.setAllowDuplicateHeaderNames is actually called, 
which is at the library user's discretion.

Otherwise CSVFormat.Builder.duplicateHeaderMode remains null.
Also, the method Builder.setDuplicateHeaderMode(DuplicateHeaderMode) accepts 
null. A non-null parameter should be enforced.

Besides, shouldn't the above statement be:
allowDuplicateHeaderNames ? ALLOW_ALL : DISALLOW
?

duplicateHeaderMode should default to DISALLOW for backward compatibility.

Found another small bug in setNullString where member quotedNullString was 
inconsistently written (missing write in setQuote):

this.quotedNullString = quoteCharacter + nullString + 
quoteCharacter;

All of the above in pull request https://github.com/apache/commons-csv/pull/276

@Alex would you review my PR please?

Kind regards,
Markus


From: Alex Herbert 
Sent: Monday, October 17, 2022 13:46
To: Commons Developers List 
Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

On Mon, 17 Oct 2022 at 11:00,  wrote:
>
> Hello
>
> CSV-264 (Add DuplicateHeaderMode) introduces bugs that should be fixed before 
> shipping 1.10.0 IMO
> - missing default
> - broken serialization of class CSVFormat
>
> I raised these issues in CSV-302.
>
> The serialization issue is caught by Revapi. I had suggested to include 
> Revapi in the project (CSV-303) or alternatively add revapi and drop clirr in 
> commons-parent but that more or less fell on deaf ears.
>
> mvn org.revapi:revapi-maven-plugin:check
>
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  3.560 s
> [INFO] Finished at: 2022-10-17T11:30:21+02:00
> [INFO] 
> 
> [ERROR] Failed to execute goal org.revapi:revapi-maven-plugin:0.14.7:check 
> (default-cli) on project commons-csv: The following API problems caused the 
> build to fail:
> [ERROR] java.field.serialVersionUIDUnchanged: field 
> org.apache.commons.csv.CSVFormat.serialVersionUID: The class changed in an 
> incompatible way with regards to serialization but the serialVersionUID field 
> stayed unchanged. This might be ok and/or desired but is suspicious. 
> https://revapi.org/revapi-java/differences.html#java.field.serialVersionUIDUnchanged

This is an allowed source and binary compatible error; it has
potentially breaking semantic severity which do apply here.

The CSVFormat object is deserialized using the defaultReadObject
method. This will be able to deserialize older versions of CSVFormat.
The new fields added in this version will be absent from the
ObjectInputStream and set to their default initialised value.

IIUC this is the change:

remove: private boolean allowDuplicateHeaderNames
add: private DuplicateHeaderMode duplicateHeaderMode

Deserialization would mean that the new field 'duplicateHeaderMode' is
set to null and the previous field 'allowDuplicateHeaderNames' is
ignored. To fix this would require setting duplicateHeaderMode
appropriately using the field allowDuplicateHeaderName. This is the
logic from the current builder:

DuplicateHeaderMode mode = allowDuplicateHeaderNames ?
DuplicateHeaderMode.ALLOW_ALL : DuplicateHeaderMode.ALLOW_EMPTY

However, serialization was broken previously in CSV. This was for the
CSVRecord. The release notes contain this in the description for 1.8:

"This release fixes serialization compatibility of CSVRecord with
versions 1.0 to 1.6. New fields added since 1.7 are not serialized.
Support for Serializable is scheduled to be removed in version 2.0."

The Jira ticket is 248:
https://issues.apache.org/jira/browse/CSV-248

In that case we made sure we could (de)serialize without throwing an
exception to minimise errors for downstream users. But we did not fix
behavioural compatibility across versions. In this instance I do not
think exceptions need to be addressed as missing fields will be
ignored and the new field will be initialized as null. However I have
not verified this with a serialization test. If this new field is null
then the format will behave as if duplicates is set to DISALLOW.

I do not know the user case for serializing a CSVFormat. It would not
be within the library. A reference is kept in CSVParser and CSVPrinter
but these are not Serializable.

The serialization support dates back to the legacy codebase used to
create the project. Serialization can be replaced by other more
configurable libraries to read/write binary objects. Perhaps we just
need to add to the release notes that CSVF

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-17 Thread Alex Herbert
On Mon, 17 Oct 2022 at 11:00,  wrote:
>
> Hello
>
> CSV-264 (Add DuplicateHeaderMode) introduces bugs that should be fixed before 
> shipping 1.10.0 IMO
> - missing default
> - broken serialization of class CSVFormat
>
> I raised these issues in CSV-302.
>
> The serialization issue is caught by Revapi. I had suggested to include 
> Revapi in the project (CSV-303) or alternatively add revapi and drop clirr in 
> commons-parent but that more or less fell on deaf ears.
>
> mvn org.revapi:revapi-maven-plugin:check
>
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time:  3.560 s
> [INFO] Finished at: 2022-10-17T11:30:21+02:00
> [INFO] 
> 
> [ERROR] Failed to execute goal org.revapi:revapi-maven-plugin:0.14.7:check 
> (default-cli) on project commons-csv: The following API problems caused the 
> build to fail:
> [ERROR] java.field.serialVersionUIDUnchanged: field 
> org.apache.commons.csv.CSVFormat.serialVersionUID: The class changed in an 
> incompatible way with regards to serialization but the serialVersionUID field 
> stayed unchanged. This might be ok and/or desired but is suspicious. 
> https://revapi.org/revapi-java/differences.html#java.field.serialVersionUIDUnchanged

This is an allowed source and binary compatible error; it has
potentially breaking semantic severity which do apply here.

The CSVFormat object is deserialized using the defaultReadObject
method. This will be able to deserialize older versions of CSVFormat.
The new fields added in this version will be absent from the
ObjectInputStream and set to their default initialised value.

IIUC this is the change:

remove: private boolean allowDuplicateHeaderNames
add: private DuplicateHeaderMode duplicateHeaderMode

Deserialization would mean that the new field 'duplicateHeaderMode' is
set to null and the previous field 'allowDuplicateHeaderNames' is
ignored. To fix this would require setting duplicateHeaderMode
appropriately using the field allowDuplicateHeaderName. This is the
logic from the current builder:

DuplicateHeaderMode mode = allowDuplicateHeaderNames ?
DuplicateHeaderMode.ALLOW_ALL : DuplicateHeaderMode.ALLOW_EMPTY

However, serialization was broken previously in CSV. This was for the
CSVRecord. The release notes contain this in the description for 1.8:

"This release fixes serialization compatibility of CSVRecord with
versions 1.0 to 1.6. New fields added since 1.7 are not serialized.
Support for Serializable is scheduled to be removed in version 2.0."

The Jira ticket is 248:
https://issues.apache.org/jira/browse/CSV-248

In that case we made sure we could (de)serialize without throwing an
exception to minimise errors for downstream users. But we did not fix
behavioural compatibility across versions. In this instance I do not
think exceptions need to be addressed as missing fields will be
ignored and the new field will be initialized as null. However I have
not verified this with a serialization test. If this new field is null
then the format will behave as if duplicates is set to DISALLOW.

I do not know the user case for serializing a CSVFormat. It would not
be within the library. A reference is kept in CSVParser and CSVPrinter
but these are not Serializable.

The serialization support dates back to the legacy codebase used to
create the project. Serialization can be replaced by other more
configurable libraries to read/write binary objects. Perhaps we just
need to add to the release notes that CSVFormat serialization is no
longer supported.


> [ERROR] java.method.exception.checkedRemoved: method 
> java.util.List 
> org.apache.commons.csv.CSVParser::getRecords(): Method no longer throws 
> checked exceptions: java.io.IOException. 
> https://revapi.org/revapi-java/differences.html#java.method.exception.checkedRemoved

This is a source incompatible error. It is binary compatible.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-17 Thread sman81
Hello

CSV-264 (Add DuplicateHeaderMode) introduces bugs that should be fixed before 
shipping 1.10.0 IMO
- missing default
- broken serialization of class CSVFormat

I raised these issues in CSV-302.

The serialization issue is caught by Revapi. I had suggested to include Revapi 
in the project (CSV-303) or alternatively add revapi and drop clirr in 
commons-parent but that more or less fell on deaf ears.

mvn org.revapi:revapi-maven-plugin:check

[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time:  3.560 s
[INFO] Finished at: 2022-10-17T11:30:21+02:00
[INFO] 
[ERROR] Failed to execute goal org.revapi:revapi-maven-plugin:0.14.7:check 
(default-cli) on project commons-csv: The following API problems caused the 
build to fail:
[ERROR] java.field.serialVersionUIDUnchanged: field 
org.apache.commons.csv.CSVFormat.serialVersionUID: The class changed in an 
incompatible way with regards to serialization but the serialVersionUID field 
stayed unchanged. This might be ok and/or desired but is suspicious. 
https://revapi.org/revapi-java/differences.html#java.field.serialVersionUIDUnchanged
[ERROR] java.method.exception.checkedRemoved: method 
java.util.List 
org.apache.commons.csv.CSVParser::getRecords(): Method no longer throws checked 
exceptions: java.io.IOException. 
https://revapi.org/revapi-java/differences.html#java.method.exception.checkedRemoved

Also noticed that class DuplicateHeaderMode and method 
CSVFormat.getDuplicateHeaderMode() are incorrectly annotated with "@since 
1.9.0". Both are new in version 1.10.0.

My vote is -1 assuming I am allowed to vote.

Kind regards,
Markus

From: Gary Gregory 
Sent: Sunday, October 16, 2022 14:48
To: Commons Developers List 
Subject: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 
 
To: dev@commons.apache.org
We have fixed a few bugs and added some enhancements since Apache
Commons CSV 1.9.0 was released, so I would like to release Apache
Commons CSV 1.10.0.

Apache Commons CSV 1.10.0 RC1 is available for review here:
    https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1 (svn
revision 57404)

The Git tag commons-csv-1.10.0-RC1 commit for this RC is
1bd1fd8e6065da9d07b5a3a1723b059246b14001 which you can browse here:
    
https://gitbox.apache.org/repos/asf?p=commons-csv.git;a=commit;h=1bd1fd8e6065da9d07b5a3a1723b059246b14001
You may checkout this tag using:
    git clone https://gitbox.apache.org/repos/asf/commons-csv.git
--branch commons-csv-1.10.0-RC1 commons-csv-1.10.0-RC1

Maven artifacts are here:
    
https://repository.apache.org/content/repositories/orgapachecommons-1600/org/apache/commons/commons-csv/1.10.0/

These are the artifacts and their hashes:

#Release SHA-512s
#Sun Oct 16 08:32:39 EDT 2022
Apache\ Commons\
CSV-1.10.0.spdx.rdf.xml=c8cf7495637ce5282c72aa8e8ef4b93f206fbbdf21c7b337a34daa3b89af9816073d0185118cab4cdbed5104cc7f090e81efb614594079b02829c96829fe8aeb
commons-csv-1.10.0-bin.tar.gz=5b0ff3f5cc4aeff6c2aa2c11e92ea8b1b1fbbcd098ba79d97bcfd9af3ea08a9ebc922e5237bda8abcb238eb0d2fe7a5ef27534d0d796a1fc77691684a698f3d1
commons-csv-1.10.0-bin.zip=a52caa6ccda5b830c133c965fd6843d8f960e1ce3ef83decf5435022481746093cf8b4308c1fded21a168b751724b4a9f5e72f3eca34381828968bff89e7239b
commons-csv-1.10.0-bom.json=0e3e6652e520afb697461a3f6b33458dc8ef1dd157c2b1442664b8f12f7504226916e64960cea5cc92dd2b923d18047ed921be60eb7febfbeffc8ac3e3254fc0
commons-csv-1.10.0-bom.xml=91e494b94eea35ee7d85c8d8dbdc5029f981e0c5a8b1aae2b0f3cd77d1cea0fb0fc8c857d18d41ca055d12c4c4eb1adf70582f178a88751b836861c3286283db
commons-csv-1.10.0-javadoc.jar=719656f5399e0889af2e8b40f49458d0dec6e8ae85413f36caa2187235c68d9e42502bf07c66c0bcc7c5c3427bc0aac51f9897581842d303501bbff5ec73bdc0
commons-csv-1.10.0-sources.jar=c78c8fa112ef35d6d6c061b05e08816ce516a3acd56d720c59dbdaeb9d1202b16294233bd2aa2fac91cfa759456f8d7e08951876d85decdf0b8fb215a8e31754
commons-csv-1.10.0-src.tar.gz=a0d3a7aedab567aaa32a3dd1ceddeff412e686f3fd2b0805993db7e3dd4804b4d5d59e70f1500f23e8d0ba6531832e9f20d0c298bd443d9825724d5d1b1c87b3
commons-csv-1.10.0-src.zip=5cbdbcf367c9dd78e76228817592d70378b3cd8d919fdcd7df5c3eb2d7ebd8bd47fbb17376a631efb640be4115d42d3cc7aafa866bfefdfe073d5a4fcf625e0b
commons-csv-1.10.0-test-sources.jar=0d77c791934609b176457df5c60fdd5bfa3c6f392375adef81e5e7cd31de7fb25bd91b0bbe60a9ff46acb62efc2a33660adff24850370b7546a71374aa267f94
commons-csv-1.10.0-tests.jar=9a08e4d2ec865d9e1dafc01e4ee58804490b2d50f886bac47a5235581077b31a15dda5b3f8f2adb5a70be8794cca886126d980d6d3722eec2e10471fc24e1940

I have tested this with 'mvn' and 'mvn -V -Duser.name=$my_apache_id
-Prelease -Ptest-deploy -P jacoco -P japicmp clean package site
deploy' using:

Darwin *** 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10
PDT 2022; root:xnu-8020.140.49~2/RELEASE_X86_64 x86_64

Apache Maven 3.8.6 

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-16 Thread Bruno Kinoshita
+1

Built successfully from tag, using:

Apache Maven 3.8.5 (3599d3414f046de2324203b78ddcf9b5e4388aa0)
Maven home: /opt/apache-maven-3.8.5
Java version: 17.0.4, vendor: Private Build, runtime:
/usr/lib/jvm/java-17-openjdk-amd64
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "5.15.0-50-generic", arch: "amd64", family:
"unix"

Cheers
Bruno

On Mon, 17 Oct 2022 at 01:49, Gary Gregory  wrote:

> To: dev@commons.apache.org
> We have fixed a few bugs and added some enhancements since Apache
> Commons CSV 1.9.0 was released, so I would like to release Apache
> Commons CSV 1.10.0.
>
> Apache Commons CSV 1.10.0 RC1 is available for review here:
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1 (svn
> revision 57404)
>
> The Git tag commons-csv-1.10.0-RC1 commit for this RC is
> 1bd1fd8e6065da9d07b5a3a1723b059246b14001 which you can browse here:
>
> https://gitbox.apache.org/repos/asf?p=commons-csv.git;a=commit;h=1bd1fd8e6065da9d07b5a3a1723b059246b14001
> You may checkout this tag using:
> git clone https://gitbox.apache.org/repos/asf/commons-csv.git
> --branch 
> commons-csv-1.10.0-RC1 commons-csv-1.10.0-RC1
>
> Maven artifacts are here:
>
> https://repository.apache.org/content/repositories/orgapachecommons-1600/org/apache/commons/commons-csv/1.10.0/
>
> These are the artifacts and their hashes:
>
> #Release SHA-512s
> #Sun Oct 16 08:32:39 EDT 2022
> Apache\ Commons\
>
> CSV-1.10.0.spdx.rdf.xml=c8cf7495637ce5282c72aa8e8ef4b93f206fbbdf21c7b337a34daa3b89af9816073d0185118cab4cdbed5104cc7f090e81efb614594079b02829c96829fe8aeb
>
> commons-csv-1.10.0-bin.tar.gz=5b0ff3f5cc4aeff6c2aa2c11e92ea8b1b1fbbcd098ba79d97bcfd9af3ea08a9ebc922e5237bda8abcb238eb0d2fe7a5ef27534d0d796a1fc77691684a698f3d1
>
> commons-csv-1.10.0-bin.zip=a52caa6ccda5b830c133c965fd6843d8f960e1ce3ef83decf5435022481746093cf8b4308c1fded21a168b751724b4a9f5e72f3eca34381828968bff89e7239b
>
> commons-csv-1.10.0-bom.json=0e3e6652e520afb697461a3f6b33458dc8ef1dd157c2b1442664b8f12f7504226916e64960cea5cc92dd2b923d18047ed921be60eb7febfbeffc8ac3e3254fc0
>
> commons-csv-1.10.0-bom.xml=91e494b94eea35ee7d85c8d8dbdc5029f981e0c5a8b1aae2b0f3cd77d1cea0fb0fc8c857d18d41ca055d12c4c4eb1adf70582f178a88751b836861c3286283db
>
> commons-csv-1.10.0-javadoc.jar=719656f5399e0889af2e8b40f49458d0dec6e8ae85413f36caa2187235c68d9e42502bf07c66c0bcc7c5c3427bc0aac51f9897581842d303501bbff5ec73bdc0
>
> commons-csv-1.10.0-sources.jar=c78c8fa112ef35d6d6c061b05e08816ce516a3acd56d720c59dbdaeb9d1202b16294233bd2aa2fac91cfa759456f8d7e08951876d85decdf0b8fb215a8e31754
>
> commons-csv-1.10.0-src.tar.gz=a0d3a7aedab567aaa32a3dd1ceddeff412e686f3fd2b0805993db7e3dd4804b4d5d59e70f1500f23e8d0ba6531832e9f20d0c298bd443d9825724d5d1b1c87b3
>
> commons-csv-1.10.0-src.zip=5cbdbcf367c9dd78e76228817592d70378b3cd8d919fdcd7df5c3eb2d7ebd8bd47fbb17376a631efb640be4115d42d3cc7aafa866bfefdfe073d5a4fcf625e0b
>
> commons-csv-1.10.0-test-sources.jar=0d77c791934609b176457df5c60fdd5bfa3c6f392375adef81e5e7cd31de7fb25bd91b0bbe60a9ff46acb62efc2a33660adff24850370b7546a71374aa267f94
>
> commons-csv-1.10.0-tests.jar=9a08e4d2ec865d9e1dafc01e4ee58804490b2d50f886bac47a5235581077b31a15dda5b3f8f2adb5a70be8794cca886126d980d6d3722eec2e10471fc24e1940
>
> I have tested this with 'mvn' and 'mvn -V -Duser.name=$my_apache_id
> -Prelease -Ptest-deploy -P jacoco -P japicmp clean package site
> deploy' using:
>
> Darwin *** 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10
> PDT 2022; root:xnu-8020.140.49~2/RELEASE_X86_64 x86_64
>
> Apache Maven 3.8.6 (84538c9988a25aec085021c365c560670ad80f63)
> Maven home: /usr/local/Cellar/maven/3.8.6/libexec
> Java version: 1.8.0_345, vendor: Homebrew, runtime:
> /usr/local/Cellar/openjdk@8
> /1.8.0+345/libexec/openjdk.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "12.6", arch: "x86_64", family: "mac"
>
> Details of changes since 1.9.0 are in the release notes:
>
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/RELEASE-NOTES.txt
>
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/site/changes-report.html
>
> Site:
>
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/site/index.html
> (note some *relative* links are broken and the 1.10.0 directories
> are not yet created - these will be OK once the site is deployed.)
>
> JApiCmp Report (compared to 1.9.0):
>
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/site/japicmp.html
>
> RAT Report:
>
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1/site/rat-report.html
>
> KEYS:
>   https://downloads.apache.org/commons/KEYS
>
> Please review the release candidate and vote.
> This vote will close no sooner than 72 hours from now.
>
>   [ ] +1 Release these artifacts
>   [ ] +0 OK, but...
>   [ ] -0 OK, but really should fix...
>   [ ] -1 I oppose this release because...
>
> Thank you,
>
> Gary Gregory,
> Release Manager (using key 

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-16 Thread Gary Gregory
I will update all of this now but not cut another RC, as you said, none of
these are blockers. I'm doing this in the release branch for convenience.

Tracking below..

On Sun, Oct 16, 2022 at 10:57 AM Alex Herbert 
wrote:
>
> Verified sha512 and asc signatures on the binary and source distributions.
>
> Built the src.tar.gz using:
>
> mvn clean install site -Pjacoco,japicmp
>
> with:
>
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: /usr/local/apache-maven-3.6.3
> Java version: 17, vendor: Oracle Corporation, runtime:
> /Library/Java/JavaVirtualMachines/jdk-17.jdk/Contents/Home
> Default locale: en_GB, platform encoding: UTF-8
> OS name: "mac os x", version: "12.6", arch: "x86_64", family: "mac"
>
> Built from the git tag using:
>
> mvn install site
>
> with:
>
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: /usr/local/apache-maven-3.6.3
> Java version: 1.8.0_301, vendor: Oracle Corporation, runtime:
> /Library/Java/JavaVirtualMachines/jdk1.8.0_301.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "10.16", arch: "x86_64", family: "mac"
>
> Note: PMD: The build has warnings about deprecated rules to be removed in
> 7.0. These should be cleaned up.

I removed the deprecated rules.

>
> Site:
>
> The main index.html page has the dependency example using version 1.9.0.
> This should be 1.10.0. Given this is in the dependency information section
> under project information then perhaps it could be dropped to avoid
> maintenance.

Agreed & done.

>
> The CI management section is wrong (refers to commons-parent). This is
> picked up from the  element in the POM of commons-parent.
The
> element should be overridden in the commons-csv POM.

Done

>
> Surefire report:
>
> CSV-213 test is disabled. This is marked as fixed in CSV 1.4 in the Jira
> ticket. Perhaps it can be reenabled?

Yep & done.

There are a lot of other disabled
> tests too (29 occurrences found in the report). I did not check them all.
> This should be looked at in the future as they may actually pass and were
> not reenabled after the fix.

I was enabled to enable some CSVPrinterTest tests

* testJira135_part1
* testJira135_part3
* testRandomPostgreSqlText

The disabled PostgreSQL tests just need to be reworked to account for the
bug fix in this release.

The other disabled test either need research or underlying issues fixed.

>
> CSVFormat.printRecord(final Appendable appendable, final Object... values)
> has no coverage.

Fixed.

Perhaps this is related to the new
> CSVPrinter.printRecord(Stream) method (e.g. execution paths have switched
> to use the new method).
>
> JApiCmp - All changes OK. I checked the @since tags for the new API:
>
> DuplicateHeaderMode is a new class. The since tag is for 1.9.0. This
should
> be 1.10.0.

Fixed

>
> CSVFormat.getDuplicateHeaderMode @since tag is also 1.9.0.

Fixed

>
> new method: org.apache.commons.csv.CSVFormat$Builder
setDuplicateHeaderMode
> has no @since tag
>
> All other new API @since tags are OK.
>
> changes, rat, jira, javadoc, Checkstyle, spotbugs, PMD all OK.
>
>
> Overall no major issues, just some minor issues to clean up but I do not
> think these are blockers.
>
> +1: Release these artifacts

Thank you for the thorough review!

>
> Alex
>
>
>
>
> On Sun, 16 Oct 2022 at 13:49, Gary Gregory  wrote:
>
> > To: dev@commons.apache.org
> > We have fixed a few bugs and added some enhancements since Apache
> > Commons CSV 1.9.0 was released, so I would like to release Apache
> > Commons CSV 1.10.0.
> >
> > Apache Commons CSV 1.10.0 RC1 is available for review here:
> > https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1 (svn
> > revision 57404)
> >
> > The Git tag commons-csv-1.10.0-RC1 commit for this RC is
> > 1bd1fd8e6065da9d07b5a3a1723b059246b14001 which you can browse here:
> >
> >
https://gitbox.apache.org/repos/asf?p=commons-csv.git;a=commit;h=1bd1fd8e6065da9d07b5a3a1723b059246b14001
> > You may checkout this tag using:
> > git clone https://gitbox.apache.org/repos/asf/commons-csv.git
> > --branch 
> > commons-csv-1.10.0-RC1 commons-csv-1.10.0-RC1
> >
> > Maven artifacts are here:
> >
> >
https://repository.apache.org/content/repositories/orgapachecommons-1600/org/apache/commons/commons-csv/1.10.0/
> >
> > These are the artifacts and their hashes:
> >
> > #Release SHA-512s
> > #Sun Oct 16 08:32:39 EDT 2022
> > Apache\ Commons\
> >
> >
CSV-1.10.0.spdx.rdf.xml=c8cf7495637ce5282c72aa8e8ef4b93f206fbbdf21c7b337a34daa3b89af9816073d0185118cab4cdbed5104cc7f090e81efb614594079b02829c96829fe8aeb
> >
> >
commons-csv-1.10.0-bin.tar.gz=5b0ff3f5cc4aeff6c2aa2c11e92ea8b1b1fbbcd098ba79d97bcfd9af3ea08a9ebc922e5237bda8abcb238eb0d2fe7a5ef27534d0d796a1fc77691684a698f3d1
> >
> >

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-16 Thread Alex Herbert
Verified sha512 and asc signatures on the binary and source distributions.

Built the src.tar.gz using:

mvn clean install site -Pjacoco,japicmp

with:

Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/apache-maven-3.6.3
Java version: 17, vendor: Oracle Corporation, runtime:
/Library/Java/JavaVirtualMachines/jdk-17.jdk/Contents/Home
Default locale: en_GB, platform encoding: UTF-8
OS name: "mac os x", version: "12.6", arch: "x86_64", family: "mac"

Built from the git tag using:

mvn install site

with:

Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/apache-maven-3.6.3
Java version: 1.8.0_301, vendor: Oracle Corporation, runtime:
/Library/Java/JavaVirtualMachines/jdk1.8.0_301.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.16", arch: "x86_64", family: "mac"

Note: PMD: The build has warnings about deprecated rules to be removed in
7.0. These should be cleaned up.

Site:

The main index.html page has the dependency example using version 1.9.0.
This should be 1.10.0. Given this is in the dependency information section
under project information then perhaps it could be dropped to avoid
maintenance.

The CI management section is wrong (refers to commons-parent). This is
picked up from the  element in the POM of commons-parent. The
element should be overridden in the commons-csv POM.

Surefire report:

CSV-213 test is disabled. This is marked as fixed in CSV 1.4 in the Jira
ticket. Perhaps it can be reenabled? There are a lot of other disabled
tests too (29 occurrences found in the report). I did not check them all.
This should be looked at in the future as they may actually pass and were
not reenabled after the fix.

CSVFormat.printRecord(final Appendable appendable, final Object... values)
has no coverage. Perhaps this is related to the new
CSVPrinter.printRecord(Stream) method (e.g. execution paths have switched
to use the new method).

JApiCmp - All changes OK. I checked the @since tags for the new API:

DuplicateHeaderMode is a new class. The since tag is for 1.9.0. This should
be 1.10.0.

CSVFormat.getDuplicateHeaderMode @since tag is also 1.9.0.

new method: org.apache.commons.csv.CSVFormat$Builder setDuplicateHeaderMode
has no @since tag

All other new API @since tags are OK.

changes, rat, jira, javadoc, Checkstyle, spotbugs, PMD all OK.


Overall no major issues, just some minor issues to clean up but I do not
think these are blockers.

+1: Release these artifacts

Alex




On Sun, 16 Oct 2022 at 13:49, Gary Gregory  wrote:

> To: dev@commons.apache.org
> We have fixed a few bugs and added some enhancements since Apache
> Commons CSV 1.9.0 was released, so I would like to release Apache
> Commons CSV 1.10.0.
>
> Apache Commons CSV 1.10.0 RC1 is available for review here:
> https://dist.apache.org/repos/dist/dev/commons/csv/1.10.0-RC1 (svn
> revision 57404)
>
> The Git tag commons-csv-1.10.0-RC1 commit for this RC is
> 1bd1fd8e6065da9d07b5a3a1723b059246b14001 which you can browse here:
>
> https://gitbox.apache.org/repos/asf?p=commons-csv.git;a=commit;h=1bd1fd8e6065da9d07b5a3a1723b059246b14001
> You may checkout this tag using:
> git clone https://gitbox.apache.org/repos/asf/commons-csv.git
> --branch 
> commons-csv-1.10.0-RC1 commons-csv-1.10.0-RC1
>
> Maven artifacts are here:
>
> https://repository.apache.org/content/repositories/orgapachecommons-1600/org/apache/commons/commons-csv/1.10.0/
>
> These are the artifacts and their hashes:
>
> #Release SHA-512s
> #Sun Oct 16 08:32:39 EDT 2022
> Apache\ Commons\
>
> CSV-1.10.0.spdx.rdf.xml=c8cf7495637ce5282c72aa8e8ef4b93f206fbbdf21c7b337a34daa3b89af9816073d0185118cab4cdbed5104cc7f090e81efb614594079b02829c96829fe8aeb
>
> commons-csv-1.10.0-bin.tar.gz=5b0ff3f5cc4aeff6c2aa2c11e92ea8b1b1fbbcd098ba79d97bcfd9af3ea08a9ebc922e5237bda8abcb238eb0d2fe7a5ef27534d0d796a1fc77691684a698f3d1
>
> commons-csv-1.10.0-bin.zip=a52caa6ccda5b830c133c965fd6843d8f960e1ce3ef83decf5435022481746093cf8b4308c1fded21a168b751724b4a9f5e72f3eca34381828968bff89e7239b
>
> commons-csv-1.10.0-bom.json=0e3e6652e520afb697461a3f6b33458dc8ef1dd157c2b1442664b8f12f7504226916e64960cea5cc92dd2b923d18047ed921be60eb7febfbeffc8ac3e3254fc0
>
> commons-csv-1.10.0-bom.xml=91e494b94eea35ee7d85c8d8dbdc5029f981e0c5a8b1aae2b0f3cd77d1cea0fb0fc8c857d18d41ca055d12c4c4eb1adf70582f178a88751b836861c3286283db
>
> commons-csv-1.10.0-javadoc.jar=719656f5399e0889af2e8b40f49458d0dec6e8ae85413f36caa2187235c68d9e42502bf07c66c0bcc7c5c3427bc0aac51f9897581842d303501bbff5ec73bdc0
>
> commons-csv-1.10.0-sources.jar=c78c8fa112ef35d6d6c061b05e08816ce516a3acd56d720c59dbdaeb9d1202b16294233bd2aa2fac91cfa759456f8d7e08951876d85decdf0b8fb215a8e31754
>
> commons-csv-1.10.0-src.tar.gz=a0d3a7aedab567aaa32a3dd1ceddeff412e686f3fd2b0805993db7e3dd4804b4d5d59e70f1500f23e8d0ba6531832e9f20d0c298bd443d9825724d5d1b1c87b3
>
>