Hi Gary,

I raised a few niggles a while back with CSV and the discussion did not receive 
a response on how to proceed.

There is the major bug CSV-248 where the CSVRecord is not Serializable [1]. 
This requires a decision on what to do to fix it. This bug is still present in 
1.8 RC1 as found by FindBugs [2].

From what I can see the CSVRecord maintains a reference to the CSVParser. This 
chain of objects maintained in memory is not serializable and leads back to the 
original input Reader.

I can see from the JApiCmp report that the serial version id was changed for 
CSVRecord this release so there is still an intention to support serialization. 
So this should be a blocker.

I could not find a serialisation test in the unit tests for CSVRecord. This 
quick test added to CSVRecordTest fails:


@Test
public void testSerialization() throws IOException {
    CSVRecord shortRec;
    try (final CSVParser parser = CSVParser.parse("a,b", 
CSVFormat.newFormat(','))) {
        shortRec = parser.iterator().next();
    }
    final ByteArrayOutputStream out = new ByteArrayOutputStream();
    try (ObjectOutputStream oos = new ObjectOutputStream(out)) {
        oos.writeObject(shortRec);
    }
}

mvn test -Dtest=CSVRecordTest

[ERROR] testSerialization  Time elapsed: 0.032 s  <<< ERROR!
java.io.NotSerializableException: org.apache.commons.csv.CSVParser
        at 
org.apache.commons.csv.CSVRecordTest.testSerialization(CSVRecordTest.java:235)

If I mark the field csvParser as transient it passes. So this is a problem as 
raised by FindBugs.



I also raised [3] the strange implementation of the CSVParser getHeaderNames() 
which ignores null headers as they cannot be used as a key into the map. 
However the list of column names could contain the null values. This test 
currently fails:

@Test
public void testHeaderNamesWithNull() throws IOException {
    final Reader in = new StringReader("header1,null,header3\n1,2,3\n4,5,6");
    final Iterator<CSVRecord> records = CSVFormat.DEFAULT.withHeader()
                                                         .withNullString("null")
                                                         
.withAllowMissingColumnNames()
                                                         .parse(in).iterator();
    final CSVRecord record = records.next();
    assertEquals(Arrays.asList("header1", null, "header3"), 
record.getParser().getHeaderNames());
}

I am not saying it should pass but at least the documentation should state the 
behaviour in this edge case. That is the list of header names may be shorter 
than the number of columns when the parser is configured to allow null headers. 
I’ve not raised a bug ticket for this as it is open to opinion if this is by 
design or actually a bug. This issue is still present in 1.8 RC1.

Previously I suggested documentation changes for this and another edge case 
using the header map to be added to the javadoc for getHeaderNames() and 
getHeaderMap():

- Documentation:

The mapping is only guaranteed to be a one-to-one mapping if the record was 
created with a format that does not allow duplicate or null header names. Null 
headers are excluded from the map and duplicates can only map to 1 column.


- Bug / Documentation

The CSVParser only stores headers names in a list of header names if they are 
not null. So the list can be shorter than the number of columns if you use a 
format that allows empty headers and contains null column names.


The ultimate result is that we should document that the purpose of the header 
names is to provide a list of non-null header names in the order they occur in 
the header and thus represent keys that can be used in the header map. In 
certain circumstances there may be more columns in the data than there are 
header names.


Alex


[1] https://issues.apache.org/jira/browse/CSV-248 
<https://issues.apache.org/jira/browse/CSV-248>

[2] 
https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/findbugs.html 
<https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/findbugs.html>

[3] https://markmail.org/message/woti2iymecosihx6 
<https://markmail.org/message/woti2iymecosihx6>



> On 18 Jan 2020, at 17:52, Gary Gregory <ggreg...@apache.org> wrote:
> 
> We have fixed quite a few bugs and added some significant enhancements
> since Apache Commons CSV 1.7 was released, so I would like to release
> Apache Commons CSV 1.8.
> 
> Apache Commons CSV 1.8 RC1 is available for review here:
>    https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1 (svn
> revision 37670)
> 
> The Git tag commons-csv-1.8-RC1 commit for this RC is
> c1c8b32809df295423fc897eae0e8b22bfadfe27 which you can browse here:
> 
> https://gitbox.apache.org/repos/asf?p=commons-csv.git;a=commit;h=c1c8b32809df295423fc897eae0e8b22bfadfe27
> You may checkout this tag using:
>    git clone https://gitbox.apache.org/repos/asf/commons-csv.git --branch
> commons-csv-1.8-RC1 commons-csv-1.8-RC1
> 
> Maven artifacts are here:
> 
> https://repository.apache.org/content/repositories/orgapachecommons-1489/org/apache/commons/commons-csv/1.8/
> 
> These are the artifacts and their hashes:
> 
> #Release SHA-512s
> #Sat Jan 18 12:01:01 EST 2020
> commons-csv-1.8-bin.tar.gz=85a876b41aa9ce61f7f533c46df48754e05bddbdef892aed2bac7674b5ea13855de25576364649048dbb55e7fb18a354305b56cb697e85df68a87113954128ed
> commons-csv-1.8-bin.zip=9b86a22367c84a0c96a457e8495f81113b64ae5501eabbe2ea4137654b6baa05bcc24a19626452b80e30ff2dd39214840c6ec534be1db9eec2d12c93eeab2de1
> commons-csv-1.8-javadoc.jar=a481149dfeffe4e915d5d2e846831994223fc7d09ed2b61398c68eed5a672654a141fa6de705aa743d0b5af6fd24a3f4b0d5e7cee238a1f7642673288d4a985d
> commons-csv-1.8-sources.jar=f68e50f8a025a8b2a570b46905b22b5753a83c19bee5c38103d92ec1e47b4e0d27353e7931961e74fe8e67c4909b0ece6ede49a585d2f9180a7a15458b020bc0
> commons-csv-1.8-src.tar.gz=c3268f456978e75c19134e35d05bff77002b2fa7439be2623d58a102cab4f93b0913a1a789f962aafcd677938be1547f47c5dd86e3ea08b7bf8f0420e81beb7a
> commons-csv-1.8-src.zip=ebb32f2406b6afa48472283612c7d0a94f932d7ae7a72ad1d239e2249de12f1e0da7f61d34d95d66b1d1fe95b66b6316af9d1fc93734f610cce4a7163b0900d0
> commons-csv-1.8-test-sources.jar=13508d417e23a5d3f575a39b3aedb20d0d834335d7994f3045fff316e6b12e50cbf9afe908271357b4091d981c178a28dc61bcdb8db60bd0ada07d3de59eacbf
> commons-csv-1.8-tests.jar=901889d4be203c2044df89b7e051d21e7b806e5e56438bf9a7483b334331da94b71de1a129c8bf7967e02479a0922bb834ce37eaabf6662702e147813ecb2b7f
> 
> I have tested this with ' mvn -V -Prelease -Ptest-deploy -P jacoco -P
> japicmp clean package site deploy' using:
> 
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: C:\Java\apache-maven-3.6.3\bin\..
> Java version: 1.8.0_241, vendor: Oracle Corporation, runtime: C:\Program
> Files\Java\jdk1.8.0_241\jre
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> 
> Additional tests with 'mvn -V clean test' using:
> 
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: C:\Java\apache-maven-3.6.3\bin\..
> Java version: 1.8.0_241, vendor: Oracle Corporation, runtime: C:\Program
> Files\Java\jdk1.8.0_241\jre
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> 
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: C:\Java\apache-maven-3.6.3\bin\..
> Java version: 11.0.6, vendor: Oracle Corporation, runtime: C:\Program
> Files\Java\jdk-11.0.6
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> 
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: C:\Java\apache-maven-3.6.3\bin\..
> Java version: 12.0.2, vendor: Oracle Corporation, runtime: C:\Program
> Files\Java\jdk-12.0.2
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> 
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: C:\Java\apache-maven-3.6.3\bin\..
> Java version: 13.0.2, vendor: Oracle Corporation, runtime: C:\Program
> Files\Java\jdk-13.0.2
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> 
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: C:\Java\apache-maven-3.6.3\bin\..
> Java version: 14-ea, vendor: Oracle Corporation, runtime: C:\Program
> Files\Java\openjdk\jdk-14
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> 
> JaCoCo fails on Java 15-EA because it does not know about class file major
> version 59:
> 
> Caused by: java.lang.IllegalArgumentException: Unsupported class file major
> version 59
>        at
> org.jacoco.agent.rt.internal_43f5073.asm.ClassReader.<init>(ClassReader.java:195)
> 
> Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
> Maven home: C:\Java\apache-maven-3.6.3\bin\..
> Java version: 15-ea, vendor: Oracle Corporation, runtime: C:\Program
> Files\Java\openjdk\jdk-15
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
> 
> Details of changes since 1.7 are in the release notes:
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/RELEASE-NOTES.txt
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/changes-report.html
> 
> Site:
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/index.html
>    (note some *relative* links are broken and the 1.8 directories are not
> yet created - these will be OK once the site is deployed.)
> 
> JApiCmp Report (compared to 1.7):
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/japicmp.html
> 
> RAT Report:
> 
> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/rat-report.html
> 
> KEYS:
>  https://www.apache.org/dist/commons/KEYS
> 
> Please review the release candidate and vote.
> This vote will close no sooner that 72 hours from now.
> 
>  [ ] +1 Release these artifacts
>  [ ] +0 OK, but...
>  [ ] -0 OK, but really should fix...
>  [ ] -1 I oppose this release because...
> 
> Thank you,
> 
> Gary Gregory,
> Release Manager (using key 86fdc7e2a11262cb)
> 
> For following is intended as a helper and refresher for reviewers.
> 
> Validating a release candidate
> ==============================
> 
> These guidelines are NOT complete.
> 
> Requirements: Git, Java, Maven.
> 
> You can validate a release from a release candidate (RC) tag as follows.
> 
> 1) Clone and checkout the RC tag
> 
> git clone https://gitbox.apache.org/repos/asf/commons-csv.git --branch
> commons-csv-1.8-RC1 commons-csv-1.8-RC1
> cd commons-csv-1.8-RC1
> 
> 2) Check Apache licenses
> 
> This step is not required if the site includes a RAT report page which you
> then must check.
> 
> mvn apache-rat:check
> 
> 3) Check binary compatibility
> 
> Older components still use Apache Clirr:
> 
> This step is not required if the site includes a Clirr report page which
> you then must check.
> 
> mvn clirr:check
> 
> Newer components use JApiCmp with the japicmp Maven Profile:
> 
> This step is not required if the site includes a JApiCmp report page which
> you then must check.
> 
> mvn install -DskipTests -P japicmp japicmp:cmp
> 
> 4) Build the package
> 
> mvn -V clean package
> 
> You can record the Maven and Java version produced by -V in your VOTE reply.
> To gather OS information from a command line:
> Windows: ver
> Linux: uname -a
> 
> 5) Build the site for a single module project
> 
> Note: Some plugins require the components to be installed instead of
> packaged.
> 
> mvn site
> Check the site reports in:
> - Windows: target\site\index.html
> - Linux: target/site/index.html
> 
> 6) Build the site for a multi-module project
> 
> mvn site
> mvn site:stage
> Check the site reports in:
> - Windows: target\site\index.html
> - Linux: target/site/index.html
> 
> -the end-

Reply via email to