On Mon, Jan 20, 2020 at 6:28 PM Gary Gregory <garydgreg...@gmail.com> wrote:
> On Sun, Jan 19, 2020 at 7:39 AM Alex Herbert <alex.d.herb...@gmail.com> > wrote: > >> Hi Gary, >> >> I raised a few niggles a while back with CSV and the discussion did not >> receive a response on how to proceed. >> >> There is the major bug CSV-248 where the CSVRecord is not Serializable >> [1]. This requires a decision on what to do to fix it. This bug is still >> present in 1.8 RC1 as found by FindBugs [2]. >> >> From what I can see the CSVRecord maintains a reference to the CSVParser. >> This chain of objects maintained in memory is not serializable and leads >> back to the original input Reader. >> >> I can see from the JApiCmp report that the serial version id was changed >> for CSVRecord this release so there is still an intention to support >> serialization. So this should be a blocker. >> >> I could not find a serialisation test in the unit tests for CSVRecord. >> This quick test added to CSVRecordTest fails: >> >> >> @Test >> public void testSerialization() throws IOException { >> CSVRecord shortRec; >> try (final CSVParser parser = CSVParser.parse("a,b", >> CSVFormat.newFormat(','))) { >> shortRec = parser.iterator().next(); >> } >> final ByteArrayOutputStream out = new ByteArrayOutputStream(); >> try (ObjectOutputStream oos = new ObjectOutputStream(out)) { >> oos.writeObject(shortRec); >> } >> } >> >> mvn test -Dtest=CSVRecordTest >> >> [ERROR] testSerialization Time elapsed: 0.032 s <<< ERROR! >> java.io.NotSerializableException: org.apache.commons.csv.CSVParser >> at >> org.apache.commons.csv.CSVRecordTest.testSerialization(CSVRecordTest.java:235) >> >> If I mark the field csvParser as transient it passes. So this is a >> problem as raised by FindBugs. >> > > Making the field transient would indeed fix this test but... some of the > record APIs would then fail with NPEs... so we would be kicking the can > down the road basically. Making the parser serializable is going in the > wrong direction feature-wise IMO, so let's not go there. A serialization > proxy would be less worse but should we provide such a thing? I would say > no. I am OK with making the field transient despite the can kicking, so > let's do that. > I suppose we should bump the serialVersionUID... Gary > > Gary > > >> >> >> I also raised [3] the strange implementation of the CSVParser >> getHeaderNames() which ignores null headers as they cannot be used as a key >> into the map. However the list of column names could contain the null >> values. This test currently fails: >> >> @Test >> public void testHeaderNamesWithNull() throws IOException { >> final Reader in = new >> StringReader("header1,null,header3\n1,2,3\n4,5,6"); >> final Iterator<CSVRecord> records = CSVFormat.DEFAULT.withHeader() >> >> .withNullString("null") >> >> .withAllowMissingColumnNames() >> >> .parse(in).iterator(); >> final CSVRecord record = records.next(); >> assertEquals(Arrays.asList("header1", null, "header3"), >> record.getParser().getHeaderNames()); >> } >> >> I am not saying it should pass but at least the documentation should >> state the behaviour in this edge case. That is the list of header names may >> be shorter than the number of columns when the parser is configured to >> allow null headers. I’ve not raised a bug ticket for this as it is open to >> opinion if this is by design or actually a bug. This issue is still present >> in 1.8 RC1. >> >> Previously I suggested documentation changes for this and another edge >> case using the header map to be added to the javadoc for getHeaderNames() >> and getHeaderMap(): >> >> - Documentation: >> >> The mapping is only guaranteed to be a one-to-one mapping if the record >> was created with a format that does not allow duplicate or null header >> names. Null headers are excluded from the map and duplicates can only map >> to 1 column. >> >> >> - Bug / Documentation >> >> The CSVParser only stores headers names in a list of header names if they >> are not null. So the list can be shorter than the number of columns if you >> use a format that allows empty headers and contains null column names. >> >> >> The ultimate result is that we should document that the purpose of the >> header names is to provide a list of non-null header names in the order >> they occur in the header and thus represent keys that can be used in the >> header map. In certain circumstances there may be more columns in the data >> than there are header names. >> >> >> Alex >> >> >> [1] https://issues.apache.org/jira/browse/CSV-248 < >> https://issues.apache.org/jira/browse/CSV-248> >> >> [2] >> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/findbugs.html >> < >> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/findbugs.html >> > >> >> [3] https://markmail.org/message/woti2iymecosihx6 < >> https://markmail.org/message/woti2iymecosihx6> >> >> >> >> > On 18 Jan 2020, at 17:52, Gary Gregory <ggreg...@apache.org> wrote: >> > >> > We have fixed quite a few bugs and added some significant enhancements >> > since Apache Commons CSV 1.7 was released, so I would like to release >> > Apache Commons CSV 1.8. >> > >> > Apache Commons CSV 1.8 RC1 is available for review here: >> > https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1 (svn >> > revision 37670) >> > >> > The Git tag commons-csv-1.8-RC1 commit for this RC is >> > c1c8b32809df295423fc897eae0e8b22bfadfe27 which you can browse here: >> > >> > >> https://gitbox.apache.org/repos/asf?p=commons-csv.git;a=commit;h=c1c8b32809df295423fc897eae0e8b22bfadfe27 >> > You may checkout this tag using: >> > git clone https://gitbox.apache.org/repos/asf/commons-csv.git >> --branch >> > commons-csv-1.8-RC1 commons-csv-1.8-RC1 >> > >> > Maven artifacts are here: >> > >> > >> https://repository.apache.org/content/repositories/orgapachecommons-1489/org/apache/commons/commons-csv/1.8/ >> > >> > These are the artifacts and their hashes: >> > >> > #Release SHA-512s >> > #Sat Jan 18 12:01:01 EST 2020 >> > >> commons-csv-1.8-bin.tar.gz=85a876b41aa9ce61f7f533c46df48754e05bddbdef892aed2bac7674b5ea13855de25576364649048dbb55e7fb18a354305b56cb697e85df68a87113954128ed >> > >> commons-csv-1.8-bin.zip=9b86a22367c84a0c96a457e8495f81113b64ae5501eabbe2ea4137654b6baa05bcc24a19626452b80e30ff2dd39214840c6ec534be1db9eec2d12c93eeab2de1 >> > >> commons-csv-1.8-javadoc.jar=a481149dfeffe4e915d5d2e846831994223fc7d09ed2b61398c68eed5a672654a141fa6de705aa743d0b5af6fd24a3f4b0d5e7cee238a1f7642673288d4a985d >> > >> commons-csv-1.8-sources.jar=f68e50f8a025a8b2a570b46905b22b5753a83c19bee5c38103d92ec1e47b4e0d27353e7931961e74fe8e67c4909b0ece6ede49a585d2f9180a7a15458b020bc0 >> > >> commons-csv-1.8-src.tar.gz=c3268f456978e75c19134e35d05bff77002b2fa7439be2623d58a102cab4f93b0913a1a789f962aafcd677938be1547f47c5dd86e3ea08b7bf8f0420e81beb7a >> > >> commons-csv-1.8-src.zip=ebb32f2406b6afa48472283612c7d0a94f932d7ae7a72ad1d239e2249de12f1e0da7f61d34d95d66b1d1fe95b66b6316af9d1fc93734f610cce4a7163b0900d0 >> > >> commons-csv-1.8-test-sources.jar=13508d417e23a5d3f575a39b3aedb20d0d834335d7994f3045fff316e6b12e50cbf9afe908271357b4091d981c178a28dc61bcdb8db60bd0ada07d3de59eacbf >> > >> commons-csv-1.8-tests.jar=901889d4be203c2044df89b7e051d21e7b806e5e56438bf9a7483b334331da94b71de1a129c8bf7967e02479a0922bb834ce37eaabf6662702e147813ecb2b7f >> > >> > I have tested this with ' mvn -V -Prelease -Ptest-deploy -P jacoco -P >> > japicmp clean package site deploy' using: >> > >> > Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) >> > Maven home: C:\Java\apache-maven-3.6.3\bin\.. >> > Java version: 1.8.0_241, vendor: Oracle Corporation, runtime: C:\Program >> > Files\Java\jdk1.8.0_241\jre >> > Default locale: en_US, platform encoding: Cp1252 >> > OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows" >> > >> > Additional tests with 'mvn -V clean test' using: >> > >> > Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) >> > Maven home: C:\Java\apache-maven-3.6.3\bin\.. >> > Java version: 1.8.0_241, vendor: Oracle Corporation, runtime: C:\Program >> > Files\Java\jdk1.8.0_241\jre >> > Default locale: en_US, platform encoding: Cp1252 >> > OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows" >> > >> > Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) >> > Maven home: C:\Java\apache-maven-3.6.3\bin\.. >> > Java version: 11.0.6, vendor: Oracle Corporation, runtime: C:\Program >> > Files\Java\jdk-11.0.6 >> > Default locale: en_US, platform encoding: Cp1252 >> > OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows" >> > >> > Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) >> > Maven home: C:\Java\apache-maven-3.6.3\bin\.. >> > Java version: 12.0.2, vendor: Oracle Corporation, runtime: C:\Program >> > Files\Java\jdk-12.0.2 >> > Default locale: en_US, platform encoding: Cp1252 >> > OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows" >> > >> > Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) >> > Maven home: C:\Java\apache-maven-3.6.3\bin\.. >> > Java version: 13.0.2, vendor: Oracle Corporation, runtime: C:\Program >> > Files\Java\jdk-13.0.2 >> > Default locale: en_US, platform encoding: Cp1252 >> > OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows" >> > >> > Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) >> > Maven home: C:\Java\apache-maven-3.6.3\bin\.. >> > Java version: 14-ea, vendor: Oracle Corporation, runtime: C:\Program >> > Files\Java\openjdk\jdk-14 >> > Default locale: en_US, platform encoding: Cp1252 >> > OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows" >> > >> > JaCoCo fails on Java 15-EA because it does not know about class file >> major >> > version 59: >> > >> > Caused by: java.lang.IllegalArgumentException: Unsupported class file >> major >> > version 59 >> > at >> > >> org.jacoco.agent.rt.internal_43f5073.asm.ClassReader.<init>(ClassReader.java:195) >> > >> > Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f) >> > Maven home: C:\Java\apache-maven-3.6.3\bin\.. >> > Java version: 15-ea, vendor: Oracle Corporation, runtime: C:\Program >> > Files\Java\openjdk\jdk-15 >> > Default locale: en_US, platform encoding: Cp1252 >> > OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows" >> > >> > Details of changes since 1.7 are in the release notes: >> > >> > >> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/RELEASE-NOTES.txt >> > >> > >> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/changes-report.html >> > >> > Site: >> > >> > >> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/index.html >> > (note some *relative* links are broken and the 1.8 directories are >> not >> > yet created - these will be OK once the site is deployed.) >> > >> > JApiCmp Report (compared to 1.7): >> > >> > >> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/japicmp.html >> > >> > RAT Report: >> > >> > >> https://dist.apache.org/repos/dist/dev/commons/csv/1.8-RC1/site/rat-report.html >> > >> > KEYS: >> > https://www.apache.org/dist/commons/KEYS >> > >> > Please review the release candidate and vote. >> > This vote will close no sooner that 72 hours from now. >> > >> > [ ] +1 Release these artifacts >> > [ ] +0 OK, but... >> > [ ] -0 OK, but really should fix... >> > [ ] -1 I oppose this release because... >> > >> > Thank you, >> > >> > Gary Gregory, >> > Release Manager (using key 86fdc7e2a11262cb) >> > >> > For following is intended as a helper and refresher for reviewers. >> > >> > Validating a release candidate >> > ============================== >> > >> > These guidelines are NOT complete. >> > >> > Requirements: Git, Java, Maven. >> > >> > You can validate a release from a release candidate (RC) tag as follows. >> > >> > 1) Clone and checkout the RC tag >> > >> > git clone https://gitbox.apache.org/repos/asf/commons-csv.git --branch >> > commons-csv-1.8-RC1 commons-csv-1.8-RC1 >> > cd commons-csv-1.8-RC1 >> > >> > 2) Check Apache licenses >> > >> > This step is not required if the site includes a RAT report page which >> you >> > then must check. >> > >> > mvn apache-rat:check >> > >> > 3) Check binary compatibility >> > >> > Older components still use Apache Clirr: >> > >> > This step is not required if the site includes a Clirr report page which >> > you then must check. >> > >> > mvn clirr:check >> > >> > Newer components use JApiCmp with the japicmp Maven Profile: >> > >> > This step is not required if the site includes a JApiCmp report page >> which >> > you then must check. >> > >> > mvn install -DskipTests -P japicmp japicmp:cmp >> > >> > 4) Build the package >> > >> > mvn -V clean package >> > >> > You can record the Maven and Java version produced by -V in your VOTE >> reply. >> > To gather OS information from a command line: >> > Windows: ver >> > Linux: uname -a >> > >> > 5) Build the site for a single module project >> > >> > Note: Some plugins require the components to be installed instead of >> > packaged. >> > >> > mvn site >> > Check the site reports in: >> > - Windows: target\site\index.html >> > - Linux: target/site/index.html >> > >> > 6) Build the site for a multi-module project >> > >> > mvn site >> > mvn site:stage >> > Check the site reports in: >> > - Windows: target\site\index.html >> > - Linux: target/site/index.html >> > >> > -the end- >> >>