Agree, we should try to avoid including test-data that requires attribution - users of Commons CSV will then seemingly have to propagate the NOTICE when the data is not used at runtime.
Is there a reason why we can't create equivalent test-CSV files (use the same syntax and escapes), or is the whole point that the tests must be able to use CSV files in the wild "as-is"? As the original URLs have gone 404 we can't just use rely on them either (in addition to the fact that the build would then require Internet access). Let's see if we get a reply this time :) If not then I think we will simply need to remove the ferc.gov test files and the derived test class. I can help re-create an equivalent file - copyright only exists if the content can be considered Work as an intellectual creation - and a few "," without the original text will usually not meet that criteria. (You could argue that even in this test data is not considered a Work - but it could be consider a Database Design under EU Intellectual Proeprty rights - so just the particular column name combination must be changed) On 25 May 2016 at 11:15, sebb <seb...@gmail.com> wrote: > On 25 May 2016 at 00:13, Gary Gregory <garydgreg...@gmail.com> wrote: >> On Tue, May 24, 2016 at 3:29 PM, sebb <seb...@gmail.com> wrote: >> >>> On 24 May 2016 at 23:14, Gary Gregory <garydgreg...@gmail.com> wrote: >>> > Ok, so maybe I just exclude these files from the release and add some >>> > Assume calls in the tests. I can do that... >>> >>> As I wrote, the Unit test clearly contains data extracted from the >>> FERC CSV files. >>> So the entire class needs to be dropped. >>> >>> Note: there is a lot of potential CSV test data on the gov.uk website. >>> I raised LEGAL-255 about the licence. >>> >> >> This is promising: >> >> https://www.gov.uk/government/statistical-data-sets/house-price-index-background-tables > > This page requests attribution, unlike the page I noted in the JIRA initially. > So I think it should be avoided. > >> {quote} >> Publishing or making use of our data >> You will need to add the following attribution statement if you use or >> publish our House Price index background tables: >> >> Data produced by Land Registry © Crown copyright 2016. >> If you publish the data you should include information regarding the nature >> of the data and any relevant dates for the period of time covered. >> An example is set out below: >> >> This data covers the transactions received at Land Registry in the period >> [first working day of the month] to [last working day of the month]. © >> Crown copyright 2016. >> Open Government Licence >> The data published on this page is available for use and reuse under the >> Open Government Licence (OGL). This licence is for public bodies to make >> their data available for reuse. >> {quote} > >> Gary >> >> >>> >>> There are probably CSV files on other government websites around the world. >>> >>> > On May 24, 2016 3:04 PM, "sebb" <seb...@gmail.com> wrote: >>> > >>> >> On 24 May 2016 at 22:53, Gary Gregory <garydgreg...@gmail.com> wrote: >>> >> > On Tue, May 24, 2016 at 2:46 PM, sebb <seb...@gmail.com> wrote: >>> >> > >>> >> >> On 24 May 2016 at 22:09, Gary Gregory <garydgreg...@gmail.com> >>> wrote: >>> >> >> > On Mon, May 23, 2016 at 1:47 PM, Gary Gregory < >>> garydgreg...@gmail.com >>> >> > >>> >> >> > wrote: >>> >> >> > >>> >> >> >> I just added a little more "docs" in revision 1745267. Awaiting >>> >> >> feedback. >>> >> >> >> >>> >> >> >>> >> >> The commit is a comment in the POM in the RAT excludes section. >>> >> >> >>> >> >> I don't see how that relates to the FERC license discussion. >>> >> >> Since they are test data files one would expect them to be exempt >>> from >>> >> >> RAT anyway. >>> >> >> >>> >> >> > Any thoughts? No need to start another RC until this is squared >>> away. >>> >> >> >>> >> >> If we cannot establish what the license is then we don't know whether >>> >> >> they are even AL compatible, let alone what attribution if any is >>> >> >> required. >>> >> >> >>> >> >> Do we really need the files? >>> >> >> >>> >> >> What benefit do they provide? >>> >> >> >>> >> > >>> >> > These files show that we are testing against real world data, as >>> opposed >>> >> to >>> >> > using our own CSVPrinter to create test data or cobbling up tests by >>> >> hand. >>> >> > Using our own CSVPrinter would provide that we can parse what we can >>> >> print, >>> >> > a useful test but this does not show "real" data. Hand-created test >>> files >>> >> > may have a subjective bias as to the range of features tested. >>> >> > >>> >> > At least it is a sanity check, and for more, using real world data may >>> >> also >>> >> > end up letting us find bugs, edge cases and discover the need for new >>> >> > features. >>> >> >>> >> Then if we want to keep them, we need to establish what the license is. >>> >> Or find some other CSV files that have a suitable license. >>> >> >>> >> Note that the FercGovTest.java file contains details which appear to >>> >> be extracted from the test files. >>> >> >>> >> If we want to release CSV in the near future, I suggest we remove the >>> >> data files and the test case for now. >>> >> >>> >> Meanwhile someone can try again to contact FERC. I note that there is >>> >> a phone number, so perhaps there is someone in the US who can progress >>> >> this. >>> >> >>> >> The relevant page is at: >>> >> >>> >> http://www.ferc.gov/docs-filing/eqr/xml.asp >>> >> >>> >> I also note that there are two e-mail addresses on the page. >>> >> They are different from the one mentioned in the JIRA from which there >>> >> was no response, so might be worth trying. >>> >> >>> >> > Gary >>> >> > >>> >> > >>> >> >> >>> >> >> > Gary >>> >> >> > >>> >> >> >> >>> >> >> >> Gary >>> >> >> >> >>> >> >> >> On Mon, May 23, 2016 at 10:49 AM, Gary Gregory < >>> >> garydgreg...@gmail.com> >>> >> >> >> wrote: >>> >> >> >> >>> >> >> >>> Hi All: >>> >> >> >>> >>> >> >> >>> These files were discussed here: >>> >> >> >>> https://issues.apache.org/jira/browse/LEGAL-175 >>> >> >> >>> >>> >> >> >>> I never got a reply IIRC from FERC, see the above link for my >>> email. >>> >> >> >>> >>> >> >> >>> So we decided to ship the files in the configuration as they >>> still >>> >> are >>> >> >> in >>> >> >> >>> 1.4-RC1. >>> >> >> >>> >>> >> >> >>> I looks like we want to do it differently now, so: >>> >> >> >>> >>> >> >> >>> - I removed the entries from the NOTICE file. >>> >> >> >>> >>> >> >> >>> - I added a comment in the POM RAT excludes section. >>> >> >> >>> >>> >> >> >>> FWIW: Here are the current FERC examples: >>> >> >> >>> http://www.ferc.gov/docs-filing/eqr/xml.asp but still no >>> license or >>> >> >> >>> copyright on the files. >>> >> >> >>> >>> >> >> >>> In the future (>1.4), I'd like to use these files and keep >>> >> gathering a >>> >> >> >>> pile of IRL examples. >>> >> >> >>> >>> >> >> >>> Is the state of trunk (now at revision 1745238) OK for an RC2? >>> >> >> >>> >>> >> >> >>> Gary >>> >> >> >>> >>> >> >> >>> On Mon, May 23, 2016 at 7:14 AM, Benedikt Ritter < >>> >> brit...@apache.org> >>> >> >> >>> wrote: >>> >> >> >>> >>> >> >> >>>> Hi, >>> >> >> >>>> >>> >> >> >>>> Moving this away from the vote thread... >>> >> >> >>>> >>> >> >> >>>> I'm pretty sure we've already discussed the problem with the >>> >> ferc.gov >>> >> >> >>>> file >>> >> >> >>>> but I can't find a reference in the archives. Can anybody help? >>> >> >> >>>> I agree with Stian, that we should better document this. >>> >> >> >>>> I'm not sure whether this is a blocker, since 1.3 has been >>> >> released in >>> >> >> >>>> the >>> >> >> >>>> same state wrt NOTICE.txt. >>> >> >> >>>> >>> >> >> >>>> Regards, >>> >> >> >>>> Benedikt >>> >> >> >>>> >>> >> >> >>>> ---------- Forwarded message --------- >>> >> >> >>>> From: Stian Soiland-Reyes <st...@apache.org> >>> >> >> >>>> Date: Mo., 23. Mai 2016 um 15:59 Uhr >>> >> >> >>>> Subject: Re: [VOTE] Release Apache Commons CSV 1.4 RC1 >>> >> >> >>>> To: Commons Developers List <dev@commons.apache.org> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> On 23 May 2016 at 06:53, Gary Gregory <garydgreg...@gmail.com> >>> >> wrote: >>> >> >> >>>> > Apache Commons CSV 1.4 RC1 is available for review here: >>> >> >> >>>> > >>> >> >> >>>> > https://dist.apache.org/repos/dist/dev/commons/csv/1.4-RC1/ >>> >> >> >>>> > (revision 13733) >>> >> >> >>>> >>> >> >> >>>> I assume you mean >>> >> >> >>>> >>> >> >> >>>> https://dist.apache.org/repos/dist/dev/commons/csv/CSV_1_4_RC1/ >>> >> >> >>>> (@13733) >>> >> >> >>>> >>> >> >> >>>> My vote: -1 due to NOTICE issues. >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> Checked: >>> >> >> >>>> >>> >> >> >>>> +1 Signatures >>> >> >> >>>> +1 Hashes >>> >> >> >>>> +1 mvn clean install >>> >> >> >>>> +1 mvn apache-rat:check >>> >> >> >>>> -1 NOTICE is outdated, and material copyright >>> >> >> >>>> >>> >> >> >>>> Notice includes: >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> src/main/resources/contract.txt >>> >> >> >>>> This file was downloaded from >>> >> >> >>>> >>> >> >> >>> http://www.ferc.gov/docs-filing/eqr/soft-tools/sample-csv/contract.txt >>> >> >> >>>> and contains neither copyright notice nor license. >>> >> >> >>>> >>> >> >> >>>> src/main/resources/transaction.txt >>> >> >> >>>> This file was downloaded from >>> >> >> >>>> >>> >> >> >>> >> >>> http://www.ferc.gov/docs-filing/eqr/soft-tools/sample-csv/transaction.txt >>> >> >> >>>> and contains neither copyright notice nor license. >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> (I don't care that the URLs are 404) >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> however these files are now in >>> >> >> >>>> >>> >> >> >>>> ./src/test/resources/ferc.gov/ >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> We can't include files that "contain neither copyright notice >>> nor >>> >> >> >>>> license" - that means regular copyright remains and we don't >>> have >>> >> >> >>>> permission to use it. >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> As a US government organization, ferc.gov SHOULD be publishing >>> >> under >>> >> >> >>>> Public Domain - but we can't include their work if that has not >>> >> been >>> >> >> >>>> expressed. >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> Note that I have not searched the email archive or Jira in case >>> >> the IP >>> >> >> >>>> of these files have already been cleared. >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> This bit of the NOTICE should be removed as it is not a required >>> >> >> >>>> attribution notice. Move it to a comment in the apache-rat >>> exclude >>> >> in >>> >> >> >>>> the pom.xml >>> >> >> >>>> >>> >> >> >>>> src/test/resources/CSVFileParser/bom.csv >>> >> >> >>>> src/test/resources/CSVFileParser/test.csv >>> >> >> >>>> src/test/resources/CSVFileParser/test_default.txt >>> >> >> >>>> src/test/resources/CSVFileParser/test_default_comment.txt >>> >> >> >>>> src/test/resources/CSVFileParser/test_rfc4180.txt >>> >> >> >>>> src/test/resources/CSVFileParser/test_rfc4180_trim.txt >>> >> >> >>>> src/test/resources/CSVFileParser/testCSV85.csv >>> >> >> >>>> src/test/resources/CSVFileParser/testCSV85_default.txt >>> >> >> >>>> src/test/resources/CSVFileParser/testCSV85_ignoreEmpty.txt >>> >> >> >>>> These files are used as test data and test result >>> specifications. >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> Checked using Ubuntu 16:04 x/64: >>> >> >> >>>> >>> >> >> >>>> Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; >>> >> >> >>>> 2015-11-10T16:41:47+00:00) >>> >> >> >>>> Maven home: /home/stain/software/maven >>> >> >> >>>> Java version: 1.8.0_91, vendor: Oracle Corporation >>> >> >> >>>> Java home: /usr/lib/jvm/java-8-openjdk-amd64/jre >>> >> >> >>>> Default locale: en_GB, platform encoding: UTF-8 >>> >> >> >>>> OS name: "linux", version: "4.4.0-22-generic", arch: "amd64", >>> >> family: >>> >> >> >>>> "unix" >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> > commons-csv-1.4-bin.tar.gz >>> >> >> >>>> > (SHA1: 19806d3a6b2f8c6569f50b294da1d3f3a5be4429) >>> >> >> >>>> > commons-csv-1.4-bin.zip >>> >> >> >>>> > (SHA1: f551f471081c75a4cb6710b9981a3e0c858debd3) >>> >> >> >>>> > commons-csv-1.4-src.tar.gz >>> >> >> >>>> > (SHA1: 08151857d96af4c95ddbd5131f40e56b05eb088f) >>> >> >> >>>> > commons-csv-1.4-src.zip >>> >> >> >>>> > (SHA1: c379ec116117e0a9bbd66f7bb3279cfe1e9697ef) >>> >> >> >>>> > >>> >> >> >>>> > Maven artifacts are here: >>> >> >> >>>> > >>> >> >> >>>> > >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>> >> >>> https://repository.apache.org/content/repositories/orgapachecommons-1172/org/apache/commons/commons-csv/1.4/ >>> >> >> >>>> > >>> >> >> >>>> > These are the artifacts and their hashes: >>> >> >> >>>> > >>> >> >> >>>> > commons-csv-1.4-test-sources.jar >>> >> >> >>>> > (SHA1: fa468674f62177f6182a318f4d1bb7b385e146b6) >>> >> >> >>>> > commons-csv-1.4-sources.jar >>> >> >> >>>> > (SHA1: f8e3c6b3d3c1a5bbd80ad5b73c72a98af471c401) >>> >> >> >>>> > commons-csv-1.4.pom >>> >> >> >>>> > (SHA1: c065422ac0fd4ff25016fb2fcb00af3874103935) >>> >> >> >>>> > commons-csv-1.4.jar >>> >> >> >>>> > (SHA1: 5221b8e5d24f26aab600d367313c6620c7f1fdb6) >>> >> >> >>>> > commons-csv-1.4-javadoc.jar >>> >> >> >>>> > (SHA1: 878a92f52149c3d3050332fbfb9702ed3a64c515) >>> >> >> >>>> > commons-csv-1.4-tests.jar >>> >> >> >>>> > (SHA1: 2eb791225c8f002be1fa0f4b6d68110e63b14f5a) >>> >> >> >>>> > >>> >> >> >>>> > Details of changes since 1.3 are in the release notes: >>> >> >> >>>> > >>> >> >> >>>> > >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>> >> >>> https://dist.apache.org/repos/dist/dev/commons/csv/CSV_1_4_RC1/RELEASE-NOTES.txt >>> >> >> >>>> > >>> >> >> >>>> >>> >> http://home.apache.org/~ggregory/csv-1.4-rc1/site/changes-report.html >>> >> >> >>>> > >>> >> >> >>>> > >>> >> >> >>>> > The tag is here: >>> >> >> >>>> > >>> >> >> >>>> >>> >> http://svn.apache.org/repos/asf/commons/proper/csv/tags/csv-1.4-RC1/ >>> >> >> >>>> > (revision 1745108) >>> >> >> >>>> > >>> >> >> >>>> > Site: >>> >> >> >>>> > http://home.apache.org/~ggregory/csv-1.4-rc1/site/ >>> >> >> >>>> > >>> >> >> >>>> > (some *relative* links are broken - these will be OK once >>> the >>> >> >> site >>> >> >> >>>> > is deployed) >>> >> >> >>>> > >>> >> >> >>>> > Clirr Report (compared to 1.3): >>> >> >> >>>> > >>> >> >> >>>> >>> >> http://home.apache.org/~ggregory/csv-1.4-rc1/site/clirr-report.html >>> >> >> >>>> > >>> >> >> >>>> > RAT Report: >>> >> >> >>>> > >>> >> >> http://home.apache.org/~ggregory/csv-1.4-rc1/site/rat-report.html >>> >> >> >>>> > >>> >> >> >>>> > KEYS: >>> >> >> >>>> > https://www.apache.org/dist/commons/KEYS >>> >> >> >>>> > >>> >> >> >>>> > Please review the release candidate and vote. >>> >> >> >>>> > >>> >> >> >>>> > This vote will close no sooner than 72 hours from now, >>> >> >> >>>> > i.e. sometime after 23:00 PST 25 May 2016 >>> >> >> >>>> > >>> >> >> >>>> > >>> >> >> >>>> > [ ] +1 Release these artifacts >>> >> >> >>>> > [ ] +0 OK, but... >>> >> >> >>>> > [ ] -0 OK, but really should fix... >>> >> >> >>>> > [ ] -1 I oppose this release because... >>> >> >> >>>> > >>> >> >> >>>> > Thanks! >>> >> >> >>>> > Gary Gregory >>> >> >> >>>> > >>> >> >> >>>> > -- >>> >> >> >>>> > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org >>> >> >> >>>> > Java Persistence with Hibernate, Second Edition >>> >> >> >>>> > <http://www.manning.com/bauer3/> >>> >> >> >>>> > JUnit in Action, Second Edition < >>> >> http://www.manning.com/tahchiev/> >>> >> >> >>>> > Spring Batch in Action <http://www.manning.com/templier/> >>> >> >> >>>> > Blog: http://garygregory.wordpress.com >>> >> >> >>>> > Home: http://garygregory.com/ >>> >> >> >>>> > Tweet! http://twitter.com/GaryGregory >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> -- >>> >> >> >>>> Stian Soiland-Reyes >>> >> >> >>>> Apache Commons, Apache Taverna (incubating), Apache Commons RDF >>> >> >> >>>> (incubating) >>> >> >> >>>> http://orcid.org/0000-0001-9842-9718 >>> >> >> >>>> ./src/test/resources/ferc.gov/contract.txt >>> >> >> >>>> >>> >> >> >>>> >>> >> --------------------------------------------------------------------- >>> >> >> >>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> >> >> >>>> For additional commands, e-mail: dev-h...@commons.apache.org >>> >> >> >>>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> -- >>> >> >> >>> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org >>> >> >> >>> Java Persistence with Hibernate, Second Edition >>> >> >> >>> <http://www.manning.com/bauer3/> >>> >> >> >>> JUnit in Action, Second Edition < >>> http://www.manning.com/tahchiev/> >>> >> >> >>> Spring Batch in Action <http://www.manning.com/templier/> >>> >> >> >>> Blog: http://garygregory.wordpress.com >>> >> >> >>> Home: http://garygregory.com/ >>> >> >> >>> Tweet! http://twitter.com/GaryGregory >>> >> >> >>> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> -- >>> >> >> >> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org >>> >> >> >> Java Persistence with Hibernate, Second Edition >>> >> >> >> <http://www.manning.com/bauer3/> >>> >> >> >> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/ >>> > >>> >> >> >> Spring Batch in Action <http://www.manning.com/templier/> >>> >> >> >> Blog: http://garygregory.wordpress.com >>> >> >> >> Home: http://garygregory.com/ >>> >> >> >> Tweet! http://twitter.com/GaryGregory >>> >> >> >> >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > -- >>> >> >> > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org >>> >> >> > Java Persistence with Hibernate, Second Edition >>> >> >> > <http://www.manning.com/bauer3/> >>> >> >> > JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> >>> >> >> > Spring Batch in Action <http://www.manning.com/templier/> >>> >> >> > Blog: http://garygregory.wordpress.com >>> >> >> > Home: http://garygregory.com/ >>> >> >> > Tweet! http://twitter.com/GaryGregory >>> >> >> >>> >> >> --------------------------------------------------------------------- >>> >> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> >> >> For additional commands, e-mail: dev-h...@commons.apache.org >>> >> >> >>> >> >> >>> >> > >>> >> > >>> >> > -- >>> >> > E-Mail: garydgreg...@gmail.com | ggreg...@apache.org >>> >> > Java Persistence with Hibernate, Second Edition >>> >> > <http://www.manning.com/bauer3/> >>> >> > JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> >>> >> > Spring Batch in Action <http://www.manning.com/templier/> >>> >> > Blog: http://garygregory.wordpress.com >>> >> > Home: http://garygregory.com/ >>> >> > Tweet! http://twitter.com/GaryGregory >>> >> >>> >> --------------------------------------------------------------------- >>> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> >> For additional commands, e-mail: dev-h...@commons.apache.org >>> >> >>> >> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> For additional commands, e-mail: dev-h...@commons.apache.org >>> >>> >> >> >> -- >> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org >> Java Persistence with Hibernate, Second Edition >> <http://www.manning.com/bauer3/> >> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> >> Spring Batch in Action <http://www.manning.com/templier/> >> Blog: http://garygregory.wordpress.com >> Home: http://garygregory.com/ >> Tweet! http://twitter.com/GaryGregory > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > -- Stian Soiland-Reyes Apache Commons, Apache Taverna (incubating), Apache Commons RDF (incubating) http://orcid.org/0000-0001-9842-9718 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org