+1 Thank you, Tyler!
Apologies to Hong-Thai and community for not recognizing the severity of TIKA-1600 when I voted in favor of rc1! Details... I reran against govdocs1, and there aren't any major surprises. On our Rackspace vm, I _finally_ unzipped the Common Crawl slice that Julien Nioche created for us, and I ran against that as well. That turned up TIKA-1605 and another exceedingly rare NPE in the PDFParser. I don't think either of these are blockers, and they're now fixed in trunk. There are slightly fewer metadata values for some jpegs. For the one file that I manually reviewed, 1.8-rc was missing these values (that were available in 1.7): JPEG quality IPTC-NAA record Plug-in 1 Data Comparison reports are available here (much more work remains to be done on tika-eval): https://github.com/tballison/share/tree/master/tika_comparisons ________________________________________ From: Tyler Palsulich <tpalsul...@apache.org> Sent: Monday, April 13, 2015 1:56 PM To: dev@tika.apache.org; u...@tika.apache.org Subject: [VOTE] Apache Tika 1.8 Release Candidate #2 Hi Folks, A candidate for the Tika 1.8 release is available at: https://dist.apache.org/repos/dist/dev/tika/ The release candidate is a zip archive of the sources in: http://svn.apache.org/repos/asf/tika/tags/1.8-rc2/ The SHA1 checksum of the archive is 5e22fee9079370398472e59082d171ae2d7fdd31. In addition, a staged maven repository is available here: https://repository.apache.org/content/repositories/orgapachetika-1009 Please vote on releasing this package as Apache Tika 1.8. The vote is open for the next 72 hours and passes if a majority of at least three +1 Tika PMC votes are cast. [ ] +1 Release this package as Apache Tika 1.8 [ ] ±0 I don't object to this release, but I haven't checked it [ ] -1 Do not release this package because... Thanks, Tyler