[ https://issues.apache.org/jira/browse/TIKA-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604531#comment-16604531 ]
Hudson commented on TIKA-2722: ------------------------------ SUCCESS: Integrated in Jenkins build tika-branch-1x #80 (See [https://builds.apache.org/job/tika-branch-1x/80/]) TIKA-2722 -- remove dead code and prevent potentially bad (tallison: [https://github.com/apache/tika/commit/0951bf96f4654c6ef314e2cd49b54b2add4b5d4c]) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java TIKA-2722 -- clean up setting calendar values (tallison: [https://github.com/apache/tika/commit/8d70109af35dc94e0c7ce9437764cc2b7d064112]) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java TIKA-2722 -- clean up setting calendar values, take2 (tallison: [https://github.com/apache/tika/commit/2fd54ff31865089f154390fed42849e4572929e7]) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java > Don't call Date.toString (Possible issue with JDK 11) > ----------------------------------------------------- > > Key: TIKA-2722 > URL: https://issues.apache.org/jira/browse/TIKA-2722 > Project: Tika > Issue Type: Bug > Environment: Tika 1.18, JDK 11 with locale set to "ar-EG". > Reporter: David Smiley > Priority: Major > > I'm troubleshooting [a test failure in Apache > Lucene/Sor|https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/22799/] > "extracting" contrib that occurs in JDK 11 with locale "ar-EG". JDK 8 & 9 > passes; I don't know about JDK 10. It has to do with extracting date metadata > from a PDF, particularly the created date but perhaps others too. > I stepped through the code into Tika and I think I've found out where the > troublesome code is. First note PDFParser line 271: {{addMetadata(metadata, > "created", info.getCreationDate());}}. That addMetadata overload variant > will call toString on a Date. IMO that's asking for trouble since the output > of that is Locale-dependent. I think that's okay to show to a user but not > for machine-to-machine information exchange. In the case of the test, it > yielded this odd looking date string: > Thu Nov 13 18:35:51 GMT+٠٥:٠٠ 2008 > I pasted that in and it looks consistent with what I see in IntelliJ and in > Jenkins logs; hopefully will post correctly to JIRA. The odd part is the > hour & minutes relative to GMT. I won't be certain until after I click > "Create". > Perhaps this problem is also indicative of a JDK 11 bug? Nevertheless I > think Tika should avoid calling Date.toString(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)