[
https://issues.apache.org/jira/browse/TIKA-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604471#comment-16604471
]
Tim Allison edited comment on TIKA-2722 at 9/5/18 2:36 PM:
-----------------------------------------------------------
Thank you, [~dsmiley]. That's dead code in 2.0, but you're right. That was
operative in the 1x branch. I've removed it, and I've cleaned up the
calendar->date->calendar->string to just calendar->string (properly formatted
in Metadata's {{set(Property property, Calendar calendar)}})..
was (Author: [email protected]):
Thank you, [~dsmiley]. That's dead code in 2.0, but you're right. That was
operative in the 1x branch. I've removed it with 44c2279, and I've cleaned up
the calendar->date->calendar->string to just calendar->string (properly
formatted in Metadata's {{set(Property property, Calendar calendar)}})..
> Don't call Date.toString (Possible issue with JDK 11)
> -----------------------------------------------------
>
> Key: TIKA-2722
> URL: https://issues.apache.org/jira/browse/TIKA-2722
> Project: Tika
> Issue Type: Bug
> Environment: Tika 1.18, JDK 11 with locale set to "ar-EG".
> Reporter: David Smiley
> Priority: Major
>
> I'm troubleshooting [a test failure in Apache
> Lucene/Sor|https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/22799/]
> "extracting" contrib that occurs in JDK 11 with locale "ar-EG". JDK 8 & 9
> passes; I don't know about JDK 10. It has to do with extracting date metadata
> from a PDF, particularly the created date but perhaps others too.
> I stepped through the code into Tika and I think I've found out where the
> troublesome code is. First note PDFParser line 271: {{addMetadata(metadata,
> "created", info.getCreationDate());}}. That addMetadata overload variant
> will call toString on a Date. IMO that's asking for trouble since the output
> of that is Locale-dependent. I think that's okay to show to a user but not
> for machine-to-machine information exchange. In the case of the test, it
> yielded this odd looking date string:
> Thu Nov 13 18:35:51 GMT+٠٥:٠٠ 2008
> I pasted that in and it looks consistent with what I see in IntelliJ and in
> Jenkins logs; hopefully will post correctly to JIRA. The odd part is the
> hour & minutes relative to GMT. I won't be certain until after I click
> "Create".
> Perhaps this problem is also indicative of a JDK 11 bug? Nevertheless I
> think Tika should avoid calling Date.toString().
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)