[ https://issues.apache.org/jira/browse/TIKA-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894877#action_12894877 ]
Staffan Olsson commented on TIKA-451: ------------------------------------- Jpeg parser (TiffExtractor.handleCommonImageTags and JpegParserTest) has the same issue. The test asserts for a date format that is not iso. The field's (DublinCore.DATE) javadoc says ISO 8601 so the test is clearly wrong. There is a "TODO Make me a Date Property" on it. I have code for parsing Metadata Extractor's date to ISO so I could fix this, but what field should we use? This issue discusses MSOffice.CREATION_DATE but I think DublinCore makes more sense for images. However Tika will be easier to use if there is only one creation date field. > Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED > ------------------------------------------------------------------------------ > > Key: TIKA-451 > URL: https://issues.apache.org/jira/browse/TIKA-451 > Project: Tika > Issue Type: Improvement > Components: metadata, parser > Affects Versions: 0.7 > Reporter: Nick Burch > Assignee: Nick Burch > Priority: Minor > > Currently, the PDF Parser does calendar.getTime().toString() which means > dates end up in your local timezone, and are hard to parse > The Open Document parsers output in iso 8601 format, which avoids these two > problems > The poi ole2 based parsers also output in date.toString() format, with the > same timezone/parsing problems > We should probably select one format, and update the parsers to all output in > it -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.