[ 
https://issues.apache.org/jira/browse/TIKA-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884339#action_12884339
 ] 

Nick Burch commented on TIKA-451:
---------------------------------

If we make the change, it could impact users, but I think not too much

Currently, there are a number of different date formats that crop up in the 
date field. This means that anyone who cares about the format is already having 
to try multiple date patterns to parse it. So, they shouldn't be affected by a 
change to a pre-existing format.

The only people I can see being affected are people who only ever use one of 
the date.toString() parsers, and no others, and who assume that format on all 
dates. Hopefully that's a rare enough use case that we don't need to worry 
about when making this change?

> Inconsistent date format for Metadata.CREATION_DATE and Metadata.LAST_MODIFIED
> ------------------------------------------------------------------------------
>
>                 Key: TIKA-451
>                 URL: https://issues.apache.org/jira/browse/TIKA-451
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata, parser
>    Affects Versions: 0.7
>            Reporter: Nick Burch
>            Priority: Minor
>
> Currently, the PDF Parser does   calendar.getTime().toString()   which means 
> dates end up in your local timezone, and are hard to parse
> The Open Document parsers output in iso 8601 format, which avoids these two 
> problems
> The poi ole2 based parsers also output in date.toString() format, with the 
> same timezone/parsing problems
> We should probably select one format, and update the parsers to all output in 
> it

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to