[ 
https://issues.apache.org/jira/browse/TIKA-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luís Filipe Nassif updated TIKA-3815:
-------------------------------------
    Description: 
Running tika-app-2.4.1.jar on the attached image, these metadata is returned:

Exif IFD0:Date/Time: 2022:06:16 11:18:49
Exif SubIFD:Date/Time Digitized: 2022:06:16 11:18:49
Exif SubIFD:Date/Time Original: 2022:06:16 11:18:49
Exif SubIFD:Time Zone: -03:00
Exif SubIFD:Time Zone Digitized: -03:00
Exif SubIFD:Time Zone Original: -03:00
File Modified Date: Thu Jun 16 11:18:50 -03:00 2022
GPS:GPS Date Stamp: 2022:06:16
GPS:GPS Time-Stamp: 14:18:47.000 UTC
dcterms:created: 2022-06-16T08:18:49
dcterms:modified: 2022-06-16T08:18:49
exif:DateTimeOriginal: 2022-06-16T08:18:49

 

The right value is 2022-06-16T14:18:49Z. Although there is no timezone 
specified for some values, I think it makes no sense converting them to 
timezones different than GMT, the one used to take the picture (-03:00) or the 
one used to run the application (-03:00), so Tika could be making an incorrect 
timezone conversion on the last 3 fields.

  was:
Running tika-app-2.4.1.jar on the attached image, these metadata is returned:

Exif IFD0:Date/Time: 2022:06:16 11:18:49
Exif SubIFD:Date/Time Digitized: 2022:06:16 11:18:49
Exif SubIFD:Date/Time Original: 2022:06:16 11:18:49
Exif SubIFD:Time Zone: -03:00
Exif SubIFD:Time Zone Digitized: -03:00
Exif SubIFD:Time Zone Original: -03:00
File Modified Date: Thu Jun 16 11:18:50 -03:00 2022
GPS:GPS Date Stamp: 2022:06:16
GPS:GPS Time-Stamp: 14:18:47.000 UTC
dcterms:created: 2022-06-16T08:18:49
dcterms:modified: 2022-06-16T08:18:49
exif:DateTimeOriginal: 2022-06-16T08:18:49

 

The right value is 2022-06-16T14:18:49Z. Although there is no timezone 
specified for some values, I think it makes no sense converting them to 
timezones different than GMT or the one used to take the picture (-03:00), so 
Tika could be making an incorrect timezone conversion on the last 3 fields.


> Inconsistent Date/Time information extracted from Exif data
> -----------------------------------------------------------
>
>                 Key: TIKA-3815
>                 URL: https://issues.apache.org/jira/browse/TIKA-3815
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 2.4.1
>            Reporter: Luís Filipe Nassif
>            Priority: Major
>         Attachments: IMG_20220616_111848_HDR.jpg
>
>
> Running tika-app-2.4.1.jar on the attached image, these metadata is returned:
> Exif IFD0:Date/Time: 2022:06:16 11:18:49
> Exif SubIFD:Date/Time Digitized: 2022:06:16 11:18:49
> Exif SubIFD:Date/Time Original: 2022:06:16 11:18:49
> Exif SubIFD:Time Zone: -03:00
> Exif SubIFD:Time Zone Digitized: -03:00
> Exif SubIFD:Time Zone Original: -03:00
> File Modified Date: Thu Jun 16 11:18:50 -03:00 2022
> GPS:GPS Date Stamp: 2022:06:16
> GPS:GPS Time-Stamp: 14:18:47.000 UTC
> dcterms:created: 2022-06-16T08:18:49
> dcterms:modified: 2022-06-16T08:18:49
> exif:DateTimeOriginal: 2022-06-16T08:18:49
>  
> The right value is 2022-06-16T14:18:49Z. Although there is no timezone 
> specified for some values, I think it makes no sense converting them to 
> timezones different than GMT, the one used to take the picture (-03:00) or 
> the one used to run the application (-03:00), so Tika could be making an 
> incorrect timezone conversion on the last 3 fields.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to