[ 
https://issues.apache.org/jira/browse/TIKA-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667903#comment-16667903
 ] 

Dave Meikle commented on TIKA-2630:
-----------------------------------

Thanks for raising this one. Short term we can add in the reading from these 
fields for compressed images knowing it will set the tiff:ImageHeight and 
tiff:ImageLength to the correct value.

Longer term we need to address the metadata clashes are part of the 2.x series 
as whilst we could add in the directory name as a key to the metadata (e.g. 
Exif IFD0:Image Height: 520 pixels) I would be worried about the impact on 
downstream code that has got used to what we do. This means we can also build 
up on the Metadata proposals for 2.x.

Will this work for you?

 

> Wrong height and width metadata for JPEG images
> -----------------------------------------------
>
>                 Key: TIKA-2630
>                 URL: https://issues.apache.org/jira/browse/TIKA-2630
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.17
>            Reporter: Ancuta Morarasu
>            Assignee: Dave Meikle
>            Priority: Major
>         Attachments: Tika-metadata.txt, metadata-exctractor-metadata.txt, 
> sizesampleissue.jpg
>
>
> According to [Exif 
> specs|http://www.exif.org/Exif2-2.PDF#page=73&zoom=auto,-176,103], for 
> compressed images the values for width and height should come from the tags:
> * *PixelXDimension* mapped in metadata-extractor to 
> {{com.drew.metadata.Directory.ExifDirectoryBase.TAG_EXIF_IMAGE_WIDTH}} and
> * *PixelYDimension* mapped to {{ExifDirectoryBase.TAG_EXIF_IMAGE_HEIGHT}}.
> {{ImageMetadataExtractor$ExifHandler.[handlePhotoTags(...)|https://github.com/apache/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java#L487]}}
>  should extract and set these in the metadata:
> {code:java}
>  if (directory.containsTag(ExifSubIFDDirectory.TAG_EXIF_IMAGE_WIDTH)) {
>     metadata.set(Metadata.IMAGE_WIDTH,
>                  
> trimPixels(directory.getDescription(ExifSubIFDDirectory.TAG_EXIF_IMAGE_WIDTH)));
>   }
>   if (directory.containsTag(ExifSubIFDDirectory.TAG_EXIF_IMAGE_WIDTH)) {
>       metadata.set(Metadata.IMAGE_LENGTH,
>                    
> trimPixels(directory.getDescription(ExifSubIFDDirectory.TAG_EXIF_IMAGE_HEIGHT)));
>    }
> {code}
> Also the {{CopyUnknownFieldsHandler}} overrides the values for "Image Width" 
> ({{JpegDirectory.TAG_IMAGE_WIDTH}}) and "Image Height" 
> ({{JpegDirectory.TAG_IMAGE_HEIGHT}}) with the values from 
> {{ExifIFD0Descriptor.TAG_IMAGE_WIDTH}} and 
> {{ExifIFD0Descriptor.TAG_IMAGE_HEIGHT}} because they have the same tag name.
> I attached a sample image, these are the metadata values:
> * extracted by metadata-extractor:
> [JPEG] Image Height = 367 pixels
> [JPEG] Image Width = 1535 pixels
> [Exif IFD0] Image Width = 2173 pixels
> [Exif IFD0] Image Height = 520 pixels
> [Exif SubIFD] Exif Image Width = 1535 pixels
> [Exif SubIFD] Exif Image Height = 367 pixels
> * Tika metadata:
> Image Height: 520 pixels
> Image Width: 2173 pixels
> tiff:ImageLength: 520
> tiff:ImageWidth: 2173
> Exif Image Height: 367 pixels
> Exif Image Width: 1535 pixels



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to