[ 
https://issues.apache.org/jira/browse/TIKA-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-4209:
------------------------------
    Description: 
[~johanvanderknijff] recently published a great post on multi-image TIFFs: 
[https://www.bitsgalore.org/2024/03/11/multi-image-tiffs-subfiles-and-image-file-directories]

I hadn't worked on TIFF in a while. I tried out a few sample multi-image tiffs 
and found that we are not processing anything beyond the first page/image in a 
TIFF. Even worse, we're not populating our 
"{color:#000000}imagereader:NumImages{color}" metadata value for TIFFs.

It looks like Drew Noakes' metadata-extractor is not yet handling these well: 
[https://github.com/drewnoakes/metadata-extractor/issues/648]

 

There's an example file on that issue: 
[https://github.com/drewnoakes/metadata-extractor/files/14052854/color-pages-jpg.zip]

And [~johanvanderknijff] also pointed out to TIFFs available here: 
[https://www.leadtools.com/support/forum/posts/t10960-]

  was:
[~johanvanderknijff] recently published a great post on multipage TIFFs: 
[https://www.bitsgalore.org/2024/03/11/multi-image-tiffs-subfiles-and-image-file-directories]

I hadn't worked on TIFF in a while. I tried out a few sample multipage tiffs 
and found that we are not processing anything beyond the first page/image in a 
TIFF. Even worse, we're not populating our 
"{color:#000000}imagereader:NumImages{color}" metadata value for TIFFs.

It looks like Drew Noakes' metadata-extractor is not yet handling these well: 
[https://github.com/drewnoakes/metadata-extractor/issues/648]

 

There's an example file on that issue: 
[https://github.com/drewnoakes/metadata-extractor/files/14052854/color-pages-jpg.zip]

And [~johanvanderknijff] also pointed out to TIFFs available here: 
[https://www.leadtools.com/support/forum/posts/t10960-]


> Improve handling of multi-image tiffs
> -------------------------------------
>
>                 Key: TIKA-4209
>                 URL: https://issues.apache.org/jira/browse/TIKA-4209
>             Project: Tika
>          Issue Type: New Feature
>            Reporter: Tim Allison
>            Priority: Major
>
> [~johanvanderknijff] recently published a great post on multi-image TIFFs: 
> [https://www.bitsgalore.org/2024/03/11/multi-image-tiffs-subfiles-and-image-file-directories]
> I hadn't worked on TIFF in a while. I tried out a few sample multi-image 
> tiffs and found that we are not processing anything beyond the first 
> page/image in a TIFF. Even worse, we're not populating our 
> "{color:#000000}imagereader:NumImages{color}" metadata value for TIFFs.
> It looks like Drew Noakes' metadata-extractor is not yet handling these well: 
> [https://github.com/drewnoakes/metadata-extractor/issues/648]
>  
> There's an example file on that issue: 
> [https://github.com/drewnoakes/metadata-extractor/files/14052854/color-pages-jpg.zip]
> And [~johanvanderknijff] also pointed out to TIFFs available here: 
> [https://www.leadtools.com/support/forum/posts/t10960-]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to