By looking at the test cases I have got the External2.ExternalParser
to work with ExifTool for a single file.

But, when I tried using Tika batch and Tika Pipes to process a
directory of images I got exceptions in both cases. I can add details
here, but I  need to gather the details of the issues.

I will raise a JIRA about my original issue.



On Tue, 26 Aug 2025 at 22:03, Tim Allison <[email protected]> wrote:
>
> This is bad. I’m sorry for your experience with this.
>
> Let me see if I can get something working with our v2 external parsers.
>
> At the least, I agree that we need to fix our documentation.
>
> On Tue, Aug 26, 2025 at 6:11 AM Adrian Bird <[email protected]> 
> wrote:
>>
>> Hi,
>>   I tried requesting a Jira account (birdya22) to report this issue
>> but the request was denied.
>>
>> The reply said I could submit PRs on github (I have an account), but I
>> didn't see how to do it (https://github.com/apache/tika/).
>>
>> So I've subscribed to this list and here are the details.
>>
>> I tried to get Tika and ExifTool to work together to process some JPEG
>> image files and came across a number of issues.
>> 1) Tika and ExifTool don't work on Windows
>> I used the Wiki page
>> 'https://cwiki.apache.org/confluence/display/TIKA/EXIFToolParser' to
>> understand how to do the integration.
>> Because I wasn't getting the metadata I expected, I used the
>> '--verbose' option and got a Java Exception which contained this text:
>>  "WARN  [main] 07:13:34,699
>> org.apache.tika.parser.external.ExternalParser problem with process
>> exec
>> java.io.IOException: Cannot run program "env": CreateProcess error=2,
>> The system cannot find the file specified"
>> The exception occurs because 'env' is not a valid Windows command.
>> I tracked this down to the file
>> 'org\apache\tika\parser\external\tika-external-parsers.xml' in the
>> Tika App jar where the command is:
>> '<command>env FOO=${OUTPUT} exiftool ${INPUT}</command>'
>> This doesn't work on Windows because 'env' does not exist as a command.
>>
>> 2) In the same file I noticed an entry for 'sox'. For the same reason
>> as ExifTool, Tika and sox won't work on Windows
>> The command is:
>> <command>env FOO=${OUTPUT} sox --info ${INPUT}</command>
>> Note I didn't find any information on 'sox' on the Wiki.
>>
>> 3) Looking at the file
>> 'org\apache\tika\parser\external\tika-external-parsers.xml' I noticed
>> that it only contains video related mime-types, meaning that I cannot
>> use it with image files. The Wiki page says:
>> 'EXIFTool is a wonderful tool that reads videos, images, audio and
>> other media files and that extracts EXIF metadata from them.'
>> I took this to mean that Tika can extract metadata from all 3 file
>> types, but that isn't the case as it only supports video files.
>> Given this can I suggest the Wiki page should be updated to make this clear.
>>
>> Adrian

Reply via email to