By looking at the test cases I have got the External2.ExternalParser to work with ExifTool for a single file.
But, when I tried using Tika batch and Tika Pipes to process a directory of images I got exceptions in both cases. I can add details here, but I need to gather the details of the issues. I will raise a JIRA about my original issue. On Tue, 26 Aug 2025 at 22:03, Tim Allison <[email protected]> wrote: > > This is bad. I’m sorry for your experience with this. > > Let me see if I can get something working with our v2 external parsers. > > At the least, I agree that we need to fix our documentation. > > On Tue, Aug 26, 2025 at 6:11 AM Adrian Bird <[email protected]> > wrote: >> >> Hi, >> I tried requesting a Jira account (birdya22) to report this issue >> but the request was denied. >> >> The reply said I could submit PRs on github (I have an account), but I >> didn't see how to do it (https://github.com/apache/tika/). >> >> So I've subscribed to this list and here are the details. >> >> I tried to get Tika and ExifTool to work together to process some JPEG >> image files and came across a number of issues. >> 1) Tika and ExifTool don't work on Windows >> I used the Wiki page >> 'https://cwiki.apache.org/confluence/display/TIKA/EXIFToolParser' to >> understand how to do the integration. >> Because I wasn't getting the metadata I expected, I used the >> '--verbose' option and got a Java Exception which contained this text: >> "WARN [main] 07:13:34,699 >> org.apache.tika.parser.external.ExternalParser problem with process >> exec >> java.io.IOException: Cannot run program "env": CreateProcess error=2, >> The system cannot find the file specified" >> The exception occurs because 'env' is not a valid Windows command. >> I tracked this down to the file >> 'org\apache\tika\parser\external\tika-external-parsers.xml' in the >> Tika App jar where the command is: >> '<command>env FOO=${OUTPUT} exiftool ${INPUT}</command>' >> This doesn't work on Windows because 'env' does not exist as a command. >> >> 2) In the same file I noticed an entry for 'sox'. For the same reason >> as ExifTool, Tika and sox won't work on Windows >> The command is: >> <command>env FOO=${OUTPUT} sox --info ${INPUT}</command> >> Note I didn't find any information on 'sox' on the Wiki. >> >> 3) Looking at the file >> 'org\apache\tika\parser\external\tika-external-parsers.xml' I noticed >> that it only contains video related mime-types, meaning that I cannot >> use it with image files. The Wiki page says: >> 'EXIFTool is a wonderful tool that reads videos, images, audio and >> other media files and that extracts EXIF metadata from them.' >> I took this to mean that Tika can extract metadata from all 3 file >> types, but that isn't the case as it only supports video files. >> Given this can I suggest the Wiki page should be updated to make this clear. >> >> Adrian
