This is bad. I’m sorry for your experience with this. Let me see if I can get something working with our v2 external parsers.
At the least, I agree that we need to fix our documentation. On Tue, Aug 26, 2025 at 6:11 AM Adrian Bird <[email protected]> wrote: > Hi, > I tried requesting a Jira account (birdya22) to report this issue > but the request was denied. > > The reply said I could submit PRs on github (I have an account), but I > didn't see how to do it (https://github.com/apache/tika/). > > So I've subscribed to this list and here are the details. > > I tried to get Tika and ExifTool to work together to process some JPEG > image files and came across a number of issues. > 1) Tika and ExifTool don't work on Windows > I used the Wiki page > 'https://cwiki.apache.org/confluence/display/TIKA/EXIFToolParser' to > understand how to do the integration. > Because I wasn't getting the metadata I expected, I used the > '--verbose' option and got a Java Exception which contained this text: > "WARN [main] 07:13:34,699 > org.apache.tika.parser.external.ExternalParser problem with process > exec > java.io.IOException: Cannot run program "env": CreateProcess error=2, > The system cannot find the file specified" > The exception occurs because 'env' is not a valid Windows command. > I tracked this down to the file > 'org\apache\tika\parser\external\tika-external-parsers.xml' in the > Tika App jar where the command is: > '<command>env FOO=${OUTPUT} exiftool ${INPUT}</command>' > This doesn't work on Windows because 'env' does not exist as a command. > > 2) In the same file I noticed an entry for 'sox'. For the same reason > as ExifTool, Tika and sox won't work on Windows > The command is: > <command>env FOO=${OUTPUT} sox --info ${INPUT}</command> > Note I didn't find any information on 'sox' on the Wiki. > > 3) Looking at the file > 'org\apache\tika\parser\external\tika-external-parsers.xml' I noticed > that it only contains video related mime-types, meaning that I cannot > use it with image files. The Wiki page says: > 'EXIFTool is a wonderful tool that reads videos, images, audio and > other media files and that extracts EXIF metadata from them.' > I took this to mean that Tika can extract metadata from all 3 file > types, but that isn't the case as it only supports video files. > Given this can I suggest the Wiki page should be updated to make this > clear. > > Adrian >
