I went ahead and tried to piece together what I needed to do to test Tika with the code provided above.
tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml One of the link references provided suggesting modifying this above file, do I am need to? I don't believe it is necessary to modify this file because it only contains definitions of MIME types and the image files are already defined. Does that seem correct? tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser I went ahead and copied the HTMLRenderingEngine.java file into Tika-Parsers/src/main/java/org/apache/tika/parser/image/ with the same name HTMLRenderingEngine.java. Then I went into the file in the above folder and added a line with the following contents: org.apache.tika.image.TikaImageExtractingParser and rebuilt the project and then packaged it and attempted to run it to see if the new functionality worked and it ran but did nothing new. I am sorry for all the basic questions, but what am I missing? All I really need is to use a ImageParser that will save the embedded images to some arbitrary directory in addition to parsing the files... is there some other package that I should use to perform this extraction before I parse the files with Tika? -- View this message in context: http://lucene.472066.n3.nabble.com/Image-Extraction-tp3006668p3015474.html Sent from the Apache Tika - Development mailing list archive at Nabble.com.
