I went ahead and tried to piece together what I needed to do to test Tika
with the code provided above. 


tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
One of the link references provided suggesting modifying this above file, do
I am need to? I don't believe it is necessary to modify this file because it
only contains definitions of MIME types and the image files are already
defined. Does that seem correct? 

tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser
I went ahead and copied the HTMLRenderingEngine.java file into
Tika-Parsers/src/main/java/org/apache/tika/parser/image/ with the same name
HTMLRenderingEngine.java. Then I went into the file in the above folder and
added a line with the following contents:
org.apache.tika.image.TikaImageExtractingParser and rebuilt the project and
then packaged it and attempted to run it to see if the new functionality
worked and it ran but did nothing new. I am sorry for all the basic
questions, but what am I missing?

All I really need is to use a ImageParser that will save the embedded images
to some arbitrary directory in addition to parsing the files... is there
some other package that I should use to perform this extraction before I
parse the files with Tika?


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Image-Extraction-tp3006668p3015474.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.

Reply via email to