Hi, I am migrating a Jackrabbit Instance from 2.2.13 to 2.6.2 using: jackrabbit-core jackrabbit-jcr-commons jackrabbit-jcr-rmi For indexing I am using the module tika-core and parts of tika-parsers. Because the module tika-parsers is creating problems (among others the aspectjrt-1.6.x.jar is in conflict with my one-jar pkg meccano) I try to include only those parser classes and their dependencies into the Project, so I am able to index .pdf and .xml files. While the indexing via the PDFParser is working the DcXMLParser parser is not executed and no content is in the index. When I configure the EmptyParser with the application/xml Mime-Type EmptyParser is not called either.
So what confuses me is that the PDFParser config is read from the tika-config.xml (I can proof that with falsifying the Classname) and called at runtime. However, the XMLParser is read as well but not called at runtime. tika-config.xml ... <mimeTypeRepository resource="/org/apache/tika/mime/tika-mimetypes.xml" magic="false"/> <parsers> <parser name="parse-pdf" class="org.apache.tika.parser.pdf.PDFParser"> <mime>application/pdf</mime> </parser> <parser name="parse-dcxml" class="org.apache.tika.parser.xml.DcXMLParser"> <mime>application/xml</mime> <mime>image/svg+xml</mime> </parser> <parser class="org.apache.tika.parser.DefaultParser"/> <parser class=" org.apache.tika.parser.EmptyParser "> <!-- <mime>application/xml</mime> --> </parser> </parsers> .... The XML-Files have the Mime-Type application/xml. The other configuration file /resources/META-INF/services/org.apache.tika.parser.Parser is in a sub-jar of the one-jar pkg. Because that did not show effect I took it outside and referenced it explicitly on the classpath on startup but that did not show any effect either. Is this file needed for the Parsers to work? Thanks for any hints! Paul
