I am trying to implement the NLP functionality within the Solr ExtractingRequestHandler and the Tika framework I am using PDF documents to index and have been successful in extracting and indexing the content but have not been successful in engaging the NLP routines. I have reached the point where I even trying to generate an exception just to validate my understanding of the interfaces. I have included parts of my solrconfig and tika.config . Also, I am using the techproducts example and Solr 6.3.0
solrconfig.xml --- NLP Models en-ner-organization.bin etc <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib" regex=".*\.bin" /> <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib" regex=".*\.jar" /> <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-cell-\d.*\.jar" /> solrconfig.xml <requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" > <lst name="defaults"> <str name="lowernames">true</str> <str name="uprefix">attr_</str> <str name="tika.config">tika-config.xml</str> <!-- capture link hrefs but ignore div attributes --> <str name="captureAttr">true</str> <str name="fmap.a">links</str> <str name="fmap.div">ignored_</str> </lst> </requestHandler> Tika-Config.xml <?xml version="1.0" encoding="UTF-8"?> <properties> <parsers> <parser class="org.apache.tika.parser.ner.NamedEntityParser"> <mime>text/plain</mime> <mime>text/html</mime> <mime>application/xhtml+xml</mime> <mime>application/pdf</mime> </parser> </parsers> </properties>