[ 
https://issues.apache.org/jira/browse/SOLR-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960411#comment-14960411
 ] 

Uwe Schindler commented on SOLR-8166:
-------------------------------------

You are alos using the wrong classloader for doing this stuff. This will break 
with Solr plugins. You have to pass the SolrResourceLoader down to your parser 
and this one has to call the class loading methods it provides. Don't use 
native Classloaders.

> Introduce possibility to configure ParseContext in 
> ExtractingRequestHandler/ExtractingDocumentLoader
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-8166
>                 URL: https://issues.apache.org/jira/browse/SOLR-8166
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - Solr Cell (Tika extraction)
>    Affects Versions: 5.3
>            Reporter: Andriy Binetsky
>
> Actually there is no possibility to hand over some additional configuration 
> by document extracting with ExtractingRequestHandler/ExtractingDocumentLoader.
> For example I need to put org.apache.tika.parser.pdf.PDFParserConfig with 
> "extractInlineImages" set to true in ParseContext to trigger extraction/OCR 
> recognizing of embedded images from pdf. 
> It would be nice to have possibility to configure created ParseContext due 
> xml-config file like TikaConfig does.
> I would suggest to have following:
> solrconfig.xml:
>   <requestHandler name="/update/extract" 
> class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
>     <str name="parseContext.config">parseContext.config</str>
>   </requestHandler>
> parseContext.config:
> <entries>
>   <entry class="org.apache.tika.parser.pdf.PDFParserConfig" 
> value="org.apache.tika.parser.pdf.PDFParserConfig">
>     <property name="extractInlineImages" value="true"/>
>   </entry>
> </entries>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to