[ https://issues.apache.org/jira/browse/SOLR-7189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352813#comment-14352813 ]
Tim Allison commented on SOLR-7189: ----------------------------------- Got it. If anyone has an interest, I'll draft a patch, but otherwise this should do for now. Thank you, again! > Allow DIH to extract content from embedded documents via Tika > ------------------------------------------------------------- > > Key: SOLR-7189 > URL: https://issues.apache.org/jira/browse/SOLR-7189 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler > Affects Versions: 5.0 > Reporter: Tim Allison > Assignee: Shalin Shekhar Mangar > Priority: Minor > Fix For: Trunk, 5.1 > > Attachments: SOLR-7189.patch, test_recursive_embedded.docx > > > DIH's TikaEntityProcessor doesn't currently extract content from embedded > documents/attachments within a file. It might be useful if users could > configure whether or not to include extraction of content from embedded > documents. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org