[
https://issues.apache.org/jira/browse/SOLR-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611488#comment-13611488
]
Jan Høydahl commented on SOLR-1605:
-----------------------------------
I'm not sure this is a great idea. You could add an option to store the source
as a BinaryField or something, but what good does it do to have a 500Mb media
file in your index? Or do you want to store the parsed and structured XHTML
output from Tika in a stored field? I'm afraid that output is not meant for
pretty display.
> ExtractingRequestHandler does not embed original document
> ---------------------------------------------------------
>
> Key: SOLR-1605
> URL: https://issues.apache.org/jira/browse/SOLR-1605
> Project: Solr
> Issue Type: Wish
> Components: contrib - Solr Cell (Tika extraction)
> Affects Versions: 1.4
> Reporter: Lance Norskog
>
> The ExtractingRequestHandler does not have the option to embed the original
> document file as a saved field.
> This would be generally useful for content management system purposes, since
> the search index can also directly serve the content making for a much
> simpler system architecture.
> My use case is to highlight indexed HTML. Since the raw HTML text is not
> indexed, it is not possible to request it highlighted.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]