[ 
https://issues.apache.org/jira/browse/SOLR-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611488#comment-13611488
 ] 

Jan Høydahl commented on SOLR-1605:
-----------------------------------

I'm not sure this is a great idea. You could add an option to store the source 
as a BinaryField or something, but what good does it do to have a 500Mb media 
file in your index? Or do you want to store the parsed and structured XHTML 
output from Tika in a stored field? I'm afraid that output is not meant for 
pretty display.
                
> ExtractingRequestHandler does not embed original document
> ---------------------------------------------------------
>
>                 Key: SOLR-1605
>                 URL: https://issues.apache.org/jira/browse/SOLR-1605
>             Project: Solr
>          Issue Type: Wish
>          Components: contrib - Solr Cell (Tika extraction)
>    Affects Versions: 1.4
>            Reporter: Lance Norskog
>
> The ExtractingRequestHandler does not have the option to embed the original 
> document file as a saved field. 
> This would be generally useful for content management system purposes, since 
> the search index can also directly serve the content making for a much 
> simpler system architecture.
> My use case is to highlight indexed HTML. Since the raw HTML text is not 
> indexed, it is not possible to request it highlighted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to