I indexed a structured pdf document in Solr. The problem is when I search for a simple string - I get the entire content field as the response! I don't know how to change that.
My requirement is that, lets say I search for "metadata" it should give me "*Metadata*Discussion . . . 4 matches ... make sure that Tika users have a chance to get to all of the* metadata* created and/or extracted by Tika. == Original Problem == The original inspiration for this page was a Tika ... 10.7k - rev: 2 (current) last modified: 2010-08-02 18:09:45 " But it gives me the whole document!- the entire string that was indexed. It seems like Lucene can only tell me in which field it occurred, not where in the field it occurred I posted the document like this curl "http://localhost:8983/solr/update/extract?stream.file=/home/Desktop/DOCUMENTS/T.pdf&stream.contentType=application/pdf&literal.id=DOC_N&commit=true&captureAttr=true" A query of *:* gives me the entire content of the document indexed in the <text> field And any search also returns the same thing. Any help will be greatly appreciated! Any help will be greatly appreciated!! -- View this message in context: http://lucene.472066.n3.nabble.com/Not-able-to-use-the-highlighting-feature-Want-to-return-snippets-of-text-tp3985012.html Sent from the Solr - User mailing list archive at Nabble.com.