I indexed a structured pdf document in Solr. The problem is when I search
for a simple string - I get the entire content field as the response! I
don't know how to change that.

My requirement is that, lets say I search for "metadata" it should give me

"*Metadata*Discussion . . . 4 matches ... make sure that Tika users have a
chance to get to all of the* metadata* created and/or extracted by Tika. ==
Original Problem == The original inspiration for this page was a Tika ...
10.7k - rev: 2 (current) last modified: 2010-08-02 18:09:45 "

But it gives me the whole document!- the entire string that was indexed. It
seems like Lucene can only tell me in which field it occurred, not where in
the field it occurred

I posted the document like this 
   curl
"http://localhost:8983/solr/update/extract?stream.file=/home/Desktop/DOCUMENTS/T.pdf&stream.contentType=application/pdf&literal.id=DOC_N&commit=true&captureAttr=true";
 

A query of *:* gives me the entire content of the document indexed in the
<text> field And any search also returns the same thing. 

Any help will be greatly appreciated!
Any help will be greatly appreciated!!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Not-able-to-use-the-highlighting-feature-Want-to-return-snippets-of-text-tp3985012.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to