If you don't care about sentence boundaries, but just want a window around target terms and you want concordance functionality (sort before, after, etc), you might check out LUCENE-5317, which is available as a standalone jar on my github site [1] and is available through maven central.
Using a highlighter, too, will get you close. See a crummy image of LUCENE-5317 [2] or the full presentation [3] [1] https://github.com/tballison/lucene-addons/tree/6.5-0.1 [2] https://twitter.com/_tallison/status/852492398793981952 [3] https://github.com/tballison/share/blob/master/slides/TextProcessingAndAdvancedSearch_tallison_MITRE_201510_final_abbrev.pdf slide 23ff. -----Original Message----- From: ankur [mailto:ankur.sancheti.netw...@gmail.com] Sent: Thursday, April 13, 2017 12:08 PM To: solr-user@lucene.apache.org Subject: Re: keyword-in-content for PDF document Thanks Alex. Yes, I am using TIKA. So, to some extent it preserves the text flow. There is something interesting in your reply, "Or you could try using highlighter to return only the sentence. ". I didnt understand that bit. How do we use Highlighter to return the sentence? To make sure, I want to return all sentences where the word "Growth" appears. -- View this message in context: http://lucene.472066.n3.nabble.com/keyword-in-context-for-PDF-document-tp4329754p4329794.html Sent from the Solr - User mailing list archive at Nabble.com.