Hi Furkan, I have done an implementation with a custom filler (special character) sequence in between sentences. A better solution I landed at was increasing the position of each sentence's first token by a large number, like 10000 (perhaps, a smaller number could be used too). Then a user search can be conducted with a proximity query: "some tokens" ~5000 (the recently committed complexphrase parser supports rich phrase syntax, for example). This of course expects that a sentence fits the 5000 window size and the total number of sentences in the field * 10k does not exceed Integer.MAX_VALUE. Then on the highlighter side you'd get the hits within sentences naturally.
Is this something you are looking for? Dmitry On Mon, Mar 24, 2014 at 5:43 PM, Furkan KAMACI <furkankam...@gmail.com>wrote: > Hi; > > When I generate snippet via Solr I do not want to remove beginning of any > sentence at the snippet. So I need to do a sentence detection. I think that > I can do it before I send documents into Solr. I can put some special > characters that signs beginning or end of a sentence. Then I can use that > information when generating snippet. On the other hand I should not show > that special character to the user. > > What do you think that how can I do it or do you have any other ideas for > my purpose? > > PS: I do not do it for English sentences. > > Thanks; > Furkan KAMACI > -- Dmitry Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan