Hi Furkan,

I have done an implementation with a custom filler (special character)
sequence in between sentences. A better solution I landed at was increasing
the position of each sentence's first token by a large number, like 10000
(perhaps, a smaller number could be used too). Then a user search can be
conducted with a proximity query: "some tokens" ~5000 (the recently
committed complexphrase parser supports rich phrase syntax, for example).
This of course expects that a sentence fits the 5000 window size and the
total number of sentences in the field * 10k does not exceed
Integer.MAX_VALUE. Then on the highlighter side you'd get the hits within
sentences naturally.

Is this something you are looking for?

Dmitry



On Mon, Mar 24, 2014 at 5:43 PM, Furkan KAMACI <furkankam...@gmail.com>wrote:

> Hi;
>
> When I generate snippet via Solr I do not want to remove beginning of any
> sentence at the snippet. So I need to do a sentence detection. I think that
> I can do it before I send documents into Solr. I can put some special
> characters that signs beginning or end of a sentence. Then I can use that
> information when generating snippet. On the other hand I should not show
> that special character to the user.
>
> What do you think that how can I do it or do you have any other ideas for
> my purpose?
>
> PS: I do not do it for English sentences.
>
> Thanks;
> Furkan KAMACI
>



-- 
Dmitry
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan

Reply via email to