Re: How to extract 15/20 words around the matched query after getting results from lucene searcher?

KK Mon, 25 May 2009 02:21:17 -0700

Thanks for your response @Seid.
Can any Lucene user give me directions on this regard? I'm stuck.
 Really appreciate your help.


Thanks,
KK

On Mon, May 25, 2009 at 2:43 PM, Seid Muhie <seidy...@gmail.com> wrote:

> actually I used the normal java standard libraries for this work. I
> used lucene only to retrieve the relevant document.
> what you will do is, thought it is to manuall, as i don't know the way
> it can be done by the Lucene API, you just record the location of the
> query terms in the document (it is as easy as indexOf(query terms)).
> But you have to be very aware of the speed of the system. then you can
> go ahead or back, as you want.
>
> Once again, I think there will be also a LUcene workaround, that I am
> not aware of it at all.
>
> Seid M
>
> On 5/25/09, KK <dioxide.softw...@gmail.com> wrote:
> > Thanks for your quick response, Seid.
> >
> > There is one more mail I found in the archive[3/4 days old] where someone
> > asked about extracting 3 neighbors words around the match. I think once
> you
> > have the position of matching term/phrase then extracting 3 or 30
> neighbors
> > wont be different, right? because you just have to move back/forward and
> get
> > the words, this sounds logically simple but I dont know how simple is
> this
> > implementation-wise.
> > Also people are talking about someting called spanQueries/termvectors etc
> to
> > use for this purpose. I'm still to get the exact idea of how to do this.
> > As per your mail, you used Java to extract the neighbors, Is that using
> the
> > standard techniques i.e using those spanqueries/termvectors or something
> > else.
> > If you can elaborate all this a bit It'd be very helpful.
> >
> > Thank you.
> > KK>
> >
> > On Mon, May 25, 2009 at 10:51 AM, Seid Muhie <seidy...@gmail.com> wrote:
> >
> >> for my thesis work (Question Answering) I used to retrieve first the
> >> document and then play with java to extract the needed answer.
> >> for your case what you will do is first locate the positions of the
> >> query terms in the document (in this case it might be distributed
> >> throughout the document - hence difficult to get the 15/20 words) then
> >> count something10 words forward and backward and extract the match.
> >> this is the way I handle my problem. Hope there might be different I
> >> dea too
> >>
> >> Seid M.
> >>
> >> On 5/25/09, KK <dioxide.softw...@gmail.com> wrote:
> >> > Hi All,
> >> > I'm trying to index some non-english web pages and I'm keeping all the
> >> > content of the page in a single field and the searches are working
> fine
> >> as
> >> > well. Now when I search for some query it gives the complete page,
> which
> >> is
> >> > expected. Now I want to restrict the showing of results to say 20
> words
> >> > around the match, something like google does, otherwise we cann't make
> >> users
> >> > to look for a match in the whole page content[I'll use highlighter
> after
> >> > this is done]. So getting positions of the matched word/phrase might
> >> > help
> >> so
> >> > that I can extract some words before and some words after that and
> will
> >> show
> >> > that to end user. Any idea on doing the same will be very helpful.
> Thank
> >> > you.
> >> >
> >> > KK.
> >> >
> >>
> >>
> >> --
> >> "RABI ZIDNI ILMA"
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
> >
>
>
> --
> "RABI ZIDNI ILMA"
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Re: How to extract 15/20 words around the matched query after getting results from lucene searcher?

Reply via email to