Thanks for your response @Seid. Can any Lucene user give me directions on this regard? I'm stuck. Really appreciate your help.
Thanks, KK On Mon, May 25, 2009 at 2:43 PM, Seid Muhie <seidy...@gmail.com> wrote: > actually I used the normal java standard libraries for this work. I > used lucene only to retrieve the relevant document. > what you will do is, thought it is to manuall, as i don't know the way > it can be done by the Lucene API, you just record the location of the > query terms in the document (it is as easy as indexOf(query terms)). > But you have to be very aware of the speed of the system. then you can > go ahead or back, as you want. > > Once again, I think there will be also a LUcene workaround, that I am > not aware of it at all. > > Seid M > > On 5/25/09, KK <dioxide.softw...@gmail.com> wrote: > > Thanks for your quick response, Seid. > > > > There is one more mail I found in the archive[3/4 days old] where someone > > asked about extracting 3 neighbors words around the match. I think once > you > > have the position of matching term/phrase then extracting 3 or 30 > neighbors > > wont be different, right? because you just have to move back/forward and > get > > the words, this sounds logically simple but I dont know how simple is > this > > implementation-wise. > > Also people are talking about someting called spanQueries/termvectors etc > to > > use for this purpose. I'm still to get the exact idea of how to do this. > > As per your mail, you used Java to extract the neighbors, Is that using > the > > standard techniques i.e using those spanqueries/termvectors or something > > else. > > If you can elaborate all this a bit It'd be very helpful. > > > > Thank you. > > KK> > > > > On Mon, May 25, 2009 at 10:51 AM, Seid Muhie <seidy...@gmail.com> wrote: > > > >> for my thesis work (Question Answering) I used to retrieve first the > >> document and then play with java to extract the needed answer. > >> for your case what you will do is first locate the positions of the > >> query terms in the document (in this case it might be distributed > >> throughout the document - hence difficult to get the 15/20 words) then > >> count something10 words forward and backward and extract the match. > >> this is the way I handle my problem. Hope there might be different I > >> dea too > >> > >> Seid M. > >> > >> On 5/25/09, KK <dioxide.softw...@gmail.com> wrote: > >> > Hi All, > >> > I'm trying to index some non-english web pages and I'm keeping all the > >> > content of the page in a single field and the searches are working > fine > >> as > >> > well. Now when I search for some query it gives the complete page, > which > >> is > >> > expected. Now I want to restrict the showing of results to say 20 > words > >> > around the match, something like google does, otherwise we cann't make > >> users > >> > to look for a match in the whole page content[I'll use highlighter > after > >> > this is done]. So getting positions of the matched word/phrase might > >> > help > >> so > >> > that I can extract some words before and some words after that and > will > >> show > >> > that to end user. Any idea on doing the same will be very helpful. > Thank > >> > you. > >> > > >> > KK. > >> > > >> > >> > >> -- > >> "RABI ZIDNI ILMA" > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > > > > -- > "RABI ZIDNI ILMA" > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >