I would do some googling to find examples, or read the javadocs for the highlighter package?
Or pick up copy of the early-release of Lucene in Action 2nd edition from http://manning.com [disclaimer: I'm one of the authors on that book!]. We've revamped the highlighter coverage (in chapter 8)... Mike On Mon, May 25, 2009 at 6:19 AM, KK <dioxide.softw...@gmail.com> wrote: > Thanks @Michael. > I've no idea about this contrib though I'm looking into highlighter. Can you > throw some lights on the same. The steps to be taken for achieving the same. > I'm completely new to this thing. Can you point me to some examples for the > same? Thank you. > > KK. > > On Mon, May 25, 2009 at 3:26 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Can't you use contrib/highlighter to achieve this? >> >> It can do both excerpting (grabbing chunk of text around each hit) and >> highlighting (highlighting the specific tokens that matched, within >> that excerpt). >> >> Mike >> >> On Mon, May 25, 2009 at 5:20 AM, KK <dioxide.softw...@gmail.com> wrote: >> > Thanks for your response @Seid. >> > Can any Lucene user give me directions on this regard? I'm stuck. >> > Really appreciate your help. >> > >> > Thanks, >> > KK >> > >> > On Mon, May 25, 2009 at 2:43 PM, Seid Muhie <seidy...@gmail.com> wrote: >> > >> >> actually I used the normal java standard libraries for this work. I >> >> used lucene only to retrieve the relevant document. >> >> what you will do is, thought it is to manuall, as i don't know the way >> >> it can be done by the Lucene API, you just record the location of the >> >> query terms in the document (it is as easy as indexOf(query terms)). >> >> But you have to be very aware of the speed of the system. then you can >> >> go ahead or back, as you want. >> >> >> >> Once again, I think there will be also a LUcene workaround, that I am >> >> not aware of it at all. >> >> >> >> Seid M >> >> >> >> On 5/25/09, KK <dioxide.softw...@gmail.com> wrote: >> >> > Thanks for your quick response, Seid. >> >> > >> >> > There is one more mail I found in the archive[3/4 days old] where >> someone >> >> > asked about extracting 3 neighbors words around the match. I think >> once >> >> you >> >> > have the position of matching term/phrase then extracting 3 or 30 >> >> neighbors >> >> > wont be different, right? because you just have to move back/forward >> and >> >> get >> >> > the words, this sounds logically simple but I dont know how simple is >> >> this >> >> > implementation-wise. >> >> > Also people are talking about someting called spanQueries/termvectors >> etc >> >> to >> >> > use for this purpose. I'm still to get the exact idea of how to do >> this. >> >> > As per your mail, you used Java to extract the neighbors, Is that >> using >> >> the >> >> > standard techniques i.e using those spanqueries/termvectors or >> something >> >> > else. >> >> > If you can elaborate all this a bit It'd be very helpful. >> >> > >> >> > Thank you. >> >> > KK> >> >> > >> >> > On Mon, May 25, 2009 at 10:51 AM, Seid Muhie <seidy...@gmail.com> >> wrote: >> >> > >> >> >> for my thesis work (Question Answering) I used to retrieve first the >> >> >> document and then play with java to extract the needed answer. >> >> >> for your case what you will do is first locate the positions of the >> >> >> query terms in the document (in this case it might be distributed >> >> >> throughout the document - hence difficult to get the 15/20 words) >> then >> >> >> count something10 words forward and backward and extract the match. >> >> >> this is the way I handle my problem. Hope there might be different I >> >> >> dea too >> >> >> >> >> >> Seid M. >> >> >> >> >> >> On 5/25/09, KK <dioxide.softw...@gmail.com> wrote: >> >> >> > Hi All, >> >> >> > I'm trying to index some non-english web pages and I'm keeping all >> the >> >> >> > content of the page in a single field and the searches are working >> >> fine >> >> >> as >> >> >> > well. Now when I search for some query it gives the complete page, >> >> which >> >> >> is >> >> >> > expected. Now I want to restrict the showing of results to say 20 >> >> words >> >> >> > around the match, something like google does, otherwise we cann't >> make >> >> >> users >> >> >> > to look for a match in the whole page content[I'll use highlighter >> >> after >> >> >> > this is done]. So getting positions of the matched word/phrase >> might >> >> >> > help >> >> >> so >> >> >> > that I can extract some words before and some words after that and >> >> will >> >> >> show >> >> >> > that to end user. Any idea on doing the same will be very helpful. >> >> Thank >> >> >> > you. >> >> >> > >> >> >> > KK. >> >> >> > >> >> >> >> >> >> >> >> >> -- >> >> >> "RABI ZIDNI ILMA" >> >> >> >> >> >> --------------------------------------------------------------------- >> >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> >> >> >> > >> >> >> >> >> >> -- >> >> "RABI ZIDNI ILMA" >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org