Quoting mark harwood <[EMAIL PROTECTED]>: Hi, Mark,
I just used StandardAnalyzer and code is as following: ===================================================== Analyzer analyzer = new StandardAnalyzer(); BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); String line = in.readLine(); if (line.length() == -1) return; Query query = QueryParser.parse(line, "contents", analyzer); Hits hits = searcher.search(query); Highlighter highlighter =new Highlighter(new QueryScorer(query)); ======================================================== Part of the fulltext result is: ======================================================== sectors in South Africa have failed to incorporate many co-management principles, such as joint... of the RNP. However, poor representation of community interests on the joint management committee... of a joint management committee and the improvement of infrastructure in the area. The key differences ...... ======================================================== Of course the result is far more than this. The paper itself looks normal to me. It makes me confused why this is happening. Thanks for your help, Ying > >> One of my > >> search results from our > >> records contains far too much of the text > > This is a problem I haven't seen before. I suspect it > may have something to do with your choice of analyzer. > Your paper will only ever be fragmented on "token gap" > boundaries ie points in the token stream where the > current token position does not overlap with the > previous token's . If the section in your text which > contains the search terms contains a long stream of > overlapping tokens you will end up with a long > highlighted selection. > > Which analyzer are using out of interest? > > > Cheers > Mark > > > > --- [EMAIL PROTECTED] wrote: > > > > > > Hi, All, > > > > I use lucene highlight package to generate KWIC for > > our application. > > > > The part of the code is as following: > > > ===================================================== > > if(text != null ){ > > TokenStream tokenStream = > > analyzer.tokenStream("contents", > > new StringReader(text)); > > // Get 3 best fragments and seperate with > > a "..." > > result = > > highlighter.getBestFragments(tokenStream, > > text, 3, "..."); > > } > > > > > ===================================================== > > > > However, I got a very strange problem. One of my > > search results from our > > records contains far too much of the text of the > > paper. It doesn't happen > > for the same paper when I changed the search > > criteria. > > > > Thanks very mcuh for your help, > > Ying > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > > [EMAIL PROTECTED] > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > > > > > > > > ___________________________________________________________ > Yahoo! Messenger - want a free and easy way to contact your friends online? > http://uk.messenger.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]