ping!
Any hope for help here?
I'm a bit stuck before deploying a release.
thanks in advance
paul
On 3 sept. 2010, at 14:05, Paul Libbrecht wrote:
>
> Hello list,
>
> I'm strugging again with the highlighter. I don't understand why I obtain
> sporadically InvalidTokenOffsetsException.
>
> The mission: given a query, detect which field was matched, among the names
> of the concepts: there can be several names for a given concept, also in one
> language. Concepts are documents and names are in fields name-xx where xx is
> the two-letter-language.
>
> Here's the method I'm using:
>
> public String computeMatchedField(int docNum, Document doc, Analyzer
> analyzer, Query query) throws IOException {
> //System.out.println("----- computing matched field for query " +
> query + " on document " + doc.get("uri"));
> query = query.rewrite(this.reader);
> String found = null;
> float maxScore = 0;
> try {
> for(Field f: (List<Field>) doc.getFields()) {
> QueryScorer scorer = new QueryScorer(query,reader,f.name());
> if(!f.name().startsWith("name-")) continue;
> //System.out.println("Measuring field " + f.name() + ": " +
> f.stringValue());
> String text = f.stringValue();
> TokenStream tokenStream =
> TokenSources.getAnyTokenStream(reader,docNum, f.name(), doc, analyzer);
> SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
> Highlighter highlighter = new Highlighter(htmlFormatter,
> scorer);
> TextFragment[] frags =
> highlighter.getBestTextFragments(tokenStream, text, false, 1);
> if(frags==null || frags.length==0) continue;
> float score = frags[0].getScore();
> //System.out.println("Score: " + score);
> if(score > maxScore) {
> maxScore = score;
> found = frags[0].toString();
> }
> }
> } catch(Exception ex) {ex.printStackTrace();}
> return found;
> }
>
> Unfortunately, I have to catch InvalidTokenOffsetsException which does happen
> sometimes, not always.
> When it occurs, it stops the highlighting (the detected field is "null") and
> also costs quite some time.
>
> What am I doing wrong?
> I tried making my own tokenStream with no difference.
>
> thanks in advance
>
> paul
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]