Ian Lea wrote:
See https://issues.apache.org/jira/browse/LUCENE-1068 which appears to
be talking about the same sort of thing, and
StandardAnalyzer.setReplaceInvalidAcronym(b).
Quite how you deal with this in your own analyzer is left as an exercise ...
Yes I think you are right, though dont understand it fully
TokenStream ts = analyzer.tokenStream("content", new
StringReader("R.E.S."));
Token t;
while ((t = ts.next()) != null) { System.out.println("R.E.S.
parsed to :"+t); }
ts = analyzer.tokenStream("content", new StringReader("R.E.S"));
while ((t = ts.next()) != null) { System.out.println("R.E.S
parsed to :"+t); }
}
this code outputs
R.E.S. parsed to :(res,0,6,type=<ACRONYM>)
R.E.S parsed to :(r.e.s,0,5,type=<HOST>)
so from my perspective I cannot see
it thinks R.E.S is a HOST it should be an acronym, but also for the one
that is an acronym I thought it end up as r.e.s not res
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org