Can someone please point me in the right direction. We are creating an application that needs to beable to search on C++ and get back doc's that have C++ in it. The StandardAnalyzer does not seem to index the "+", so a search for "C++" will bring back docs that contain, C++, C, C#, etc..... The WhiteSpaceAnalyzer will index the "+", but if we have the term "C++." that is, if C++ is at the end of a sentence, it will index "C++." so a search for "C++" will not return the doc. I have heard of maybe a CustomAnalyzer; however, it seems like there would actually need to be a CustomFilter/CustomTokenizer, I looked at: - StandardAnalyzer.java - StandardFilter.java - StandardTokenizer.java - StandardTokenizerImpl.java - StandardTokenizerImpl.jflex
I would guess that the StandardTokenizer is where the changes would need to be made to allow the "+" character, but I am unclear as to how. Any and all help is greatly appreciated. Going thru all the documents, stripping out "+" for the word "plus" is not really an option for us. -- View this message in context: http://old.nabble.com/Lucene-Analyzer-that-can-handle-C%2B%2B-vs-C--tp26748041p26748041.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org