This type of question is not appropriate on the developers list, this
list is devoted to development. Please please post this kind of
question on the user's list.
As it happens, this very topic is being discussed under a thread
"Recover special terms from StandardTokenizer", that should give
you some ideas.
ERick
On Fri, Dec 11, 2009 at 11:19 AM, maxSchlein wrote:
>
> Can someone please point me in the right direction.
>
> We are creating an application that needs to beable to search on C++ and
> get
> back doc's that have C++ in it. The StandardAnalyzer does not seem to
> index
> the "+", so a search for "C++" will bring back docs that contain, C++, C,
> C#, etc. The WhiteSpaceAnalyzer will index the "+", but if we have the
> term "C++." that is, if C++ is at the end of a sentence, it will index
> "C++." so a search for "C++" will not return the doc. I have heard of
> maybe
> a CustomAnalyzer; however, it seems like there would actually need to be a
> CustomFilter/CustomTokenizer, I looked at:
> - StandardAnalyzer.java
> - StandardFilter.java
> - StandardTokenizer.java
> - StandardTokenizerImpl.java
> - StandardTokenizerImpl.jflex
>
> I would guess that the StandardTokenizer is where the changes would need to
> be made to allow the "+" character, but I am unclear as to how.
>
> Any and all help is greatly appreciated.
> --
> View this message in context:
> http://old.nabble.com/Lucene-Analyzer-that-can-handle-C%2B%2B-vs-C--tp26747079p26747079.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> -
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>