http://search.lucidimagination.com/search/document/2d325f6178afc00a/how_to_search_for_c
-Yonik http://www.lucidimagination.com On Thu, Aug 6, 2009 at 11:38 AM, Michael _<solrco...@gmail.com> wrote: > Hi everyone, > I'm indexing several documents that contain words that the StandardTokenizer > cannot detect as tokens. These are words like > C# > .NET > C++ > which are important for users to be able to search for, but get treated as > "C", "NET", and "C". > > How can I create a list of words that should be understood to be indivisible > tokens? Is my only option somehow stringing together a lot of > PatternTokenizers? I'd love to do something like <tokenizer > class="StandardTokenizer" tokenwhitelist=".NET C++ C#" />. > > Thanks in advance! >