http://search.lucidimagination.com/search/document/2d325f6178afc00a/how_to_search_for_c

-Yonik
http://www.lucidimagination.com



On Thu, Aug 6, 2009 at 11:38 AM, Michael _<solrco...@gmail.com> wrote:
> Hi everyone,
> I'm indexing several documents that contain words that the StandardTokenizer
> cannot detect as tokens.  These are words like
>  C#
>  .NET
>  C++
> which are important for users to be able to search for, but get treated as
> "C", "NET", and "C".
>
> How can I create a list of words that should be understood to be indivisible
> tokens?  Is my only option somehow stringing together a lot of
> PatternTokenizers?  I'd love to do something like <tokenizer
> class="StandardTokenizer" tokenwhitelist=".NET C++ C#" />.
>
> Thanks in advance!
>

Reply via email to