Try setting generateWordParts=1 in your WDF. Also, having a WhitespaceTokenizer 
makes little sense for URL's, there should be no whitespace in a URL, the 
StandardTokenizer can tokenize a URL. Anyway, the problem is your WDF.
 
-----Original message-----
From: Max Lynch <ihas...@gmail.com>
Sent: Thu 23-09-2010 23:00
To: solr-user@lucene.apache.org; 
Subject: Search a URL

Is there a tokenizer that will allow me to search for parts of a URL?  For
example, the search "google" would match on the data "
http://mail.google.com/dlkjadf";

This tokenizer factory doesn't seem to be sufficient:

       <fieldType name="text_standard" class="solr.TextField"
positionIncrementGap="100">
           <analyzer type="index">
               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
               <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
               <filter class="solr.LowerCaseFilterFactory"/>
               <filter class="solr.SnowballPorterFilterFactory"
language="English" protected="protwords.txt"/>
           </analyzer>
           <analyzer type="query">
                <tokenizer class="solr.WhitespaceTokenizerFactory"/>

                <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.SnowballPorterFilterFactory"
language="English" protected="protwords.txt"/>
            </analyzer>
   </fieldType>

Thanks.

Reply via email to