Hmmm, have you tried EdgeNGrams? This works for me (at the expense
of a somewhat larger index, of course)...

    <fieldType name="edge" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="solr.EdgeNGramFilterFactory" minGramSize="4"
maxGramSize="15" side="front"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory" />
      </analyzer>


and a field of type "edge" named "thomasfield"....


Now searches like
thomasfield:"GOK IA 3"
(include quotes!) should work. The various parameters (min/max gram size)
I chose arbitrarily, you'll want to tweak them.

I include a lowercasefilter for safety's sake if people are actually
going to type
things in...

It's probably instructive to look at the admin/analysis page to see how
this all plays out....

Best
Erick


On Wed, Jun 8, 2011 at 9:29 AM, Thomas Fischer <fischer...@aon.at> wrote:
> Hi Erick,
>
> I have a multivalued field "GOK" (local classification scheme) with separate 
> entries of the sort
>  IA 300; IC 330; IA 317; IA 318, i.e. 1 to 3 capital characters, space, 3 
> digits.
> I want to be able to perform a truncated search on that field:
> either just the string before the space, or a combination of that string with 
> 1 or 2 digits, something like:
> GOK:IA
> or
> GOK:IA 3*
> or
> GOK:IA 31?
> My problem is the clash between the phrase (GOK:"IA 317" works) and the 
> wildcards.
>
> As a start I tried as type
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100" 
> autoGeneratePhraseQueries="true">
> from the solr 3.2 distribution schema
> (apache-solr-3.2.0/example/solr/conf/schema.xml),
> the field is just
> <field name="GOK" type="text" multiValued="true"/>
>
> BTW, I have another field "DDC" with entries of the form "t1:086643" with 
> analogous requirements which yields similar problems due to the colon, also 
> indexed as text.
> Here also
> DDC:T1\:086643
> works, but not
> DDC:T1\:08664?
>
> Thanks in advance
> Thomas
>
>> Yes there is, but you haven't provided enough information to
>> make a suggestion. What isthe fieldType definition? What is
>> the field definition?
>>
>> Two resources that'll help you greatly are:
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
>>
>> and the admin/analysis page...
>>
>> Best
>> Erick
>>
>> On Tue, Jun 7, 2011 at 6:23 PM, Thomas Fischer <fischer...@aon.at> wrote:
>>> Hello,
>>>
>>> I am testing solr 3.2 and have problems with wildcards.
>>> I am indexing values like "IA 300; IC 330; IA 317; IA 318" in a field 
>>> "GOK", and can't find a way to search with wildcards.
>>> I want to use a wild card search to match something like "IA 31?" but 
>>> cannot find a way to do so.
>>> GOK:IA\ 38* doesn't work with the contents of GOK indexed as text.
>>> Is there a way to index and search that would meet my requirements?
>>>
>>> Thomas
>>>
>>>
>>>
>
> Mit freundlichen Grüßen
> Thomas Fischer
>
>
>

Reply via email to