On 4/6/2019 6:59 AM, Dave Beckstrom wrote:
I'm really hating SOLR.   All I want is to define a text field that data
can be indexed into and which is searchable.  Should be super simple.  But
I run into issue after issue.  I'm running SOLR 7.3 because it's compatible
with the version of NUTCH I'm running.

The docs say that SOLR ships with a default TextField but that seems to be
wrong.  I define:

<field name="metadata.myfield" type="TextField" stored="true"
indexed="true"/>

That is a field definition. In order for that to work, you must also have a type definition named "TextField".

There are no default type definitions, your schema must contain all of the types that you use.

Solr does include a field class named TextField, which can be used in a type definition.

I'm going to paste a full (and quite short) schema below for a dovecot index that I have been experimenting on. Dovecot is a POP3/IMAP server, for email, and it can use Solr as a search backend. This schema defines four types - string, long, boolean, and text.

You'll notice that the definition for "text" uses "solr.TextField" for the class. The fully qualified class name for this is actually org.apache.solr.schema.TextField if you want to find it in the source code. The "solr." prefix on the class name is special syntax that Solr uses to search multiple java packages.

----------------
<?xml version="1.0" encoding="UTF-8" ?>

<!--
For fts-solr:

This is the Solr schema file, place it into solr/conf/schema.xml. You may
want to modify the tokenizers and filters.
-->
<schema name="dovecot" version="1.5">
  <types>
<!-- IMAP has 32bit unsigned ints but java ints are signed, so use longs -->
    <fieldType name="string" class="solr.StrField" />
    <fieldType name="long" class="solr.LongPointField" />
    <fieldType name="boolean" class="solr.BoolField" />

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.FlattenGraphFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.EnglishMinimalStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.EnglishMinimalStemFilterFactory"/>
      </analyzer>
    </fieldType>
 </types>


 <fields>
<field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="uid" type="long" indexed="true" stored="true" required="true" /> <field name="box" type="string" indexed="true" stored="true" required="true" /> <field name="user" type="string" indexed="true" stored="true" required="true" />

   <field name="hdr" type="text" indexed="true" stored="false" />
   <field name="body" type="text" indexed="true" stored="false" />

   <field name="from" type="text" indexed="true" stored="false" />
   <field name="to" type="text" indexed="true" stored="false" />
   <field name="cc" type="text" indexed="true" stored="false" />
   <field name="bcc" type="text" indexed="true" stored="false" />
   <field name="subject" type="text" indexed="true" stored="false" />

   <!-- Used by Solr internally: -->
   <field name="_version_" type="long" indexed="true" stored="true"/>
 </fields>

 <uniqueKey>id</uniqueKey>
</schema>
----------------

When I defined Text_General:

  <updateProcessor class="solr.AddSchemaFieldsUpdateProcessorFactory"
name="add-schema-fields">
     <lst name="typeMapping">
       <str name="valueClass">java.lang.String</str>
       <str name="fieldType">text_general</str>
       <bool name="default">true</bool>
     </lst>

That is an update processor definition, not a type definition. That definition and its usage would both go in solrconfig.xml, not your schema. Update processors are not relevant to the message you posted.

Thanks,
Shawn

Reply via email to