I am trying to override the tokenized attribute of a single FieldType from
the field attribute in schema.xml, but it doesn't seem to work and I can't
figure out why. For example, if I define various fields to be of type
solr.TextField, and use tokenized="false" for some and tokenized="true"
for others, the fields are defined properly when schema.xml is read, but
when documents are added to the index, all indexed fields are
Field.Index.TOKENIZED, which is the default for solr.TextField (as if I
had not used the tokenized attribute in the field element). And if I use
solr.StrField as the field type, all indexed fields turn out to be
Field.Index.UN_TOKENIZED: the default for solr.StrField. I am confirming
the tokenized state of the fields by using Luke and by executing searches.
Any clues as to what I'm doing wrong?
-- Robert
PS: Yes, I know I could use solr.StrField for those fields I would like to
be Field.Index.UN_TOKENIZED and solr.TextField for those I would like to
be Field.Index.TOKENIZED, but my reading of the documentation and the code
is that I should be able to do things the way I'm attempting them, and I
have other reasons for wanting to consolidate all field attribute
definitions to the field element.
Solr version: 1.1.0
==================================
Extract from schema.xml
-----------------------
<types>
<fieldtype name="wpsField" class="solr.StrField">
<analyzer type="index" class="&index_analyzer;" />
<analyzer type="query" class="&query_analyzer;" />
</fieldtype>
</types>
<fields>
<field name="term" type="wpsField" indexed="true" stored="true"
tokenized="true" />
<field name="termType" type="wpsField" indexed="true"
stored="true" tokenized="false" />
<field name="unstm_termExact" type="wpsField" indexed="true"
stored="false" tokenized="false" />
<field name="descriptorType" type="wpsField" indexed="true"
stored="true" tokenized="false" />
<field name="aqs" type="wpsField" indexed="true" stored="true"
tokenized="false" multiValued="true" />
<field name="treeNums" type="wpsField" indexed="true"
stored="true" tokenized="false" multiValued="true" />
<field name="scopeNote" type="wpsField" indexed="false"
stored="true" tokenized="false" />
<field name="see" type="wpsField" indexed="false" stored="true"
tokenized="false" />
<field name="permutation" type="wpsField" indexed="true"
stored="true" tokenized="true" />
<field name="all-fields" type="wpsField" indexed="false"
stored="false" tokenized="true" />
<field name="WPS_UNID_FIELD_CONSTANT" type="wpsField"
indexed="true" stored="true" tokenized="false" />
<dynamicField name="unstm_*" type="wpsField" indexed="true"
stored="false" tokenized="true" multiValued="true" />
</fields>
<copyField source="term" dest="unstm_termExact" />
Output to catalina.out
----------------------
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined:
term{type=wpsField,properties=indexed,tokenized,stored}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined: termType{type=wpsField,properties=indexed,stored}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined: unstm_termExact{type=wpsField,properties=indexed}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined:
descriptorType{type=wpsField,properties=indexed,stored}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined:
aqs{type=wpsField,properties=indexed,stored,multiValued}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined:
treeNums{type=wpsField,properties=indexed,stored,multiValued}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined: scopeNote{type=wpsField,properties=stored}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined: see{type=wpsField,properties=stored}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined:
permutation{type=wpsField,properties=indexed,tokenized,stored,multiValued}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined: all-fields{type=wpsField,properties=tokenized}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: field defined:
WPS_UNID_FIELD_CONSTANT{type=wpsField,properties=indexed,stored}
May 31, 2007 9:25:49 AM org.apache.solr.schema.IndexSchema readConfig
FINE: dynamic field defined:
unstm_*{type=wpsField,properties=indexed,multiValued}
Screenshot from Luke
--------------------
o shows that field "term" is not tokenized even though it should be
(according to schema.xml and field definition from
org.apache.solr.schema.IndexSchema.readConfig).