Thanks, but I think I'm going to have to work out a different solution. I 
have written my own analyzer that does everything I need: it's not a 
different analyzer I need but a way to specify that certain fields should 
be tokenized and others not -- while still leaving all other options open.

As far as the generic options parsing resulting in unused properties in a 
ShcemaField object, not it is not specifically documented anywhere, but 
the Solr Wiki lists, for both fields and field types: "Common options that 
fields can have are...". I could not find anywhere a definitive list of 
what is allowed/used or excluded, so I went to the code and found that the 
"tokenized" would indeed be respected in SchemaField.

-- Robert

[EMAIL PROTECTED] wrote on 05/31/2007 11:30:04 AM:

> On 5/31/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > You say the "tokenized" attribute is not settable from the schema, but 
the
> > output from IndexSchema.readConfig shows that the properties are 
indeed
> > read, and the resulting SchemaField object retains these properties: 
are
> > they then ignored?
> 
> Not sure off the top of my head, but don't use it... it's shouldn't be
> documented anywhere.
> It probably slipped through as part of generic options parsing.
> 
> > > "untokenized" means don't use the analyzer.   If you don't want an
> > > analyzer, then use the "string" type.
> > >
> > This is true only in the simplest of cases. An analyzer can do far 
more
> > than tokenize: it can stem, change to lower case, etc. What if you 
want
> > one or more of these things to happen, but you don't want 
tokenization?
> 
> From a Lucene perspective, if you create an untokenized field, the
> analyzer will not be used at all.  It should have probably been named
> unanalyzed, as that's more accurate.
> 
> KeywordTokenizer (via KeywordTokenizerFactory) is probably what you
> are looking for.
> Create a new text field type with that as the tokenizer, followed by
> whatever filters you want (like lowercasing).
> 
> -Yonik

Reply via email to