I broke it with reusable token streams. Just checked in a fix - can
you try now?
-Yonik
http://www.lucidimagination.com
On Mon, Aug 17, 2009 at 10:17 PM, Erik Hatchererik.hatc...@gmail.com wrote:
I'm interested in using a CharFilter, something like this:
fieldType name=html_text class=solr.TextField
analyzer
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.WhitespaceTokenizerFactory/
/analyzer
/fieldType
In hopes of being able to put in a value like
htmlbodywhatever/body/html and have whatever come back out. In
analysis.jsp, I see that happening in the verbose output but it doesn't make
it to the tokenizer input - the original string makes it there.
I must be misunderstanding something about CharFilter's and how to use them
in Solr. HTMLStripWhitespaceTokenizerFactory is deprecated in favor of the
above design, I think, but does what I'm after.
Solr only seems to use CharFilter's in analysis.jsp. Is that correct?
Shouldn't they be factored into the analyzer for each field? (like in
FieldAnalysisRequestHandler)
Thanks,
Erik