I broke it with reusable token streams. Just checked in a fix - can you try now?
-Yonik http://www.lucidimagination.com On Mon, Aug 17, 2009 at 10:17 PM, Erik Hatcher<erik.hatc...@gmail.com> wrote: > I'm interested in using a CharFilter, something like this: > > <fieldType name="html_text" class="solr.TextField"> > <analyzer> > <charFilter class="solr.HTMLStripCharFilterFactory"/> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > </analyzer> > </fieldType> > > In hopes of being able to put in a value like > "<html><body>whatever</body></html>" and have "whatever" come back out. In > analysis.jsp, I see that happening in the verbose output but it doesn't make > it to the tokenizer input - the original string makes it there. > > I must be misunderstanding something about CharFilter's and how to use them > in Solr. HTMLStripWhitespaceTokenizerFactory is deprecated in favor of the > above design, I think, but does what I'm after. > > Solr only seems to use CharFilter's in analysis.jsp. Is that correct? > Shouldn't they be factored into the analyzer for each field? (like in > FieldAnalysisRequestHandler) > > Thanks, > Erik > >