I broke it with reusable token streams.  Just checked in a fix - can
you try now?

-Yonik
http://www.lucidimagination.com


On Mon, Aug 17, 2009 at 10:17 PM, Erik Hatcher<erik.hatc...@gmail.com> wrote:
> I'm interested in using a CharFilter, something like this:
>
>    <fieldType name="html_text" class="solr.TextField">
>      <analyzer>
>        <charFilter class="solr.HTMLStripCharFilterFactory"/>
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>      </analyzer>
>    </fieldType>
>
> In hopes of being able to put in a value like
> "<html><body>whatever</body></html>" and have "whatever" come back out.  In
> analysis.jsp, I see that happening in the verbose output but it doesn't make
> it to the tokenizer input - the original string makes it there.
>
> I must be misunderstanding something about CharFilter's and how to use them
> in Solr.  HTMLStripWhitespaceTokenizerFactory is deprecated in favor of the
> above design, I think, but does what I'm after.
>
> Solr only seems to use CharFilter's in analysis.jsp.  Is that correct?
>  Shouldn't they be factored into the analyzer for each field?  (like in
> FieldAnalysisRequestHandler)
>
> Thanks,
>        Erik
>
>

Reply via email to