Thanks we were having the saem issue.
We are trying to store article content and we are strong a field like
<p>This article is for blah </p>.
Wheni see the analysis.jsp page it does strip out the <p> tags and is
indexed. but when we fetch the document it returns the field with the <p>
tags.
>From solr point of view, its correct but our issue is that this kind of html
tags is screwing up our display of our page. Is there an easy way to esure
how to strip out hte html tags, or do we have to take care of manually.

Thanks
Rashid


aseem cheema wrote:
> 
> Alright. It turns out that escapedTags is not for what I thought it is
> for.
> The problem that I am having with HTMLStripCharFilterFactory is that
> it strips the html while indexing the field, but not while storing the
> field. That is why what is see in analysis.jsp, which is index
> analysis, does not match what gets stored... because.. well HTML is
> stripped only for indexing. Makes so much sense.
> 
> Thanks to Ryan McKinley for clarifying this.
> Aseem
> 
> On Wed, Nov 11, 2009 at 9:50 AM, aseem cheema <aseemche...@gmail.com>
> wrote:
>> I am trying to post a document with the following content using SolrJ:
>> <center>content</center>
>> I need the xml/html tags to be ignored. Even though this works fine in
>> analysis.jsp, this does not work with SolrJ, as the client escapes the
>> < and > with &lt; and &gt; and HTMLStripCharFilterFactory does not
>> strip those escaped tags. How can I achieve this? Any ideas will be
>> highly appreciated.
>>
>> There is escapedTags in HTMLStripCharFilterFactory constructor. Is
>> there a way to get that to work?
>> Thanks
>> --
>> Aseem
>>
> 
> 
> 
> -- 
> Aseem
> 
> 

-- 
View this message in context: 
http://old.nabble.com/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory-tp26305561p27116434.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to