Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2010-01-13 Thread Shalin Shekhar Mangar
On Wed, Jan 13, 2010 at 7:48 AM, Lance Norskog goks...@gmail.com wrote: You can do this stripping in the DataImportHandler. You would have to write your own stripping code using regular expresssions. Note that DIH has a HTMLStripTransformer which wraps Solr's HTMLStripReader. -- Regards,

Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2010-01-12 Thread Lance Norskog
: http://old.nabble.com/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory-tp26305561p27116434.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory

Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2010-01-11 Thread darniz
. There is escapedTags in HTMLStripCharFilterFactory constructor. Is there a way to get that to work? Thanks -- Aseem -- Aseem -- View this message in context: http://old.nabble.com/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory-tp26305561p27116434.html Sent from the Solr - User mailing

Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2010-01-11 Thread Erick Erickson
will be highly appreciated. There is escapedTags in HTMLStripCharFilterFactory constructor. Is there a way to get that to work? Thanks -- Aseem -- Aseem -- View this message in context: http://old.nabble.com/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory

Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2010-01-11 Thread darniz
/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory-tp26305561p27116434.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory-tp26305561p27116601.html Sent from the Solr - User

Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2010-01-11 Thread Chris Hostetter
: stored without tags. But looks like the html tags are removed and terms are : indexed purely for indexing, and the actual text is stored in raw format. Correct. Analysis is all about indexing it has nothing to do with stored content. You can write UpdateProcessors that modify the content

Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2010-01-11 Thread Erick Erickson
. There is escapedTags in HTMLStripCharFilterFactory constructor. Is there a way to get that to work? Thanks -- Aseem -- Aseem -- View this message in context: http://old.nabble.com/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory-tp26305561p27116434.html

Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2010-01-11 Thread darniz
-with-HTMLStripCharFilterFactory-tp26305561p27116434.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://old.nabble.com/XmlUpdateRequestHandler-with-HTMLStripCharFilterFactory-tp26305561p27116601.html Sent from the Solr - User mailing

XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2009-11-11 Thread aseem cheema
I am trying to post a document with the following content using SolrJ: centercontent/center I need the xml/html tags to be ignored. Even though this works fine in analysis.jsp, this does not work with SolrJ, as the client escapes the and with lt; and gt; and HTMLStripCharFilterFactory does not

Re: XmlUpdateRequestHandler with HTMLStripCharFilterFactory

2009-11-11 Thread aseem cheema
Alright. It turns out that escapedTags is not for what I thought it is for. The problem that I am having with HTMLStripCharFilterFactory is that it strips the html while indexing the field, but not while storing the field. That is why what is see in analysis.jsp, which is index analysis, does not