Hey Guys, I have HTMLStripCharFilterFactory char filter declared in my schema.xml for fieldType text (code below). I am using this field type for body field of my schema. I am seeing different behavior when I use SolrJ to post a document (code below) and when I use the analysis.jsp. The text I am putting in the field is <center>content</center>.
When SolrJ is used, the field gets the whole value <center>content</center>, but when analysis.jsp is used, it shows only "content" being used for the field. What am I possibly doing wrong here? How do I get HTMLStripCharFilterFactory to work, even if I am pushing data using SolrJ. Thanks. Your help is highly appreciated. Thanks -- Aseem ############# schema.xml ###################### <analyzer type="index"> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> ################## SolrJ Code ###################### CommonsHttpSolrServer server = new CommonsHttpSolrServer("http://aseem.desktop.amazon.com:8983/solr/sharepoint"); SolrInputDocument doc = new SolrInputDocument(); UpdateRequest req = new UpdateRequest(); doc.addField("url", "http://haha.com"); doc.addField("body", sbr.toString());*/ doc.addField("body", "<center>content</center>"); req.add(doc); req.setAction(ACTION.COMMIT, false, false); UpdateResponse resp = req.process(server); System.out.println(resp);