Re: HTMLStripCharFilterFactory not working when using SolrJ java client

2009-11-10 Thread aseem cheema
I printed the UpdateRequest object (getXML) and the XML is:
http://haha.com
content
I can see that the issue is because the HTML/XML <> are replaced by < > I understand that it is required to do so to keep them from interfering with the solr xml document, but how do I accomplish what I want to? I need to get the html in body field stripped out. Any help is highly appreciated. Thanks Aseem On Tue, Nov 10, 2009 at 10:56 AM, aseem cheema wrote: > Hey Guys, > I have HTMLStripCharFilterFactory char filter declared in my > schema.xml for fieldType text (code below). I am using this field type > for body field of my schema. I am seeing different behavior when I use > SolrJ to post a document (code below) and when I use the analysis.jsp. > The text I am putting in the field is content. > > When SolrJ is used, the field gets the whole value > content, but when analysis.jsp is used, it shows only > "content" being used for the field. > > What am I possibly doing wrong here? How do I get > HTMLStripCharFilterFactory to work, even if I am pushing data using > SolrJ. Thanks. > > Your help is highly appreciated. > Thanks > -- > Aseem > > # schema.xml ## >         >           >           >                            ignoreCase="true" >                  words="stopwords.txt" >                  enablePositionIncrements="true" >                  /> >           generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1"                  catenateAll="0" > splitOnCaseChange="1"/> >           >           synonyms="synonyms.txt" ignoreCase="true" expand="true"/> >           protected="protwords.txt"/> >           >         > > ## SolrJ Code ## >     CommonsHttpSolrServer server = new > CommonsHttpSolrServer("http://aseem.desktop.amazon.com:8983/solr/sharepoint";); >      SolrInputDocument doc = new SolrInputDocument(); >      UpdateRequest req = new UpdateRequest(); >      doc.addField("url", "http://haha.com";); >      doc.addField("body", sbr.toString());*/ >      doc.addField("body", "content"); >      req.add(doc); >      req.setAction(ACTION.COMMIT, false, false); >      UpdateResponse resp = req.process(server); >      System.out.println(resp); > -- Aseem

Re: HTMLStripCharFilterFactory not working when using SolrJ java client

2009-11-10 Thread aseem cheema
HTMLStripCharFilterFactory class has a constructor that accept
escaptedTags. I believe this will solve my problem. But I am not sure
how to pass this from schema.xml file. I have tried  but
that didn't work.
Anybody?
Thanks

On Tue, Nov 10, 2009 at 10:56 AM, aseem cheema  wrote:
> Hey Guys,
> I have HTMLStripCharFilterFactory char filter declared in my
> schema.xml for fieldType text (code below). I am using this field type
> for body field of my schema. I am seeing different behavior when I use
> SolrJ to post a document (code below) and when I use the analysis.jsp.
> The text I am putting in the field is content.
>
> When SolrJ is used, the field gets the whole value
> content, but when analysis.jsp is used, it shows only
> "content" being used for the field.
>
> What am I possibly doing wrong here? How do I get
> HTMLStripCharFilterFactory to work, even if I am pushing data using
> SolrJ. Thanks.
>
> Your help is highly appreciated.
> Thanks
> --
> Aseem
>
> # schema.xml ##
>        
>          
>          
>                            ignoreCase="true"
>                  words="stopwords.txt"
>                  enablePositionIncrements="true"
>                  />
>           generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1"                  catenateAll="0"
> splitOnCaseChange="1"/>
>          
>           synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>           protected="protwords.txt"/>
>          
>        
>
> ## SolrJ Code ##
>     CommonsHttpSolrServer server = new
> CommonsHttpSolrServer("http://aseem.desktop.amazon.com:8983/solr/sharepoint";);
>      SolrInputDocument doc = new SolrInputDocument();
>      UpdateRequest req = new UpdateRequest();
>      doc.addField("url", "http://haha.com";);
>      doc.addField("body", sbr.toString());*/
>      doc.addField("body", "content");
>      req.add(doc);
>      req.setAction(ACTION.COMMIT, false, false);
>      UpdateResponse resp = req.process(server);
>      System.out.println(resp);
>



-- 
Aseem