have been working there
all along.)
Steve
-Original Message-
From: Mike Hugo [mailto:m...@piragua.com]
Sent: Tuesday, January 24, 2012 3:56 PM
To: solr-user@lucene.apache.org
Subject: Re: HTMLStripCharFilterFactory not working in Solr4?
Thanks for the responses everyone.
Steve
-
From: Mike Hugo [mailto:m...@piragua.com]
Sent: Tuesday, January 24, 2012 3:56 PM
To: solr-user@lucene.apache.org
Subject: Re: HTMLStripCharFilterFactory not working in Solr4?
Thanks for the responses everyone.
Steve, the test method you provided also works for me. However
We recently updated to the latest build of Solr4 and everything is working
really well so far! There is one case that is not working the same way it
was in Solr 3.4 - we strip out certain HTML constructs (like trademark and
registered, for example) in a field as defined below - it was working in
You can use LegacyHTMLStripCharFilterFactory to get the previous behavior.
See https://issues.apache.org/jira/browse/LUCENE-3690 for more details.
-Yonik
http://www.lucidimagination.com
On Tue, Jan 24, 2012 at 1:34 PM, Mike Hugo m...@piragua.com wrote:
We recently updated to the latest build
Thanks for the response Yonik,
Interestingly enough, changing to to the LegacyHTMLStripCharFilterFactory
does NOT solve the problem - in fact I get the same result
I can see that the LegacyHTMLStripCharFilterFactory is being applied at
startup:
Jan 24, 2012 1:25:29 PM
To: solr-user@lucene.apache.org
Subject: HTMLStripCharFilterFactory not working in Solr4?
We recently updated to the latest build of Solr4 and everything is working
really well so far! There is one case that is not working the same way it
was in Solr 3.4 - we strip out certain HTML
Try putting the HTMLStripCharFilterFactory before the StandardTokenizerFactory
instead of after it. I vaguely recall being burned by something like this
before.
-Michael
Oops, I didn't read carefully enough to see that you wanted those constructs
entirely stripped out.
Given that you're seeing numbers indexed, this strongly indicates an
escaping bug in the SolrJ client that must have been introduced at
some point.
I'll see if I can reproduce it in a unit test.
]
Sent: Tuesday, January 24, 2012 1:34 PM
To: solr-user@lucene.apache.org
Subject: HTMLStripCharFilterFactory not working in Solr4?
We recently updated to the latest build of Solr4 and everything is
working
really well so far! There is one case that is not working the same way