Yeah, I realize my "fix" is more a bandage. While it wouldn't be a good long-term solution, how about going the path of ignoring unrecognized types and logging a warning message so the handler does crash. The Jira ticket could then be left open (and hopefully assigned) to fix the actual problem. This would allow consumers from having to avoid the scenario or manually patching the file to ignore the problem.
On Wed, May 8, 2013 at 11:49 AM, Shawn Heisey <s...@elyograg.org> wrote: > On 5/8/2013 9:20 AM, Shane Perry wrote: > >> I opened a Jira issue in Oct of 2011 which is still outstanding. I've >> boosted the priority to Critical as each time I've upgraded Solr, I've had >> to manually patch and build the jars. There is a patch (for 3.6) >> attached >> to the ticket. Is there someone with commit access who can take a look and >> poke the fix through (preferably on 4.2 as well as 4.3)? The ticket is >> https://issues.apache.org/**jira/browse/SOLR-2834<https://issues.apache.org/jira/browse/SOLR-2834> >> . >> > > Your patch just ignores the problem so the request doesn't crash, it > doesn't fix it. We need to fix whatever the problem is in > HTMLStripCharFilter. > > I had hoped I could come up with a quick fix, but it's proving too > difficult for me to unravel. I can't even figure out it works on "good" > analysis components like WhiteSpaceTokenizer, so I definitely can't see > what the problem is for HTMLStripCharFilter. > > Thanks, > Shawn > >