Please, does anyone have a JSPParser class that parses JSPs?

I hacked the HTMLParser class that comes in the Lucene demo and made it parse and 
index JSPs. But when i would do a search, the jsp tags 

 <%pageContext.setAttribute( "req", request );%> 
<%@ page import="com.propelnewmedia.tags.BreadcrumbTrailer"%>


and so on, were included in the summary. 
Then, I figured out a way to get the JSP tags out of the summary (and i think out of 
the index as well). 

What I did was designate JSP tags (anything starting with <% and ending with %>) as a 
3rd comment type in the void CommentTag() :, TOKEN :, and <WithinCommentN> TOKEN : 
sections of HTMLParser.jj 

I just copied and pasted the relevant code for Comment2 and mimicked that for my new 
Comment type. I then recompiled HTMLParser.jj using javacc. 

I'm still not out of the woods though. I still need to know how to make Lucene not 
include list element values, etc in the search hits. For instance, if a keyword 
happens to be in a <selection> list, it gets counted as a hit. 

Any suggestions (or preferably, working code) would be massively appreciated!. Thanks 
in advance.

Reply via email to