Please, does anyone have a JSPParser class that parses JSPs? I hacked the HTMLParser class that comes in the Lucene demo and made it parse and index JSPs. But when i would do a search, the jsp tags
<%pageContext.setAttribute( "req", request );%> <%@ page import="com.propelnewmedia.tags.BreadcrumbTrailer"%> and so on, were included in the summary. Then, I figured out a way to get the JSP tags out of the summary (and i think out of the index as well). What I did was designate JSP tags (anything starting with <% and ending with %>) as a 3rd comment type in the void CommentTag() :, TOKEN :, and <WithinCommentN> TOKEN : sections of HTMLParser.jj I just copied and pasted the relevant code for Comment2 and mimicked that for my new Comment type. I then recompiled HTMLParser.jj using javacc. I'm still not out of the woods though. I still need to know how to make Lucene not include list element values, etc in the search hits. For instance, if a keyword happens to be in a <selection> list, it gets counted as a hit. Any suggestions (or preferably, working code) would be massively appreciated!. Thanks in advance.