I tested against the phrase in my text, '<b>men's college soccer</b>', matching successfully on 'college AND soccer*'. However, I found no match for 'college AND soccer', 'college AND soccer<*', 'college AND soccer<', 'college AND soccerb', 'college AND soccerb*', or 'college AND soccer/'.
Regards, Terry ---- Original Message ----- From: "Otis Gospodnetic" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Friday, October 18, 2002 9:32 PM Subject: Re: Tags Screwing up Searches > Is it possible that the Analyzer is stripping <, >, and / characters > and leaving you with terms like: bCollege and Soccerb ? > > Otis > > --- Terry Steichen <[EMAIL PROTECTED]> wrote: > > Some content I'm indexing contains certain HTML tags, like <p>, <b>, > > <i>, etc. What I find is that when a term I'm searching for touches > > one of these tags (which is fairly typical), the term isn't > > recognized and the search fails. For example, <b>College Soccer</b> > > doesn't match on either "college" or "soccer". I seem to recall > > someone else bring up a similar problem with a word that ends a > > sentence (and is thus treated as if the period was part of the word), > > but don't recall what the response was and I can't find that thread. > > > > Does anyone have some ideas on what's the best way to handle this? > > Filter out the tags in the process of creating the Document for > > indexing? Or through a modification to the Analyzer (I'm using the > > StandardAnalyzer)? Or something else? > > > > TIA, > > > > Terry > > > > > > > __________________________________________________ > Do you Yahoo!? > Y! Web Hosting - Let the expert host your web site > http://webhosting.yahoo.com/ > > -- > To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@;jakarta.apache.org> > For additional commands, e-mail: <mailto:lucene-user-help@;jakarta.apache.org> > > -- To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@;jakarta.apache.org> For additional commands, e-mail: <mailto:lucene-user-help@;jakarta.apache.org>