I tested against the phrase in my text, '<b>men's college soccer</b>',
matching successfully on 'college AND soccer*'.  However, I found no match
for 'college AND soccer', 'college AND soccer<*', 'college AND soccer<',
'college AND soccerb', 'college AND soccerb*', or 'college AND soccer/'.



---- Original Message -----
From: "Otis Gospodnetic" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Friday, October 18, 2002 9:32 PM
Subject: Re: Tags Screwing up Searches

> Is it possible that the Analyzer is stripping <, >, and / characters
> and leaving you with terms like: bCollege and Soccerb ?
> Otis
> --- Terry Steichen <[EMAIL PROTECTED]> wrote:
> > Some content I'm indexing contains certain HTML tags, like <p>, <b>,
> > <i>, etc.  What I find is that when a term I'm searching for touches
> > one of these tags (which is fairly typical), the term isn't
> > recognized and the search fails.  For example, <b>College Soccer</b>
> > doesn't match on either "college" or "soccer".  I seem to recall
> > someone else bring up a similar problem with a word that ends a
> > sentence (and is thus treated as if the period was part of the word),
> > but don't recall what the response was and I can't find that thread.
> >
> > Does anyone have some ideas on what's the best way to handle this?
> > Filter out the tags in the process of creating the Document for
> > indexing? Or through a modification to the Analyzer (I'm using the
> > StandardAnalyzer)? Or something else?
> >
> > TIA,
> >
> > Terry
> >
> >
> __________________________________________________
> Do you Yahoo!?
> Y! Web Hosting - Let the expert host your web site
> http://webhosting.yahoo.com/
> --
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@;jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@;jakarta.apache.org>

Reply via email to