[ http://issues.apache.org/jira/browse/LUCENE-733?page=all ]
Hoss Man resolved LUCENE-733.
-----------------------------
Resolution: Invalid
The situation described is very likely depending on the Analyzers used when
indexing the source text, and when parsing the query ... without specific code
demonstrating exactly what analysers were used, there isn't really any evidence
of a "bug"
When getting unexpected results back from a Lucene search, please consults the
user mailing list before submitting a bug ... the number of people
reading/replying to the user list who can provide assistence in understanding
the results you are getting is much larger then the number of people watching
the Jira issue queue.
> problems with some non word ascii characters in searchs
> -------------------------------------------------------
>
> Key: LUCENE-733
> URL: http://issues.apache.org/jira/browse/LUCENE-733
> Project: Lucene - Java
> Issue Type: Bug
> Components: QueryParser, Search
> Reporter: Neil Despain
>
> Here are a number of examples of searches that are not acting as I would
> expect.
> 1.
> ---------
> I have a document with the text:
> Smith, Bob
> 1.a
> If I do a search:
> Smith,~0.9 Bob~0.9
> MultiPhraseQueryParser.parse(term) returns a query for:
> content:smith,~0.9 content:bob~0.9
> But it only gets a hit on: Bob
> 1.b
> If I do this search:
> "Smith,~0.9 Bob~0.9"~1
> MultiPhraseQueryParser.parse(term) returns a query for:
> content:"bob"~1
> and it also only returns a hit for: Bob
> In both cases words that end with a comma are not found. (other characters
> have the same affect as commas)
> =========
> 2.
> ---------
> For a document with phone numbers:
> 2124225100
> 212 422 5100
> 212-422-5100
> (212) 422-5100
> (212)4225100
> (212)422-5100
> (212) 422.5100
> (212) 422 5100
> 212.422.5100
> 212.422-5100
> 2.a
> If I do a search:
> 212*422*5100~0.9
> MultiPhraseQueryParser.parse(term) returns a query for:
> content:"(212.422-5100 212-422-5100 2124225100 212.422.5100)"
> I do not get a match on 212)422-5100 -- Doesn't find anything that starts
> with (212)...
> 2.b
> Search term:
> 212*422*5100
> MultiPhraseQueryParser.parse(term) returns a query for:
> content:212*422*5100
> and does not match 212)422-5100 -- Doesn't find anything that starts with
> (212)...
> 2.c
> If I try to work around that by searching with proximity for:
> "212 422*5100"~1
> MultiPhraseQueryParser.parse(term) returns a query for:
> content:"(422-5100 422.5100 4225100)"~1
> and again does not find anything with (212)... like (212) 422-5100 or
> (212)422-5100
> =========
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]