problems with some non word ascii characters in searchs
-------------------------------------------------------
Key: LUCENE-733
URL: http://issues.apache.org/jira/browse/LUCENE-733
Project: Lucene - Java
Issue Type: Bug
Components: QueryParser, Search
Reporter: Neil Despain
Here are a number of examples of searches that are not acting as I would expect.
1.
---------
I have a document with the text:
Smith, Bob
1.a
If I do a search:
Smith,~0.9 Bob~0.9
MultiPhraseQueryParser.parse(term) returns a query for:
content:smith,~0.9 content:bob~0.9
But it only gets a hit on: Bob
1.b
If I do this search:
"Smith,~0.9 Bob~0.9"~1
MultiPhraseQueryParser.parse(term) returns a query for:
content:"bob"~1
and it also only returns a hit for: Bob
In both cases words that end with a comma are not found. (other characters have
the same affect as commas)
=========
2.
---------
For a document with phone numbers:
2124225100
212 422 5100
212-422-5100
(212) 422-5100
(212)4225100
(212)422-5100
(212) 422.5100
(212) 422 5100
212.422.5100
212.422-5100
2.a
If I do a search:
212*422*5100~0.9
MultiPhraseQueryParser.parse(term) returns a query for:
content:"(212.422-5100 212-422-5100 2124225100 212.422.5100)"
I do not get a match on 212)422-5100 -- Doesn't find anything that starts with
(212)...
2.b
Search term:
212*422*5100
MultiPhraseQueryParser.parse(term) returns a query for:
content:212*422*5100
and does not match 212)422-5100 -- Doesn't find anything that starts with
(212)...
2.c
If I try to work around that by searching with proximity for:
"212 422*5100"~1
MultiPhraseQueryParser.parse(term) returns a query for:
content:"(422-5100 422.5100 4225100)"~1
and again does not find anything with (212)... like (212) 422-5100 or
(212)422-5100
=========
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]