We are using Lucene 5.0. Some of our documents are getting indexed with a comma 
after the value. For example "John Doe, bob smith, and jane go into a bar."  We 
are using a WhitespaceTokenizer and a  LowerCaseFilter as the analyzer. If we 
search for "Doe" nothing is found because the value in the index is "Doe," I 
was wondering if there was a way to get the reader to ignore the comma. The 
current work around is to have the user do their search with * at the end. This 
is slow and also returns unwanted values such as "Does" when we search for  
"Doe*"

Thank you.

________________________________



Thomas W. Johnson, Senior Programmer
678-397-1663
tjohn...@paperhost.com<mailto:tjohn...@paperhost.com>


________________________________

[PaperHost]

[asdf]<http://bit.ly/PaperHost_Twitter>

Follow PaperHost on Twitter <http://bit.ly/PaperHost_Twitter>

[asdf]<http://bit.ly/PaperHost_FaceBook>

Become a Fan of PaperHost <http://bit.ly/PaperHost_FaceBook>

[cid:image005.png@01CA6902.F0682A90]<http://paperhost.blogspot.com/>

PaperHost Blog<http://paperhost.blogspot.com/>

[cid:image002.png@01CA6902.F0682A90]<http://www.linkedin.com/groups?homeNewMember=&gid=2468558>

PaperHost LinkedIn Discussion Group 
<http://www.linkedin.com/groups?homeNewMember=&gid=2468558>

LEGAL DISCLAIMER

The information transmitted is intended solely for the individual or entity to 
which it is addressed and may contain confidential and/or privileged material. 
Any review, retransmission, dis-semination or other use of or taking action in 
reliance upon this information by persons or entities other than the intended 
recipient is prohibited. If you have received this email in error please 
contact the sender and delete the material from any computer.



Reply via email to