We are using Lucene 5.0. Some of our documents are getting indexed with a comma after the value. For example "John Doe, bob smith, and jane go into a bar." We are using a WhitespaceTokenizer and a LowerCaseFilter as the analyzer. If we search for "Doe" nothing is found because the value in the index is "Doe," I was wondering if there was a way to get the reader to ignore the comma. The current work around is to have the user do their search with * at the end. This is slow and also returns unwanted values such as "Does" when we search for "Doe*"
Thank you. ________________________________ Thomas W. Johnson, Senior Programmer 678-397-1663 tjohn...@paperhost.com<mailto:tjohn...@paperhost.com> ________________________________ [PaperHost] [asdf]<http://bit.ly/PaperHost_Twitter> Follow PaperHost on Twitter <http://bit.ly/PaperHost_Twitter> [asdf]<http://bit.ly/PaperHost_FaceBook> Become a Fan of PaperHost <http://bit.ly/PaperHost_FaceBook> [cid:image005.png@01CA6902.F0682A90]<http://paperhost.blogspot.com/> PaperHost Blog<http://paperhost.blogspot.com/> [cid:image002.png@01CA6902.F0682A90]<http://www.linkedin.com/groups?homeNewMember=&gid=2468558> PaperHost LinkedIn Discussion Group <http://www.linkedin.com/groups?homeNewMember=&gid=2468558> LEGAL DISCLAIMER The information transmitted is intended solely for the individual or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dis-semination or other use of or taking action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this email in error please contact the sender and delete the material from any computer.