Mark, I understand your point. However, we do not maintain a separate field for the lower-case version of the words. Instead we index them twice at the same position within the same field, which allows us to provide case-exact match for search queries containing upper case characters, but case-insensitive match for search queries given in all low cases. So I'm afraid I can't use the technique you recommend.
/Jong -----Original Message----- From: markharw00d [mailto:[EMAIL PROTECTED] Sent: Monday, July 09, 2007 3:13 PM To: [email protected] Subject: Re: Stop-words comparison in MoreLikeThis class in Lucene's contrib/queries project >>the case matters only for those words that should be included. Jong, just want to check we're on the same page - you do know MoreLikeThis has a kind of automatic Stop-Wording built in , yes? MoreLikeThis looks at the document frequency of all terms in the "this" text you provide and only selects a shortlist (up to maxQueryTerms) of the rarer words. As such, users (admin or otherwise) surrender precise control over what terms are used, hence my earlier point "does case really matter in this 'inexact' scenario?" and can you use the lower-case version of the field you said you already create? Cheers Mark --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
