: >>> Just curious: it would seem easier to use multiple fields for the : >>> original case and lowercase searching. Is there any particular reason : >>> you analyzed the documents to multiple indexes instead of multiple : >>> fields? : >> : >> I considered that approach, however to expose QueryParser I'd have to : >> get tricky. If I have title_orig and title_lc fields, how would I : >> allow freeform queries of title:something?
Why have seperate fields? Why not index the title into the "title" field twice, once with each term lowercased and once with the case left alone. (Using an analyzer that tokenizes "The Quick BrOwN fox" as "[the] [quick] [brown] [fox] [The] [Quick] [BrOwN] [fox]") Then at search time, depending on the value of of the checkbox, construct your QueryParser using the appropriate Analyzer. The only problem i can think of would be inflated scores for terms that are naturally lowercased, because they would wind up getting added to the index twice, but based on what i've seen of hte data you are working with, i imageing that if you used UPPERCASE instead of lowercase you could drasticly reduce the likelyhood of any problems with that. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]