On Thu, Jul 30, 2009 at 11:22 AM, Matthew Hall <[email protected]> wrote: > > 1. Sure, just have an analyzer that splits on all non letter characters. > 2. Phrase queries keep the order intact. (And yes, the positional > information for the terms is kept, which is what allows span queries to work) > > So searching on the following "foo bar com" will match [email protected] but not > [email protected]
Thanks, I really appreciate your help with this. That's great to know. Can I take this a little further... If I have "[email protected] [email protected] [email protected]" and analyze it I get "foo bar com bar foo com com bar foo", so perhaps I need a different way of delimiting the emails, as it will match some other combinations here, eg. [email protected] which is not one of the emails. Has anyone done anything similar? I can imagine that one option would be to filter the returned docs based on the original content of the string I'm analyzing. Does Lucene do this for me? Thanks, Phil --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
