Re: Do we have some sort of recomposing token filter?
Hi Alexandre, CombiningFilter sounds close (no option to put spaces between original terms), but hasn't yet been committed: https://issues.apache.org/jira/browse/LUCENE-3413. Steve On Jan 8, 2013, at 4:55 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Hello, I want to take a composite email address such as John Doe john...@example.com and leave John Doe as a facet field. So far, I got UAX29 Tokenizer combined with TypeTokenFilterFactory to filter out email type. But that leaves with John and Doe as tokens which I cannot figure out how to combine back with extra space to make it back into John Doe. I thought about using regexp instead to just string , but that feels even less robust. Do we have anything ready to use for that or do I need to custom code? Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
Re: Do we have some sort of recomposing token filter?
Hi, Are you just trying to extract the personal name? I think Java Mail has the ability to do that. Otis Solr ElasticSearch Support http://sematext.com/ On Jan 8, 2013 4:56 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Hello, I want to take a composite email address such as John Doe john...@example.com and leave John Doe as a facet field. So far, I got UAX29 Tokenizer combined with TypeTokenFilterFactory to filter out email type. But that leaves with John and Doe as tokens which I cannot figure out how to combine back with extra space to make it back into John Doe. I thought about using regexp instead to just string , but that feels even less robust. Do we have anything ready to use for that or do I need to custom code? Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)