Handling MESH descriptor preferred terms and such is similar. I encountered this during evaluation of Solr for a project here at NLM. We decided to use Solr for different projects instead. I considered the following approaches: - use a custom tokenizer at index time that indexed all of the multiple term alternatives. - index the data, and then have an enrichment process that queries on each source synonym, and generates an update to add the target synonyms. Follow this with an optimize. - During the indexing process, but before sending the data to Solr, process the data to tokenize and add synonyms to another field.
Both the custom tokenizer and enrichment process share the feature that they use Solr's own tokenizer rather than duplicate it. The enrichment process seems to me only workable in environments where you can re-index all data periodically, so no continuous stream of data to index that needs to be handled relatively quickly once it is generated. The last method of pre-processing the data seems the least desirable to me from a blue-sky perspective, but is probably the easiest to implement and the most independent of Solr. Hope this helps, Dan Davis, Systems/Applications Architect (Contractor), Office of Computer and Communications Systems, National Library of Medicine, NIH -----Original Message----- From: Kaushik [mailto:kaushika...@gmail.com] Sent: Monday, April 20, 2015 10:47 AM To: solr-user@lucene.apache.org Subject: Mutli term synonyms Hello, Reading up on synonyms it looks like there is no real solution for multi term synonyms. Is that right? I have a use case where I need to map one multi term phrase to another. i.e. Tween 20 needs to be translated to Polysorbate 40. Any thoughts as to how this can be achieved? Thanks, Kaushik