Hi Michael, It IS possible to utilize multiple Analyzers within a single field, but it's not a "built in" capability of Solr right now. I wrote something I called a "MultiTextField" which provides this capability, and you can see the code here: https://github.com/treygrainger/solr-in-action/tree/master/src/main/java/sia/ch14
The general idea is that you can pass in a prefix for each piece of your content and then use that prefix to dynamically select one or more Analyzers for each piece of content. So, for example, you could pass in something like this when indexing your document (for a multiValued field): <field name="someMultiTextField">en|some text</field> <field name="someMultiTextField">es|some more text</field> <field name="someMultiTextField">de,fr|some other text</field> Then, the MultiTextField will parse the prefixes and dynamically grab an Analyzer based upon the prefix. In this case, the first input will be processed using an English Analyzer, the second input will use a spanish analyzer, and the third input will use both a German and French analyzer, as defined when the field is defined in the schema.xml: <fieldType name="multiText" class="sia.ch14.MultiTextField" sortMissingLast="true" defaultFieldType="text_general" fieldMappings="en:text_english, es:text_spanish, fr:text_french, fr:text_german"/> <field name="someMultiTextField" type="multiText" indexed="true" multiValued="true" /> If you want to automagically map separate fields into one of these dynamic analyzer (MultiText) fields with prefixes, you could either pass the text in multiple times from the client to the same field (with different Analyzer prefixes each time like shown above), OR you could write an Update Request Processor that does this for you. I don't think it is possible to just have the copyField add in prefixes automatically for you, though someone please correct me if I'm wrong. If you implement an Update Request Processor, then inside it you would simply grab the text from each of the relevant fields (i.e. author and title fields) and then add that field's value to the named MultiText field with the appropriate Analyzer prefix based upon each field. I made an example Update Request Processor (see the previous github link and look for MultiTextFieldLanguageIdentifierUpdateProcessor) that you could look at as an example of how to supply different analyzer prefixes to different values within a multiValued field, though you would obviously want to throw away all the language detection stuff since it doesn't match your specific use case. All that being said, this solution may end up being overly complicated for your use case, so your idea of creating a custom analyzer to just handle your example might be much less complicated. At any rate, that's the specific answer to your specific question about whether it is possible to utilize multiple Analyzers within a field based upon multiple inputs. All the best, Trey Grainger Co-author, Solr in Action Director of Engineering, Search & Analytics @ CareerBuilder On Thu, Apr 10, 2014 at 9:05 PM, Michael Sokolov < msoko...@safaribooksonline.com> wrote: > The lack of response to this question makes me think that either there is > no good answer, or maybe the question was too obtuse. So I'll give it one > more go with some more detail ... > > My main goal is to implement autocompletion with a mix of words and short > phrases, where the words are drawn from the text of largish documents, and > the phrases are author names and document titles. > > I think the best way to accomplish this is to concoct a single field that > contains data from these other "source" fields (as usual with copyField), > but with some of the fields treated as keywords (ie with their values > inserted as single tokens), and others tokenized. I believe this would be > possible at the Lucene level by calling Document.addField () with multiple > fields having the same name: some marked as TOKENIZED and others not. I > think the tokenized fields would have to share the same analyzer, but > that's OK for my case. > > I can't see how this could be made to happen in Solr without a lot of > custom coding though. It seems as if the conversion from Solr fields to > Lucene fields is not an easy thing to influence. If anyone has an idea how > to achieve the subgoal, or perhaps a different way of getting at the main > goal, I'd love to hear about it. > > So far my only other idea is to write some kind of custom analyzer that > treats short texts as keywords and tokenizes longer ones, which is probably > what I'll look at if nothing else comes up. > > Thanks > > Mike > > > > On 4/9/2014 4:16 PM, Michael Sokolov wrote: > >> I think I would like to do something like copyfield from a bunch of >> fields into a single field, but with different analysis for each source, >> and I'm pretty sure that's not a thing. Is there some alternate way to >> accomplish my goal? >> >> Which is to have a suggester that suggests words from my full text field >> and complete phrases drawn from my author and title fields all at the same >> time. So If I could index author and title using KeyWordAnalyzer, and full >> text tokenized, that would be the bees knees. >> >> -Mike >> > >