Thanks Erick, Payloads might work but I'm looking at a more general problem
Here is another use case: We have a mix of Traditional and Simplified Chinese documents indexed in the same OCR field. When a user searches using Traditional Chinese, I would like to also search in Simplified Chinese, but rank the results matching Traditional Chinese higher. Similarly, if a user enters a query in Simplified Chinese, I want to also search in Traditional Chinese but rank matches of the Simplified Chinese query terms higher. Since it is not always possible to determine whether a short query is in Simplified or Traditional Chinese here is what I would like to do. 1) Convert the query to Traditional Chinese 2) Convert the query to Simplified Chinese (One of these two steps would not be necessary if I could reliably determine the nature of the query) q1=QueryAsEntered^10 OR QueryTraditional^1 OR QuerySimplifed^1. Again, this could be done with copy fields, but that would increase my index size too much. What I really want to be able to do is to query the same index (i.e. document as created ) with the user's query processed/analyzed in 3 different ways. I could do this myself in the app layer, but I would really like to be able to use Solr. Tom On Mon, Mar 4, 2013 at 8:19 PM, Erick Erickson <erickerick...@gmail.com>wrote: > Tom: > > I wonder if you could do something with payloads here. Index all terms > with payloads of 10, but synonyms with 1? > > Random thought off the top of my head. > > Erick > > >> <analyzer type=index> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> </analyzer> >> <fieldType name="plain"> >> <analyzer type=query> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> </analyzer> >> >> <fieldType name="syn"> >> <analyzer type=index> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> </analyzer> >> <fieldType name="plain"> >> <analyzer type=query> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" >> ignoreCase="true" expand="true"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> </analyzer> >> <copyField source="plain" dest="syn"/> >> >> On Mon, Mar 4, 2013 at 4:43 PM, Jack Krupansky >> <j...@basetechnology.com>wrote: >> >>> Please clarify, and try providing a couple more use cases. I mean, >>> the case you provided suggests that the contents of the index will be >>> different between the two fields, while you told us that you wanted to >>> share the same indexed field. In other words, it sounds like you will have >>> two copies of similar data anyway. >>> >>> Maybe you simply want one copy of the stored value for the field and >>> then have one or more copyfields that index the same source data >>> differently, but don’t re-store the copied source data. >>> >>> -- Jack Krupansky >>> >>> *From:* Tom Burton-West <tburt...@umich.edu> >>> *Sent:* Monday, March 04, 2013 3:57 PM >>> *To:* dev@lucene.apache.org >>> *Subject:* Ability to specify 2 different query analyzers for same >>> indexed field in Solr >>> >>> Hello, >>> >>> We would like to be able to specify two different fields that both use >>> the same indexed field but use different analyzers. An example use-case >>> for this might be doing query-time synonym expansion with the synonyms >>> weighted lower than an exact match. >>> >>> q=exact_field^10 OR synonyms^1 >>> >>> The normal way to do this in Solr, which is just to set up separate >>> analyzer chains and use a copyfield, will not work for us because the field >>> in question is huge. It is about 7 TB of OCR. >>> >>> Is there a way to do this currently in Solr? If not , >>> >>> 1) should I open a JIRA issue? >>> 2) can someone point me towards the part of the code I might need to >>> modify? >>> >>> Tom >>> >>> Tom Burton-West >>> Information Retrieval Programmer >>> Digital Library Production Service >>> University of Michigan Library >>> http://www.hathitrust.org/blogs/large-scale-search >>> >>> >>> >> >> >