I am not sure I am following correctly. The field I upload the document to would be "content" the analyzed field is "ColonCancerField". The "content" field contains the entire text of the document, in my case a pubmed abstract. This is a tokenized field. I made this field untokenized and I still received the same results [the results for not instead of not necessarily (in my current example I have 2 docs with not and 1 doc with not necessarily {not is of course in the document that contains not necessarily})]:
http://imgur.com/a/1bfXT I also tried this: http://localhost:8983/solr/Cytokine/select?&q=ColonCancerField :"not+necessarily" I still receive the two documents, which is the same as doing ColonCancerField:"not" Just to clarify the structure looks like this: *content (untokenized, unanalyzed)* [copied to]==> *ColonCancerField *(tokenized, analyzed) then I browse the ColonCancerField and the facets state that there is 1 document for not necessarily, but when selecting it, solr returns 2 results. -Kevin On Mon, Dec 28, 2015 at 10:22 AM, Jamie Johnson <jej2...@gmail.com> wrote: > Can you do the opposite? Index into an unanalyzed field and copy into the > analyzed? > > If I remember correctly facets are based off of indexed values so if you > tokenize the field then the facets will be as you are seeing now. > On Dec 28, 2015 9:45 AM, "Kevin Lopez" <kevin.lopez...@gmail.com> wrote: > > > *What I am trying to accomplish: * > > Generate a facet based on the documents uploaded and a text file > containing > > terms from a domain/ontology such that a facet is shown if a term is in > the > > text file and in a document (key phrase extraction). > > > > *The problem:* > > When I select the facet for the term "*not necessarily*" (we see there > is a > > space) and I get the results for the term "*not*". The field is tokenized > > and multivalued. This leads me to believe that I can not use a tokenized > > field as a facet field. I tried to copy the values of the field to a text > > field with a keywordtokenizer. I am told when checking the schema > browser: > > "Sorry, no Term Info available :(" This is after I delete the old index > and > > upload the documents again. The facet is coming from a field that is > > already copied from another field, so I cannot copy this field to a text > > field with a keywordtokenizer or strfield. What can I do to fix this? Is > > there an alternate way to accomplish this? > > > > *Here is my configuration:* > > > > <copyField source="ColonCancerField" dest="cytokineField"/> > > > > <field name="cytokineField" indexed="true" stored="true" > > multiValued="true" type="Cytokine_Pass"/> > > <fieldType name="Cytokine_Pass" class="solr.TextField"> > > <analyzer> > > <tokenizer class="solr.KeywordTokenizerFactory" /> > > </analyzer> > > </fieldType> > > > > <field name="ColonCancerField" type="ColonCancer" indexed="true" > > stored="true" multiValued="true" > > termPositions="true" > > termVectors="true" > > termOffsets="true"/> > > <fieldType name="ColonCancer" class="solr.TextField" > > sortMissingLast="true" omitNorms="true"> > > <analyzer> > > <filter class="solr.ShingleFilterFactory" > > minShingleSize="2" maxShingleSize="5" > > outputUnigramsIfNoShingles="true" > > /> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > <filter class="solr.LowerCaseFilterFactory"/> > > <filter class="solr.SynonymFilterFactory" > > synonyms="synonyms_ColonCancer.txt" ignoreCase="true" expand="true" > > tokenizerFactory="solr.KeywordTokenizerFactory"/> > > <filter class="solr.KeepWordFilterFactory" > > words="prefLabels_ColonCancer.txt" ignoreCase="true"/> > > </analyzer> > > </fieldType> > > <copyField source="content" dest="ColonCancerField"/> > > > > Regards, > > > > Kevin > > >