Hi, Here's our example of exact match fields:
https://github.com/NatLibFi/finna-solr/blob/master/vufind/biblio/conf/schema.xml#L48 textProper_l requires a partial match from the beginning. textProper_lr requires a full match. I'm not sure if this works for you, but at least we have this creative use of PathHierarchyTokenizerFactory allowing the left-anchored search. HTH, Ere Paras Lehana kirjoitti 3.12.2019 klo 13.49: > Hi Omer, > > If you mean exact match with same number of words (Emir's), you can also > add an identifier in the beginning and end of the some other field like > title_exact. This can be done in your indexing script or using Pattern > Replace. During query side, you can use this identifier. For example, > indexing "united states" with "exactStart united states exactEnd" and > querying with the same. Obviously, you can have scoring issues here so only > use if you want it to debug or retrieve docs. > > Just adding to the all possible ways. *Anyways, I like the Keyword method.* > > On Tue, 3 Dec 2019 at 03:59, Erick Erickson <erickerick...@gmail.com> wrote: > >> There are two different interpretations of “exact match” going on here, >> don’t be confused! >> >> Emir’s version is “the text has to match the _entire_ input. So a field >> with “a b c d” will NOT match “a b” or “a b c” or “b c", but only “a b c d”. >> >> David’s version is “The text has to contain some sequence of words that >> exactly matches my query”, so a field with “a b c d” _would_ match “a b”, >> “a b c”, “a b c d”, “b c”, “c d”, etc. >> >> Both are entirely valid use-cases, depending on what you mean by “exact >> match" >> >> Best, >> Erick >> >>> On Dec 2, 2019, at 4:38 PM, Emir Arnautović < >> emir.arnauto...@sematext.com> wrote: >>> >>> Hi Omer, >>> From performance perspective, it is the best if you index title as a >> single token: KeywordTokenizer + LowerCaseFilter >>> >>> If you need to query that field in some other way, you can index it >> differently as some other field using copyField. >>> >>> HTH, >>> Emir >>> -- >>> Monitoring - Log Management - Alerting - Anomaly Detection >>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >>> >>> >>> >>>> On 2 Dec 2019, at 21:43, OTH <omer.t....@gmail.com> wrote: >>>> >>>> Hello, >>>> >>>> What would be the best way to get exact matches (if any) to a query? >>>> >>>> E.g.: Let's the document text is: "united states of america". >>>> Currently, any query containing one or more of the three words "united", >>>> "states", or "america" will match with the above document. I would >> like a >>>> way so that the document matches only and only if the query were also >>>> "united states of america" (case-insensitive). >>>> >>>> Document field type: TextField >>>> Index Analyzer: TokenizerChain >>>> Index Tokenizer: StandardTokenizerFactory >>>> Index Token Filters: StopFilterFactory, LowerCaseFilterFactory, >>>> SnowballPorterFilterFactory >>>> The Query Analyzer / Tokenizer / Token Filters are the same as the Index >>>> ones above. >>>> >>>> FYI I'm relatively novice at Solr / Lucene / Search. >>>> >>>> Much appreciated >>>> Omer >>> >> >> > -- Ere Maijala Kansalliskirjasto / The National Library of Finland