I am using the phrase suggester to implement did-you-mean functionality. My 
source field is named did_you_mean_source which is a combination of first 
and last name with a space in the middle.

When I search for say "allex blak" I do get fairly descent suggestions, 
including the "alex black" I am hoping to get. 

*The problem is I also get "alex [any other first and/or last name present 
in the index that has a subbestion for blak]". This suggestion doesn't 
match to any "results" (not suggestion result) but actual search result in 
my index.*

So you search for first and last name which you happen to misspelled, but 
the suggestions propose another spelling which doesn't actually return any 
index search results. In other words, I need to somehow preserve the 
identity that one is a first name and the other is a last name.

What I am trying to achieve is a did-you-mean functionality where I can 
type a first and last name, and get suggestions based of of the 
did_you_mean_source but only where both typed words match a document. 


Here's my analyzer:
did_you_mean:
  type: custom
  tokenizer: standard
  filter: ["lowercase", "trim",]


Here's the suggest part of my query:

"suggest": {
  "text": "allex blak",
  "did_you_mean" : {
    "phrase" : {
      "field": "did_you_mean_source",
      "real_word_error_likelihood": 0.90,
      "max_errors": 1,
      "direct_generator" : [{
        "field" : "did_you_mean_source",
        "suggest_mode" : "always",
        "min_word_length" : 3,
        "size": 5,
        "prefix_length": 2,
        "min_doc_freq": 1
      }]
    }
  }
}



Here's what my mappings look like: 

{
  "development_search_suggestions": {
    "mappings": {
      "search_suggestion": {
        "_all": {
          "enabled": false
        },
        "properties": {
          "did_you_mean_source": {
            "type": "string",
            "analyzer": "did_you_mean"
          },
          "keywords": {
            "type": "string",
            "index_options": "offsets",
            "analyzer": "full",
            "fields": {
              "partial": {
                "type": "string",
                "index_options": "offsets",
                "index_analyzer": "partial_auto_suggest",
                "search_analyzer": "full_with_auto_suggest_synonyms"
              },
              "synonymic": {
                "type": "string",
                "index_analyzer": "full_with_auto_suggest_synonyms",
                "search_analyzer": "full"
              }
            }
          },
          "keywords_auxiliary": {
            "type": "string",
            "index_options": "offsets",
            "analyzer": "full",
            "fields": {
              "partial": {
                "type": "string",
                "index_options": "offsets",
                "index_analyzer": "partial_auto_suggest",
                "search_analyzer": "full_with_auto_suggest_synonyms"
              },
              "synonymic": {
                "type": "string",
                "index_analyzer": "full_with_auto_suggest_synonyms",
                "search_analyzer": "full"
              }
            }
          }
        }
      }
    }
  }
}


*A few examples:*

Say my index contains the following documents with first and last name in 
respective orders:
bruno miranda
miranda bella
bran scott



If I search for:  "brno miranda" I should get a suggestion for "bruno 
miranda" but I should not have suggestions for "bran miranda" because that 
document doesn't exist in the db. It's simply a mismatch of a first name + 
a different document's last name. 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e4f16d17-a0fd-4a4f-88fd-bbeaf7238353%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to