Hi !

I tried to define a stopword list for my custom analyzer like this : 

     "analysis" : {
        "tokenizer" : {
          "host_tokenizer" : {
            "type": "pattern",
            "pattern": "[a-zA-Z0-9]+",
            "group": 0
          }
        },
        "analyzer" : {
          "host_analyzer" : {
            "type" : "custom",
            "tokenizer" : "host_tokenizer",
            "filter" : ["lowercase"],
            "stopwords": ["www", "fr", "com"]
          }
        }
      }

But when I do this, the "stopwords" line is ignored.

Apparently you need to do 

 "analysis" : {
        "filter": {
            "hostname_stop": {
                "type": "stop",
                "stopwords":  ["www", "fr", "com"]
            }
        },
        "tokenizer" : {
          "host_tokenizer" : {
            "type": "pattern",
            "pattern": "[a-zA-Z0-9]+",
            "group": 0
          }
        },
        "analyzer" : {
          "host_analyzer" : {
            "type" : "custom",
            "tokenizer" : "host_tokenizer",
            "filter" : ["lowercase", "hostname_stop"],
          }
        }
      }

This is confusing because the first syntax is closer with what you find in 
the guide for standard analyzers
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/using-stopwords.html

And there is no error or warning telling me that my             "stopwords": 
["www", "fr", "com"]
line got ignored.

Gist :
<script 
src="https://gist.github.com/Alix-Martin/7186e38459e88a474e13.js";></script>

Alix Martin

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d9ccb329-59e2-42af-9c86-6c11adeba2c0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to