
I am pretty new to elasticsearch and I'm facing a problem I can't figure 
I'm using logstash to store log files to elasticsearch following a specific 
format. Each log line includes an URL, and some other elements that are 
translated into fields inside elasticsearch databases.
The storing process seems to work pretty well and I am able to browse the 
data like I want.
The problem is related to the way some fields are parsed when I come to try 
to analyze the data and more particularly related to the delimiters that 
are used to split the tokens.

One of the fields (named 'category') I want to analyze is composed of 
several parts separated by special characters, such as '|' and the actual 
token sometimes contain '-' characters. example : "category1|cat-egory2".  
The first one should stay a delimiter but the dash is a problem as it is 
part of some of the category names.

I've read some documentation about token delimiter (
and tried to apply the instructions. So, before creating any index, I tried 
to request elasticsearch to change the pattern of delimiters by putting my 
own regular expression  ( "pattern":"|\\\\s+"  ), like in the whitespace 
example, not very different from the one in the example, I'm pretty sure 
the pattern is correct.

Here is the kind of request I am performing after the PUT request was made:
      "query": {
        "match_all": {}
      "facets": {
        "category name": {
          "terms": {
        "field": "category"

The response reports the number of occurrences of each 'category' field, by 
splitting the tokens into different parts. But the tokens split are not 
following the pattern I entered in the whitespace tokenizer.
Instead I get statistics that are not reflecting the actual data because of 
the default comportment of elasticsearch. 
I would like to know what I'm doing wrong and that's why I'm asking for 
your help. 


You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
For more options, visit https://groups.google.com/d/optout.

Reply via email to