Hi Jochen,

Here's what my "graylog-internal" template currently looks like (as seen 
via the Elasticsearch API):

{
  "graylog-internal" : {
    "order" : 0,
    "template" : "graylog2_*",
    "settings" : { },
    "mappings" : {
      "message" : {
        "_source" : {
          "compress" : true,
          "enabled" : true
        },
        "dynamic_templates" : [ {
          "internal_fields" : {
            "mapping" : {
              "index" : "not_analyzed",
              "doc_values" : true
            },
            "match" : "gl2_*"
          }
        }, {
          "store_generic" : {
            "mapping" : {
              "index" : "not_analyzed"
            },
            "match" : "*"
          }
        } ],
        "_ttl" : {
          "enabled" : true
        },
        "properties" : {
          "message" : {
            "index" : "analyzed",
            "analyzer" : "whitespace",
            "type" : "string"
          },
          "timestamp" : {
            "format" : "yyyy-MM-dd HH:mm:ss.SSS",
            "doc_values" : true,
            "type" : "date"
          },
          "source" : {
            "index" : "analyzed",
            "analyzer" : "analyzer_keyword",
            "type" : "string"
          },
          "full_message" : {
            "index" : "analyzed",
            "analyzer" : "whitespace",
            "type" : "string"
          }
        }
      }
    },
    "aliases" : { }
  }
}


Here's what my graylog2_3 index currently looks like (as seen via the 
Elasticsearch API):

{
  "graylog2_3" : {
    "aliases" : {
      "graylog2_deflector" : { }
    },
    "mappings" : {
      "message" : {
        "dynamic_templates" : [ {
          "internal_fields" : {
            "mapping" : {
              "index" : "not_analyzed",
              "doc_values" : true
            },
            "match" : "gl2_*"
          }
        }, {
          "store_generic" : {
            "mapping" : {
              "index" : "not_analyzed"
            },
            "match" : "*"
          }
        } ],
        "_ttl" : {
          "enabled" : true
        },
        "_source" : {
          "compress" : true
        },
        "properties" : {
          "full_message" : {
            "type" : "string",
            "analyzer" : "whitespace"
          },
          "gl2_remote_ip" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "gl2_remote_port" : {
            "type" : "long",
            "doc_values" : true
          },
          "gl2_source_collector" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "gl2_source_collector_input" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "gl2_source_input" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "gl2_source_node" : {
            "type" : "string",
            "index" : "not_analyzed",
            "doc_values" : true
          },
          "level" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "message" : {
            "type" : "string",
            "analyzer" : "whitespace"
          },
          "source" : {
            "type" : "string",
            "analyzer" : "analyzer_keyword"
          },
          "source_file" : {
            "type" : "string",
            "index" : "not_analyzed"
          },
          "timestamp" : {
            "type" : "date",
            "doc_values" : true,
            "format" : "yyyy-MM-dd HH:mm:ss.SSS"
          },
          "version" : {
            "type" : "string",
            "index" : "not_analyzed"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1462197971182",
        "uuid" : "ylBuS8y3SBKRYMyLuMWApg",
        "analysis" : {
          "analyzer" : {
            "analyzer_keyword" : {
              "filter" : "lowercase",
              "tokenizer" : "keyword"
            }
          }
        },
        "number_of_replicas" : "0",
        "number_of_shards" : "4",
        "version" : {
          "created" : "1070399"
        }
      }
    },
    "warmers" : { }
  }
}


After cycling the deflector so that it points to the new index, graylog2_3, 
I proceeded to delete my old indices.

Using the Graylog API browser, I tried to tokenize a random string (This is 
a $test:[to.see.if graylog() work$.):

http://vtor-lx-tomcat-d01:12900/messages/graylog2_3/analyze?string=This%20is%20a%20%24test%3A%5Bto.see.if%20graylog()%20work%24%5D.&pretty=true

{
  "tokens" : [ "this", "is", "a", "test", "to.see.if", "graylog", "work" ]
}


This makes sense because if I attempt to tokenize the same string via 
Elasticsearch (using the same index), I get the same result:

curl 'vtor-lx-tomcat-d01:9200/graylog2_3/_analyze?pretty=true' -d 'This is 
a $test:[to.see.if graylog() work$.'

"tokens" : [ {
    "token" : "this",
    "start_offset" : 0,
    "end_offset" : 4,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "is",
    "start_offset" : 5,
    "end_offset" : 7,
    "type" : "<ALPHANUM>",
    "position" : 2
  }, {
    "token" : "a",
    "start_offset" : 8,
    "end_offset" : 9,
    "type" : "<ALPHANUM>",
    "position" : 3
  }, {
    "token" : "test",
    "start_offset" : 11,
    "end_offset" : 15,
    "type" : "<ALPHANUM>",
    "position" : 4
  }, {
    "token" : "to.see.if",
    "start_offset" : 17,
    "end_offset" : 26,
    "type" : "<ALPHANUM>",
    "position" : 5
  }, {
    "token" : "graylog",
    "start_offset" : 27,
    "end_offset" : 34,
    "type" : "<ALPHANUM>",
    "position" : 6
  }, {
    "token" : "work",
    "start_offset" : 37,
    "end_offset" : 41,
    "type" : "<ALPHANUM>",
    "position" : 7
  } ]
}

However, without specifying the index in Elasticsearch, I get the result 
that I am looking for:

curl 'vtor-lx-tomcat-d01:9200/_analyze?analyzer=whitespace&pretty=true' -d 
'This is a $test:[to.see.if graylog() work$.'

"tokens" : [ {
    "token" : "This",
    "start_offset" : 0,
    "end_offset" : 4,
    "type" : "word",
    "position" : 1
  }, {
    "token" : "is",
    "start_offset" : 5,
    "end_offset" : 7,
    "type" : "word",
    "position" : 2
  }, {
    "token" : "a",
    "start_offset" : 8,
    "end_offset" : 9,
    "type" : "word",
    "position" : 3
  }, {
    "token" : "$test:[to.see.if",
    "start_offset" : 10,
    "end_offset" : 26,
    "type" : "word",
    "position" : 4
  }, {
    "token" : "graylog()",
    "start_offset" : 27,
    "end_offset" : 36,
    "type" : "word",
    "position" : 5
  }, {
    "token" : "work$.",
    "start_offset" : 37,
    "end_offset" : 43,
    "type" : "word",
    "position" : 6
  } ]
}

I feel like I am really close to an answer here.  It appears that there is 
something wrong with my index mapping/settings.

Sincerely,

On Tuesday, May 3, 2016 at 3:51:49 AM UTC-4, Jochen Schalanda wrote:
>
> Hi Dilip,
>
> are you 100% sure that the message is in a new index, that the index 
> template/mapping was properly applied (see 
> https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-get-mapping.html),
>  
> and that it is the "message" field you were looking for (and not 
> "full_message" or another field)?
>
> Cheers,
> Jochen
>
> On Monday, 2 May 2016 18:57:40 UTC+2, Dilip Muthukrishnan wrote:
>>
>> Hi Jochen,
>>
>> Thanks for your reply.  I'm using graylog-1.3.4 (server).  I removed and 
>> added an updated version of the "graylog-internal" template and then cycled 
>> the deflector through the web interface.  The new index mapping reflects 
>> the changes:
>>
>> "message" : {
>>    "type" : "string",
>>    "analyzer" : "whitespace"
>> }
>>
>>
>> However, it doesn't appear to be reflected in the search.  This message 
>> is from the latest index but based on this tokenization, it appears to 
>> still be using the old "standard analyzer":
>>
>> 02.05.2016 12:47:33.488 *ERROR* [Shell Script Executor Thread for cpu.sh] 
>> com.day.crx.core.CRXSessionImpl session# 144563 opened (103) 
>> java.lang.Exception: Stack Trace at 
>> com.day.crx.core.CRXSessionImpl$Tracker.open(CRXSessionImpl.java:212) at 
>> com.day.crx.core.CRXSessionImpl$Tracker.<init>(CRXSessionImpl.java:205) at 
>> com.day.crx.core.CRXSessionImpl.<init>(CRXSessionImpl.java:179) at 
>> com.day.crx.core.CRXRepositoryImpl.createSessionInstance(CRXRepositoryImpl.java:911)
>>  
>> at 
>> org.apache.jackrabbit.core.RepositoryImpl.createSession(RepositoryImpl.java:959)
>>  
>> at 
>> org.apache.jackrabbit.core.SessionFactory.createAdminSession(SessionFactory.java:42)
>>  
>> at 
>> com.day.crx.sling.server.impl.SlingRepositoryWrapper.loginAdministrative(SlingRepositoryWrapper.java:76)
>>  
>> at 
>> com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.extractScript(ShellScriptExecutorImpl.java:161)
>>  
>> at 
>> com.adobe.granite.monitoring.impl.ShellScriptExecutorImpl.execute(ShellScriptExecutorImpl.java:114)
>>  
>> at 
>> com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:99) 
>> at 
>> com.adobe.granite.monitoring.impl.ScriptMBean.invoke(ScriptMBean.java:158) 
>> at 
>> com.adobe.granite.monitoring.impl.ScriptConfigImpl$ExecutionThread.run(ScriptConfigImpl.java:208)
>>  
>> at java.lang.Thread.run(Thread.java:662)
>>
>>
>> Field terms: 02.05.2016124733.488errorshellscriptexecutorthreadforcpu.sh
>> com.day.crx.core.crxsessionimplsession144563opened103java.lang.exception
>> stacktraceattracker.opencrxsessionimpl.java212trackerinit205179
>> com.day.crx.core.crxrepositoryimpl.createsessioninstance
>> crxrepositoryimpl.java911
>> org.apache.jackrabbit.core.repositoryimpl.createsession
>> repositoryimpl.java959
>> org.apache.jackrabbit.core.sessionfactory.createadminsession
>> sessionfactory.java42
>> com.day.crx.sling.server.impl.slingrepositorywrapper.loginadministrative
>> slingrepositorywrapper.java76
>> com.adobe.granite.monitoring.impl.shellscriptexecutorimpl.extractscript
>> shellscriptexecutorimpl.java161
>> com.adobe.granite.monitoring.impl.shellscriptexecutorimpl.execute114
>> com.adobe.granite.monitoring.impl.scriptmbean.invokescriptmbean.java99158
>> com.adobe.granite.monitoring.impl.scriptconfigimplexecutionthread.run
>> scriptconfigimpl.java208java.lang.thread.runthread.java662
>>
>> As you can see, it has been stripped of various characters like colons 
>> and parentheses.
>>
>>
>> On Monday, May 2, 2016 at 12:36:38 PM UTC-4, Jochen Schalanda wrote:
>>>
>>> Hi Dilip,
>>>
>>> the index mapping of Graylog is applied by the means of an index 
>>> template. In Graylog 2.0.0, the index template will automatically be 
>>> updated but in older versions you'll have to remove the index template 
>>> yourself for it to be recreated by Graylog.
>>>
>>> See 
>>> https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-templates.html
>>>  
>>> for details.
>>>
>>> Cheers,
>>> Jochen
>>>
>>> On Thursday, 28 April 2016 21:42:23 UTC+2, Dilip Muthukrishnan wrote:
>>>>
>>>> I'm trying to change the analyzer from "standard" to "whitespace". 
>>>>  I've set the following property in my Graylog server configuration:
>>>>
>>>> elasticsearch_analyzer = whitespace
>>>>
>>>> It states that my change will be applied to new indices so I manually 
>>>> cycled the deflector so that it is now pointing to graylog2_1 (previously 
>>>> graylog2_0).  However, the new index still uses the "standard" analyzer 
>>>> based on the mapping in Elasticsearch:
>>>>
>>>> "message" : {
>>>>             "type" : "string",
>>>>             "analyzer" : "standard"
>>>>           },
>>>>
>>>>
>>>> How do I change the analyzer?
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Graylog Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to graylog2+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/graylog2/7e235566-00d3-4de5-8ec7-b9a480d6e644%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to