Shards get not distributed across the cluster

2015-04-20 Thread horst knete
Hey guys,

i am struggeling atm with the shard allocation in my ES cluster.

The problem is that i got the index [logstash-2015.04.13] that lays on the 
node Storage1 with 12 shards. I want that the index get evenly 
distributed to the 2 other Storage-nodes, the node with the SSDs in it 
should be avoided for allocation

The settings of the index are:

settings for logstash-2015.04.13 
 
{
index : {
routing : {
allocation : {
include : {
tag : 
},
require : {
tag : 
},
exclude : {
tag : SSD
},
total_shards_per_node : 4
}
},

Unfortunately i get the following logs in ES:

[2015-04-20 08:01:55,070][DEBUG][cluster.routing.allocation] [SSD] 
[logstash-2015.04.13][4] allocated on [Node Info of Storage1], but can no 
longer be allocated on it, moving...
[DEBUG][cluster.routing.allocation] [SSD] [logstash-2015.04.13][4] can't move

I disabled the disk-based shard allocation to make sure that this isnt the 
problem.

So is there a way i can see what denies the shard allocation to the other 
nodes? maybe a log of the Allocation-decider i can enable?

Thanks in advance.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1692d1d9-ceaa-4235-9334-0d3d1947de75%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES comprehensive searches over different logs

2015-03-06 Thread horst knete
Hey guys,

we are acutally setting up some security relevant searches in our 
ES-database and came over the following case, which i dont get managed by 
myself:

We want to make an query, that checks if a IP-address is accessing 
different ports in a given amout of time.

So what we basically need to do is, make a terms aggregation on a field 
called remote_ip and match the terms with an filter/query like port:XXX 
AND port:XXY AND port:XXZ but that query must go over different logs 
(port:XXX is in log1, port:XXZ is in log2).

So that query should return all remote_ips that have accessed all 3 ports 
in the given time.

I really struggle with that log-comprehensive searches, cause im not that 
fit in aggregation yet.

Some tipps would be really appreciated.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/220ed49b-5cf6-45cf-879b-10acecacc36e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What causes high CPU load on ES-Storage Nodes?

2015-01-28 Thread horst knete
Hi,

thanks for this excellent answer jörg, i really appreciate it.

It is true that our concept of ES is not ideal or how the developer 
designed ES to be.

Although i try to make the best out of the hardware that i have available, 
so im going to merge all the instances of ES that take care of the old 
indices to one and move it to the new big (big means 64 CPUs, 256 GB RAM 
in my context) server so that they have all the CPU power for garbage 
collection and searching available. 

What also could be an idea is to install an hypervisor on the server and 
make 4-5 virtual machines out of it and build an real, loadbalancing 
ES-Cluster - anyone got thoughts on this idea?

Thanks so far, 

Am Mittwoch, 28. Januar 2015 10:32:32 UTC+1 schrieb Jörg Prante:

 In most cases, it is a design mistake to place more than one node on a 
 single machine in production.

 Elasticsearch was not designed for scale-up on big machines (it depends 
 what you mean by big), but for scale-out, i.e. the more nodes you add, 
 the better.

 While Elasticsearch can benefit from more-powerful hardware, vertical 
 scale has its limits. Real scalability comes from horizontal scale—the 
 ability to add more nodes to the cluster and to spread load and reliability 
 between them.

 See The Definitive Guide, it is worth a read.


 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/distributed-cluster.html


 http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_scale_horizontally.html

 Another recommendation of mine is to use comparable hardware equipment for 
 the machines (CPU, RAM, Disk capacity and performance, network), for ease 
 of configuration, balancing load, and server maintenance.

 Jörg


 On Wed, Jan 28, 2015 at 7:32 AM, horst knete badun...@hotmail.de 
 javascript: wrote:

 Hi,

 thx for your response Mark.

 It looks like we are getting a second big server for our 
 ELK-stack(unfortunately without any more storage, so i really cant create a 
 failover cluster yet), but i wonder what role i should give this server in 
 our system.

 Would it be good to move the whole long term storage to this server and 
 let the indexing and pretty much of the kibana searching happen on the 
 existing server, or are there any good configuration setups?

 I read quite a bit of people that are having an similar setup like us ( 
 with 1 or 2 big machines ) and would be happy if they can share there 
 thoughts with us!.

 Thanks guys.

 Am Dienstag, 27. Januar 2015 22:54:18 UTC+1 schrieb Mark Walkom:

 Indexes are only refreshed when a new document is added.
 Best practices would be to use multiple machines, if you lose that one 
 you lose your cluster!

 Without knowing more about your cluster stats, you're probably just 
 reaching the limits of things, and either need less data, more nodes or 
 more heap.

 On 28 January 2015 at 02:14, horst knete badun...@hotmail.de wrote:

 Hey guys,

 i´ve been around this problem for quite a while, and didnt got a 
 clear answer to it.

 Like many of you guys out there, we are running an es-cluster on a 
 single strong server, moving older indices from the fast SSD´s to slow 
 cheap HDD´s (About 25 TB data).

 To make this work we got 3 instances of ES running with the path of 
 indices set to the HDD´s mountpoint and 1 single instance for the 
 indexing/searching SSD´s.

 What makes me wonder all the time is, that even the Storage-nodes 
 dont do anything(there is no indexing happening, there is 95% of the time 
 no searching happening, they are just keeping the old indices fresh) the 
 cpu load caused by this idle nodes is about 50% from the whole cpu 
 working time.

 hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b


 Is it possible that due to ES-Cluster mechanism, all the indexes are 
 keep getting refreshed all the time, when a document is indexed or a 
 search 
 is executed.

 Are there any configuration options to avoid such an behavior?

 Would it be better to export the old indices to an separate ES-Cluster 
 and configure multiple ES-Paths in Kibana?

 Are there any best practices to maintain such cluster?

 I would appreciate any form of feedback.

 Thanks

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com

What causes high CPU load on ES-Storage Nodes?

2015-01-27 Thread horst knete
Hey guys,

i´ve been around this problem for quite a while, and didnt got a clear 
answer to it.

Like many of you guys out there, we are running an es-cluster on a single 
strong server, moving older indices from the fast SSD´s to slow cheap HDD´s 
(About 25 TB data).

To make this work we got 3 instances of ES running with the path of indices 
set to the HDD´s mountpoint and 1 single instance for the 
indexing/searching SSD´s.

What makes me wonder all the time is, that even the Storage-nodes dont do 
anything(there is no indexing happening, there is 95% of the time no 
searching happening, they are just keeping the old indices fresh) the cpu 
load caused by this idle nodes is about 50% from the whole cpu working 
time.

hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b


Is it possible that due to ES-Cluster mechanism, all the indexes are keep 
getting refreshed all the time, when a document is indexed or a search is 
executed.

Are there any configuration options to avoid such an behavior?

Would it be better to export the old indices to an separate ES-Cluster and 
configure multiple ES-Paths in Kibana?

Are there any best practices to maintain such cluster?

I would appreciate any form of feedback.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What causes high CPU load on ES-Storage Nodes?

2015-01-27 Thread horst knete
Hi,

thx for your response Mark.

It looks like we are getting a second big server for our 
ELK-stack(unfortunately without any more storage, so i really cant create a 
failover cluster yet), but i wonder what role i should give this server in 
our system.

Would it be good to move the whole long term storage to this server and let 
the indexing and pretty much of the kibana searching happen on the existing 
server, or are there any good configuration setups?

I read quite a bit of people that are having an similar setup like us ( 
with 1 or 2 big machines ) and would be happy if they can share there 
thoughts with us!.

Thanks guys.

Am Dienstag, 27. Januar 2015 22:54:18 UTC+1 schrieb Mark Walkom:

 Indexes are only refreshed when a new document is added.
 Best practices would be to use multiple machines, if you lose that one you 
 lose your cluster!

 Without knowing more about your cluster stats, you're probably just 
 reaching the limits of things, and either need less data, more nodes or 
 more heap.

 On 28 January 2015 at 02:14, horst knete badun...@hotmail.de 
 javascript: wrote:

 Hey guys,

 i´ve been around this problem for quite a while, and didnt got a clear 
 answer to it.

 Like many of you guys out there, we are running an es-cluster on a 
 single strong server, moving older indices from the fast SSD´s to slow 
 cheap HDD´s (About 25 TB data).

 To make this work we got 3 instances of ES running with the path of 
 indices set to the HDD´s mountpoint and 1 single instance for the 
 indexing/searching SSD´s.

 What makes me wonder all the time is, that even the Storage-nodes dont 
 do anything(there is no indexing happening, there is 95% of the time no 
 searching happening, they are just keeping the old indices fresh) the cpu 
 load caused by this idle nodes is about 50% from the whole cpu working 
 time.

 hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b


 Is it possible that due to ES-Cluster mechanism, all the indexes are keep 
 getting refreshed all the time, when a document is indexed or a search is 
 executed.

 Are there any configuration options to avoid such an behavior?

 Would it be better to export the old indices to an separate ES-Cluster 
 and configure multiple ES-Paths in Kibana?

 Are there any best practices to maintain such cluster?

 I would appreciate any form of feedback.

 Thanks

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0a577e2a-7153-4f8a-a315-7f049e136315%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


ES keeps deleting shards

2014-12-12 Thread horst knete
hey folks,

as we added an additional es node yesterday(actually an other instance of 
ES with an different data path on our server), some of the shards that were 
moved to this node, keep stuck initializing.

Given to the debug logs, it seems like they got corrupted during the copy 
process (about 20 of 140 moved shards got this problem...) so i decided it 
could help to manually assign them to an node, using the reroute-api.

https://lh6.googleusercontent.com/-qSB3BJT-F5o/VIqjbiphLTI/AC4/9Yno_hajBgM/s1600/es.PNG
 
https://lh5.googleusercontent.com/-ijpXObA2tms/VIqkw9lfDHI/ADY/Gc3sK_NE0do/s1600/size.PNG
























= curl -X POST \

'http://localhost:9200/_cluster/reroute?pretty=true' \

-d '{ commands : [ { allocate : { index : logstash-2014.06.07, 
shard : 2 , node : Storage, allow_primary : 2 }}]}'



After I issued the command, ES tried to initialize the shard, but then this 
suddenly happend:


https://lh6.googleusercontent.com/-RnP051vjz1I/VIqlJQLi4-I/ADg/tC5q9TDr5HQ/s1600/sizeafter.PNG





It seems like ES decided to delete the whole shard, due to its (potential) 
corruption and with it 1/3 of the data in it.


In my opinion ES should never ever be allowed to delete anything without 
permission, any clue what happened in this case?


Thanks for feedback

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b737268a-32aa-4100-ba03-ac3b12e616fd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


poor performance of full text searches

2014-11-28 Thread horst knete
Hey guys,

we are experience poor performance if we do some full text ( searches 
without specifying an field name).

If we search in an 200 GB index which lays on SSDs for something like ' 
program:apache ' the search takes about 10-15 seconds, if we search for ' 
*apache* ' the whole search goes for 1,5-2 minutes.

What can cause this? Our mapping? the high heap of the SSD ES node ( 80gb + 
~90 GB available Lucene system memory)? Is it normal?

If you need any information, please let me know.

thanks for any respond


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a3a41710-93b4-4e81-93a7-6cde43e5c387%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: poor performance of full text searches

2014-11-28 Thread horst knete
Thx for response,

Actually the need to search with wildcards is given because our mapping. In 
our events are indexed a whole lot of urls which are indexed to many terms 
by the default analyzer, this will lead to an pretty akward output if you 
are doing the top 10 search in Kibana in the table panel. Thx to an post 
of an other guy in here in the forum we adjusted the mapping so everything 
get indexed into 1 token per field, allowing urls to be displayed in 
correct way but having the disadvantage of having to need to use wildcards 
because  apache  wont return any events.

We are actually using 1 big server with 256 RAM + 64 CPU, divided into 3 
elasticsearch instances. 

Given to your post it seems to make sense to adjusted the mapping back to 
the way that everything get tokenized back to the default way, eliminating 
the need of using wildcards and use not_analyzed fields for urls.

Maybe some kind of stupid question: if you dont specify any field in kibana 
search, it will be redirected by default to the _all field right? is there 
a way of changing that behaviour so that  apache  get not directed to 
_all but rather to message-field? that would eliminate the need of _all 
field which would reduce disk space and cpu/ram load for indexing

Am Freitag, 28. November 2014 10:56:09 UTC+1 schrieb David Pilato:

 Some comments:

 Searching for `program:apache` actually search in field `program`.

 Searching with wildcards is something you really should not do! Actually 
 when you do full text search on Google, I guess you don’t use wildcards, 
 right? It’s not really user friendly.
 So wildcards are extremely slow and even slower when you start with *! 
 Look here for reference: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html#query-dsl-wildcard-query
 Searching in _all field is also something you should not do but it depends 
 on your use case. I’d prefer using copy_to feature in mapping instead of 
 using _all field.

 Now, I think it depends on your infra. How many nodes, RAM, … you have.


  

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
 http://Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | @elasticsearchfr 
 https://twitter.com/elasticsearchfr | @scrutmydocs 
 https://twitter.com/scrutmydocs


  
 Le 28 nov. 2014 à 10:44, horst knete badun...@hotmail.de javascript: 
 a écrit :

 Hey guys,

 we are experience poor performance if we do some full text ( searches 
 without specifying an field name).

 If we search in an 200 GB index which lays on SSDs for something like ' 
 program:apache ' the search takes about 10-15 seconds, if we search for ' 
 *apache* ' the whole search goes for 1,5-2 minutes.

 What can cause this? Our mapping? the high heap of the SSD ES node ( 80gb 
 + ~90 GB available Lucene system memory)? Is it normal?

 If you need any information, please let me know.

 thanks for any respond



 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/a3a41710-93b4-4e81-93a7-6cde43e5c387%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/a3a41710-93b4-4e81-93a7-6cde43e5c387%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3bf1946b-86bd-4dbf-b7fd-b07a7f27460a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Changing Analyzer behavior for hyphens - suggestions?

2014-11-21 Thread horst knete
I think our solution is now to just replace all the non-letters from 
elasticsearch with an _.

char_filter : {
replace : {
type : mapping,
mappings: [\\.=_, \\u2010=_, 
'\''=_, \\:=_, \\u0020=_, \\u005C=_, \\u0028=_, 
\\u0029=_, \\u0026=_, \\u002F=_, \\u002D=_, \\u003F=_, 
\\u003D=_]
   }
  },

This lead to that the terms wont get splited into useless tokens from the 
standard analyzer.

The downside of this solution is that some urls or Windows paths looks very 
ugly for the human eye now, e.g:

http://g.ceipmsn.com/8SE/411?MI=B9DC2E6D07184453A1EFC4E765A16D30-0LV=3.0.131.0OS=6.1.7601AG=1217
 
 =

http___g_ceipmsn_com_8se_411_mi_b9dc2e6d07184453a1efc4e765a16d30_0_lv_3_0_131_0_os_6_1_7601_ag_1217

The good thing compared to not analyzed is that if i search for url:*8se*, 
the search will return the events with this url in it.

I think this is not a perfect solution, but rather an good workaround till 
lucene gives us better analyzer types to work with.

Thanks for sharing your experience so far!

cheers

Am Donnerstag, 20. November 2014 20:07:27 UTC+1 schrieb Jörg Prante:

 The whitespace tokenizer has the problem that punctuation is not ignored. 
 I find the word_delimiter filter not working at all with whitespace, only 
 with keyword tokenizer, with massive pattern matching which is complex and 
 expensive :(

 Therefore I took the classic tokenizer and generalized the hyphen rules in 
 the grammar. The tokenizer hyphen and filter hyphen are two routines. 
 The tokenizer hyphen keeps hyphenated words together and handles 
 punctuation correct. The filter hyphen adds combinations to the original 
 form.

 Main point is to add combinations of dehyphenated forms so they can be 
 searched. 

 Single words are only taken into account when the word is positioned at 
 the edge.

 For example, the phrase der-die-das should be indexed in the following 
 forms:

 der-die-das,  derdiedas, das, derdie, derdie-das, die-das, der

 Jörg

 On Thu, Nov 20, 2014 at 9:29 AM, horst knete badun...@hotmail.de 
 javascript: wrote:


 So the term this-is-a-test get tokenized into this-is-a-test which is 
 nice behaviour, but in order to make an full-text-search on this field it 
 should get tokenized into this-is-a-test, this, is, a and test as 
 i wrote before.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2aa17dac-b484-4385-8efd-9a74847e9582%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Changing Analyzer behavior for hyphens - suggestions?

2014-11-20 Thread horst knete
Hi,

thx for response and this awesome plugin bundle (especially for me as 
german).

Unfortunately the hyphen analyzer plugin didnt do the job in the way i 
wanted it to be.

The hyphen-analyzer does something similar like the whitespace analyzer - 
it just dont split on hyphen and instead see them as ALPHANUM characters 
(at least that is what i think right now).

So the term this-is-a-test get tokenized into this-is-a-test which is 
nice behaviour, but in order to make an full-text-search on this field it 
should get tokenized into this-is-a-test, this, is, a and test as 
i wrote before.

i think maybe abusing the word_delimiter token filter could do the job, 
because there is an option preserve_original.

unfortunately if you adjust the filter like this:

PUT /logstash-2014.11.20
{
index : {
analysis : {
analyzer : {
wordtest : {
type : custom,
tokenizer : whitespace,
filter : [
lowercase,
word
]
}
},
   filter : {
word : {
type : word_delimiter,
generate_word_parts: false,
generate_number_parts: false,
catenate_words: false,
catenate_numbers: false,
catenate_all: false,
split_on_case_change: false,
preserve_original: true,
split_on_numerics: false,
stem_english_possessive: true
   }
  }
}
}
}

and make an analyze test:

curl -XGET 'localhost:9200/logstash-2014.11.20/_analyze?filters=word' -d 
'this-is-a-test'

the response is this:
{tokens:[{token:this,start_offset:0,end_offset:4,type:ALPHANUM,position:1},{token:is,start_offset:5,end_offset:7,type:ALPHANUM,position:2},{token:a,start_offset:8,end_offset:9,type:ALPHANUM,position:3},{token:test,start_offset:10,end_offset:14,type:ALPHANUM,position:4}]

which just says it tokenized it in everything expect the original term, 
which make me wonder if the preserver_original settings is working?

Any idea on this?

Am Mittwoch, 19. November 2014 18:26:09 UTC+1 schrieb Jörg Prante:

 You search for a hyphen-aware tokenizer, like this?

 https://gist.github.com/jprante/cd120eac542ba6eec965

 It is in my plugin bundle

 https://github.com/jprante/elasticsearch-plugin-bundle

 Jörg

 On Wed, Nov 19, 2014 at 5:46 PM, horst knete badun...@hotmail.de 
 javascript: wrote:

 Hey guys,

 after working with the ELK stack for a while now, we still got an very 
 annoying problem regarding the behavior of the standard analyzer - it 
 splits terms into tokens using hyphens or dots as delimiters.

 e.g logsource:firewall-physical-management get split into firewall , 
 physical and management. On one side thats cool because if you search 
 for logsource:firewall you get all the events with firewall as an token in 
 the field logsource. 

 The downside on this behaviour is if you are doing e.g. an top 10 
 search on an field in Kibana, all the tokens are counted as an whole term 
 and get rated due to their count: 
 top 10: 
 1. firewall : 10
 2. physical : 10
 3. management: 10

 instead of top 10:
 1. firewall-physical-management: 10

 Well in the standard mapping from logstash this is solved using and .raw 
 field as not_analyzed but the downside on this is you got 2 fields 
 instead of one (even if its a multi_field) and the usage for kibana users 
 is not that great.

 So what we need is that logsource:firewall-physical-management get 
 tokenized into firewall-physical-management, firewall , physical and 
 management.

 I tried this using the word_delimiter filter token with the following 
 mapping:

  analysis : {
  analyzer : {
  my_analyzer : {
  type : custom,
  tokenizer : whitespace,
  filter : [lowercase, asciifolding, 
 my_worddelimiter]
  }
   },
  filter : {
 my_worddelimiter : {
 type : word_delimiter,
 generate_word_parts: false,
 generate_number_parts: false,
 catenate_words: false,
 catenate_numbers: false,
 catenate_all: false,
 split_on_case_change: false,
 preserve_original: true,
 split_on_numerics: false,
 stem_english_possessive: true

Changing Analyzer behavior for hyphens - suggestions?

2014-11-19 Thread horst knete
Hey guys,

after working with the ELK stack for a while now, we still got an very 
annoying problem regarding the behavior of the standard analyzer - it 
splits terms into tokens using hyphens or dots as delimiters.

e.g logsource:firewall-physical-management get split into firewall , 
physical and management. On one side thats cool because if you search 
for logsource:firewall you get all the events with firewall as an token in 
the field logsource. 

The downside on this behaviour is if you are doing e.g. an top 10 search 
on an field in Kibana, all the tokens are counted as an whole term and get 
rated due to their count: 
top 10: 
1. firewall : 10
2. physical : 10
3. management: 10

instead of top 10:
1. firewall-physical-management: 10

Well in the standard mapping from logstash this is solved using and .raw 
field as not_analyzed but the downside on this is you got 2 fields 
instead of one (even if its a multi_field) and the usage for kibana users 
is not that great.

So what we need is that logsource:firewall-physical-management get 
tokenized into firewall-physical-management, firewall , physical and 
management.

I tried this using the word_delimiter filter token with the following 
mapping:

 analysis : {
 analyzer : {
 my_analyzer : {
 type : custom,
 tokenizer : whitespace,
 filter : [lowercase, asciifolding, 
my_worddelimiter]
 }
  },
 filter : {
my_worddelimiter : {
type : word_delimiter,
generate_word_parts: false,
generate_number_parts: false,
catenate_words: false,
catenate_numbers: false,
catenate_all: false,
split_on_case_change: false,
preserve_original: true,
split_on_numerics: false,
stem_english_possessive: true
   }
  }
  }

But this unfortunately didnt do the job.

I´ve saw on my recherche that some other guys have an similar problem like 
this, but expect some replacement suggestions, no real solution was found.

If anyone have some ideas on how to start working on this, i would be very 
happy.

thanks.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4094292c-057f-43d8-9af0-1ea83ad45a1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Index reallocation takes hours

2014-10-29 Thread horst knete
Hi,

i want to use this thread, to clarify some of the elasticsearch settings, 
because im struggeling a bit in understanding all of them.

1.indices.store.throttle.max_bytes_per_sec : This is the maximum amount of 
mb/gb which elasticsearch uses per second to to the merging process - 
right? (Is there an difference between 
indices.store.throttle.max_bytes_per_sec and 
index.store.throttle.max_bytes_per_sec? )
2.indices.recovery.max_bytes_per_sec: This is the maximum amout of mb/gb 
which elasticsearch uses to initialize shards/indices after an restart of 
the node - right?
3.indices.recovery.concurrent_streams: The amount of parallel shard 
initializations after an restart - right?
4.cluster.routing.allocation.node_concurrent_recoveries: The cluster wide 
amount of parallel shard initializations after an restart - right?
5.cluster.routing.allocation.node_initial_primaries_recoveries: The cluster 
wide amount of parallel primary shard initializations after an restart - 
right?
6.cluster.routing.allocation.cluster_concurrent_rebalance:2: The amount of 
shards that are parallel allowed to move from one node to an other -right?

But of all the settings found in the elasticsearch there is no such 
settings like cluster.routing.allocation.max_bytes_per_sec which limits 
the maximum amount of mb/gb per sec that can be used to allocate a shard 
from one node to another - will elasticsearch by default use a value or 
just use the maximum possible speed?

Thanks for response

Am Mittwoch, 8. Oktober 2014 11:04:26 UTC+2 schrieb Andrew Lakes:

 Hi,

 we are using version 1.3.2 of ES , each index have 4 shards.

 I will try adjusting the store-throttling to an acceptable number without 
 crushing our disk IOs

 Am Dienstag, 7. Oktober 2014 19:35:35 UTC+2 schrieb Ivan Brusic:

 Currently shard allocation is throttled by default with a target low 
 value. If you are using an older version of Elasticsearch, then a Lucene 
 bug will make it even slower.


 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#store-throttling

 How many shards do your indices have? The default number of concurrent 
 allocations is only 2. The ideal values depends on your hardware, 
 especially your disk io.

 Cheers,
 Ivan
 On Oct 7, 2014 8:42 AM, Andrew Lakes alak...@gmail.com wrote:

 Hey guys,

 we have setup an central ES-server with 2 instances (1 instance SSDs for 
 indexing/last indices and an other one for long term storage)

 Everyday at 6 a.m a cronjob is started that moves indices older than 7 
 days to the HDD storage node.

 Our indices have an average size of 300 GB, but it takes about 6-7 hours 
 till the reallocation of 1 index is done (if you copy it manually with cp 
 it only takes about 30 minutes).

 Is there any settings which can i adjust that speed this process of 
 reallocation up?

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/aa78aefb-be9e-4dc6-bdca-948d312f06af%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/aa78aefb-be9e-4dc6-bdca-948d312f06af%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1224c3f7-8207-42f5-8d2d-95eb36280ef2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Any experience with ES and Data Compressing Filesystems?

2014-08-04 Thread horst knete
Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced the 
amount of data from 3,2 TB to 1,1TB without any segnificant performance 
losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to 
the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:

 Hi, 

 gzip/zlib compression is very bad for performance, so it can be 
 interesting for closed indices, but for live data I would not recommend it. 
 Also, you must know that: 

 Compression using lz4 is already enabled into indices, 
 ES/Lucene/Java usually readwrite 4k blocks, 

 - hence, compression is achieved on 4k blocks. If your filesystem uses 4k 
 blocks and you add FS compression, you will probably have a very small 
 gain, if any. I've tried on ZFS: 

 Filesystem SizeUsed   Avail Capacity  Mounted on 
 zdata/ES-lz4   1.1T1.9G1.1T 0%/zdata/ES-lz4 
 zdata/ES   1.1T1.9G1.1T 0%/zdata/ES 

 If you are using a larger block size, like 128k, a compressed filesystem 
 does show some benefit: 

 Filesystem SizeUsed   Avail Capacity  Mounted on 
 zdata/ES-lz4   1.1T1.1G1.1T 0%   
  /zdata/ES-lz4- compressratio  1.73x 
 zdata/ES-gzip  1.1T901M1.1T 0%   
  /zdata/ES-gzip- compressratio  2.27x 
 zdata/ES   1.1T1.9G1.1T 0%/zdata/ES 

 But a file system block larger than 4k is very suboptimal for IO (ES read 
 or write one 4k block - your FS must read or write a 128k block). 

 On 21 juil. 2014, at 07:58, horst knete badun...@hotmail.de javascript: 
 wrote: 

  Hey guys, 
  
  we have mounted an btrfs file system with the compression method zlib 
 for 
  testing purposes on our elasticsearchserver and copied one of the 
 indices 
  on the btrfs volume, unfortunately it had no success and still got the 
 size 
  of 50gb :/ 
  
  I will further try it with other compression methods and will report 
 here 
  
  Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic: 
  
  Hi Horst, 
  
  I wouldn't bother with this for the reasons Joerg mentioned, but should 
  you try it anyway, I'd love to hear your findings/observations. 
  
  Otis 
  -- 
  Performance Monitoring * Log Analytics * Search Analytics 
  Solr  Elasticsearch Support * http://sematext.com/ 
  
  
  
  On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote: 
  
  Hey Guys, 
  
  to save a lot of hard disk space, we are going to use an compression 
 file 
  system, which allows us transparent compression for the es-indices. 
 (It 
  seems like es-indices are very good compressable, got up to 65% 
  compression-rate in some tests). 
  
  Currently the indices are laying at a ext4-Linux Filesystem which 
  unfortunately dont have the transparent compression ability. 
  
  Anyone of you got experience with compression file systems like BTRFS 
 or 
  ZFS/OpenZFS and can tell us if this led to big performance losses? 
  
  Thanks for responding 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Any experience with ES and Data Compressing Filesystems?

2014-08-04 Thread horst knete
We are indexing all sort of events (Windows, Linux, Apache, Netflow and so 
on...) and impact is defined in speed of the Kibana GUI / how long it takes 
to load 7 or 14 days of data. Thats what is important for my colleagues.


Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom:

 What sort of data are you indexing? When you said performance impact was 
 minimal, how minimal and at what points are you seeing it?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com
  

 On 4 August 2014 16:43, horst knete badun...@hotmail.de javascript: 
 wrote:

 Hi again,

 a quick report regarding compression:

 we are using a 3-TB btrfs-volume with 32k block size now which reduced 
 the amount of data from 3,2 TB to 1,1TB without any segnificant performance 
 losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to 
 the volume ).

 So for us i can only suggest using the btrfs-volume for long term storage.

 Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:

 Hi, 

 gzip/zlib compression is very bad for performance, so it can be 
 interesting for closed indices, but for live data I would not recommend it. 
 Also, you must know that: 

 Compression using lz4 is already enabled into indices, 
 ES/Lucene/Java usually readwrite 4k blocks, 

 - hence, compression is achieved on 4k blocks. If your filesystem uses 
 4k blocks and you add FS compression, you will probably have a very small 
 gain, if any. I've tried on ZFS: 

 Filesystem SizeUsed   Avail Capacity  Mounted on 
 zdata/ES-lz4   1.1T1.9G1.1T 0%/zdata/ES-lz4 
 zdata/ES   1.1T1.9G1.1T 0%/zdata/ES 

 If you are using a larger block size, like 128k, a compressed filesystem 
 does show some benefit: 

 Filesystem SizeUsed   Avail Capacity  Mounted on 
 zdata/ES-lz4   1.1T1.1G1.1T 0%   
  /zdata/ES-lz4- compressratio  1.73x 
 zdata/ES-gzip  1.1T901M1.1T 0%   
  /zdata/ES-gzip- compressratio  2.27x 
 zdata/ES   1.1T1.9G1.1T 0%/zdata/ES 

 But a file system block larger than 4k is very suboptimal for IO (ES 
 read or write one 4k block - your FS must read or write a 128k block). 

 On 21 juil. 2014, at 07:58, horst knete badun...@hotmail.de wrote: 

  Hey guys, 
  
  we have mounted an btrfs file system with the compression method 
 zlib for 
  testing purposes on our elasticsearchserver and copied one of the 
 indices 
  on the btrfs volume, unfortunately it had no success and still got the 
 size 
  of 50gb :/ 
  
  I will further try it with other compression methods and will report 
 here 
  
  Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic: 
  
  Hi Horst, 
  
  I wouldn't bother with this for the reasons Joerg mentioned, but 
 should 
  you try it anyway, I'd love to hear your findings/observations. 
  
  Otis 
  -- 
  Performance Monitoring * Log Analytics * Search Analytics 
  Solr  Elasticsearch Support * http://sematext.com/ 
  
  
  
  On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote: 
  
  Hey Guys, 
  
  to save a lot of hard disk space, we are going to use an compression 
 file 
  system, which allows us transparent compression for the es-indices. 
 (It 
  seems like es-indices are very good compressable, got up to 65% 
  compression-rate in some tests). 
  
  Currently the indices are laying at a ext4-Linux Filesystem which 
  unfortunately dont have the transparent compression ability. 
  
  Anyone of you got experience with compression file systems like 
 BTRFS or 
  ZFS/OpenZFS and can tell us if this led to big performance losses? 
  
  Thanks for responding 

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Any experience with ES and Data Compressing Filesystems?

2014-07-20 Thread horst knete
Hey guys,

we have mounted an btrfs file system with the compression method zlib for 
testing purposes on our elasticsearchserver and copied one of the indices 
on the btrfs volume, unfortunately it had no success and still got the size 
of 50gb :/

I will further try it with other compression methods and will report here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

 Hi Horst,

 I wouldn't bother with this for the reasons Joerg mentioned, but should 
 you try it anyway, I'd love to hear your findings/observations.

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/



 On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

 Hey Guys,

 to save a lot of hard disk space, we are going to use an compression file 
 system, which allows us transparent compression for the es-indices. (It 
 seems like es-indices are very good compressable, got up to 65% 
 compression-rate in some tests).

 Currently the indices are laying at a ext4-Linux Filesystem which 
 unfortunately dont have the transparent compression ability.

 Anyone of you got experience with compression file systems like BTRFS or 
 ZFS/OpenZFS and can tell us if this led to big performance losses?

 Thanks for responding



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5fab716e-dcef-4edf-b658-56922f8dee16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Recommended File System Settings for Elasticsearch

2014-07-18 Thread horst knete
Hey Guys,

we are going to expand our development ES-System to an productive 
environment and  we got fresh new storage which waits to be formatted.

In order to get the best performance for elasticsearch, we´d like to know 
what the recommended settings are.

Does elasticsearch reads/writes in a sequential way or are there much 
random access on the indices?

Which block-size would you recommend? ( i´ve heard elasticsearch 
reads/writes in 4k blocks, right? )

Thanks for response

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Any experience with ES and Data Compressing Filesystems?

2014-07-16 Thread horst knete
Hey Guys,

to save a lot of hard disk space, we are going to use an compression file 
system, which allows us transparent compression for the es-indices. (It 
seems like es-indices are very good compressable, got up to 65% 
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which 
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or 
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Disabling _all-Field but keep Netflow-Events searchable

2014-07-14 Thread horst knete
Anyone got an idea how to realize that? I think that there are a few uses 
which got Netflow AND other types of events inserting into Elasticsearch 
and for those a disabled _all Field would save much hard disk space

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2b19c405-46b0-483b-9d22-73fa0c6dca5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Disabling _all-Field but keep Netflow-Events searchable

2014-07-01 Thread horst knete
Hey,

thx for your response.

As i already mentioned i tried setting the index.query.default_field to 
message.

{
  template : logstash-*,
  settings : {
index.refresh_interval : 5s,
index.number_of_shards : 3,
index.number_of_replicas : 0,
index.refresh_interval : 30s,
index.store.compress.stored : true,
index.store.compress.tv : true,
index.query.default_field : message,
analysis : {
  analyzer : {
default : {
  type : standard,
  stopwords : _none_
}
  }
}
  },
  mappings : {
_default_ : {
   _all : {enabled : false},
   _source: { compress: true },
   dynamic_templates : [ {
 string_fields : {
   match : *,
   match_mapping_type : string,
   mapping : {
 type : string,
   fields : {
{name} : {type: string, index : not_analyzed}
   }
   }
 }
   } ],
   properties : {
 @version: { type: string, index: not_analyzed },
@timestamp : { type : date, index : not_analyzed },
tags: { type: string, index : not_analyzed }
   }
}
  }
}

That was the template that i used. This is working fine for all Events 
except the Netflow ones, because they dont have a message-field for 
Kibana to search in. Thats what my mess is.

Is it possible to adjust the template/mapping per type of event?

Cheers

Am Montag, 30. Juni 2014 09:08:22 UTC+2 schrieb Alexander Reelsen:

 Hey,

 you can set the index.query.default_field in the mapping to circumvent 
  this, see 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-all-field.html#mapping-all-field


 --Alex


 On Tue, Jun 24, 2014 at 12:39 PM, horst knete badun...@hotmail.de 
 javascript: wrote:

 Hey guys,

 I really want to disable the _all-Field in the ES-Indices to save some 
 disk-space on our system.

 Normally its not the problem - adjust template in ES, and set 
 message-Field to the new default query field, that is normally available 
 in any event.

 The problem is that we also have many netflow-events with the 
 netflow-codec that have the following form:


 https://lh4.googleusercontent.com/-CDQQs5e5a7o/U6lUvjikncI/ACo/LHpMXlYLMWw/s1600/netflow.PNG

 As you might notice there isnt any message-field so the Kibana lucene 
 query would run into an error.

 My question is - how do i manage it to make this work (disabling 
 _all-Field but search in the netflow-events)?

 Thanks for response.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9f406910-1608-4866-8c9c-42a23f6d8f11%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Disabling _all-Field but keep Netflow-Events searchable

2014-06-24 Thread horst knete
Hey guys,

I really want to disable the _all-Field in the ES-Indices to save some 
disk-space on our system.

Normally its not the problem - adjust template in ES, and set 
message-Field to the new default query field, that is normally available 
in any event.

The problem is that we also have many netflow-events with the netflow-codec 
that have the following form:

https://lh4.googleusercontent.com/-CDQQs5e5a7o/U6lUvjikncI/ACo/LHpMXlYLMWw/s1600/netflow.PNG

As you might notice there isnt any message-field so the Kibana lucene 
query would run into an error.

My question is - how do i manage it to make this work (disabling _all-Field 
but search in the netflow-events)?

Thanks for response.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using Jetty plugin

2014-06-11 Thread horst knete
Hi,

do you want to restrict access to the physical location (e.g. 
/opt/elasticsearch/data/elasticsearch/0/indices/index-name) or do you want 
to restrict access to some type of logs?

For the 2nd option you could look at this thread 
https://groups.google.com/forum/#!topic/logstash-users/LFmNPv3z6YE where i 
explained using kibana3_auth for this purpose

Am Dienstag, 10. Juni 2014 12:30:23 UTC+2 schrieb Gaurav Kamaljith:

 Hi,

 Is it possible to restrict Kibana users from accessing particular 
 indices/sub indices folders using the Jetty plugin?

 Regards,
 Gaurav


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/66c69e64-1f2c-4917-a3b7-0a820f1d556e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: alerts from Kibana/ES

2014-06-02 Thread horst knete
Hi NF,

we did also set up alerting with our zabbix-monitoring-system.

What we use are simple Linux-Scripts using the command curl to search in 
given elasticsearch-indices.

In the zabbix-system are build triggers that are runs the script on our 
elasticsearch-server and interpret the output given from this scripts (e.g. 
Number of events with ID 4625) and if this value is a specific value the 
trigger alerts.

It´s simple to set up and maybe this is what you are looking for.

If you need any help, feel free to contact me

Am Freitag, 30. Mai 2014 08:31:07 UTC+2 schrieb NF:

 That's right, Otis.

 On Friday, May 30, 2014 7:20:27 AM UTC+2, Otis Gospodnetic wrote:

 Hi,

 There's no alerting in Kibana.  Have a look at SPM 
 http://www.google.com/url?q=http%3A%2F%2Fsematext.com%2Fspm%2Fsa=Dsntz=1usg=AFQjCNEYpqf7mA9CH122rtdZ0CJtqY3bFQ
  
 - it has ES monitoring, threshold and heartbeat alerting, anomaly 
 detection, and a number of other features.  Actually, re-reading your email 
 - you are looking to get notified when a certain event is captured?  By 
 that do you mean having something like a saved query that matches 
 incoming logs?

 Otis
 --
 Performance Monitoring * Log Analytics * Search Analytics
 Solr  Elasticsearch Support * http://sematext.com/


 On Tuesday, May 27, 2014 5:02:35 AM UTC-4, NF wrote:

 Hi,

 We’re using Kibana/Elasticsearch to visualize different kind of logs in 
 our company. Now, we would need a feature that would allow us to send an 
 alert/notification (email or other) when a certain event/trigger is 
 captured.

 I’d like to know if in Kibana/Elasticsearch backlog there is such a 
 feature planned? If so, when might we expect it available? 

 If not, could you please suggest any (open source) solution to satisfy 
 our need?

 Thanks,

 Natalia



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b6c9170f-ad81-4d92-94d5-424389e817c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete
Hey guys,

in order to meet the german laws for logging, i got the order to store the 
elasticsearch indices in a revision/audit-proof way(Indices cannot be 
edited/changed after the storage).

Are there any best practices or tips for doing such a thing?(maybe any 
plugins?)

Thanks for your feedback.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete
Yeah it looks like that this would do the job, thanks for response

Am Donnerstag, 22. Mai 2014 10:40:19 UTC+2 schrieb Mark Walkom:

 You can set indexes to readonly - 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html
 Is that what you're after?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com javascript:
 web: www.campaignmonitor.com


 On 22 May 2014 18:36, horst knete badun...@hotmail.de javascript:wrote:

 Hey guys,

 in order to meet the german laws for logging, i got the order to store 
 the elasticsearch indices in a revision/audit-proof way(Indices cannot be 
 edited/changed after the storage).

 Are there any best practices or tips for doing such a thing?(maybe any 
 plugins?)

 Thanks for your feedback.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e2776ff1-4dde-4e96-85b0-f19cd9ad6c9b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete
Hi Jörg,

thanks for your offer.

I will contact you if there´s a need for such an plugin in our company.

Also i will keep you up to date if there´s breaking changes in our project.



Am Donnerstag, 22. Mai 2014 10:55:44 UTC+2 schrieb Jörg Prante:

 You have to add a facility to your middleware that can trace all 
 authorized operations to your index (access, read, write, modify, delete) 
 and you must write this to an append-only logfile with timestamps.

 If there is interest I could write such a plugin (assuming it can run in a 
 trusted environment regarding authorization tokens) but I think best place 
 is in a middleware (where an ES client runs in a broader application 
 context e.g. transaction awareness).

 Jörg


 On Thu, May 22, 2014 at 10:36 AM, horst knete 
 badun...@hotmail.dejavascript:
  wrote:

 Hey guys,

 in order to meet the german laws for logging, i got the order to store 
 the elasticsearch indices in a revision/audit-proof way(Indices cannot be 
 edited/changed after the storage).

 Are there any best practices or tips for doing such a thing?(maybe any 
 plugins?)

 Thanks for your feedback.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4f2187bc-8d8e-4c3a-ae02-8eed30f3a175%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch supports multiple not RAID 0 Data Paths?

2014-05-15 Thread horst knete
Hello,

we are currently running our Elasticsearch on 1 single node and get about 
20 Million Logs per Day(40 GB/daily indices). Since this is a much Stuff to 
handle with, the indices takes a lot of disk space on our server.

What we like to implement:

- 1 Data directory which is stored on our SSDs and contains the indices of 
the last 7 days for quick access.
- 1 Data directory which is stored on normal HDDs and contains indices of 
last 3 months for normals speed acces.
- 1 Data directory which is stored on slow 5400 rpm HDDs and contains the 
indices of the last 2 years for access if needed.

Well it´s not problem to tell ES multiple data paths but if you do this, ES 
will stripe (RAID 0) the indices on all 3 data directories.

But thats not what we want. We want do copy the indices with a script to 
the matching directories ( a index which is older than 8 days gets 
automatically moved to normal HDDs and so on).

Is there any way to make this work?

Thanks for your feedback.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e8e34044-5895-4ccd-bac4-5ef11ea81204%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Compress Elasticsearch Index/Disk Usage Elasticsearch

2014-03-20 Thread horst knete
Hi Adrien,

thanks for respone.

The Curator is a very helpful tool thanks for the link!

I tried it and save about 1 GB of disk space (24-23GB) i think the effekt 
increases with more indexes.



Am Mittwoch, 19. März 2014 17:42:56 UTC+1 schrieb Adrien Grand:

 Hi,

 This setting doesn't exist anymore as indices are now always compressed.

 One thing that might help gain back some (not much, don't expect too much 
 from it) disk is to run an _optimize[1] call on this index. You can use 
 curator[2] in order to automate this operation.

 [1] 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-optimize.html
  
 [2] https://github.com/elasticsearch/curator


 On Wed, Mar 19, 2014 at 1:10 PM, horst knete badun...@hotmail.dejavascript:
  wrote:

 Hey guys,

 I just wanted to ask if there´s a possibility to decrease the disk space 
 useage of an elasticsearch index.

 e.g. i got an index called logstash-2014.03.17 with an size of 547MB and 
 an total of 1018761 indexed documents/logs and i want to reduce the disk 
 usage of it.

 Regarding to that article i found: 
 https://github.com/elasticsearch/elasticsearch/issues/2037 all i have to 
 do is to set index.store.compress.stored to true.

 I´ve done this with 
 -curl -XPOST 'localhost:9200/logstash-2014.03.17/_close'
 -curl curl -XPUT 'localhost:9200/logstash-2014.03.17/_settings' -d '
 {
 index : {
  index.store.compress.stored : true
 } }
 '
 -curl -XPOST 'localhost:9200/logstash-2014.03.17/_open'

 than i optimized the index with the command: 

 curl -XPOST 'http://localhost:9200/logstash-2014.03.17/_optimize'

 and restarted the cluster.

 Unfortunately nothing happend and it still uses 547MB of disk space.

 Have i done sth. wrong?

 Are there other ways to decrease the disk space useage?

 Thanks for response.
  
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/814e94ff-50c8-42ba-b587-0f04b3a8812b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Compress Elasticsearch Index/Disk Usage Elasticsearch

2014-03-19 Thread horst knete
Hey guys,

I just wanted to ask if there´s a possibility to decrease the disk space 
useage of an elasticsearch index.

e.g. i got an index called logstash-2014.03.17 with an size of 547MB and an 
total of 1018761 indexed documents/logs and i want to reduce the disk usage 
of it.

Regarding to that article i 
found: https://github.com/elasticsearch/elasticsearch/issues/2037 all i 
have to do is to set index.store.compress.stored to true.

I´ve done this with 
-curl -XPOST 'localhost:9200/logstash-2014.03.17/_close'
-curl curl -XPUT 'localhost:9200/logstash-2014.03.17/_settings' -d '
{
index : {
 index.store.compress.stored : true
} }
'
-curl -XPOST 'localhost:9200/logstash-2014.03.17/_open'

than i optimized the index with the command: 

curl -XPOST 'http://localhost:9200/logstash-2014.03.17/_optimize'

and restarted the cluster.

Unfortunately nothing happend and it still uses 547MB of disk space.

Have i done sth. wrong?

Are there other ways to decrease the disk space useage?

Thanks for response.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Kibana3 not display the terms panel

2014-02-21 Thread horst knete
After Hours of Debuging with Wireshark,Firebug etc. i found out that the 
problem is, that if you load the default page all panels get cached in the 
browser of the user except the terms-panel(i think it is not defined in the 
require.js/html of kibana to be cached)

If you know load a dashboard with a terms panel all the data from ES are 
send to the client, but not the data of the terms panel, which dont get 
displayed unless you refresh the page.

My solution was to add the terms panel to the default dashboard so that the 
client is forced to load the data for the terms panel.

Hopefully the developers of Kibana will fix this bug

Am Montag, 17. Februar 2014 13:56:57 UTC+1 schrieb horst knete:

 Hey guys,

 I´ve set up a syslog-system with Kibana3(Milestone4),ES and logstash and 
 everything worked fine.

 But now i got a small little problem, that if i load a dashboard in 
 kibana, where terms-panels are used, they wont get displayed.




 https://lh6.googleusercontent.com/-_wCbI8-pZFs/UwIGDvLRrpI/AA8/ZeszJbKpABM/s1600/screen.PNG














 As you see i just get the table-panels displayed but not the terms-panel 
 (it display the row but not the panels )

 I am using the Kibana from: 
 https://github.com/christian-marie/kibana3_auth which is pretty the same 
 as the original Milestone4 but with a authentification Site

 Have anyone of you encounterd the same problem, or have any solution?

 Thanks for your help

 cheers


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/392b95f3-7638-45e8-b157-5ac005ac7c7a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Kibana3 not display the terms panel

2014-02-17 Thread horst knete
Hey guys,

I´ve set up a syslog-system with Kibana3(Milestone4),ES and logstash and 
everything worked fine.

But now i got a small little problem, that if i load a dashboard in kibana, 
where terms-panels are used, they wont get displayed.



https://lh6.googleusercontent.com/-_wCbI8-pZFs/UwIGDvLRrpI/AA8/ZeszJbKpABM/s1600/screen.PNG














As you see i just get the table-panels displayed but not the terms-panel 
(it display the row but not the panels )

I am using the Kibana from: https://github.com/christian-marie/kibana3_auth 
which is pretty the same as the original Milestone4 but with a 
authentification Site

Have anyone of you encounterd the same problem, or have any solution?

Thanks for your help

cheers

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6b563f2a-b9f2-4c5a-9da3-f3a7f36de1aa%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.