Shards get not distributed across the cluster
Hey guys, i am struggeling atm with the shard allocation in my ES cluster. The problem is that i got the index [logstash-2015.04.13] that lays on the node Storage1 with 12 shards. I want that the index get evenly distributed to the 2 other Storage-nodes, the node with the SSDs in it should be avoided for allocation The settings of the index are: settings for logstash-2015.04.13 { index : { routing : { allocation : { include : { tag : }, require : { tag : }, exclude : { tag : SSD }, total_shards_per_node : 4 } }, Unfortunately i get the following logs in ES: [2015-04-20 08:01:55,070][DEBUG][cluster.routing.allocation] [SSD] [logstash-2015.04.13][4] allocated on [Node Info of Storage1], but can no longer be allocated on it, moving... [DEBUG][cluster.routing.allocation] [SSD] [logstash-2015.04.13][4] can't move I disabled the disk-based shard allocation to make sure that this isnt the problem. So is there a way i can see what denies the shard allocation to the other nodes? maybe a log of the Allocation-decider i can enable? Thanks in advance. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1692d1d9-ceaa-4235-9334-0d3d1947de75%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
ES comprehensive searches over different logs
Hey guys, we are acutally setting up some security relevant searches in our ES-database and came over the following case, which i dont get managed by myself: We want to make an query, that checks if a IP-address is accessing different ports in a given amout of time. So what we basically need to do is, make a terms aggregation on a field called remote_ip and match the terms with an filter/query like port:XXX AND port:XXY AND port:XXZ but that query must go over different logs (port:XXX is in log1, port:XXZ is in log2). So that query should return all remote_ips that have accessed all 3 ports in the given time. I really struggle with that log-comprehensive searches, cause im not that fit in aggregation yet. Some tipps would be really appreciated. Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/220ed49b-5cf6-45cf-879b-10acecacc36e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: What causes high CPU load on ES-Storage Nodes?
Hi, thanks for this excellent answer jörg, i really appreciate it. It is true that our concept of ES is not ideal or how the developer designed ES to be. Although i try to make the best out of the hardware that i have available, so im going to merge all the instances of ES that take care of the old indices to one and move it to the new big (big means 64 CPUs, 256 GB RAM in my context) server so that they have all the CPU power for garbage collection and searching available. What also could be an idea is to install an hypervisor on the server and make 4-5 virtual machines out of it and build an real, loadbalancing ES-Cluster - anyone got thoughts on this idea? Thanks so far, Am Mittwoch, 28. Januar 2015 10:32:32 UTC+1 schrieb Jörg Prante: In most cases, it is a design mistake to place more than one node on a single machine in production. Elasticsearch was not designed for scale-up on big machines (it depends what you mean by big), but for scale-out, i.e. the more nodes you add, the better. While Elasticsearch can benefit from more-powerful hardware, vertical scale has its limits. Real scalability comes from horizontal scale—the ability to add more nodes to the cluster and to spread load and reliability between them. See The Definitive Guide, it is worth a read. http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/distributed-cluster.html http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_scale_horizontally.html Another recommendation of mine is to use comparable hardware equipment for the machines (CPU, RAM, Disk capacity and performance, network), for ease of configuration, balancing load, and server maintenance. Jörg On Wed, Jan 28, 2015 at 7:32 AM, horst knete badun...@hotmail.de javascript: wrote: Hi, thx for your response Mark. It looks like we are getting a second big server for our ELK-stack(unfortunately without any more storage, so i really cant create a failover cluster yet), but i wonder what role i should give this server in our system. Would it be good to move the whole long term storage to this server and let the indexing and pretty much of the kibana searching happen on the existing server, or are there any good configuration setups? I read quite a bit of people that are having an similar setup like us ( with 1 or 2 big machines ) and would be happy if they can share there thoughts with us!. Thanks guys. Am Dienstag, 27. Januar 2015 22:54:18 UTC+1 schrieb Mark Walkom: Indexes are only refreshed when a new document is added. Best practices would be to use multiple machines, if you lose that one you lose your cluster! Without knowing more about your cluster stats, you're probably just reaching the limits of things, and either need less data, more nodes or more heap. On 28 January 2015 at 02:14, horst knete badun...@hotmail.de wrote: Hey guys, i´ve been around this problem for quite a while, and didnt got a clear answer to it. Like many of you guys out there, we are running an es-cluster on a single strong server, moving older indices from the fast SSD´s to slow cheap HDD´s (About 25 TB data). To make this work we got 3 instances of ES running with the path of indices set to the HDD´s mountpoint and 1 single instance for the indexing/searching SSD´s. What makes me wonder all the time is, that even the Storage-nodes dont do anything(there is no indexing happening, there is 95% of the time no searching happening, they are just keeping the old indices fresh) the cpu load caused by this idle nodes is about 50% from the whole cpu working time. hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b Is it possible that due to ES-Cluster mechanism, all the indexes are keep getting refreshed all the time, when a document is indexed or a search is executed. Are there any configuration options to avoid such an behavior? Would it be better to export the old indices to an separate ES-Cluster and configure multiple ES-Paths in Kibana? Are there any best practices to maintain such cluster? I would appreciate any form of feedback. Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34% 40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com
What causes high CPU load on ES-Storage Nodes?
Hey guys, i´ve been around this problem for quite a while, and didnt got a clear answer to it. Like many of you guys out there, we are running an es-cluster on a single strong server, moving older indices from the fast SSD´s to slow cheap HDD´s (About 25 TB data). To make this work we got 3 instances of ES running with the path of indices set to the HDD´s mountpoint and 1 single instance for the indexing/searching SSD´s. What makes me wonder all the time is, that even the Storage-nodes dont do anything(there is no indexing happening, there is 95% of the time no searching happening, they are just keeping the old indices fresh) the cpu load caused by this idle nodes is about 50% from the whole cpu working time. hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b Is it possible that due to ES-Cluster mechanism, all the indexes are keep getting refreshed all the time, when a document is indexed or a search is executed. Are there any configuration options to avoid such an behavior? Would it be better to export the old indices to an separate ES-Cluster and configure multiple ES-Paths in Kibana? Are there any best practices to maintain such cluster? I would appreciate any form of feedback. Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: What causes high CPU load on ES-Storage Nodes?
Hi, thx for your response Mark. It looks like we are getting a second big server for our ELK-stack(unfortunately without any more storage, so i really cant create a failover cluster yet), but i wonder what role i should give this server in our system. Would it be good to move the whole long term storage to this server and let the indexing and pretty much of the kibana searching happen on the existing server, or are there any good configuration setups? I read quite a bit of people that are having an similar setup like us ( with 1 or 2 big machines ) and would be happy if they can share there thoughts with us!. Thanks guys. Am Dienstag, 27. Januar 2015 22:54:18 UTC+1 schrieb Mark Walkom: Indexes are only refreshed when a new document is added. Best practices would be to use multiple machines, if you lose that one you lose your cluster! Without knowing more about your cluster stats, you're probably just reaching the limits of things, and either need less data, more nodes or more heap. On 28 January 2015 at 02:14, horst knete badun...@hotmail.de javascript: wrote: Hey guys, i´ve been around this problem for quite a while, and didnt got a clear answer to it. Like many of you guys out there, we are running an es-cluster on a single strong server, moving older indices from the fast SSD´s to slow cheap HDD´s (About 25 TB data). To make this work we got 3 instances of ES running with the path of indices set to the HDD´s mountpoint and 1 single instance for the indexing/searching SSD´s. What makes me wonder all the time is, that even the Storage-nodes dont do anything(there is no indexing happening, there is 95% of the time no searching happening, they are just keeping the old indices fresh) the cpu load caused by this idle nodes is about 50% from the whole cpu working time. hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b Is it possible that due to ES-Cluster mechanism, all the indexes are keep getting refreshed all the time, when a document is indexed or a search is executed. Are there any configuration options to avoid such an behavior? Would it be better to export the old indices to an separate ES-Cluster and configure multiple ES-Paths in Kibana? Are there any best practices to maintain such cluster? I would appreciate any form of feedback. Thanks -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0a577e2a-7153-4f8a-a315-7f049e136315%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
ES keeps deleting shards
hey folks, as we added an additional es node yesterday(actually an other instance of ES with an different data path on our server), some of the shards that were moved to this node, keep stuck initializing. Given to the debug logs, it seems like they got corrupted during the copy process (about 20 of 140 moved shards got this problem...) so i decided it could help to manually assign them to an node, using the reroute-api. https://lh6.googleusercontent.com/-qSB3BJT-F5o/VIqjbiphLTI/AC4/9Yno_hajBgM/s1600/es.PNG https://lh5.googleusercontent.com/-ijpXObA2tms/VIqkw9lfDHI/ADY/Gc3sK_NE0do/s1600/size.PNG = curl -X POST \ 'http://localhost:9200/_cluster/reroute?pretty=true' \ -d '{ commands : [ { allocate : { index : logstash-2014.06.07, shard : 2 , node : Storage, allow_primary : 2 }}]}' After I issued the command, ES tried to initialize the shard, but then this suddenly happend: https://lh6.googleusercontent.com/-RnP051vjz1I/VIqlJQLi4-I/ADg/tC5q9TDr5HQ/s1600/sizeafter.PNG It seems like ES decided to delete the whole shard, due to its (potential) corruption and with it 1/3 of the data in it. In my opinion ES should never ever be allowed to delete anything without permission, any clue what happened in this case? Thanks for feedback -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b737268a-32aa-4100-ba03-ac3b12e616fd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
poor performance of full text searches
Hey guys, we are experience poor performance if we do some full text ( searches without specifying an field name). If we search in an 200 GB index which lays on SSDs for something like ' program:apache ' the search takes about 10-15 seconds, if we search for ' *apache* ' the whole search goes for 1,5-2 minutes. What can cause this? Our mapping? the high heap of the SSD ES node ( 80gb + ~90 GB available Lucene system memory)? Is it normal? If you need any information, please let me know. thanks for any respond -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a3a41710-93b4-4e81-93a7-6cde43e5c387%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: poor performance of full text searches
Thx for response, Actually the need to search with wildcards is given because our mapping. In our events are indexed a whole lot of urls which are indexed to many terms by the default analyzer, this will lead to an pretty akward output if you are doing the top 10 search in Kibana in the table panel. Thx to an post of an other guy in here in the forum we adjusted the mapping so everything get indexed into 1 token per field, allowing urls to be displayed in correct way but having the disadvantage of having to need to use wildcards because apache wont return any events. We are actually using 1 big server with 256 RAM + 64 CPU, divided into 3 elasticsearch instances. Given to your post it seems to make sense to adjusted the mapping back to the way that everything get tokenized back to the default way, eliminating the need of using wildcards and use not_analyzed fields for urls. Maybe some kind of stupid question: if you dont specify any field in kibana search, it will be redirected by default to the _all field right? is there a way of changing that behaviour so that apache get not directed to _all but rather to message-field? that would eliminate the need of _all field which would reduce disk space and cpu/ram load for indexing Am Freitag, 28. November 2014 10:56:09 UTC+1 schrieb David Pilato: Some comments: Searching for `program:apache` actually search in field `program`. Searching with wildcards is something you really should not do! Actually when you do full text search on Google, I guess you don’t use wildcards, right? It’s not really user friendly. So wildcards are extremely slow and even slower when you start with *! Look here for reference: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html#query-dsl-wildcard-query Searching in _all field is also something you should not do but it depends on your use case. I’d prefer using copy_to feature in mapping instead of using _all field. Now, I think it depends on your infra. How many nodes, RAM, … you have. -- *David Pilato* | *Technical Advocate* | *Elasticsearch.com http://Elasticsearch.com* @dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs Le 28 nov. 2014 à 10:44, horst knete badun...@hotmail.de javascript: a écrit : Hey guys, we are experience poor performance if we do some full text ( searches without specifying an field name). If we search in an 200 GB index which lays on SSDs for something like ' program:apache ' the search takes about 10-15 seconds, if we search for ' *apache* ' the whole search goes for 1,5-2 minutes. What can cause this? Our mapping? the high heap of the SSD ES node ( 80gb + ~90 GB available Lucene system memory)? Is it normal? If you need any information, please let me know. thanks for any respond -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a3a41710-93b4-4e81-93a7-6cde43e5c387%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/a3a41710-93b4-4e81-93a7-6cde43e5c387%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3bf1946b-86bd-4dbf-b7fd-b07a7f27460a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Changing Analyzer behavior for hyphens - suggestions?
I think our solution is now to just replace all the non-letters from elasticsearch with an _. char_filter : { replace : { type : mapping, mappings: [\\.=_, \\u2010=_, '\''=_, \\:=_, \\u0020=_, \\u005C=_, \\u0028=_, \\u0029=_, \\u0026=_, \\u002F=_, \\u002D=_, \\u003F=_, \\u003D=_] } }, This lead to that the terms wont get splited into useless tokens from the standard analyzer. The downside of this solution is that some urls or Windows paths looks very ugly for the human eye now, e.g: http://g.ceipmsn.com/8SE/411?MI=B9DC2E6D07184453A1EFC4E765A16D30-0LV=3.0.131.0OS=6.1.7601AG=1217 = http___g_ceipmsn_com_8se_411_mi_b9dc2e6d07184453a1efc4e765a16d30_0_lv_3_0_131_0_os_6_1_7601_ag_1217 The good thing compared to not analyzed is that if i search for url:*8se*, the search will return the events with this url in it. I think this is not a perfect solution, but rather an good workaround till lucene gives us better analyzer types to work with. Thanks for sharing your experience so far! cheers Am Donnerstag, 20. November 2014 20:07:27 UTC+1 schrieb Jörg Prante: The whitespace tokenizer has the problem that punctuation is not ignored. I find the word_delimiter filter not working at all with whitespace, only with keyword tokenizer, with massive pattern matching which is complex and expensive :( Therefore I took the classic tokenizer and generalized the hyphen rules in the grammar. The tokenizer hyphen and filter hyphen are two routines. The tokenizer hyphen keeps hyphenated words together and handles punctuation correct. The filter hyphen adds combinations to the original form. Main point is to add combinations of dehyphenated forms so they can be searched. Single words are only taken into account when the word is positioned at the edge. For example, the phrase der-die-das should be indexed in the following forms: der-die-das, derdiedas, das, derdie, derdie-das, die-das, der Jörg On Thu, Nov 20, 2014 at 9:29 AM, horst knete badun...@hotmail.de javascript: wrote: So the term this-is-a-test get tokenized into this-is-a-test which is nice behaviour, but in order to make an full-text-search on this field it should get tokenized into this-is-a-test, this, is, a and test as i wrote before. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2aa17dac-b484-4385-8efd-9a74847e9582%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Changing Analyzer behavior for hyphens - suggestions?
Hi, thx for response and this awesome plugin bundle (especially for me as german). Unfortunately the hyphen analyzer plugin didnt do the job in the way i wanted it to be. The hyphen-analyzer does something similar like the whitespace analyzer - it just dont split on hyphen and instead see them as ALPHANUM characters (at least that is what i think right now). So the term this-is-a-test get tokenized into this-is-a-test which is nice behaviour, but in order to make an full-text-search on this field it should get tokenized into this-is-a-test, this, is, a and test as i wrote before. i think maybe abusing the word_delimiter token filter could do the job, because there is an option preserve_original. unfortunately if you adjust the filter like this: PUT /logstash-2014.11.20 { index : { analysis : { analyzer : { wordtest : { type : custom, tokenizer : whitespace, filter : [ lowercase, word ] } }, filter : { word : { type : word_delimiter, generate_word_parts: false, generate_number_parts: false, catenate_words: false, catenate_numbers: false, catenate_all: false, split_on_case_change: false, preserve_original: true, split_on_numerics: false, stem_english_possessive: true } } } } } and make an analyze test: curl -XGET 'localhost:9200/logstash-2014.11.20/_analyze?filters=word' -d 'this-is-a-test' the response is this: {tokens:[{token:this,start_offset:0,end_offset:4,type:ALPHANUM,position:1},{token:is,start_offset:5,end_offset:7,type:ALPHANUM,position:2},{token:a,start_offset:8,end_offset:9,type:ALPHANUM,position:3},{token:test,start_offset:10,end_offset:14,type:ALPHANUM,position:4}] which just says it tokenized it in everything expect the original term, which make me wonder if the preserver_original settings is working? Any idea on this? Am Mittwoch, 19. November 2014 18:26:09 UTC+1 schrieb Jörg Prante: You search for a hyphen-aware tokenizer, like this? https://gist.github.com/jprante/cd120eac542ba6eec965 It is in my plugin bundle https://github.com/jprante/elasticsearch-plugin-bundle Jörg On Wed, Nov 19, 2014 at 5:46 PM, horst knete badun...@hotmail.de javascript: wrote: Hey guys, after working with the ELK stack for a while now, we still got an very annoying problem regarding the behavior of the standard analyzer - it splits terms into tokens using hyphens or dots as delimiters. e.g logsource:firewall-physical-management get split into firewall , physical and management. On one side thats cool because if you search for logsource:firewall you get all the events with firewall as an token in the field logsource. The downside on this behaviour is if you are doing e.g. an top 10 search on an field in Kibana, all the tokens are counted as an whole term and get rated due to their count: top 10: 1. firewall : 10 2. physical : 10 3. management: 10 instead of top 10: 1. firewall-physical-management: 10 Well in the standard mapping from logstash this is solved using and .raw field as not_analyzed but the downside on this is you got 2 fields instead of one (even if its a multi_field) and the usage for kibana users is not that great. So what we need is that logsource:firewall-physical-management get tokenized into firewall-physical-management, firewall , physical and management. I tried this using the word_delimiter filter token with the following mapping: analysis : { analyzer : { my_analyzer : { type : custom, tokenizer : whitespace, filter : [lowercase, asciifolding, my_worddelimiter] } }, filter : { my_worddelimiter : { type : word_delimiter, generate_word_parts: false, generate_number_parts: false, catenate_words: false, catenate_numbers: false, catenate_all: false, split_on_case_change: false, preserve_original: true, split_on_numerics: false, stem_english_possessive: true
Changing Analyzer behavior for hyphens - suggestions?
Hey guys, after working with the ELK stack for a while now, we still got an very annoying problem regarding the behavior of the standard analyzer - it splits terms into tokens using hyphens or dots as delimiters. e.g logsource:firewall-physical-management get split into firewall , physical and management. On one side thats cool because if you search for logsource:firewall you get all the events with firewall as an token in the field logsource. The downside on this behaviour is if you are doing e.g. an top 10 search on an field in Kibana, all the tokens are counted as an whole term and get rated due to their count: top 10: 1. firewall : 10 2. physical : 10 3. management: 10 instead of top 10: 1. firewall-physical-management: 10 Well in the standard mapping from logstash this is solved using and .raw field as not_analyzed but the downside on this is you got 2 fields instead of one (even if its a multi_field) and the usage for kibana users is not that great. So what we need is that logsource:firewall-physical-management get tokenized into firewall-physical-management, firewall , physical and management. I tried this using the word_delimiter filter token with the following mapping: analysis : { analyzer : { my_analyzer : { type : custom, tokenizer : whitespace, filter : [lowercase, asciifolding, my_worddelimiter] } }, filter : { my_worddelimiter : { type : word_delimiter, generate_word_parts: false, generate_number_parts: false, catenate_words: false, catenate_numbers: false, catenate_all: false, split_on_case_change: false, preserve_original: true, split_on_numerics: false, stem_english_possessive: true } } } But this unfortunately didnt do the job. I´ve saw on my recherche that some other guys have an similar problem like this, but expect some replacement suggestions, no real solution was found. If anyone have some ideas on how to start working on this, i would be very happy. thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4094292c-057f-43d8-9af0-1ea83ad45a1c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Index reallocation takes hours
Hi, i want to use this thread, to clarify some of the elasticsearch settings, because im struggeling a bit in understanding all of them. 1.indices.store.throttle.max_bytes_per_sec : This is the maximum amount of mb/gb which elasticsearch uses per second to to the merging process - right? (Is there an difference between indices.store.throttle.max_bytes_per_sec and index.store.throttle.max_bytes_per_sec? ) 2.indices.recovery.max_bytes_per_sec: This is the maximum amout of mb/gb which elasticsearch uses to initialize shards/indices after an restart of the node - right? 3.indices.recovery.concurrent_streams: The amount of parallel shard initializations after an restart - right? 4.cluster.routing.allocation.node_concurrent_recoveries: The cluster wide amount of parallel shard initializations after an restart - right? 5.cluster.routing.allocation.node_initial_primaries_recoveries: The cluster wide amount of parallel primary shard initializations after an restart - right? 6.cluster.routing.allocation.cluster_concurrent_rebalance:2: The amount of shards that are parallel allowed to move from one node to an other -right? But of all the settings found in the elasticsearch there is no such settings like cluster.routing.allocation.max_bytes_per_sec which limits the maximum amount of mb/gb per sec that can be used to allocate a shard from one node to another - will elasticsearch by default use a value or just use the maximum possible speed? Thanks for response Am Mittwoch, 8. Oktober 2014 11:04:26 UTC+2 schrieb Andrew Lakes: Hi, we are using version 1.3.2 of ES , each index have 4 shards. I will try adjusting the store-throttling to an acceptable number without crushing our disk IOs Am Dienstag, 7. Oktober 2014 19:35:35 UTC+2 schrieb Ivan Brusic: Currently shard allocation is throttled by default with a target low value. If you are using an older version of Elasticsearch, then a Lucene bug will make it even slower. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#store-throttling How many shards do your indices have? The default number of concurrent allocations is only 2. The ideal values depends on your hardware, especially your disk io. Cheers, Ivan On Oct 7, 2014 8:42 AM, Andrew Lakes alak...@gmail.com wrote: Hey guys, we have setup an central ES-server with 2 instances (1 instance SSDs for indexing/last indices and an other one for long term storage) Everyday at 6 a.m a cronjob is started that moves indices older than 7 days to the HDD storage node. Our indices have an average size of 300 GB, but it takes about 6-7 hours till the reallocation of 1 index is done (if you copy it manually with cp it only takes about 30 minutes). Is there any settings which can i adjust that speed this process of reallocation up? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aa78aefb-be9e-4dc6-bdca-948d312f06af%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/aa78aefb-be9e-4dc6-bdca-948d312f06af%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1224c3f7-8207-42f5-8d2d-95eb36280ef2%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Any experience with ES and Data Compressing Filesystems?
Hi again, a quick report regarding compression: we are using a 3-TB btrfs-volume with 32k block size now which reduced the amount of data from 3,2 TB to 1,1TB without any segnificant performance losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to the volume ). So for us i can only suggest using the btrfs-volume for long term storage. Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski: Hi, gzip/zlib compression is very bad for performance, so it can be interesting for closed indices, but for live data I would not recommend it. Also, you must know that: Compression using lz4 is already enabled into indices, ES/Lucene/Java usually readwrite 4k blocks, - hence, compression is achieved on 4k blocks. If your filesystem uses 4k blocks and you add FS compression, you will probably have a very small gain, if any. I've tried on ZFS: Filesystem SizeUsed Avail Capacity Mounted on zdata/ES-lz4 1.1T1.9G1.1T 0%/zdata/ES-lz4 zdata/ES 1.1T1.9G1.1T 0%/zdata/ES If you are using a larger block size, like 128k, a compressed filesystem does show some benefit: Filesystem SizeUsed Avail Capacity Mounted on zdata/ES-lz4 1.1T1.1G1.1T 0% /zdata/ES-lz4- compressratio 1.73x zdata/ES-gzip 1.1T901M1.1T 0% /zdata/ES-gzip- compressratio 2.27x zdata/ES 1.1T1.9G1.1T 0%/zdata/ES But a file system block larger than 4k is very suboptimal for IO (ES read or write one 4k block - your FS must read or write a 128k block). On 21 juil. 2014, at 07:58, horst knete badun...@hotmail.de javascript: wrote: Hey guys, we have mounted an btrfs file system with the compression method zlib for testing purposes on our elasticsearchserver and copied one of the indices on the btrfs volume, unfortunately it had no success and still got the size of 50gb :/ I will further try it with other compression methods and will report here Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic: Hi Horst, I wouldn't bother with this for the reasons Joerg mentioned, but should you try it anyway, I'd love to hear your findings/observations. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote: Hey Guys, to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests). Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability. Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses? Thanks for responding -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Any experience with ES and Data Compressing Filesystems?
We are indexing all sort of events (Windows, Linux, Apache, Netflow and so on...) and impact is defined in speed of the Kibana GUI / how long it takes to load 7 or 14 days of data. Thats what is important for my colleagues. Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom: What sort of data are you indexing? When you said performance impact was minimal, how minimal and at what points are you seeing it? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 4 August 2014 16:43, horst knete badun...@hotmail.de javascript: wrote: Hi again, a quick report regarding compression: we are using a 3-TB btrfs-volume with 32k block size now which reduced the amount of data from 3,2 TB to 1,1TB without any segnificant performance losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to the volume ). So for us i can only suggest using the btrfs-volume for long term storage. Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski: Hi, gzip/zlib compression is very bad for performance, so it can be interesting for closed indices, but for live data I would not recommend it. Also, you must know that: Compression using lz4 is already enabled into indices, ES/Lucene/Java usually readwrite 4k blocks, - hence, compression is achieved on 4k blocks. If your filesystem uses 4k blocks and you add FS compression, you will probably have a very small gain, if any. I've tried on ZFS: Filesystem SizeUsed Avail Capacity Mounted on zdata/ES-lz4 1.1T1.9G1.1T 0%/zdata/ES-lz4 zdata/ES 1.1T1.9G1.1T 0%/zdata/ES If you are using a larger block size, like 128k, a compressed filesystem does show some benefit: Filesystem SizeUsed Avail Capacity Mounted on zdata/ES-lz4 1.1T1.1G1.1T 0% /zdata/ES-lz4- compressratio 1.73x zdata/ES-gzip 1.1T901M1.1T 0% /zdata/ES-gzip- compressratio 2.27x zdata/ES 1.1T1.9G1.1T 0%/zdata/ES But a file system block larger than 4k is very suboptimal for IO (ES read or write one 4k block - your FS must read or write a 128k block). On 21 juil. 2014, at 07:58, horst knete badun...@hotmail.de wrote: Hey guys, we have mounted an btrfs file system with the compression method zlib for testing purposes on our elasticsearchserver and copied one of the indices on the btrfs volume, unfortunately it had no success and still got the size of 50gb :/ I will further try it with other compression methods and will report here Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic: Hi Horst, I wouldn't bother with this for the reasons Joerg mentioned, but should you try it anyway, I'd love to hear your findings/observations. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote: Hey Guys, to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests). Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability. Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses? Thanks for responding -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Any experience with ES and Data Compressing Filesystems?
Hey guys, we have mounted an btrfs file system with the compression method zlib for testing purposes on our elasticsearchserver and copied one of the indices on the btrfs volume, unfortunately it had no success and still got the size of 50gb :/ I will further try it with other compression methods and will report here Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic: Hi Horst, I wouldn't bother with this for the reasons Joerg mentioned, but should you try it anyway, I'd love to hear your findings/observations. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote: Hey Guys, to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests). Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability. Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses? Thanks for responding -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5fab716e-dcef-4edf-b658-56922f8dee16%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Recommended File System Settings for Elasticsearch
Hey Guys, we are going to expand our development ES-System to an productive environment and we got fresh new storage which waits to be formatted. In order to get the best performance for elasticsearch, we´d like to know what the recommended settings are. Does elasticsearch reads/writes in a sequential way or are there much random access on the indices? Which block-size would you recommend? ( i´ve heard elasticsearch reads/writes in 4k blocks, right? ) Thanks for response -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Any experience with ES and Data Compressing Filesystems?
Hey Guys, to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests). Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability. Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses? Thanks for responding -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Disabling _all-Field but keep Netflow-Events searchable
Anyone got an idea how to realize that? I think that there are a few uses which got Netflow AND other types of events inserting into Elasticsearch and for those a disabled _all Field would save much hard disk space -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2b19c405-46b0-483b-9d22-73fa0c6dca5b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Disabling _all-Field but keep Netflow-Events searchable
Hey, thx for your response. As i already mentioned i tried setting the index.query.default_field to message. { template : logstash-*, settings : { index.refresh_interval : 5s, index.number_of_shards : 3, index.number_of_replicas : 0, index.refresh_interval : 30s, index.store.compress.stored : true, index.store.compress.tv : true, index.query.default_field : message, analysis : { analyzer : { default : { type : standard, stopwords : _none_ } } } }, mappings : { _default_ : { _all : {enabled : false}, _source: { compress: true }, dynamic_templates : [ { string_fields : { match : *, match_mapping_type : string, mapping : { type : string, fields : { {name} : {type: string, index : not_analyzed} } } } } ], properties : { @version: { type: string, index: not_analyzed }, @timestamp : { type : date, index : not_analyzed }, tags: { type: string, index : not_analyzed } } } } } That was the template that i used. This is working fine for all Events except the Netflow ones, because they dont have a message-field for Kibana to search in. Thats what my mess is. Is it possible to adjust the template/mapping per type of event? Cheers Am Montag, 30. Juni 2014 09:08:22 UTC+2 schrieb Alexander Reelsen: Hey, you can set the index.query.default_field in the mapping to circumvent this, see http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-all-field.html#mapping-all-field --Alex On Tue, Jun 24, 2014 at 12:39 PM, horst knete badun...@hotmail.de javascript: wrote: Hey guys, I really want to disable the _all-Field in the ES-Indices to save some disk-space on our system. Normally its not the problem - adjust template in ES, and set message-Field to the new default query field, that is normally available in any event. The problem is that we also have many netflow-events with the netflow-codec that have the following form: https://lh4.googleusercontent.com/-CDQQs5e5a7o/U6lUvjikncI/ACo/LHpMXlYLMWw/s1600/netflow.PNG As you might notice there isnt any message-field so the Kibana lucene query would run into an error. My question is - how do i manage it to make this work (disabling _all-Field but search in the netflow-events)? Thanks for response. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9f406910-1608-4866-8c9c-42a23f6d8f11%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Disabling _all-Field but keep Netflow-Events searchable
Hey guys, I really want to disable the _all-Field in the ES-Indices to save some disk-space on our system. Normally its not the problem - adjust template in ES, and set message-Field to the new default query field, that is normally available in any event. The problem is that we also have many netflow-events with the netflow-codec that have the following form: https://lh4.googleusercontent.com/-CDQQs5e5a7o/U6lUvjikncI/ACo/LHpMXlYLMWw/s1600/netflow.PNG As you might notice there isnt any message-field so the Kibana lucene query would run into an error. My question is - how do i manage it to make this work (disabling _all-Field but search in the netflow-events)? Thanks for response. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Using Jetty plugin
Hi, do you want to restrict access to the physical location (e.g. /opt/elasticsearch/data/elasticsearch/0/indices/index-name) or do you want to restrict access to some type of logs? For the 2nd option you could look at this thread https://groups.google.com/forum/#!topic/logstash-users/LFmNPv3z6YE where i explained using kibana3_auth for this purpose Am Dienstag, 10. Juni 2014 12:30:23 UTC+2 schrieb Gaurav Kamaljith: Hi, Is it possible to restrict Kibana users from accessing particular indices/sub indices folders using the Jetty plugin? Regards, Gaurav -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/66c69e64-1f2c-4917-a3b7-0a820f1d556e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: alerts from Kibana/ES
Hi NF, we did also set up alerting with our zabbix-monitoring-system. What we use are simple Linux-Scripts using the command curl to search in given elasticsearch-indices. In the zabbix-system are build triggers that are runs the script on our elasticsearch-server and interpret the output given from this scripts (e.g. Number of events with ID 4625) and if this value is a specific value the trigger alerts. It´s simple to set up and maybe this is what you are looking for. If you need any help, feel free to contact me Am Freitag, 30. Mai 2014 08:31:07 UTC+2 schrieb NF: That's right, Otis. On Friday, May 30, 2014 7:20:27 AM UTC+2, Otis Gospodnetic wrote: Hi, There's no alerting in Kibana. Have a look at SPM http://www.google.com/url?q=http%3A%2F%2Fsematext.com%2Fspm%2Fsa=Dsntz=1usg=AFQjCNEYpqf7mA9CH122rtdZ0CJtqY3bFQ - it has ES monitoring, threshold and heartbeat alerting, anomaly detection, and a number of other features. Actually, re-reading your email - you are looking to get notified when a certain event is captured? By that do you mean having something like a saved query that matches incoming logs? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Tuesday, May 27, 2014 5:02:35 AM UTC-4, NF wrote: Hi, We’re using Kibana/Elasticsearch to visualize different kind of logs in our company. Now, we would need a feature that would allow us to send an alert/notification (email or other) when a certain event/trigger is captured. I’d like to know if in Kibana/Elasticsearch backlog there is such a feature planned? If so, when might we expect it available? If not, could you please suggest any (open source) solution to satisfy our need? Thanks, Natalia -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b6c9170f-ad81-4d92-94d5-424389e817c8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Store Elasticsearch Indices in a revision/audit-proof way
Hey guys, in order to meet the german laws for logging, i got the order to store the elasticsearch indices in a revision/audit-proof way(Indices cannot be edited/changed after the storage). Are there any best practices or tips for doing such a thing?(maybe any plugins?) Thanks for your feedback. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Store Elasticsearch Indices in a revision/audit-proof way
Yeah it looks like that this would do the job, thanks for response Am Donnerstag, 22. Mai 2014 10:40:19 UTC+2 schrieb Mark Walkom: You can set indexes to readonly - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html Is that what you're after? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com javascript: web: www.campaignmonitor.com On 22 May 2014 18:36, horst knete badun...@hotmail.de javascript:wrote: Hey guys, in order to meet the german laws for logging, i got the order to store the elasticsearch indices in a revision/audit-proof way(Indices cannot be edited/changed after the storage). Are there any best practices or tips for doing such a thing?(maybe any plugins?) Thanks for your feedback. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e2776ff1-4dde-4e96-85b0-f19cd9ad6c9b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Store Elasticsearch Indices in a revision/audit-proof way
Hi Jörg, thanks for your offer. I will contact you if there´s a need for such an plugin in our company. Also i will keep you up to date if there´s breaking changes in our project. Am Donnerstag, 22. Mai 2014 10:55:44 UTC+2 schrieb Jörg Prante: You have to add a facility to your middleware that can trace all authorized operations to your index (access, read, write, modify, delete) and you must write this to an append-only logfile with timestamps. If there is interest I could write such a plugin (assuming it can run in a trusted environment regarding authorization tokens) but I think best place is in a middleware (where an ES client runs in a broader application context e.g. transaction awareness). Jörg On Thu, May 22, 2014 at 10:36 AM, horst knete badun...@hotmail.dejavascript: wrote: Hey guys, in order to meet the german laws for logging, i got the order to store the elasticsearch indices in a revision/audit-proof way(Indices cannot be edited/changed after the storage). Are there any best practices or tips for doing such a thing?(maybe any plugins?) Thanks for your feedback. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4f2187bc-8d8e-4c3a-ae02-8eed30f3a175%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Elasticsearch supports multiple not RAID 0 Data Paths?
Hello, we are currently running our Elasticsearch on 1 single node and get about 20 Million Logs per Day(40 GB/daily indices). Since this is a much Stuff to handle with, the indices takes a lot of disk space on our server. What we like to implement: - 1 Data directory which is stored on our SSDs and contains the indices of the last 7 days for quick access. - 1 Data directory which is stored on normal HDDs and contains indices of last 3 months for normals speed acces. - 1 Data directory which is stored on slow 5400 rpm HDDs and contains the indices of the last 2 years for access if needed. Well it´s not problem to tell ES multiple data paths but if you do this, ES will stripe (RAID 0) the indices on all 3 data directories. But thats not what we want. We want do copy the indices with a script to the matching directories ( a index which is older than 8 days gets automatically moved to normal HDDs and so on). Is there any way to make this work? Thanks for your feedback. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8e34044-5895-4ccd-bac4-5ef11ea81204%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Compress Elasticsearch Index/Disk Usage Elasticsearch
Hi Adrien, thanks for respone. The Curator is a very helpful tool thanks for the link! I tried it and save about 1 GB of disk space (24-23GB) i think the effekt increases with more indexes. Am Mittwoch, 19. März 2014 17:42:56 UTC+1 schrieb Adrien Grand: Hi, This setting doesn't exist anymore as indices are now always compressed. One thing that might help gain back some (not much, don't expect too much from it) disk is to run an _optimize[1] call on this index. You can use curator[2] in order to automate this operation. [1] http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-optimize.html [2] https://github.com/elasticsearch/curator On Wed, Mar 19, 2014 at 1:10 PM, horst knete badun...@hotmail.dejavascript: wrote: Hey guys, I just wanted to ask if there´s a possibility to decrease the disk space useage of an elasticsearch index. e.g. i got an index called logstash-2014.03.17 with an size of 547MB and an total of 1018761 indexed documents/logs and i want to reduce the disk usage of it. Regarding to that article i found: https://github.com/elasticsearch/elasticsearch/issues/2037 all i have to do is to set index.store.compress.stored to true. I´ve done this with -curl -XPOST 'localhost:9200/logstash-2014.03.17/_close' -curl curl -XPUT 'localhost:9200/logstash-2014.03.17/_settings' -d ' { index : { index.store.compress.stored : true } } ' -curl -XPOST 'localhost:9200/logstash-2014.03.17/_open' than i optimized the index with the command: curl -XPOST 'http://localhost:9200/logstash-2014.03.17/_optimize' and restarted the cluster. Unfortunately nothing happend and it still uses 547MB of disk space. Have i done sth. wrong? Are there other ways to decrease the disk space useage? Thanks for response. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/814e94ff-50c8-42ba-b587-0f04b3a8812b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Compress Elasticsearch Index/Disk Usage Elasticsearch
Hey guys, I just wanted to ask if there´s a possibility to decrease the disk space useage of an elasticsearch index. e.g. i got an index called logstash-2014.03.17 with an size of 547MB and an total of 1018761 indexed documents/logs and i want to reduce the disk usage of it. Regarding to that article i found: https://github.com/elasticsearch/elasticsearch/issues/2037 all i have to do is to set index.store.compress.stored to true. I´ve done this with -curl -XPOST 'localhost:9200/logstash-2014.03.17/_close' -curl curl -XPUT 'localhost:9200/logstash-2014.03.17/_settings' -d ' { index : { index.store.compress.stored : true } } ' -curl -XPOST 'localhost:9200/logstash-2014.03.17/_open' than i optimized the index with the command: curl -XPOST 'http://localhost:9200/logstash-2014.03.17/_optimize' and restarted the cluster. Unfortunately nothing happend and it still uses 547MB of disk space. Have i done sth. wrong? Are there other ways to decrease the disk space useage? Thanks for response. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Kibana3 not display the terms panel
After Hours of Debuging with Wireshark,Firebug etc. i found out that the problem is, that if you load the default page all panels get cached in the browser of the user except the terms-panel(i think it is not defined in the require.js/html of kibana to be cached) If you know load a dashboard with a terms panel all the data from ES are send to the client, but not the data of the terms panel, which dont get displayed unless you refresh the page. My solution was to add the terms panel to the default dashboard so that the client is forced to load the data for the terms panel. Hopefully the developers of Kibana will fix this bug Am Montag, 17. Februar 2014 13:56:57 UTC+1 schrieb horst knete: Hey guys, I´ve set up a syslog-system with Kibana3(Milestone4),ES and logstash and everything worked fine. But now i got a small little problem, that if i load a dashboard in kibana, where terms-panels are used, they wont get displayed. https://lh6.googleusercontent.com/-_wCbI8-pZFs/UwIGDvLRrpI/AA8/ZeszJbKpABM/s1600/screen.PNG As you see i just get the table-panels displayed but not the terms-panel (it display the row but not the panels ) I am using the Kibana from: https://github.com/christian-marie/kibana3_auth which is pretty the same as the original Milestone4 but with a authentification Site Have anyone of you encounterd the same problem, or have any solution? Thanks for your help cheers -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/392b95f3-7638-45e8-b157-5ac005ac7c7a%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Kibana3 not display the terms panel
Hey guys, I´ve set up a syslog-system with Kibana3(Milestone4),ES and logstash and everything worked fine. But now i got a small little problem, that if i load a dashboard in kibana, where terms-panels are used, they wont get displayed. https://lh6.googleusercontent.com/-_wCbI8-pZFs/UwIGDvLRrpI/AA8/ZeszJbKpABM/s1600/screen.PNG As you see i just get the table-panels displayed but not the terms-panel (it display the row but not the panels ) I am using the Kibana from: https://github.com/christian-marie/kibana3_auth which is pretty the same as the original Milestone4 but with a authentification Site Have anyone of you encounterd the same problem, or have any solution? Thanks for your help cheers -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6b563f2a-b9f2-4c5a-9da3-f3a7f36de1aa%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.