from:"horst knete"

Shards get not distributed across the cluster

2015-04-20 Thread horst knete

Hey guys,

i am struggeling atm with the shard allocation in my ES cluster.

The problem is that i got the index [logstash-2015.04.13] that lays on the 
node Storage1 with 12 shards. I want that the index get evenly 
distributed to the 2 other Storage-nodes, the node with the SSDs in it 
should be avoided for allocation

The settings of the index are:

settings for logstash-2015.04.13 
 
{
index : {
routing : {
allocation : {
include : {
tag : 
},
require : {
tag : 
},
exclude : {
tag : SSD
},
total_shards_per_node : 4
}
},

Unfortunately i get the following logs in ES:

[2015-04-20 08:01:55,070][DEBUG][cluster.routing.allocation] [SSD] 
[logstash-2015.04.13][4] allocated on [Node Info of Storage1], but can no 
longer be allocated on it, moving...
[DEBUG][cluster.routing.allocation] [SSD] [logstash-2015.04.13][4] can't move

I disabled the disk-based shard allocation to make sure that this isnt the 
problem.

So is there a way i can see what denies the shard allocation to the other 
nodes? maybe a log of the Allocation-decider i can enable?

Thanks in advance.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1692d1d9-ceaa-4235-9334-0d3d1947de75%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ES comprehensive searches over different logs

2015-03-06 Thread horst knete

Hey guys,

we are acutally setting up some security relevant searches in our 
ES-database and came over the following case, which i dont get managed by 
myself:

We want to make an query, that checks if a IP-address is accessing 
different ports in a given amout of time.

So what we basically need to do is, make a terms aggregation on a field 
called remote_ip and match the terms with an filter/query like port:XXX 
AND port:XXY AND port:XXZ but that query must go over different logs 
(port:XXX is in log1, port:XXZ is in log2).

So that query should return all remote_ips that have accessed all 3 ports 
in the given time.

I really struggle with that log-comprehensive searches, cause im not that 
fit in aggregation yet.

Some tipps would be really appreciated.

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/220ed49b-5cf6-45cf-879b-10acecacc36e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: What causes high CPU load on ES-Storage Nodes?

2015-01-28 Thread horst knete

Hi,

thanks for this excellent answer jörg, i really appreciate it.

It is true that our concept of ES is not ideal or how the developer
designed ES to be.

Although i try to make the best out of the hardware that i have available,
so im going to merge all the instances of ES that take care of the old
indices to one and move it to the new big (big means 64 CPUs, 256 GB RAM
in my context) server so that they have all the CPU power for garbage
collection and searching available.

What also could be an idea is to install an hypervisor on the server and
make 4-5 virtual machines out of it and build an real, loadbalancing
ES-Cluster - anyone got thoughts on this idea?

Thanks so far,

Am Mittwoch, 28. Januar 2015 10:32:32 UTC+1 schrieb Jörg Prante:

In most cases, it is a design mistake to place more than one node on a
single machine in production.

Elasticsearch was not designed for scale-up on big machines (it depends
what you mean by big), but for scale-out, i.e. the more nodes you add,
the better.

While Elasticsearch can benefit from more-powerful hardware, vertical
scale has its limits. Real scalability comes from horizontal scale—the
ability to add more nodes to the cluster and to spread load and reliability
between them.

See The Definitive Guide, it is worth a read.

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/distributed-cluster.html

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_scale_horizontally.html

Another recommendation of mine is to use comparable hardware equipment for
the machines (CPU, RAM, Disk capacity and performance, network), for ease
of configuration, balancing load, and server maintenance.

Jörg

On Wed, Jan 28, 2015 at 7:32 AM, horst knete badun...@hotmail.de
javascript: wrote:

Hi,

thx for your response Mark.

It looks like we are getting a second big server for our
ELK-stack(unfortunately without any more storage, so i really cant create a
failover cluster yet), but i wonder what role i should give this server in
our system.

Would it be good to move the whole long term storage to this server and
let the indexing and pretty much of the kibana searching happen on the
existing server, or are there any good configuration setups?

I read quite a bit of people that are having an similar setup like us (
with 1 or 2 big machines ) and would be happy if they can share there
thoughts with us!.

Thanks guys.

Am Dienstag, 27. Januar 2015 22:54:18 UTC+1 schrieb Mark Walkom:

Indexes are only refreshed when a new document is added.
Best practices would be to use multiple machines, if you lose that one
you lose your cluster!

Without knowing more about your cluster stats, you're probably just
reaching the limits of things, and either need less data, more nodes or
more heap.

On 28 January 2015 at 02:14, horst knete badun...@hotmail.de wrote:

Hey guys,

i´ve been around this problem for quite a while, and didnt got a
clear answer to it.

Like many of you guys out there, we are running an es-cluster on a
single strong server, moving older indices from the fast SSD´s to slow
cheap HDD´s (About 25 TB data).

To make this work we got 3 instances of ES running with the path of
indices set to the HDD´s mountpoint and 1 single instance for the
indexing/searching SSD´s.

What makes me wonder all the time is, that even the Storage-nodes
dont do anything(there is no indexing happening, there is 95% of the time
no searching happening, they are just keeping the old indices fresh) the
cpu load caused by this idle nodes is about 50% from the whole cpu
working time.

hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b

Is it possible that due to ES-Cluster mechanism, all the indexes are
keep getting refreshed all the time, when a document is indexed or a
search
is executed.

Are there any configuration options to avoid such an behavior?

Would it be better to export the old indices to an separate ES-Cluster
and configure multiple ES-Paths in Kibana?

Are there any best practices to maintain such cluster?

I would appreciate any form of feedback.

Thanks

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

What causes high CPU load on ES-Storage Nodes?

2015-01-27 Thread horst knete

Hey guys,

i´ve been around this problem for quite a while, and didnt got a clear
answer to it.

Like many of you guys out there, we are running an es-cluster on a single
strong server, moving older indices from the fast SSD´s to slow cheap HDD´s
(About 25 TB data).

To make this work we got 3 instances of ES running with the path of indices
set to the HDD´s mountpoint and 1 single instance for the
indexing/searching SSD´s.

What makes me wonder all the time is, that even the Storage-nodes dont do
anything(there is no indexing happening, there is 95% of the time no
searching happening, they are just keeping the old indices fresh) the cpu
load caused by this idle nodes is about 50% from the whole cpu working
time.

hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b

Is it possible that due to ES-Cluster mechanism, all the indexes are keep
getting refreshed all the time, when a document is indexed or a search is
executed.

Are there any configuration options to avoid such an behavior?

Would it be better to export the old indices to an separate ES-Cluster and
configure multiple ES-Paths in Kibana?

Are there any best practices to maintain such cluster?

I would appreciate any form of feedback.

Thanks

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: What causes high CPU load on ES-Storage Nodes?

2015-01-27 Thread horst knete

Hi,

thx for your response Mark.

Would it be good to move the whole long term storage to this server and let
the indexing and pretty much of the kibana searching happen on the existing
server, or are there any good configuration setups?

I read quite a bit of people that are having an similar setup like us (
with 1 or 2 big machines ) and would be happy if they can share there
thoughts with us!.

Thanks guys.

Am Dienstag, 27. Januar 2015 22:54:18 UTC+1 schrieb Mark Walkom:

Indexes are only refreshed when a new document is added.
Best practices would be to use multiple machines, if you lose that one you
lose your cluster!

Without knowing more about your cluster stats, you're probably just
reaching the limits of things, and either need less data, more nodes or
more heap.

On 28 January 2015 at 02:14, horst knete badun...@hotmail.de
javascript: wrote:

Hey guys,

i´ve been around this problem for quite a while, and didnt got a clear
answer to it.

Like many of you guys out there, we are running an es-cluster on a
single strong server, moving older indices from the fast SSD´s to slow
cheap HDD´s (About 25 TB data).

To make this work we got 3 instances of ES running with the path of
indices set to the HDD´s mountpoint and 1 single instance for the
indexing/searching SSD´s.

What makes me wonder all the time is, that even the Storage-nodes dont
do anything(there is no indexing happening, there is 95% of the time no
searching happening, they are just keeping the old indices fresh) the cpu
load caused by this idle nodes is about 50% from the whole cpu working
time.

hot_threads: https://gist.github.com/german23/662732ca4d9dbdcb406b

Is it possible that due to ES-Cluster mechanism, all the indexes are keep
getting refreshed all the time, when a document is indexed or a search is
executed.

Are there any configuration options to avoid such an behavior?

Would it be better to export the old indices to an separate ES-Cluster
and configure multiple ES-Paths in Kibana?

Are there any best practices to maintain such cluster?

I would appreciate any form of feedback.

Thanks

https://groups.google.com/d/msgid/elasticsearch/f8ce8722-027d-4656-83ce-6c19b97e5e34%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0a577e2a-7153-4f8a-a315-7f049e136315%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ES keeps deleting shards

2014-12-12 Thread horst knete

hey folks,

as we added an additional es node yesterday(actually an other instance of 
ES with an different data path on our server), some of the shards that were 
moved to this node, keep stuck initializing.

Given to the debug logs, it seems like they got corrupted during the copy 
process (about 20 of 140 moved shards got this problem...) so i decided it 
could help to manually assign them to an node, using the reroute-api.

https://lh6.googleusercontent.com/-qSB3BJT-F5o/VIqjbiphLTI/AC4/9Yno_hajBgM/s1600/es.PNG
 
https://lh5.googleusercontent.com/-ijpXObA2tms/VIqkw9lfDHI/ADY/Gc3sK_NE0do/s1600/size.PNG
























= curl -X POST \

'http://localhost:9200/_cluster/reroute?pretty=true' \

-d '{ commands : [ { allocate : { index : logstash-2014.06.07, 
shard : 2 , node : Storage, allow_primary : 2 }}]}'



After I issued the command, ES tried to initialize the shard, but then this 
suddenly happend:


https://lh6.googleusercontent.com/-RnP051vjz1I/VIqlJQLi4-I/ADg/tC5q9TDr5HQ/s1600/sizeafter.PNG





It seems like ES decided to delete the whole shard, due to its (potential) 
corruption and with it 1/3 of the data in it.


In my opinion ES should never ever be allowed to delete anything without 
permission, any clue what happened in this case?


Thanks for feedback

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b737268a-32aa-4100-ba03-ac3b12e616fd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

poor performance of full text searches

2014-11-28 Thread horst knete

Hey guys,

we are experience poor performance if we do some full text ( searches 
without specifying an field name).

If we search in an 200 GB index which lays on SSDs for something like ' 
program:apache ' the search takes about 10-15 seconds, if we search for ' 
*apache* ' the whole search goes for 1,5-2 minutes.

What can cause this? Our mapping? the high heap of the SSD ES node ( 80gb + 
~90 GB available Lucene system memory)? Is it normal?

If you need any information, please let me know.

thanks for any respond


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a3a41710-93b4-4e81-93a7-6cde43e5c387%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: poor performance of full text searches

2014-11-28 Thread horst knete

Thx for response,

Actually the need to search with wildcards is given because our mapping. In
our events are indexed a whole lot of urls which are indexed to many terms
by the default analyzer, this will lead to an pretty akward output if you
are doing the top 10 search in Kibana in the table panel. Thx to an post
of an other guy in here in the forum we adjusted the mapping so everything
get indexed into 1 token per field, allowing urls to be displayed in
correct way but having the disadvantage of having to need to use wildcards
because apache wont return any events.

We are actually using 1 big server with 256 RAM + 64 CPU, divided into 3
elasticsearch instances.

Given to your post it seems to make sense to adjusted the mapping back to
the way that everything get tokenized back to the default way, eliminating
the need of using wildcards and use not_analyzed fields for urls.

Maybe some kind of stupid question: if you dont specify any field in kibana
search, it will be redirected by default to the _all field right? is there
a way of changing that behaviour so that apache get not directed to
_all but rather to message-field? that would eliminate the need of _all
field which would reduce disk space and cpu/ram load for indexing

Am Freitag, 28. November 2014 10:56:09 UTC+1 schrieb David Pilato:

Some comments:

Searching for `program:apache` actually search in field `program`.

Searching with wildcards is something you really should not do! Actually
when you do full text search on Google, I guess you don’t use wildcards,
right? It’s not really user friendly.
So wildcards are extremely slow and even slower when you start with *!
Look here for reference:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html#query-dsl-wildcard-query
Searching in _all field is also something you should not do but it depends
on your use case. I’d prefer using copy_to feature in mapping instead of
using _all field.

Now, I think it depends on your infra. How many nodes, RAM, … you have.

--
*David Pilato* | *Technical Advocate* | *Elasticsearch.com
http://Elasticsearch.com*
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 28 nov. 2014 à 10:44, horst knete badun...@hotmail.de javascript:
a écrit :

Hey guys,

we are experience poor performance if we do some full text ( searches
without specifying an field name).

If we search in an 200 GB index which lays on SSDs for something like '
program:apache ' the search takes about 10-15 seconds, if we search for '
*apache* ' the whole search goes for 1,5-2 minutes.

What can cause this? Our mapping? the high heap of the SSD ES node ( 80gb
+ ~90 GB available Lucene system memory)? Is it normal?

If you need any information, please let me know.

thanks for any respond

https://groups.google.com/d/msgid/elasticsearch/a3a41710-93b4-4e81-93a7-6cde43e5c387%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3bf1946b-86bd-4dbf-b7fd-b07a7f27460a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Changing Analyzer behavior for hyphens - suggestions?

2014-11-21 Thread horst knete

I think our solution is now to just replace all the non-letters from
elasticsearch with an _.

char_filter : {
replace : {
type : mapping,
mappings: [\\.=_, \\u2010=_,
'\''=_, \\:=_, \\u0020=_, \\u005C=_, \\u0028=_,
\\u0029=_, \\u0026=_, \\u002F=_, \\u002D=_, \\u003F=_,
\\u003D=_]
}
},

This lead to that the terms wont get splited into useless tokens from the
standard analyzer.

The downside of this solution is that some urls or Windows paths looks very
ugly for the human eye now, e.g:

http://g.ceipmsn.com/8SE/411?MI=B9DC2E6D07184453A1EFC4E765A16D30-0LV=3.0.131.0OS=6.1.7601AG=1217

http___g_ceipmsn_com_8se_411_mi_b9dc2e6d07184453a1efc4e765a16d30_0_lv_3_0_131_0_os_6_1_7601_ag_1217

The good thing compared to not analyzed is that if i search for url:*8se*,
the search will return the events with this url in it.

I think this is not a perfect solution, but rather an good workaround till
lucene gives us better analyzer types to work with.

Thanks for sharing your experience so far!

cheers

Am Donnerstag, 20. November 2014 20:07:27 UTC+1 schrieb Jörg Prante:

The whitespace tokenizer has the problem that punctuation is not ignored.
I find the word_delimiter filter not working at all with whitespace, only
with keyword tokenizer, with massive pattern matching which is complex and
expensive :(

Therefore I took the classic tokenizer and generalized the hyphen rules in
the grammar. The tokenizer hyphen and filter hyphen are two routines.
The tokenizer hyphen keeps hyphenated words together and handles
punctuation correct. The filter hyphen adds combinations to the original
form.

Main point is to add combinations of dehyphenated forms so they can be
searched.

Single words are only taken into account when the word is positioned at
the edge.

For example, the phrase der-die-das should be indexed in the following
forms:

der-die-das, derdiedas, das, derdie, derdie-das, die-das, der

Jörg

On Thu, Nov 20, 2014 at 9:29 AM, horst knete badun...@hotmail.de
javascript: wrote:

So the term this-is-a-test get tokenized into this-is-a-test which is
nice behaviour, but in order to make an full-text-search on this field it
should get tokenized into this-is-a-test, this, is, a and test as
i wrote before.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2aa17dac-b484-4385-8efd-9a74847e9582%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Changing Analyzer behavior for hyphens - suggestions?

2014-11-20 Thread horst knete

Hi,

thx for response and this awesome plugin bundle (especially for me as 
german).

Unfortunately the hyphen analyzer plugin didnt do the job in the way i 
wanted it to be.

The hyphen-analyzer does something similar like the whitespace analyzer - 
it just dont split on hyphen and instead see them as ALPHANUM characters 
(at least that is what i think right now).

So the term this-is-a-test get tokenized into this-is-a-test which is 
nice behaviour, but in order to make an full-text-search on this field it 
should get tokenized into this-is-a-test, this, is, a and test as 
i wrote before.

i think maybe abusing the word_delimiter token filter could do the job, 
because there is an option preserve_original.

unfortunately if you adjust the filter like this:

PUT /logstash-2014.11.20
{
index : {
analysis : {
analyzer : {
wordtest : {
type : custom,
tokenizer : whitespace,
filter : [
lowercase,
word
]
}
},
   filter : {
word : {
type : word_delimiter,
generate_word_parts: false,
generate_number_parts: false,
catenate_words: false,
catenate_numbers: false,
catenate_all: false,
split_on_case_change: false,
preserve_original: true,
split_on_numerics: false,
stem_english_possessive: true
   }
  }
}
}
}

and make an analyze test:

curl -XGET 'localhost:9200/logstash-2014.11.20/_analyze?filters=word' -d 
'this-is-a-test'

the response is this:
{tokens:[{token:this,start_offset:0,end_offset:4,type:ALPHANUM,position:1},{token:is,start_offset:5,end_offset:7,type:ALPHANUM,position:2},{token:a,start_offset:8,end_offset:9,type:ALPHANUM,position:3},{token:test,start_offset:10,end_offset:14,type:ALPHANUM,position:4}]

which just says it tokenized it in everything expect the original term, 
which make me wonder if the preserver_original settings is working?

Any idea on this?

Am Mittwoch, 19. November 2014 18:26:09 UTC+1 schrieb Jörg Prante:

 You search for a hyphen-aware tokenizer, like this?

 https://gist.github.com/jprante/cd120eac542ba6eec965

 It is in my plugin bundle

 https://github.com/jprante/elasticsearch-plugin-bundle

 Jörg

 On Wed, Nov 19, 2014 at 5:46 PM, horst knete badun...@hotmail.de 
 javascript: wrote:

 Hey guys,

 after working with the ELK stack for a while now, we still got an very 
 annoying problem regarding the behavior of the standard analyzer - it 
 splits terms into tokens using hyphens or dots as delimiters.

 e.g logsource:firewall-physical-management get split into firewall , 
 physical and management. On one side thats cool because if you search 
 for logsource:firewall you get all the events with firewall as an token in 
 the field logsource. 

 The downside on this behaviour is if you are doing e.g. an top 10 
 search on an field in Kibana, all the tokens are counted as an whole term 
 and get rated due to their count: 
 top 10: 
 1. firewall : 10
 2. physical : 10
 3. management: 10

 instead of top 10:
 1. firewall-physical-management: 10

 Well in the standard mapping from logstash this is solved using and .raw 
 field as not_analyzed but the downside on this is you got 2 fields 
 instead of one (even if its a multi_field) and the usage for kibana users 
 is not that great.

 So what we need is that logsource:firewall-physical-management get 
 tokenized into firewall-physical-management, firewall , physical and 
 management.

 I tried this using the word_delimiter filter token with the following 
 mapping:

  analysis : {
  analyzer : {
  my_analyzer : {
  type : custom,
  tokenizer : whitespace,
  filter : [lowercase, asciifolding, 
 my_worddelimiter]
  }
   },
  filter : {
 my_worddelimiter : {
 type : word_delimiter,
 generate_word_parts: false,
 generate_number_parts: false,
 catenate_words: false,
 catenate_numbers: false,
 catenate_all: false,
 split_on_case_change: false,
 preserve_original: true,
 split_on_numerics: false,
 stem_english_possessive: true

Changing Analyzer behavior for hyphens - suggestions?

2014-11-19 Thread horst knete

Hey guys,

after working with the ELK stack for a while now, we still got an very
annoying problem regarding the behavior of the standard analyzer - it
splits terms into tokens using hyphens or dots as delimiters.

e.g logsource:firewall-physical-management get split into firewall ,
physical and management. On one side thats cool because if you search
for logsource:firewall you get all the events with firewall as an token in
the field logsource.

The downside on this behaviour is if you are doing e.g. an top 10 search
on an field in Kibana, all the tokens are counted as an whole term and get
rated due to their count:
top 10:
1. firewall : 10
2. physical : 10
3. management: 10

instead of top 10:
1. firewall-physical-management: 10

Well in the standard mapping from logstash this is solved using and .raw
field as not_analyzed but the downside on this is you got 2 fields
instead of one (even if its a multi_field) and the usage for kibana users
is not that great.

So what we need is that logsource:firewall-physical-management get
tokenized into firewall-physical-management, firewall , physical and
management.

I tried this using the word_delimiter filter token with the following
mapping:

analysis : {
analyzer : {
my_analyzer : {
type : custom,
tokenizer : whitespace,
filter : [lowercase, asciifolding,
my_worddelimiter]
}
},
filter : {
my_worddelimiter : {
type : word_delimiter,
generate_word_parts: false,
generate_number_parts: false,
catenate_words: false,
catenate_numbers: false,
catenate_all: false,
split_on_case_change: false,
preserve_original: true,
split_on_numerics: false,
stem_english_possessive: true
}
}
}

But this unfortunately didnt do the job.

I´ve saw on my recherche that some other guys have an similar problem like
this, but expect some replacement suggestions, no real solution was found.

If anyone have some ideas on how to start working on this, i would be very
happy.

thanks.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4094292c-057f-43d8-9af0-1ea83ad45a1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Index reallocation takes hours

2014-10-29 Thread horst knete

Hi,

i want to use this thread, to clarify some of the elasticsearch settings,
because im struggeling a bit in understanding all of them.

1.indices.store.throttle.max_bytes_per_sec : This is the maximum amount of
mb/gb which elasticsearch uses per second to to the merging process -
right? (Is there an difference between
indices.store.throttle.max_bytes_per_sec and
index.store.throttle.max_bytes_per_sec? )
2.indices.recovery.max_bytes_per_sec: This is the maximum amout of mb/gb
which elasticsearch uses to initialize shards/indices after an restart of
the node - right?
3.indices.recovery.concurrent_streams: The amount of parallel shard
initializations after an restart - right?
4.cluster.routing.allocation.node_concurrent_recoveries: The cluster wide
amount of parallel shard initializations after an restart - right?
5.cluster.routing.allocation.node_initial_primaries_recoveries: The cluster
wide amount of parallel primary shard initializations after an restart -
right?
6.cluster.routing.allocation.cluster_concurrent_rebalance:2: The amount of
shards that are parallel allowed to move from one node to an other -right?

But of all the settings found in the elasticsearch there is no such
settings like cluster.routing.allocation.max_bytes_per_sec which limits
the maximum amount of mb/gb per sec that can be used to allocate a shard
from one node to another - will elasticsearch by default use a value or
just use the maximum possible speed?

Thanks for response

Am Mittwoch, 8. Oktober 2014 11:04:26 UTC+2 schrieb Andrew Lakes:

Hi,

we are using version 1.3.2 of ES , each index have 4 shards.

I will try adjusting the store-throttling to an acceptable number without
crushing our disk IOs

Am Dienstag, 7. Oktober 2014 19:35:35 UTC+2 schrieb Ivan Brusic:

Currently shard allocation is throttled by default with a target low
value. If you are using an older version of Elasticsearch, then a Lucene
bug will make it even slower.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#store-throttling

How many shards do your indices have? The default number of concurrent
allocations is only 2. The ideal values depends on your hardware,
especially your disk io.

Cheers,
Ivan
On Oct 7, 2014 8:42 AM, Andrew Lakes alak...@gmail.com wrote:

Hey guys,

we have setup an central ES-server with 2 instances (1 instance SSDs for
indexing/last indices and an other one for long term storage)

Everyday at 6 a.m a cronjob is started that moves indices older than 7
days to the HDD storage node.

Our indices have an average size of 300 GB, but it takes about 6-7 hours
till the reallocation of 1 index is done (if you copy it manually with cp
it only takes about 30 minutes).

Is there any settings which can i adjust that speed this process of
reallocation up?

https://groups.google.com/d/msgid/elasticsearch/aa78aefb-be9e-4dc6-bdca-948d312f06af%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1224c3f7-8207-42f5-8d2d-95eb36280ef2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Any experience with ES and Data Compressing Filesystems?

2014-08-04 Thread horst knete

Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced the
amount of data from 3,2 TB to 1,1TB without any segnificant performance
losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to
the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:

Hi,

gzip/zlib compression is very bad for performance, so it can be
interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually readwrite 4k blocks,

- hence, compression is achieved on 4k blocks. If your filesystem uses 4k
blocks and you add FS compression, you will probably have a very small
gain, if any. I've tried on ZFS:

Filesystem SizeUsed Avail Capacity Mounted on
zdata/ES-lz4 1.1T1.9G1.1T 0%/zdata/ES-lz4
zdata/ES 1.1T1.9G1.1T 0%/zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem
does show some benefit:

Filesystem SizeUsed Avail Capacity Mounted on
zdata/ES-lz4 1.1T1.1G1.1T 0%
/zdata/ES-lz4- compressratio 1.73x
zdata/ES-gzip 1.1T901M1.1T 0%
/zdata/ES-gzip- compressratio 2.27x
zdata/ES 1.1T1.9G1.1T 0%/zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read
or write one 4k block - your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete badun...@hotmail.de javascript:
wrote:

Hey guys,

we have mounted an btrfs file system with the compression method zlib
for
testing purposes on our elasticsearchserver and copied one of the
indices
on the btrfs volume, unfortunately it had no success and still got the
size
of 50gb :/

I will further try it with other compression methods and will report
here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but should
you try it anyway, I'd love to hear your findings/observations.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression
file
system, which allows us transparent compression for the es-indices.
(It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS
or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Any experience with ES and Data Compressing Filesystems?

2014-08-04 Thread horst knete

We are indexing all sort of events (Windows, Linux, Apache, Netflow and so
on...) and impact is defined in speed of the Kibana GUI / how long it takes
to load 7 or 14 days of data. Thats what is important for my colleagues.

Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom:

What sort of data are you indexing? When you said performance impact was
minimal, how minimal and at what points are you seeing it?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com javascript:
web: www.campaignmonitor.com

On 4 August 2014 16:43, horst knete badun...@hotmail.de javascript:
wrote:

Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced
the amount of data from 3,2 TB to 1,1TB without any segnificant performance
losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to
the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:

Hi,

gzip/zlib compression is very bad for performance, so it can be
interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually readwrite 4k blocks,

- hence, compression is achieved on 4k blocks. If your filesystem uses
4k blocks and you add FS compression, you will probably have a very small
gain, if any. I've tried on ZFS:

Filesystem SizeUsed Avail Capacity Mounted on
zdata/ES-lz4 1.1T1.9G1.1T 0%/zdata/ES-lz4
zdata/ES 1.1T1.9G1.1T 0%/zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem
does show some benefit:

But a file system block larger than 4k is very suboptimal for IO (ES
read or write one 4k block - your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete badun...@hotmail.de wrote:

Hey guys,

we have mounted an btrfs file system with the compression method
zlib for
testing purposes on our elasticsearchserver and copied one of the
indices
on the btrfs volume, unfortunately it had no success and still got the
size
of 50gb :/

I will further try it with other compression methods and will report
here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but
should
you try it anyway, I'd love to hear your findings/observations.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like
BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Any experience with ES and Data Compressing Filesystems?

2014-07-20 Thread horst knete

Hey guys,

we have mounted an btrfs file system with the compression method zlib for
testing purposes on our elasticsearchserver and copied one of the indices
on the btrfs volume, unfortunately it had no success and still got the size
of 50gb :/

I will further try it with other compression methods and will report here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:

Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but should
you try it anyway, I'd love to hear your findings/observations.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr Elasticsearch Support * http://sematext.com/

On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file
system, which allows us transparent compression for the es-indices. (It
seems like es-indices are very good compressable, got up to 65%
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5fab716e-dcef-4edf-b658-56922f8dee16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Recommended File System Settings for Elasticsearch

2014-07-18 Thread horst knete

Hey Guys,

we are going to expand our development ES-System to an productive 
environment and  we got fresh new storage which waits to be formatted.

In order to get the best performance for elasticsearch, we´d like to know 
what the recommended settings are.

Does elasticsearch reads/writes in a sequential way or are there much 
random access on the indices?

Which block-size would you recommend? ( i´ve heard elasticsearch 
reads/writes in 4k blocks, right? )

Thanks for response

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8181f3db-1ac7-4c32-aece-169687eb538c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Any experience with ES and Data Compressing Filesystems?

2014-07-16 Thread horst knete

Hey Guys,

to save a lot of hard disk space, we are going to use an compression file 
system, which allows us transparent compression for the es-indices. (It 
seems like es-indices are very good compressable, got up to 65% 
compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which 
unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or 
ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Disabling _all-Field but keep Netflow-Events searchable

2014-07-14 Thread horst knete

Anyone got an idea how to realize that? I think that there are a few uses 
which got Netflow AND other types of events inserting into Elasticsearch 
and for those a disabled _all Field would save much hard disk space

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2b19c405-46b0-483b-9d22-73fa0c6dca5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Disabling _all-Field but keep Netflow-Events searchable

2014-07-01 Thread horst knete

Hey,

thx for your response.

As i already mentioned i tried setting the index.query.default_field to 
message.

{
  template : logstash-*,
  settings : {
index.refresh_interval : 5s,
index.number_of_shards : 3,
index.number_of_replicas : 0,
index.refresh_interval : 30s,
index.store.compress.stored : true,
index.store.compress.tv : true,
index.query.default_field : message,
analysis : {
  analyzer : {
default : {
  type : standard,
  stopwords : _none_
}
  }
}
  },
  mappings : {
_default_ : {
   _all : {enabled : false},
   _source: { compress: true },
   dynamic_templates : [ {
 string_fields : {
   match : *,
   match_mapping_type : string,
   mapping : {
 type : string,
   fields : {
{name} : {type: string, index : not_analyzed}
   }
   }
 }
   } ],
   properties : {
 @version: { type: string, index: not_analyzed },
@timestamp : { type : date, index : not_analyzed },
tags: { type: string, index : not_analyzed }
   }
}
  }
}

That was the template that i used. This is working fine for all Events 
except the Netflow ones, because they dont have a message-field for 
Kibana to search in. Thats what my mess is.

Is it possible to adjust the template/mapping per type of event?

Cheers

Am Montag, 30. Juni 2014 09:08:22 UTC+2 schrieb Alexander Reelsen:

 Hey,

 you can set the index.query.default_field in the mapping to circumvent 
  this, see 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-all-field.html#mapping-all-field


 --Alex


 On Tue, Jun 24, 2014 at 12:39 PM, horst knete badun...@hotmail.de 
 javascript: wrote:

 Hey guys,

 I really want to disable the _all-Field in the ES-Indices to save some 
 disk-space on our system.

 Normally its not the problem - adjust template in ES, and set 
 message-Field to the new default query field, that is normally available 
 in any event.

 The problem is that we also have many netflow-events with the 
 netflow-codec that have the following form:


 https://lh4.googleusercontent.com/-CDQQs5e5a7o/U6lUvjikncI/ACo/LHpMXlYLMWw/s1600/netflow.PNG

 As you might notice there isnt any message-field so the Kibana lucene 
 query would run into an error.

 My question is - how do i manage it to make this work (disabling 
 _all-Field but search in the netflow-events)?

 Thanks for response.

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9f406910-1608-4866-8c9c-42a23f6d8f11%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Disabling _all-Field but keep Netflow-Events searchable

2014-06-24 Thread horst knete

Hey guys,

I really want to disable the _all-Field in the ES-Indices to save some 
disk-space on our system.

Normally its not the problem - adjust template in ES, and set 
message-Field to the new default query field, that is normally available 
in any event.

The problem is that we also have many netflow-events with the netflow-codec 
that have the following form:

https://lh4.googleusercontent.com/-CDQQs5e5a7o/U6lUvjikncI/ACo/LHpMXlYLMWw/s1600/netflow.PNG

As you might notice there isnt any message-field so the Kibana lucene 
query would run into an error.

My question is - how do i manage it to make this work (disabling _all-Field 
but search in the netflow-events)?

Thanks for response.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9ab09bba-392f-4f77-8937-aa518c22292f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Using Jetty plugin

2014-06-11 Thread horst knete

Hi,

do you want to restrict access to the physical location (e.g. 
/opt/elasticsearch/data/elasticsearch/0/indices/index-name) or do you want 
to restrict access to some type of logs?

For the 2nd option you could look at this thread 
https://groups.google.com/forum/#!topic/logstash-users/LFmNPv3z6YE where i 
explained using kibana3_auth for this purpose

Am Dienstag, 10. Juni 2014 12:30:23 UTC+2 schrieb Gaurav Kamaljith:

 Hi,

 Is it possible to restrict Kibana users from accessing particular 
 indices/sub indices folders using the Jetty plugin?

 Regards,
 Gaurav


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/66c69e64-1f2c-4917-a3b7-0a820f1d556e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: alerts from Kibana/ES

2014-06-02 Thread horst knete

Hi NF,

we did also set up alerting with our zabbix-monitoring-system.

What we use are simple Linux-Scripts using the command curl to search in
given elasticsearch-indices.

In the zabbix-system are build triggers that are runs the script on our
elasticsearch-server and interpret the output given from this scripts (e.g.
Number of events with ID 4625) and if this value is a specific value the
trigger alerts.

It´s simple to set up and maybe this is what you are looking for.

If you need any help, feel free to contact me

Am Freitag, 30. Mai 2014 08:31:07 UTC+2 schrieb NF:

That's right, Otis.

On Friday, May 30, 2014 7:20:27 AM UTC+2, Otis Gospodnetic wrote:

Hi,

There's no alerting in Kibana. Have a look at SPM
http://www.google.com/url?q=http%3A%2F%2Fsematext.com%2Fspm%2Fsa=Dsntz=1usg=AFQjCNEYpqf7mA9CH122rtdZ0CJtqY3bFQ

- it has ES monitoring, threshold and heartbeat alerting, anomaly
detection, and a number of other features. Actually, re-reading your email
- you are looking to get notified when a certain event is captured? By
that do you mean having something like a saved query that matches
incoming logs?

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr Elasticsearch Support * http://sematext.com/

On Tuesday, May 27, 2014 5:02:35 AM UTC-4, NF wrote:

Hi,

We’re using Kibana/Elasticsearch to visualize different kind of logs in
our company. Now, we would need a feature that would allow us to send an
alert/notification (email or other) when a certain event/trigger is
captured.

I’d like to know if in Kibana/Elasticsearch backlog there is such a
feature planned? If so, when might we expect it available?

If not, could you please suggest any (open source) solution to satisfy
our need?

Thanks,

Natalia

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b6c9170f-ad81-4d92-94d5-424389e817c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete

Hey guys,

in order to meet the german laws for logging, i got the order to store the 
elasticsearch indices in a revision/audit-proof way(Indices cannot be 
edited/changed after the storage).

Are there any best practices or tips for doing such a thing?(maybe any 
plugins?)

Thanks for your feedback.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete

Yeah it looks like that this would do the job, thanks for response

Am Donnerstag, 22. Mai 2014 10:40:19 UTC+2 schrieb Mark Walkom:

You can set indexes to readonly -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html
Is that what you're after?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com javascript:
web: www.campaignmonitor.com

On 22 May 2014 18:36, horst knete badun...@hotmail.de javascript:wrote:

Hey guys,

in order to meet the german laws for logging, i got the order to store
the elasticsearch indices in a revision/audit-proof way(Indices cannot be
edited/changed after the storage).

Are there any best practices or tips for doing such a thing?(maybe any
plugins?)

Thanks for your feedback.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e2776ff1-4dde-4e96-85b0-f19cd9ad6c9b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Store Elasticsearch Indices in a revision/audit-proof way

2014-05-22 Thread horst knete

Hi Jörg,

thanks for your offer.

I will contact you if there´s a need for such an plugin in our company.

Also i will keep you up to date if there´s breaking changes in our project.

Am Donnerstag, 22. Mai 2014 10:55:44 UTC+2 schrieb Jörg Prante:

You have to add a facility to your middleware that can trace all
authorized operations to your index (access, read, write, modify, delete)
and you must write this to an append-only logfile with timestamps.

If there is interest I could write such a plugin (assuming it can run in a
trusted environment regarding authorization tokens) but I think best place
is in a middleware (where an ES client runs in a broader application
context e.g. transaction awareness).

Jörg

On Thu, May 22, 2014 at 10:36 AM, horst knete
badun...@hotmail.dejavascript:
wrote:

Hey guys,

in order to meet the german laws for logging, i got the order to store
the elasticsearch indices in a revision/audit-proof way(Indices cannot be
edited/changed after the storage).

Are there any best practices or tips for doing such a thing?(maybe any
plugins?)

Thanks for your feedback.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ed95d3f8-9266-4ee4-a1a4-d3764b1150a4%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4f2187bc-8d8e-4c3a-ae02-8eed30f3a175%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch supports multiple not RAID 0 Data Paths?

2014-05-15 Thread horst knete

Hello,

we are currently running our Elasticsearch on 1 single node and get about 
20 Million Logs per Day(40 GB/daily indices). Since this is a much Stuff to 
handle with, the indices takes a lot of disk space on our server.

What we like to implement:

- 1 Data directory which is stored on our SSDs and contains the indices of 
the last 7 days for quick access.
- 1 Data directory which is stored on normal HDDs and contains indices of 
last 3 months for normals speed acces.
- 1 Data directory which is stored on slow 5400 rpm HDDs and contains the 
indices of the last 2 years for access if needed.

Well it´s not problem to tell ES multiple data paths but if you do this, ES 
will stripe (RAID 0) the indices on all 3 data directories.

But thats not what we want. We want do copy the indices with a script to 
the matching directories ( a index which is older than 8 days gets 
automatically moved to normal HDDs and so on).

Is there any way to make this work?

Thanks for your feedback.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e8e34044-5895-4ccd-bac4-5ef11ea81204%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Compress Elasticsearch Index/Disk Usage Elasticsearch

2014-03-20 Thread horst knete

Hi Adrien,

thanks for respone.

The Curator is a very helpful tool thanks for the link!

I tried it and save about 1 GB of disk space (24-23GB) i think the effekt
increases with more indexes.

Am Mittwoch, 19. März 2014 17:42:56 UTC+1 schrieb Adrien Grand:

Hi,

This setting doesn't exist anymore as indices are now always compressed.

One thing that might help gain back some (not much, don't expect too much
from it) disk is to run an _optimize[1] call on this index. You can use
curator[2] in order to automate this operation.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-optimize.html

[2] https://github.com/elasticsearch/curator

On Wed, Mar 19, 2014 at 1:10 PM, horst knete badun...@hotmail.dejavascript:
wrote:

Hey guys,

I just wanted to ask if there´s a possibility to decrease the disk space
useage of an elasticsearch index.

e.g. i got an index called logstash-2014.03.17 with an size of 547MB and
an total of 1018761 indexed documents/logs and i want to reduce the disk
usage of it.

Regarding to that article i found:
https://github.com/elasticsearch/elasticsearch/issues/2037 all i have to
do is to set index.store.compress.stored to true.

I´ve done this with
-curl -XPOST 'localhost:9200/logstash-2014.03.17/_close'
-curl curl -XPUT 'localhost:9200/logstash-2014.03.17/_settings' -d '
{
index : {
index.store.compress.stored : true
} }
'
-curl -XPOST 'localhost:9200/logstash-2014.03.17/_open'

than i optimized the index with the command:

curl -XPOST 'http://localhost:9200/logstash-2014.03.17/_optimize'

and restarted the cluster.

Unfortunately nothing happend and it still uses 547MB of disk space.

Have i done sth. wrong?

Are there other ways to decrease the disk space useage?

Thanks for response.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/814e94ff-50c8-42ba-b587-0f04b3a8812b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Compress Elasticsearch Index/Disk Usage Elasticsearch

2014-03-19 Thread horst knete

Hey guys,

I just wanted to ask if there´s a possibility to decrease the disk space 
useage of an elasticsearch index.

e.g. i got an index called logstash-2014.03.17 with an size of 547MB and an 
total of 1018761 indexed documents/logs and i want to reduce the disk usage 
of it.

Regarding to that article i 
found: https://github.com/elasticsearch/elasticsearch/issues/2037 all i 
have to do is to set index.store.compress.stored to true.

I´ve done this with 
-curl -XPOST 'localhost:9200/logstash-2014.03.17/_close'
-curl curl -XPUT 'localhost:9200/logstash-2014.03.17/_settings' -d '
{
index : {
 index.store.compress.stored : true
} }
'
-curl -XPOST 'localhost:9200/logstash-2014.03.17/_open'

than i optimized the index with the command: 

curl -XPOST 'http://localhost:9200/logstash-2014.03.17/_optimize'

and restarted the cluster.

Unfortunately nothing happend and it still uses 547MB of disk space.

Have i done sth. wrong?

Are there other ways to decrease the disk space useage?

Thanks for response.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8b4d651e-7379-47e8-88a1-189308d118f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Kibana3 not display the terms panel

2014-02-21 Thread horst knete

After Hours of Debuging with Wireshark,Firebug etc. i found out that the
problem is, that if you load the default page all panels get cached in the
browser of the user except the terms-panel(i think it is not defined in the
require.js/html of kibana to be cached)

If you know load a dashboard with a terms panel all the data from ES are
send to the client, but not the data of the terms panel, which dont get
displayed unless you refresh the page.

My solution was to add the terms panel to the default dashboard so that the
client is forced to load the data for the terms panel.

Hopefully the developers of Kibana will fix this bug

Am Montag, 17. Februar 2014 13:56:57 UTC+1 schrieb horst knete:

Hey guys,

I´ve set up a syslog-system with Kibana3(Milestone4),ES and logstash and
everything worked fine.

But now i got a small little problem, that if i load a dashboard in
kibana, where terms-panels are used, they wont get displayed.

https://lh6.googleusercontent.com/-_wCbI8-pZFs/UwIGDvLRrpI/AA8/ZeszJbKpABM/s1600/screen.PNG

As you see i just get the table-panels displayed but not the terms-panel
(it display the row but not the panels )

I am using the Kibana from:
https://github.com/christian-marie/kibana3_auth which is pretty the same
as the original Milestone4 but with a authentification Site

Have anyone of you encounterd the same problem, or have any solution?

Thanks for your help

cheers

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/392b95f3-7638-45e8-b157-5ac005ac7c7a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Kibana3 not display the terms panel

2014-02-17 Thread horst knete

Hey guys,

I´ve set up a syslog-system with Kibana3(Milestone4),ES and logstash and 
everything worked fine.

But now i got a small little problem, that if i load a dashboard in kibana, 
where terms-panels are used, they wont get displayed.



https://lh6.googleusercontent.com/-_wCbI8-pZFs/UwIGDvLRrpI/AA8/ZeszJbKpABM/s1600/screen.PNG














As you see i just get the table-panels displayed but not the terms-panel 
(it display the row but not the panels )

I am using the Kibana from: https://github.com/christian-marie/kibana3_auth 
which is pretty the same as the original Milestone4 but with a 
authentification Site

Have anyone of you encounterd the same problem, or have any solution?

Thanks for your help

cheers

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6b563f2a-b9f2-4c5a-9da3-f3a7f36de1aa%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Shards get not distributed across the cluster

ES comprehensive searches over different logs

Re: What causes high CPU load on ES-Storage Nodes?

What causes high CPU load on ES-Storage Nodes?

Re: What causes high CPU load on ES-Storage Nodes?

ES keeps deleting shards

poor performance of full text searches

Re: poor performance of full text searches

Re: Changing Analyzer behavior for hyphens - suggestions?

Re: Changing Analyzer behavior for hyphens - suggestions?

Changing Analyzer behavior for hyphens - suggestions?

Re: Index reallocation takes hours

Re: Any experience with ES and Data Compressing Filesystems?

Re: Any experience with ES and Data Compressing Filesystems?

Re: Any experience with ES and Data Compressing Filesystems?

Recommended File System Settings for Elasticsearch

Any experience with ES and Data Compressing Filesystems?

Re: Disabling _all-Field but keep Netflow-Events searchable

Re: Disabling _all-Field but keep Netflow-Events searchable

Disabling _all-Field but keep Netflow-Events searchable

Re: Using Jetty plugin

Re: alerts from Kibana/ES

Store Elasticsearch Indices in a revision/audit-proof way

Re: Store Elasticsearch Indices in a revision/audit-proof way

Re: Store Elasticsearch Indices in a revision/audit-proof way

Elasticsearch supports multiple not RAID 0 Data Paths?

Re: Compress Elasticsearch Index/Disk Usage Elasticsearch

Compress Elasticsearch Index/Disk Usage Elasticsearch

Re: Kibana3 not display the terms panel

Kibana3 not display the terms panel

30 matches

Site Navigation

Mail list logo

Footer information