Re: Creating index dynamically in ES.

2014-11-09 Thread Magnus Bäck
On Tuesday, November 04, 2014 at 00:57 CET, Alejandro Alves wrote: > El miércoles, 19 de febrero de 2014 05:02:40 UTC+13, Binh Ly > escribió: > > > You can specify the index name in the elasticsearch output: > > http://logstash.net/docs/1.3.3/outputs/elasticsearch#index > > For example, l

Re: How to filter logs on remote server before shipping to logstash server

2014-11-09 Thread Magnus Bäck
On Thursday, November 06, 2014 at 20:27 CET, Vilas Reddy wrote: > I am new to ELK and my organization is interested in implementing this > framework. > I setup ELK on my machine and trying to collect logs from remote > server. > But the logs on the remote server are huge in size (in GigaByte

Re: Benefits of using bulk with node client?

2014-11-09 Thread Rotem Hermon
Ah, thanks, that's helpful to know! But if that's the case - why does the node need to parse the document when doing an individual request and not bulk? It can also stream the doc to the right shard based on the meta data (id and routing) without parsing the doc, same as

Re: Benefits of using bulk with node client?

2014-11-09 Thread David Pilato
Definitely ! Try with and without and you will see the difference. The node does not parse the full doc but only headers and streams your docs to the right shards. I noticed myself a huge difference between both. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > Le 10 nov.

Benefits of using bulk with node client?

2014-11-09 Thread Rotem
I can definitely see the point of using the bulk API when indexing via HTTP. But is there an advantage of using bulk instead of individual index request when using the client node? Since the node parses the bulk and routes each request to its proper destination - and it's basically doing the s

Re: how to search non indexed field in elasticsearch

2014-11-09 Thread ramky
Thanks Nikolas. I tried the query but it failed to search on non-indexed field. Query i used is { "filter": { "script": { "script": "doc['service'].value == http" } } } "service" is non-indexed field. Exception after execution is {[x3a9BIGLRwOdhwpsaUZbrw][siem0511][0]: QueryPh

every shards are unassigned

2014-11-09 Thread Marcus Park
I have over 800 indices and 3 nodes of a cluster, but after i killed a node at loading time, all the shards are to be unassigned. I shutdown all the nodes and started 1 of each node but nodes not bounded for the cluster and still all the shards are unassigned. How can assign the all shards as pri

Re: Deleting docs in elasticsearch...

2014-11-09 Thread Mark Walkom
This sort of thing will only really happen if you have dangling indexes or other nodes in the cluster that leave and then join later, which is just another form of dangling indexes. What sort of data is it, how are you indexing it? On 9 November 2014 12:41, kilsedar wrote: > Hello All, > I swit

Re: Many clusters each with one small index (current) VS Single "Big index" cluster VS Multi clusters each with 3 small indexes

2014-11-09 Thread Mark Walkom
ES scales horizontally, so you should consider one cluster of many nodes and multiples indexes rather than many clusters. This will also save on management overhead. Some other points; Set shard count to an increment of node count, you have 3 nodes so use 3/6/9/etc shards, this ensures you have ba

how to tokenize html, javascript and css

2014-11-09 Thread Kevin S
We would like to use elasticsearch internally to search our own work product which is primarily html javascript and css. When we index these today we can not find things like "somefile.js" because of the tokenizer (I think) Has there been a tokenizer developed for this yet? Thank you. -- You

Looking for a sexy solution for Aggregations

2014-11-09 Thread kazoompa
Hi, Consider the aggregation below: "Sociodemographic_economic_characteristics": { "terms": { "field": "Sociodemographic_economic_characteristics", "size": 0, "min_doc_count": 0, "order": { "_term": "asc" } } } This is the result

Re: accidently ran another instance of elasticsearch on a few nodes

2014-11-09 Thread Mark Walkom
Yellow means unassigned replicas, try removing them and then adding them back. Once your cluster is green you can stop one of the nodes with the extra data and then delete the extra directory, just make sure you let the other nodes rebalance and your cluster is green again before deleting, otherwi

Re: Updating threadpool settings in ES

2014-11-09 Thread joergpra...@gmail.com
I read "database_name " - do you use JDBC river? If so, please increase max_bulk_actions parameter in JDBC plugin 1.3.4.4 to a large value, like 5000 or 1. Changing threadpool.bulk is wrong and is not a solution. Where did you find this "solution"? Jörg On Sun, Nov 9, 2014 at 6:22 PM, Max

Recreating Google's Ngram Viewer with elasticsearch

2014-11-09 Thread jari
Hello, I'm looking for tips on how to recreate something like Google's Ngram viewer with elasticsearch. I have a text corpus of < 500 MB for which this kind of tool would be very valuable. I've had some success with the shingle token filter

Re: Infinite scroll best practices with ES

2014-11-09 Thread Nikolas Everett
Scan/scroll queries use too much memory to serve all clients. They also keep files around on disk after they would normally be deleted. On Nov 9, 2014 12:12 PM, "pulkitsinghal" wrote: > In this discussion, I will rely on this page for reference: > http://www.elasticsearch.org/guide/en/elasticsea

Re: Large results sets and paging for Aggregations

2014-11-09 Thread pulkitsinghal
Sharing a response I received from Igor Motov: "scroll works only to page results. paging aggs doesn't make sense since > aggs are executed on the entire result set. therefore if it managed to fit > into the memory you should just get it. paging will mean that you throw > away a lot of results

Updating threadpool settings in ES

2014-11-09 Thread Maxim Fedchyshyn
Hello Everyone I would like to get an advice about ES river and queue capacity increasing I'm getting such an error: [_river][1]: failed to execute [get [_river][database_name][_meta]: routing [null]] org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (que

Re: Infinite scroll best practices with ES

2014-11-09 Thread pulkitsinghal
In this discussion, I will rely on this page for reference: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html At my level, I cannot really make a recommendation but I can share some questions going through my head, which if you fill in the blanks,

Large results sets and paging for Aggregations

2014-11-09 Thread pulkitsinghal
Based on the reference docs I couldn't figure out what happens when the aggregation result set is very large. Does it get cut off? What is the upperbound? Does ES crash? I see closed issues that indicate that pagination for aggregations will not be supported (https://github.com/elasticsearch/el

Re: ES cluster become red

2014-11-09 Thread Moshe Recanati
Update After couple of seconds or minutes the cluster became green. I assume this is after ES stabilized with data, Thank you On Sunday, November 9, 2014 3:06:46 PM UTC+2, Moshe Recanati wrote: > > Hi, > I wrote simple program that enters 1000 documents into clean ES cluster. > While querying the

Re: ES 1.3.4 scrolling never ends

2014-11-09 Thread Yarden Bar
Update: Only when I set the SearchType to something else than the QUERY_AND_FETCH the scroll success to finish. Any idea why QUERY_THEN_FETCH(the default) brings me to an endless loop? The full code is: val client = ESClientFactory.createByNode(ESNode.Builder,cluster = "test_acm_es") val

ES cluster become red

2014-11-09 Thread Moshe Recanati
Hi, I wrote simple program that enters 1000 documents into clean ES cluster. While querying the cluster during execution I'm getting Green health all the time. C:\Users\mosher>curl -XGET http://mosher:9200/_cat/indices?pretty=true green twitter2 1 0 1000 0 72.7kb 72.7kb However after stop and st