Re: Issue creating S3 repository on ElasticSearch 1.3

2014-09-04 Thread David Pilato
Do you try to create a repo? If so, you need to use PUT method. Something like: $ curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{ "type": "s3", "settings": { "bucket": "my_bucket_name", "region": "us-west" } }' -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydoc

Re: aggregations

2014-09-04 Thread navdeep agarwal
i am asking because query output or after filtering my output contain very few entries(in hundreds),so if its is hitting oom error then aggregations is taking everything into cache irrespective of before query or filtering . On Wednesday, September 3, 2014 10:28:02 PM UTC+5:30, navdeep agarwal w

Fwd: aggregations

2014-09-04 Thread navdeep agarwal
thank you for reply ,my heap size is of 8gb for 74 gb index and yes i am hitting circut breaker so when i am querying or filtering before aggregations,aggregations are passed only filtered/query output results ??? On Thursday, September 4, 2014 3:15:43 PM UTC+5:30, Colin Goodheart-Smithe

Re: co-locate elasticsearch with hadoop/yarn

2014-09-04 Thread David Pilato
f you give 2 disks to Elasticsearch, it will fill both at the same time. Not one after the other. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 5 sept. 2014 à 08:19, Ronny Vaningh a écrit : Thanks Andrew If I'm correct e.s. will first fill up the first disk, and so on

Re: co-locate elasticsearch with hadoop/yarn

2014-09-04 Thread Ronny Vaningh
Thanks Andrew If I'm correct e.s. will first fill up the first disk, and so on if you specify multiple data paths vs scattering over them like hdfs.. correct? You overcome this by giving easch e.s. instanve a data path on a sepparate disk Thanks Regards Ronny On 5 Sep 2014 02:30, "Andrew Selden

How to filter out duplicate documents across multiple types?

2014-09-04 Thread Anand Natarajan
We have certain documents stored across multiple types with translated values, for example, US and ES types has same document but with different values in title fields. Example: US: { "title":"Manning: Spring in Action, Third Edition" } ES: { "title":"Manning : Primavera en Acción , Tercera E

How to filter out duplicate documents across types

2014-09-04 Thread Anand Natarajan
We have certain documents stored across multiple types with translated values, for example, US and ES types has same document but with different values in title fields. Example: US: { "title":"Manning: Spring in Action, Third Edition" } ES: { -- You received this message because you are sub

Re: Missing Documents

2014-09-04 Thread David Pilato
Does this really work? client().admin().indices().delete(new DeleteIndexRequest("_all")).actionGet(); Was expecting something like : client().admin().indices().delete(new DeleteIndexRequest("_all")).get(); Or client().admin().indices().delete(new DeleteIndexRequest("_all")).execute().actionGet()

Re: ElasticSearch store Index data externally

2014-09-04 Thread David Pilato
Why do you want to separate data and elasticsearch process? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 4 sept. 2014 à 12:47, "a.aneiros" a écrit : Thanks for your answers, Yesterday I built a NFS folder, and set the ElasticSearch index path to this NFS folder, so wh

Re: ElasticSearch & DropBox

2014-09-04 Thread David Pilato
I can update the plugin if you need it. Though I'm not sure when... Open an issue in Dropbox plugin project? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 4 sept. 2014 à 12:54, "a.aneiros" a écrit : Hello, I've been reading about *Dropbox River*, and check it's docume

Re: ElasticSearch store Index data externally

2014-09-04 Thread a.aneiros
Thanks for your answers, Yesterday I built a NFS folder, and set the ElasticSearch index path to this NFS folder, so when I create an index and add data to it, ElasticSearch stores the information in this NFS folder (so the information is stored in a external server) but the speed is not the best

Re: Missing Documents

2014-09-04 Thread Luan Garrido
Hi, Im waiting until it happens again, because it is not every time. I have just one pc on my cluster =D. Can you answer others questions until this? 1- Im using JAVA API, and Im deleting index using "client().admin().indices().delete(new DeleteIndexRequest("_all")).actionGet();". This is really r

ElasticSearch & DropBox

2014-09-04 Thread a.aneiros
Hello, I've been reading about *Dropbox River*, and check it's documentation on PilatoFr Dropbox but it seems this plugin is only available for ElasticSearch 0.20 and now I'm using 1.2.0. Does anyone know about some other plugin/feature that allows me to index D

Re: Elasticsearch High CPU high load when searching

2014-09-04 Thread 王星龙
I want to add cache for es request, but this function is coming to 1.4.0 在 2014年9月4日星期四UTC+8下午6时29分06秒,Mark Walkom写道: > > Your second two links do not work. > > Can you add another node, or close some old indexes? > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: m

Re: writing a custom scoring plugin

2014-09-04 Thread Srinivasan Ramaswamy
Hi Joerg, I tried the data loading part as a separate module and it works, but i have a small issue with it. When there is an exception the constructor gets called multiple times. public class CustomDataModule extends AbstractModule { private final Settings settings; public SalesDataModu

Re: co-locate elasticsearch with hadoop/yarn

2014-09-04 Thread Andrew Selden
I have not tried this but my initial thoughts would be - Set ES_HEAP_SIZE = 30 GB, give Hadoop an appropriate amount, leave the rest for the OS cache. - Set the filesystem paths where ES and Hadoop store data to separate physical disk(s). You don't want them contending for bandwidth. - You don'

Re: shard recovery

2014-09-04 Thread Mark Walkom
Easiest way is to drop the replicas and then re-add them. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 5 September 2014 08:30, daa qqu wrote: > Hello everyone, > > I am new to ES and need some help with it: > >

shard recovery

2014-09-04 Thread daa qqu
Hello everyone, I am new to ES and need some help with it: - Elasticsearch version 1.0.1 ( I know I need to upgrade!) - Two node cluster - Cluster state yellow for last couple of hours - Two shards in INITIALIZING state. - One shard UNASSIGNED wl-2014.09.04 is an active index with new documents

Re: Learning optimal boost weight [ML]

2014-09-04 Thread Ivan Brusic
I have something similar which uses search analytics to determine relevant filters. No plugin or framework since everything works on the client side during the creation of the query. The process is far from ideal and is currently very conservative, providing only a slight boost. It does not work on

Re: Total number of documents be included in each query

2014-09-04 Thread David Pilato
You could try http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-multi-search.html Or this: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-global-aggregation.html -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @s

Total number of documents be included in each query

2014-09-04 Thread kazoompa
> > Hi, I would like to know whether I can perform a query and get not only the total hits but also the total document counts. Here is an example of one of my queries: { "from" : 0, "size" : 30, "query" : { "filtered" : { "query" : { "match_all" : { } }, "f

Elastic search 1.2.1: Snapshot not creating

2014-09-04 Thread manish kumar
I am runnign this command http://localhost:9200/_snapshot/titan_es_backup?pretty **output** { "titan_es_backup" : { "type" : "fs", "settings" : { "compress" : "true", "location" : "/home/manish/Softwares/DataBackUp/ES" } } } ag

Re: Missing Documents

2014-09-04 Thread Luan Garrido
Hi Vineeth, I made it. I manually deleted the folders in $ES$/data/elasticsearch, but it have being created every time. I took a look, We need to delete it from project workspace and from jboss data. Remove theese 3 directory and everything will work =D Thank you Em quinta-feira, 4 de setembro d

Plaso / Elasticsearch / Kibana - psort.exe -o list (not showing Elastic module installed)

2014-09-04 Thread David O'Neil
Hey, I'm working on building my first ElasticSearch / Kibana / Plaso instance. I'm running into an issue with the Elastic Output for psort -o list. The build I'm downloading the build from the website, and run the command, I don't see the Elastic in the output. I'm running the command after

Re: Limiting buckets on histogram agg

2014-09-04 Thread Doug Nelson
thanks vineeth, I +1 these in the issues. I was able to solve this by using two queries and in the histogram buckets, I simply used a range filter to limit the number of buckets. Each bucket has the has the information I need, but I need to get the total count from the overall (without the

co-locate elasticsearch with hadoop/yarn

2014-09-04 Thread Ronny Vaningh
Hi I have some beefy boxes with 512 Gb ram and I would like to co-locate yarn/hadoop with elasticsearch Does anyone have experience in doing the same ? How did you split the resources (memory/disk) across both functions ? Hdfs like jbod while E.S. like raid10 Thank Reg -- You received thi

Re: How do I help the users understand some unexpected search hits (Or how can I do "highlighting" on _all)

2014-09-04 Thread Nikolas Everett
On Thu, Sep 4, 2014 at 1:41 PM, mooky wrote: > I am indexing some entities that have up to 140 fields in the resultant > document - ie lots. > I am providing a simple/powerful google-style search of such entities > using the _all field - however, to make the user's life easier, we do > prefix se

Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-04 Thread Adrien Grand
I think this is something the cardinality[1] aggregation can be used for? [1] http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html On Thu, Sep 4, 2014 at 7:46 PM, mooky wrote: > > I tend to agree that knowing the total #

Re: should ES_HEAP_SIZE be less than 31G?

2014-09-04 Thread Jinyuan Zhou
Say if I have two boxes for data node in the cluster: A and B. However, A is doubles capacity of B. Now I can start two data nodes on A and 1 on B. With the help of rack id attribute, ES will make sure replicate and its primary are not located on the same box. In this case, It seems to me that Mac

Re: Exists filter does not respect must_not bool filter

2014-09-04 Thread joergpra...@gmail.com
Yes, range filter operates on all fields. The missing/exists operation has been slightly changed in recent versions. For high cardinality fields, operations on the field content were very expensive. So, an optimization was introduced: each doc carries a list of the field names in a hidden field, a

Re: High CPU load during search(elasticsearch 1.2.1)

2014-09-04 Thread joergpra...@gmail.com
In your hot threads dump, you see the culprit, it has something to do with a plugin you use, not with Elasticsearch. com.clarabridge.elasticsearch.facet.sampling Ask the people who provided you with this software. Jörg On Thu, Sep 4, 2014 at 4:51 PM, Anton A wrote: > Hi, Everyone. I have n

Re: Is there any way to get the total number of buckets a aggs generated ?

2014-09-04 Thread mooky
I tend to agree that knowing the total # of buckets would be universally useful. Even though you may only want to "show" 10 buckets to the user, you may also want to show that there are "15 more..." - and then allow the user to expand on that to see all 25. On Tuesday, 2 September 2014 04:

How do I help the users understand some unexpected search hits (Or how can I do "highlighting" on _all)

2014-09-04 Thread mooky
I am indexing some entities that have up to 140 fields in the resultant document - ie lots. I am providing a simple/powerful google-style search of such entities using the _all field - however, to make the user's life easier, we do prefix searches. (e.g. rather than the user having to type "joh

Re: ClusterHealthStatus.GREEN for a single (embedded) node?

2014-09-04 Thread mooky
I thought it would be simple... Thanks! On Thursday, 4 September 2014 11:02:42 UTC+1, David Pilato wrote: > > Set number of replicas to 0. > > -- > *David Pilato* | *Technical Advocate* | *Elasticsearch.com* > @dadoonet | @elasticsearchfr >

Re: Exists filter does not respect must_not bool filter

2014-09-04 Thread ElasticRabbit
Hi Jorg, I was in a assumption that range filter has to be used for numeric fields. But this works thanks for the help. If anyone could enlighten me why must_not bool filter doesn't respect exists filter? Thanks, Ayush Sangani On Wednesday, September 3, 2014 5:20:54 PM UTC-4, Jörg Prante wrote

Re: How do I remove _index, _type, _id and _score from output?

2014-09-04 Thread Ivan Brusic
There is a plugin which can help: https://github.com/jprante/elasticsearch-index-termlist -- Ivan On Wed, Sep 3, 2014 at 11:47 AM, David Pilato wrote: > I don't think you can as far as I remember the same thread about it some > months ago. > > -- > David ;-) > Twitter : @dadoonet / @elasticse

Re: Is it possible to add yet another score value based on similarity (same words) to differentiate between two _scores ?

2014-09-04 Thread Ivan Brusic
Can you simply boost the non analyzed field? If the scores are still too similar, try using a dis_max query with the non analyzed query getting a higher boost: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-dis-max-query.html -- Ivan On Wed, Sep 3, 2014 at 7:16 A

Re: should ES_HEAP_SIZE be less than 31G?

2014-09-04 Thread Ivan Brusic
On Wed, Sep 3, 2014 at 11:47 AM, joergpra...@gmail.com < joergpra...@gmail.com> wrote: > > ES scales best over multiple machines horizontally, not vertically. More > RAM does not automatically mean better performance at linear scale at a > certain point - it depends on the JVM if it can keep up. >

How does percolation Notification work?

2014-09-04 Thread IronMan2014
I would like to know how others go about percolation notification. I have couple of questions that I am not clear on. My setup: - I have a webApp that talks to Index server to do searches. - Index through TransportClient/Java/BulkProcessor Simple Use case: Users register for interests in webapp

shard allocated for local recovery (post api), should exists, but doesn't (org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException)

2014-09-04 Thread Mathias Gawlista
Hi, after I uninstalled elasticsearch 0.9.x through "brew uninstall elasticsearch" I installed elasticsearch 0.20.6 through "brew install elasticsearch-0.20". When I start the server through elasticsearch -f -D es.config=/usr/local/opt/elasticsearch-0.20/config/elasticsearch.yml the server en

Issue creating S3 repository on ElasticSearch 1.3

2014-09-04 Thread Dylan Lingelbach
Hi, We are trying to migrate our data running on ES cluster running version 1.0.3 to a cluster running version 1.3.2. Our cluster on version 1.0.3 is snapshotted to S3. I would like to restore that snapshot to our cluster running 1.3.2. I am unable to create the S3 repository however on my

Learning optimal boost weight [ML]

2014-09-04 Thread NM
Hi, i have several fields playing in the query. some fields are more important than other , requiring to set boost factors. I would like to automatically learn the optimal boost weight for each field based on a training data set. is there a plugin or framework to do that nicely with ES? --

High CPU load during search(elasticsearch 1.2.1)

2014-09-04 Thread Anton A
Hi, Everyone. I have no huge experience with elsticsearch but meet performance problem on our environment and need somehow to resolve it or figure out what is the problem. Some background: we have multiple environments with the same configuration and only one have this issue. We using elastic s

Re: Percolation when Indexing

2014-09-04 Thread IronMan2014
Any ideas why the percolator isn't showing me matches while indexing in Transport java client, but shows fine after indexing process is done by REST query command like so: GET /myIndex/type/1232/_percolate "matches": [ { "_index": "myIndex", "_id": "appealQuery" },

index size load from what file?

2014-09-04 Thread Jason Wee
Hello ES, With curl showing the index statistics as below: $ curl 'http://localhost:9200/_cat/indices?v' health index pri rep docs.count docs.deleted store.size pri.store.size green twitter 1 0 00 123b 123b 123b is the index size? and where does thi

Re: Missing Documents

2014-09-04 Thread vineeth mohan
1. How did you come to the conclusion that every index was deleted ? If you are using head or some othedr UI , please paste the screenshot. 2. Which folder are you deleting to remove the data ? 3. Please check if multiple instances of Elasticsearch is running on your machine Thanks

Re: multiple terms in filter in filtered query

2014-09-04 Thread Pontus Lundin
Ah, Thanks for the hint vineeth! Den torsdagen den 4:e september 2014 kl. 04:31:31 UTC+2 skrev vineeth mohan: > > Hello Pontus , > > You need to use terms instead of term filter > > TERMS filter - > http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_most_important_queries_and_f

Re: StrictDynamicMapping and MultiFields

2014-09-04 Thread John O'Gara
Is any extra information required? On Thursday, September 4, 2014 9:11:17 AM UTC+1, John O'Gara wrote: > > I'm having a problem with using strict dynamic mapping and multi-field > values. My multi fields work fine when I don't have StrictDynamicMapping > enabled but when I enable it I get the er

no node available exception after node shutdown

2014-09-04 Thread Emanuel Buzek
Hi folks, we have a two node cluster and we connect to it using the TransportClient from java by adding two hosts, i.e. ((TransportClient) client).addTransportAddress(new InetSocketTransportAddress("elastic1", port)); ((TransportClient) client).addTransportAddress(new InetSocketTransportAddress("e

multi routing in template aliases

2014-09-04 Thread 'Nicolas Fraison' via elasticsearch
Hi, Is there a way to set multi routing in a template aliases like we can do in search requests? Ex. ofmulti routing "working" search request: curl -XGET 'http://localhost:9200/forum/_search?routing=secutenant,usertenant' -d '{...}' Here is the kind of template I like to set but that doesn't

Re: Missing Documents

2014-09-04 Thread Luan Garrido
Hi, Im waiting until it happens again, because it is not every time. I have just one pc on my cluster =D. Can you answer others questions until this? 1- Im using JAVA API, and Im deleting index using "client().admin().indices().delete(new DeleteIndexRequest("_all")).actionGet();". This is really

Re: Elasticsearch High CPU high load when searching

2014-09-04 Thread 王星龙
I just add the link of the second pic . yes ,i can add more node to my cluster, but i want to know does i can do something to improve the performance, like gc or something else ? what do you mean by "close the old indexes" ? do you means to delete the old indexs? if so ,does the old indexes inf

Re: Elasticsearch High CPU high load when searching

2014-09-04 Thread 王星龙
i just add the link of the second pic 在 2014年9月4日星期四UTC+8下午6时29分06秒,Mark Walkom写道: > > Your second two links do not work. > > Can you add another node, or close some old indexes? > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: ma...@campaignmonitor.com > web: w

Re: Elasticsearch High CPU high load when searching

2014-09-04 Thread Mark Walkom
Your second two links do not work. Can you add another node, or close some old indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 4 September 2014 20:18, 王星龙 wrote: > Hi,everyone > I'm new to es ,i got a pro

Elasticsearch High CPU high load when searching

2014-09-04 Thread 王星龙
Hi,everyone I'm new to es ,i got a probem with es and tried a lot but still don't know how to change the situation ,so i would be gratitude if you can give me some advice about better performance of es searching I build a es cluster with 2 nodes, in order to analyse logs for develop engineers

Re: ClusterHealthStatus.GREEN for a single (embedded) node?

2014-09-04 Thread David Pilato
Set number of replicas to 0. --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 4 septembre 2014 à 11:59:17, mooky (nick.minute...@gmail.com) a écrit: I have a bit of an unusual situation - I am running a single embedded node of elastic. (it will later e

ClusterHealthStatus.GREEN for a single (embedded) node?

2014-09-04 Thread mooky
I have a bit of an unusual situation - I am running a single embedded node of elastic. (it will later expand to more nodes, but for now, this is the picture). Is there a way I can get a ClusterHealthStatus.GREEN status rather than ClusterHealthStatus.YELLOW ? Cheers -- You received this mess

Re: _ttl inherited from index settings

2014-09-04 Thread Octavian
Thank you very much, this is what I'm looking for. On Monday, September 1, 2014 6:18:33 PM UTC+3, David Pilato wrote: > > You can do that I guess using index templates: > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-templates.html#indices-templates > > -- > *Davi

Re: Confused as to what works - Percolator

2014-09-04 Thread Glenn Jacobs
Turns out if I percolate using a document which is already indexed it works! Problem solved :-) On 3 September 2014 23:56, Glenn Jacobs wrote: > OK I've just come back to this (4 months later!) and I'm still struggling. > > I've created the index, added suitable mappings, registered percolation >

Re: aggregations

2014-09-04 Thread Colin Goodheart-Smithe
Hi, Sounds like your problem might be your heap size is too low. How much memory have you assigned to your heap (i.e. what have you set as ES_HEAP_SIZE)? To perform aggregations, Elasticsearch has to load the values for a field for every document into memory in a data structure called field ca

Re: ElasticSearch store Index data externally

2014-09-04 Thread Bharvi Dixit
Hi, you need to create a shared file system. So that the data path of server2 is accessible from server1 and then provide that data path inside elasticsearch.yml file. When you will create a shared file system the data directory of server2 will be seems like it is existing on server1 itself. Re

StrictDynamicMapping and MultiFields

2014-09-04 Thread John O'Gara
I'm having a problem with using strict dynamic mapping and multi-field values. My multi fields work fine when I don't have StrictDynamicMapping enabled but when I enable it I get the error "error": "StrictDynamicMappingException[mapping set to strict, dynamic introduction of [email.downcased