High CPU on windows server 2012

2014-02-12 Thread Wesley Creteur
Hi everyone, I'm using elasticsearch for my webshop products to have a fast search/navigation and have approximate 5000 products (85000 documents indexed). I'm using elasticsearch as service and it works fine, but it's eating my CPU. I think my use of elasticsearch is very minimal and still my

Re: get MapperParsingException failed to parse in 0.90.10

2014-02-12 Thread Stefan Sabolowitsch
Ivan, thank you for your help and good explanations. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this

Re: Optimized Rest requests in case of embedded elastic search

2014-02-12 Thread joergpra...@gmail.com
I wrap up my Elasticsearch apps in a Wildfly web app with JAX-RS API, using the native java transport protocol. You do not need to use HTTP with Java app logs. Just write a log4j appender or something like this and connect this to a bulk indexer. Jörg -- You received this message because you

Filter by minimum value per parent (or term)

2014-02-12 Thread Manuel Brunner
Use Case - Hotels (~some thousand) with Offers (~some million) - Hotels and offers each have a lot of data where filters are applied - Sorting needs to be done by minimum offer price or some other ratings - Need to retrieve the cheapest offer per hotel (while matching all hotel

Design question - relationships across indices

2014-02-12 Thread Ludwig Magnusson
In my application I want to index user profiles and events. Each event is performed by a specific user at a specific time. I would like to be able to look at a specific event and see statistics on the users that have preformed them. To make this possible I have done some initial experiments

Re: Data Loss

2014-02-12 Thread Brad Lhotsky
There’s also an edge-case with shard allocation during cluster restarts that can result in the loss of data if a shard is being re-allocated. I saw this behaviour in 0.90.1, recently upgraded to 0.90.10 and haven’t had a failure case like this yet.  My use case is logstash style daily indexes

Re: coordinate/projection system in geo point

2014-02-12 Thread Florian Schilling
Hi Francois, currently ES just supports the WGS84 projection, which is also the base for the GPS system. In the future we maybe allow custom projections. May I ask for your use-case? cheers, Florian On Wednesday, February 12, 2014 8:08:18 PM UTC+9, Francois Brunet wrote: Which

Re: coordinate/projection system in geo point

2014-02-12 Thread Francois Brunet
It was just to be sure about the projection system. I have no need to store another projection for the moment. WGS84 is the best choice :) Thanks a lot for your answer. Le mercredi 12 février 2014 13:34:32 UTC+1, Florian Schilling a écrit : Hi Francois, currently ES just supports the WGS84

Re: how to search on number of nested terms matches?

2014-02-12 Thread Binh Ly
I can't think of a way this can be done at the moment (unless of course the categories are finite and you can build a massive query using combinations of them). However, you can always precompute the distinct category count per author prior to indexing and then include it as an extra field in

Re: common terms query with cutoff_frequency

2014-02-12 Thread Alexander Reelsen
Hey, you may have used an analyzer which removed stopwords? Try with another one and see if that works... side note: The default analyzer of elasticsearch 0.90 removes stopwords, the default one of 1.x does not, so take care of the version you are trying this iwth. --Alex On Tue, Feb 11, 2014

Elasticseach query optimizations

2014-02-12 Thread Roopendra Vishwakarma
Is there any way to optimize query in Elasticsearch? I am using below query. Its taking average `15-20s` and sometimes it little bit fast `4-5s`. My server configuration :- Centos 6.3, 8 Core 16GB RAM { fields: [ _id, aff_id, post_uri, blog_cat,

Application wise dashboard in kibana.

2014-02-12 Thread pankaj ghadge
Hello All, I want to create a application wise dashboard in kibana like as *1) Dashboard for domain example1.com:* It should show php, css, js, and jpg/png hit count for domain example1.com *2) Dashboard for domain example2.com:* It should show php, css, js, and jpg/png hit count for domain

Re: Correctly indexing data into one place with multiple analyzers

2014-02-12 Thread Binh Ly
I'm not sure your mapping actually does what you think/expect it to do. Actually, I don't believe you can combine multiple analyzed-already tokens from different fields into 1 field at all. Your best bet for correctness is probably just to leave all the multi-fields alone and then run queries

Re: define analyzer in a percolator query_string

2014-02-12 Thread Dunaeth
Btw, I found a different way to achieve my primary goal using a different percolate query : curl -XPUT 'http://127.0.0.1:9200/_percolator/tester/test1' -d '{ query:{ query_string:{ fields:[ uri_field ], query:*test* } } }' But it would be better if I could use simple query strings with

Re: elastic result search by text with nearest distance using latitude and longitude

2014-02-12 Thread Binh Ly
Oh I see, sounds like you want to sort by relevance and then have the distance factored into the relevance score also. You might want to take a look at the function_score query:

Re: High CPU on windows server 2012

2014-02-12 Thread Binh Ly
May I ask what is your ES_MAX_MEM setting? Is it possible that it is set too low - like the default 1G? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: when set include_in_all to true, what's the value to be stored into _all for an analyzed field?

2014-02-12 Thread Binh Ly
It will be the original field value from the source document. And then it is analyzed by whatever analyzer you assign to the _all field. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails

Re: Data Loss

2014-02-12 Thread Binh Ly
FYI, ES has very frequent releases to fix bugs discovered by the community. If you find a data loss problem in your current install (and assuming it is indeed an ES problem), please try the latest build and see if it fixes it. Chances are it has already been discovered and fixed in the latest

[Hadoop] How to collect stats in elasticsearch MR job

2014-02-12 Thread Abhijit Bose
Hello, I would like to collect some stats on the entries being written when running a MapReduce job using the elasticsearch-hadoop library. I am using the default Mapper.class with a batch of entries in JSON files as input to MR, e.g. job.setInputFormatClass(TextInputFormat.class);

Re: Elasticseach query optimizations

2014-02-12 Thread Binh Ly
A couple of suggestions: 1) You probably want range condition to go down to the filter part also (so bool it with the query_string filter) 2) The term (url.cat=sports) query can potentially move down to the filter section too (so bool it with the query_string filter) 2) The query_string/query

Re: High CPU on windows server 2012

2014-02-12 Thread Wesley Creteur
I'm using 3Gb. Is that sufficient? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the

Re: [Hadoop] How to collect stats in elasticsearch MR job

2014-02-12 Thread Costin Leau
We can introduce such counters. What exactly are you interested in? The default counters in Hadoop provide information on the amount of data read/written. Do you want to extract the information directly in Hadoop as oppose to ES proper? On 12/02/2014 5:13 PM, Abhijit Bose wrote: Hello, I

Re: how to search on number of nested terms matches?

2014-02-12 Thread Colin Surprenant
The number of categories is finite and relatively low count. As you are suggesting, querying for all combinations is an option as well as precomputing. I wanted to see if there was a way to do it efficiently at query time. Thanks, Colin On Wednesday, February 12, 2014 8:36:44 AM UTC-5, Binh

Re: Troubleshooting search load balancer node connectivity problems

2014-02-12 Thread Jelle Smet
Never mind, Root cause was a network issue which got identified. Setting keepalive in ES kernel params solved the problem. Cheers, -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from

elasticsearch-zookeeper plugin update (0.90.11)

2014-02-12 Thread David Carlson
New patch for elasticsearch-zookeeper supports 0.90.11: https://github.com/sonian/elasticsearch-zookeeper/pull/21 My group recently chose to use zookeeper to manage elasticsearch clusters. I started using sonian/elasticsearch-zookeeper plugin, but ran into compatibility problems (supports

Distinct key value pairs

2014-02-12 Thread atoomkern
I would like to be able to search all fields for a certain string and get all distinct matching key value pairs as a result. It should also be possible to add filters/queries to constrain the results. The original data consists of millions of documents and a few thousand possible keys so I

Re: Spring-Data-Elasticsearch is ready to serve !

2014-02-12 Thread Mohsin Husen
Don't miss out: *Spring* *Data* *Elasticsearch*https://twitter.com/search?q=%23Elasticsearchsrc=hash1.0.0.M1 released https://spring.io/blog/2014/02/11/spring-data-elasticsearch-1-0-m1-released On Friday, 22 March 2013 15:56:48 UTC, Mohsin Husen wrote: Following support is added 1)

Re: Data Loss

2014-02-12 Thread Brad Lhotsky
Appreciated, but keep in mind large installations can’t just constantly upgrade.  And if ES is being used in critical infrastructure upgrading may mean many hours of recertification work with auditors and assessors.  The project is still relatively young, but just upgrade isn’t always

Re: Suggestion: DistanceUnit.NAUTICALMILES is a worthy addition

2014-02-12 Thread Ivan Brusic
I'm glad I was able to steer you in the right direction. I flubbed a PR recently since I have not used git consistently in the past few years, so I am glad someone else can learn from my mistakes. Your PR seemed to have gained some attention! :) Ivan On Tue, Feb 11, 2014 at 1:17 PM,

Re: Application wise dashboard in kibana.

2014-02-12 Thread Binh Ly
Pankaj: You should be able to pin a query on each dasboard to filter down to only the log events that you're interested in. So for example in your first case, you can pin a filter (query) like: vhost.raw:example1.com And everything in your current dashboard will narrow down to example1.com

custom_score in has_child

2014-02-12 Thread Luiz Carlos Jr
Hi all, I have the following parent/child/grandchild documents on my ES: DOC. TYPE: /users/user (parent is none) _source: { prefix: TST_LUIZ, id: c0338fde-981c-478a-bcbe-2548ab967dce, cookieids: [ b6d3c8b4-a075-49a3-9493-f1bf0f56312e ] } DOC. TYPE:

Re: [Hadoop] How to collect stats in elasticsearch MR job

2014-02-12 Thread A Bose
This is to capture the time taken by ES to process the items in that batch of records. Yes the total size written in bytes will already be in a MR counter. On Feb 12, 2014 8:30 AM, Costin Leau costin.l...@gmail.com wrote: We can introduce such counters. What exactly are you interested in? The

setFilter in Java API

2014-02-12 Thread Ryan Chazen
Hey I updated to 1.0.0, but I'm getting a strange issue: As from http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/search.html I tried to run SearchResponse response = client.prepareSearch(index1, index2) .setTypes(type1, type2)

Re: High CPU on windows server 2012

2014-02-12 Thread Binh Ly
I'd try to up it to 8GB, assuming of course you still have a lot (like close to 8GB) free. If that still doesn't work, once you get to the high CPU state, try to run this and it'll tell you what threads in ES is doing what with the CPU: curl localhost:9200/_nodes/hot_threads?pretty -- You

Re: setFilter in Java API

2014-02-12 Thread Ivan Brusic
The documentation has not been correct for version 1.0 [1]. The method should be now called setPostFilter. Better yet, you should look into filtered queries [2]. [1] https://github.com/elasticsearch/elasticsearch/pull/4461 [2]

Re: setFilter in Java API

2014-02-12 Thread Ryan Chazen
Ah great, thanks. Is this stackoverflow answer incorrect then, or still correct? http://stackoverflow.com/questions/14595988/queries-vs-filters Namely, which is more efficient: a query, a post filter, or a filter on a query? Eg, which one is the best? 1)

Re: setFilter in Java API

2014-02-12 Thread Ivan Brusic
The answer is still correct. What the git commit that I referenced essentially accomplished was to remove the ambiguity between the different filters. The filter that is part of a filtered query can be thought of as a prefilter. Here is the breakdown of what happens in your three cases: 1)

Re: Data Loss

2014-02-12 Thread Tony Su
IMO evaluating this issue starts with applying the CAP Theorem which in summary states that networked clusters with multiple nodes can offer only 2 of the following 3 desirable objectives Consistency Availability Partition tolerance (data distributed across nodes). ES clearly does the

Re: Correctly indexing data into one place with multiple analyzers

2014-02-12 Thread Kevin Claggett
So what you are saying is, there is no way to aggregate together into one place all the tokens generated by one document? I mostly wanted to do this so that an end user doesn't have to understand what fields are in the document, or lucene query syntax to get the results they are looking for.

Re: custom_score in has_child

2014-02-12 Thread Binh Ly
Maybe try something like this (assuming frequency is available for all children)? GET /users/browser_TST_LUIZ/_search { min_score: 6, query: { has_child: { type: browser_TST_LUIZ_fr, score_type: sum, query: { custom_score: { query: {

Re: Using the elasticsearch logo r

2014-02-12 Thread Tony Su
Hello, Would like to ask if there is any update/change to using an Elasticsearch logo since this Q was originally asked a year ago. BTW - I notice that the current Elasticsearch website doesn't even display a logo... Tony On Sunday, March 25, 2012 4:48:27 AM UTC-7, kimchy wrote: Go

Re: Data Loss

2014-02-12 Thread Josh Harrison
I'm sure it isn't the case for everyone that is having data/shard problems, but I had some real trouble doing a full cluster restart on an 18 node cluster. Kinda nightmarish, actually, shards failing all over the place, lost data because of lost shards, etc. I finally realized that the

Re: setFilter in Java API

2014-02-12 Thread Ryan Chazen
Great, thank you, that makes it very clear. That explanation should be added to the query/filter/postfilter docs! On Wed, Feb 12, 2014 at 9:46 PM, Ivan Brusic i...@brusic.com wrote: The answer is still correct. What the git commit that I referenced essentially accomplished was to remove the

Re: Data Loss

2014-02-12 Thread Mohit Anchlia
Thanks for sharing this info. It's really helpful. In any case data loss shouldn't be acceptable to anyone, especially index corruption and not able to recover at all. I think one shouldn't confuse consistency with data loss as suggested in this thread. It's also good to hear that most of the bugs

Re: Data Loss

2014-02-12 Thread Tony Su
Josh, Your experience about recovering in only about 10 minutes is very interesting. Because my little 5-node cluster/15GB data/3500 indices is taking about an hour to recover and i know the bottleneck is the disk subsystem I'm currently on, Am curious - What is the total size data in your

Re: Data Loss

2014-02-12 Thread joergpra...@gmail.com
I use replica shard level 1, and always use latest ES version. I never had data loss, and that is also due to the fact I have access to dedicated real servers in our DC just a few meters away, and there are no servers at cloud server farms with unknown and unstable network environment. I do

Re: Data Loss

2014-02-12 Thread Josh Harrison
This particular cluster is 16 data nodes with SSD RAIDs connected to each other and the two master nodes with infiniband. Under 100 indexes and usually 3 shards per index with 1 replica. Overall data volume is in the 1TB range. I haven't tweaked the shard allocation settings from default. -Josh

Re: Data Loss

2014-02-12 Thread Mohit Anchlia
On Wed, Feb 12, 2014 at 1:58 PM, joergpra...@gmail.com joergpra...@gmail.com wrote: I use replica shard level 1, and always use latest ES version. I never had data loss, and that is also due to the fact I have access to dedicated real servers in our DC just a few meters away, and there are no

Alias across tribe node clusters?

2014-02-12 Thread Nelson Jeppesen
I want to scale ES between data centers for use with logstash and tribe node(s) sound like a great solution to this. One problem is that I'm not sure how aliases work across clusters. Here's the general idea. two clusters and an alias index on the tribe cluster. Can you do this? Thanks for

Re: Faceted search returns no facets (if I'm using a match message query)

2014-02-12 Thread Björn Schmitt
Thanks for your help! { _index: myindex, _type: knowledge, _id: 1288, _version: 1, exists: true, _source: { id: 1288, question: Est-ce que je dois installer des applications ou des bibliotheques de KOMPLETE 9 a nouveau si celles-ci sont déja sur mon ordinateur, answer:

Elasticsearch 1.0.0 is now GA

2014-02-12 Thread Mark Walkom
I didn't see anything on the list, but there's a blog post about 1.0.0 hitting general availability! http://www.elasticsearch.org/blog/1-0-0-released/ Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com -- You received

[ANN] JDBC river 1.0.0.1 released

2014-02-12 Thread joergpra...@gmail.com
Hi, JDBC river plugin 1.0.0.1 for Elasticsearch 1.0.0 has been released. https://github.com/jprante/elasticsearch-river-jdbc Changes: - compiled against Elasticsearch 1.0.0 - refactored some classes for preparing the move to a more robust data gathering plugin - improved JSON building when

Outdated doc - what is the 1.0 equivalent?

2014-02-12 Thread Ben McCann
Hi, The method setFilter on this page no longer exists: http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/search.html Any tips on what that has changed to? Thanks, Ben -- You received this message because you are subscribed to the Google Groups elasticsearch group.

Re: How to change name?

2014-02-12 Thread Nick Chang
Hello If I push to same index, but different type with elasticsearch. Can I do it?? Thanks David Pilato於 2014年2月11日星期二UTC+8下午12時56分15秒寫道: Which process pull your data from MongoDB and MySQL and push to elasticsearch? If you can't do that at index time, then I guess you will need to manage

Re: Elasticsearch 1.0.0 is now GA

2014-02-12 Thread Mark Walkom
However it looks like the repo's are still on RC2; markw@na0-esd-a-001:~$ sudo apt-get dist-upgrade Reading package lists... Done Building dependency tree Reading state information... Done Calculating upgrade... Done The following packages will be upgraded: elasticsearch 1 upgraded, 0 newly

Re: Outdated doc - what is the 1.0 equivalent?

2014-02-12 Thread Ben McCann
Thanks! I take it that's just a name change for clarity and not a functional change... Thanks, Ben On Wed, Feb 12, 2014 at 7:25 PM, Kevin Wang kevin807...@gmail.com wrote: It has been changed to setPostFilter(...) On Thursday, February 13, 2014 1:39:11 PM UTC+11, Ben McCann wrote: Hi,

Re: Elasticsearch 1.0.0 is now GA

2014-02-12 Thread Petter Abrahamsson
Mark, I believe this has to do with the way the packages have been named. I see the same issue with rpm packages. Rpm (and I believe dpkg) will consider 1.0.0.RC2 to be newer than 1.0.0. [root@dlpuppet01 rpmbuild]# rpmdev-vercmp 1.0.0-1 1.0.0.RC2-1 0:1.0.0.RC2-1 is newer

Re: Data Loss

2014-02-12 Thread Ivan Brusic
On Wed, Feb 12, 2014 at 1:58 PM, joergpra...@gmail.com joergpra...@gmail.com wrote: For my requirements, downtime of 15 min is acceptable. I can only wish! I run an ecommerce site, so my requirement is no downtime. Ever. -- Ivan -- You received this message because you are subscribed to

Re: How to change name?

2014-02-12 Thread David Pilato
I don't think you can. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 13 févr. 2014 à 03:54, Nick Chang nick.ch...@kland.com.tw a écrit : Hello If I push to same index, but different type with elasticsearch. Can I do it?? Thanks David Pilato於

Re: Application wise dashboard in kibana.

2014-02-12 Thread pankaj ghadge
Hi Binh, Thanks for reply, But, I want to search php, css ,js and jpg count from example1.com then I think, I need to query like below: 1) vhost:example1.com AND *.php for php count. 2) vhost:example1.com AND *.css for css count. 3) vhost:example1.com AND *.js for js count. I think it