Re: Can i elastic search as my primary store?

2014-10-25 Thread shikhar
Depends on how valuable your data is.

I wouldn't say Elasticsearch is quite there yet to be considered a reliable
primary data store, even with 1.4. Lots of work to be done on hardening
replication.

On Sat, Oct 25, 2014 at 4:08 AM, Nikolas Everett nik9...@gmail.com wrote:

 I'd wait for 1.4 before considering it.  There are lots of stability
 improvements there.  One thing to consider is that updates are quite costly
 compared to Mongo/MySQL whatever.

 Nik

 On Fri, Oct 24, 2014 at 6:34 PM, Zennet Wheatcroft zwheatcr...@atypon.com
  wrote:

 I have heard from the source, Do not use Elasticsearch as a data store.
 But some people do and it works ok. I would recommend that you use the
 snapshot and restore features. And back up your json file data so you can
 re-index in case your index gets corrupted. And be careful upgrading,
 especially between breaking versions.


 On Friday, October 24, 2014 2:32:56 PM UTC-7, Akram Hussein wrote:

 Is it a use case today to use elastic search as a primary store?
 basically using it similar to mongodb? is that a use case the product is
 moving towards or it is mostly just for search?

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a6124515-fea1-47ab-9b0a-6718e4123164%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a6124515-fea1-47ab-9b0a-6718e4123164%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2R85TSSyQxfdpwU_ucUdCezCgP536xDb3UojBgt25PMg%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2R85TSSyQxfdpwU_ucUdCezCgP536xDb3UojBgt25PMg%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHWG4DO-y8NwQDedz-Rvu1L_F%3DwO%3D8Bvz9MMUxRFA6J%2BsU-%3DSw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: High cpu load but low memory usage.

2014-10-25 Thread joergpra...@gmail.com
Please post the result of hot threads action on a gist/paste page so we
can understand your problem better.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html

Jörg

On Fri, Oct 24, 2014 at 10:43 AM, Atrus anhhu...@gmail.com wrote:

 Hi Pros,

 I've used ES for several months, it works perfectly and speed as lightning.

 There are 3 nodes in cluster, each have 12cores CPU and 24GB-32GB RAM.



 https://lh3.googleusercontent.com/-_uMqos5lNrA/VEoN2IA030I/E7Q/P-dzKwCA5Uo/s1600/es1.png

 For some recent days, the cpu get too high on all three nodes.


 https://lh4.googleusercontent.com/-vZ2nh9KaL5I/VEoOdnnLnpI/E7Y/hhKD9quUW04/s1600/es3.png

 Here is some sample record :


 https://lh4.googleusercontent.com/-CB5yFYk54i4/VEoOzF4RnDI/E7g/-Ko82c2hiC8/s1600/es4.png

 The question is :

 - Why the cpu get too high but memory consume is low althougt I have set
 the big HEAP size.

 - There is 15 shards per index, is this too much or enough ? I've used the
 default config. I know that this could be effect the load but dont know how
 to figure out the exact number.

 - Is there any way to show the running queries ? something like mysql show
 process list ? to show what queries have eat CPU alot. I have enable slow
 log queries 1s but found nothing.

 - Any suggestion is appreciate.

 If you need more info, plz tell me.

 Thank you so much.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/11addd59-1138-4f62-bac2-3e95030f5631%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/11addd59-1138-4f62-bac2-3e95030f5631%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEepovw3E_B%2B3vGSB%2B9nB-%2B2-RrQMVYPJRmp1PKpjkHdQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation buckets, with an additional key:value inside.

2014-10-25 Thread Ivan Brusic
I maintain a mapping on the client side to due the lookups. Thankfully my
taxonomy is static (but somewhat large). There is a PR to do server-side
mappings, but I don't think it would apply to aggregations and is quite old.

An alternative solution would be to create compound values such as
48885:Car Rental and decompose the value on the client side, but this
would create a string aggregation, which could have slower performance.

Cheers,

Ivan

On Fri, Oct 24, 2014 at 5:50 PM, Cody Stringham cs.nega...@gmail.com
wrote:

 Hey everyone,

 These aggregations are working out great, but I need to return more than
 one value in the bucket so we can use them in our API. The basic idea is
 that we aggregate all of the category id's, but we also want the
 category_name to be included in that same bucket for ease of use.


 *Mapping:*
 categories : {
 properties : {
 category_name : {
 analyzer : keyword,
 type : string
 },
 category_id : {
 type : integer
 },
 parent_id : {
 type : integer
 }
 }
 }

 *Aggs:*
 aggs: {
   categories: {
 terms: {
   size: 130,
   field: categories.category_id
 }
   },


 *Returns (actual):*

 category_stats: [
 {
   category_id: 58,
   offer_count: 48885
 },
 {
   category_id: 1008,
   offer_count: 44530
 },

 ...



 *Returns (desired):*

 category_stats: [
 {
   category_name: Car Rental,
   category_id: 58,
   offer_count: 48885
 },
 {
   category_name: Fast Food,
   category_id: 1008,
   offer_count: 44530
 },

 ...

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/055d0c2b-f4ba-455b-883f-587c09b61582%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/055d0c2b-f4ba-455b-883f-587c09b61582%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB-Asuq5igGP-mJQ7RGv4t2CjsjryBGSTPDn0EAb-vfZw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: High cpu load but low memory usage.

2014-10-25 Thread Mathieu Lecarme


Le vendredi 24 octobre 2014 10:43:21 UTC+2, Atrus a écrit :

 - There is 15 shards per index, is this too much or enough ? I've used the 
 default config. I know that this could be effect the load but dont know how 
 to figure out the exact number.

 It's a huge value. Shards can be split between nodes, do you target tu use 
15 nodes?
 

 - Is there any way to show the running queries ? something like mysql show 
 process list ? to show what queries have eat CPU alot. I have enable slow 
 log queries 1s but found nothing.

You can watch HTTP traffic, with pcap (I hack packetbeat, for that). It's 
from the outside, from the inside, use the hot thread. strace can help, too.
 

 - Any suggestion is appreciate.

Do you poll the _nodes/stat url? a monitoring tool, or a web page like kopf?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3fbc22df-c8d8-4472-8a1a-db90a130d795%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Heavy load on a small Elasticsearch cluster

2014-10-25 Thread Mathieu Lecarme


What monitoring tool do you use? Try to reduce the frequency at least.


The _nodes/stats?all url is VERY slow for an elasticsearch request, 
something like 1500 ms. Some tools like kopf poll it every 3 seconds. If 
your tool poll it too every minute, you can break something. The 
_nodes/stats/indices is the slowest sub part, and the most interesting sub 
part.

Is it a regression in ES 1.3 ? I can't find any 1.1 ES in my network to 
bench the difference.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/62d15231-5ca6-49e1-90f9-f87b9ff76978%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Sorting weirdness

2014-10-25 Thread Michael Irwin
I have a mapping like this:

venue: {
  type: nested,
  include_in_parent: true,
  properties: {
name: {
  type: string
}
  }

If I'm sorting by 'venue.name' ascending, why would a name like 'Terminal 
5' be sorted before 'B.B. King Blues Club  Grill'? Does it have something 
to do with the number '5' in the name?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/159fe507-bb01-4504-9ba7-e6b7e7bb964c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Aggregation on last element

2014-10-25 Thread Michaël Gallego
Hi,

I have a type whose data looks like this:

{
date: 2014-01-01
element: abc,
type: A
},
{
date: 2014-01-02
element: abc,
type: B
},
{
date: 2014-01-03
element: def,
type: A
}

I'd like to be able to group the data by element, and count the documents 
where the LAST document by date have a type of A. In this case, I want the 
result to be 1 (because the second document, that has the same element as 
the first document, has a date that is after the first document, but as its 
type is not B, I don't want it to be counted ; for the last document, it is 
the only one with element def and the type is A).

I'm not sure this is even possible. Please note that the cardinality of 
element can be quite high (up to 20 000 different values).

Thank you in advance!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/46509869-4afa-4062-8c34-ad828dcf680c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: High cpu load but low memory usage.

2014-10-25 Thread Atrus
Thanks Mathieu for you response. I will try your suggestions.

It's a huge value. Shards can be split between nodes, do you target tu use 
15 nodes?

Hi Mat, 

- For examples if I have just one node, shards = 5, replica = 0. Then I can 
easily backup the data by cp /var/lib/elasticsearch/nodename 
/somewhere/backup -rfp

- Now I add one more node, so the cluster has two nodes, shards = 5, 
replica = 0. The shards are redistributed, maybe 1st node holds 0 2 4, 2nd 
node holds 1 3. = How can I backup, each node does not hold the whole 
data, can not simple cp ...

- If I update replicas = 1, each node now have full 5 shards, I can easy cp 
backup on any node.

If you know the better way for backup which can handle distributed shards, 
plz let me know.

Thank you.

PS : Can I reduce shards from 5 to 4 without losing data ? 

On Sunday, October 26, 2014 12:57:40 AM UTC+7, Mathieu Lecarme wrote:



 Le vendredi 24 octobre 2014 10:43:21 UTC+2, Atrus a écrit :

 - There is 15 shards per index, is this too much or enough ? I've used 
 the default config. I know that this could be effect the load but dont know 
 how to figure out the exact number.

 It's a huge value. Shards can be split between nodes, do you target tu 
 use 15 nodes?
  

 - Is there any way to show the running queries ? something like mysql 
 show process list ? to show what queries have eat CPU alot. I have enable 
 slow log queries 1s but found nothing.

 You can watch HTTP traffic, with pcap (I hack packetbeat, for that). It's 
 from the outside, from the inside, use the hot thread. strace can help, too.
  

 - Any suggestion is appreciate.

 Do you poll the _nodes/stat url? a monitoring tool, or a web page like 
 kopf?



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4ab77f45-177d-4546-b953-5f38c7f4f5d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Heavy load on a small Elasticsearch cluster

2014-10-25 Thread Otis Gospodnetic
Hi,

On Friday, October 24, 2014 3:30:01 PM UTC-4, Mathieu Lecarme wrote:



 Le vendredi 24 octobre 2014 19:59:01 UTC+2, Jörg Prante a écrit :

 You're doomed :) 

 What monitoring tool do you use? Try to reduce the frequency at least.

 Jörg


 New Relic monitor the OS but don't touch ES.


May want to look at SPM http://sematext.com/spm/ for both.

For now, maybe look at thread dump (I'd actually create a few of them and 
compare them), GC, merges stats...

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr  Elasticsearch Support * http://sematext.com/


I used a patched version of Diamond. I unplugged it and I RTFM for specific 
 frequency setup. 
 Some browsers with kopf runnning.

 I can watch user agent, to know who is hurting the server.

 This tools was used for months, without breaking anything. I'm still 
 suspicious about recovery status.

 M.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f324b8b6-82c7-4e97-8dca-6d83290efd72%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.