Re: Can i elastic search as my primary store?
Depends on how valuable your data is. I wouldn't say Elasticsearch is quite there yet to be considered a reliable primary data store, even with 1.4. Lots of work to be done on hardening replication. On Sat, Oct 25, 2014 at 4:08 AM, Nikolas Everett nik9...@gmail.com wrote: I'd wait for 1.4 before considering it. There are lots of stability improvements there. One thing to consider is that updates are quite costly compared to Mongo/MySQL whatever. Nik On Fri, Oct 24, 2014 at 6:34 PM, Zennet Wheatcroft zwheatcr...@atypon.com wrote: I have heard from the source, Do not use Elasticsearch as a data store. But some people do and it works ok. I would recommend that you use the snapshot and restore features. And back up your json file data so you can re-index in case your index gets corrupted. And be careful upgrading, especially between breaking versions. On Friday, October 24, 2014 2:32:56 PM UTC-7, Akram Hussein wrote: Is it a use case today to use elastic search as a primary store? basically using it similar to mongodb? is that a use case the product is moving towards or it is mostly just for search? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a6124515-fea1-47ab-9b0a-6718e4123164%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/a6124515-fea1-47ab-9b0a-6718e4123164%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2R85TSSyQxfdpwU_ucUdCezCgP536xDb3UojBgt25PMg%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2R85TSSyQxfdpwU_ucUdCezCgP536xDb3UojBgt25PMg%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHWG4DO-y8NwQDedz-Rvu1L_F%3DwO%3D8Bvz9MMUxRFA6J%2BsU-%3DSw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: High cpu load but low memory usage.
Please post the result of hot threads action on a gist/paste page so we can understand your problem better. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html Jörg On Fri, Oct 24, 2014 at 10:43 AM, Atrus anhhu...@gmail.com wrote: Hi Pros, I've used ES for several months, it works perfectly and speed as lightning. There are 3 nodes in cluster, each have 12cores CPU and 24GB-32GB RAM. https://lh3.googleusercontent.com/-_uMqos5lNrA/VEoN2IA030I/E7Q/P-dzKwCA5Uo/s1600/es1.png For some recent days, the cpu get too high on all three nodes. https://lh4.googleusercontent.com/-vZ2nh9KaL5I/VEoOdnnLnpI/E7Y/hhKD9quUW04/s1600/es3.png Here is some sample record : https://lh4.googleusercontent.com/-CB5yFYk54i4/VEoOzF4RnDI/E7g/-Ko82c2hiC8/s1600/es4.png The question is : - Why the cpu get too high but memory consume is low althougt I have set the big HEAP size. - There is 15 shards per index, is this too much or enough ? I've used the default config. I know that this could be effect the load but dont know how to figure out the exact number. - Is there any way to show the running queries ? something like mysql show process list ? to show what queries have eat CPU alot. I have enable slow log queries 1s but found nothing. - Any suggestion is appreciate. If you need more info, plz tell me. Thank you so much. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/11addd59-1138-4f62-bac2-3e95030f5631%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/11addd59-1138-4f62-bac2-3e95030f5631%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEepovw3E_B%2B3vGSB%2B9nB-%2B2-RrQMVYPJRmp1PKpjkHdQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Aggregation buckets, with an additional key:value inside.
I maintain a mapping on the client side to due the lookups. Thankfully my taxonomy is static (but somewhat large). There is a PR to do server-side mappings, but I don't think it would apply to aggregations and is quite old. An alternative solution would be to create compound values such as 48885:Car Rental and decompose the value on the client side, but this would create a string aggregation, which could have slower performance. Cheers, Ivan On Fri, Oct 24, 2014 at 5:50 PM, Cody Stringham cs.nega...@gmail.com wrote: Hey everyone, These aggregations are working out great, but I need to return more than one value in the bucket so we can use them in our API. The basic idea is that we aggregate all of the category id's, but we also want the category_name to be included in that same bucket for ease of use. *Mapping:* categories : { properties : { category_name : { analyzer : keyword, type : string }, category_id : { type : integer }, parent_id : { type : integer } } } *Aggs:* aggs: { categories: { terms: { size: 130, field: categories.category_id } }, *Returns (actual):* category_stats: [ { category_id: 58, offer_count: 48885 }, { category_id: 1008, offer_count: 44530 }, ... *Returns (desired):* category_stats: [ { category_name: Car Rental, category_id: 58, offer_count: 48885 }, { category_name: Fast Food, category_id: 1008, offer_count: 44530 }, ... -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/055d0c2b-f4ba-455b-883f-587c09b61582%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/055d0c2b-f4ba-455b-883f-587c09b61582%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB-Asuq5igGP-mJQ7RGv4t2CjsjryBGSTPDn0EAb-vfZw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: High cpu load but low memory usage.
Le vendredi 24 octobre 2014 10:43:21 UTC+2, Atrus a écrit : - There is 15 shards per index, is this too much or enough ? I've used the default config. I know that this could be effect the load but dont know how to figure out the exact number. It's a huge value. Shards can be split between nodes, do you target tu use 15 nodes? - Is there any way to show the running queries ? something like mysql show process list ? to show what queries have eat CPU alot. I have enable slow log queries 1s but found nothing. You can watch HTTP traffic, with pcap (I hack packetbeat, for that). It's from the outside, from the inside, use the hot thread. strace can help, too. - Any suggestion is appreciate. Do you poll the _nodes/stat url? a monitoring tool, or a web page like kopf? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3fbc22df-c8d8-4472-8a1a-db90a130d795%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Heavy load on a small Elasticsearch cluster
What monitoring tool do you use? Try to reduce the frequency at least. The _nodes/stats?all url is VERY slow for an elasticsearch request, something like 1500 ms. Some tools like kopf poll it every 3 seconds. If your tool poll it too every minute, you can break something. The _nodes/stats/indices is the slowest sub part, and the most interesting sub part. Is it a regression in ES 1.3 ? I can't find any 1.1 ES in my network to bench the difference. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62d15231-5ca6-49e1-90f9-f87b9ff76978%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Sorting weirdness
I have a mapping like this: venue: { type: nested, include_in_parent: true, properties: { name: { type: string } } If I'm sorting by 'venue.name' ascending, why would a name like 'Terminal 5' be sorted before 'B.B. King Blues Club Grill'? Does it have something to do with the number '5' in the name? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/159fe507-bb01-4504-9ba7-e6b7e7bb964c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Aggregation on last element
Hi, I have a type whose data looks like this: { date: 2014-01-01 element: abc, type: A }, { date: 2014-01-02 element: abc, type: B }, { date: 2014-01-03 element: def, type: A } I'd like to be able to group the data by element, and count the documents where the LAST document by date have a type of A. In this case, I want the result to be 1 (because the second document, that has the same element as the first document, has a date that is after the first document, but as its type is not B, I don't want it to be counted ; for the last document, it is the only one with element def and the type is A). I'm not sure this is even possible. Please note that the cardinality of element can be quite high (up to 20 000 different values). Thank you in advance! -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/46509869-4afa-4062-8c34-ad828dcf680c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: High cpu load but low memory usage.
Thanks Mathieu for you response. I will try your suggestions. It's a huge value. Shards can be split between nodes, do you target tu use 15 nodes? Hi Mat, - For examples if I have just one node, shards = 5, replica = 0. Then I can easily backup the data by cp /var/lib/elasticsearch/nodename /somewhere/backup -rfp - Now I add one more node, so the cluster has two nodes, shards = 5, replica = 0. The shards are redistributed, maybe 1st node holds 0 2 4, 2nd node holds 1 3. = How can I backup, each node does not hold the whole data, can not simple cp ... - If I update replicas = 1, each node now have full 5 shards, I can easy cp backup on any node. If you know the better way for backup which can handle distributed shards, plz let me know. Thank you. PS : Can I reduce shards from 5 to 4 without losing data ? On Sunday, October 26, 2014 12:57:40 AM UTC+7, Mathieu Lecarme wrote: Le vendredi 24 octobre 2014 10:43:21 UTC+2, Atrus a écrit : - There is 15 shards per index, is this too much or enough ? I've used the default config. I know that this could be effect the load but dont know how to figure out the exact number. It's a huge value. Shards can be split between nodes, do you target tu use 15 nodes? - Is there any way to show the running queries ? something like mysql show process list ? to show what queries have eat CPU alot. I have enable slow log queries 1s but found nothing. You can watch HTTP traffic, with pcap (I hack packetbeat, for that). It's from the outside, from the inside, use the hot thread. strace can help, too. - Any suggestion is appreciate. Do you poll the _nodes/stat url? a monitoring tool, or a web page like kopf? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ab77f45-177d-4546-b953-5f38c7f4f5d1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Heavy load on a small Elasticsearch cluster
Hi, On Friday, October 24, 2014 3:30:01 PM UTC-4, Mathieu Lecarme wrote: Le vendredi 24 octobre 2014 19:59:01 UTC+2, Jörg Prante a écrit : You're doomed :) What monitoring tool do you use? Try to reduce the frequency at least. Jörg New Relic monitor the OS but don't touch ES. May want to look at SPM http://sematext.com/spm/ for both. For now, maybe look at thread dump (I'd actually create a few of them and compare them), GC, merges stats... Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ I used a patched version of Diamond. I unplugged it and I RTFM for specific frequency setup. Some browsers with kopf runnning. I can watch user agent, to know who is hurting the server. This tools was used for months, without breaking anything. I'm still suspicious about recovery status. M. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f324b8b6-82c7-4e97-8dca-6d83290efd72%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.