Data Node Allocation Weight
Hi, I'm in a bit of a crunch. We have some servers we added to our cluster whose processors were not as good as the older servers. I'd rather not get into how we got swindled into accepting them, but this leaves me in a place with imbalanced hardware between my nodes. The crunch bit comes in that during our peak load we begin to have slow queries which lead to rejected queries once we max out the CPUs on the new machines. I'm trying to find a way to tell ES to put more shards on the more powerful nodes. The simplest idea, would be if I could give a weight to servers or something else rather than try to move shards around by hand, create zones and force some indexes into them, etc. --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c4d9eb0b-fc46-4f40-a4c0-809632dec5b8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
term complexity and filter caching
Hi peeps, Leaf level filter cache is nice, but caching complex filters is better if you can get some hit ratio. We have some items (querystrings that may become bool filters later) with over 50 terms (sometimes hundreds) that get reused often. We essentially aggregate financial research for hundreds of sites, each of which have only paid for certain data and this is the primary case for these complex and highly reused queries. However, we are looking at adding in some leaf level filter caching on other items. Say an industry or a report type. Across fields with small permutation and high re-use. Elasticsearch employs an LRU policy on filter cache. However, the complex queries with high term complexity and lots of ANDs and ORs saves far more time, CPU and memory per use than the leaf level caching on something like reporttype:research , so I'd like to add some weight to the more complex queries when determining what gets evicted. Thoughts? --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e51fafea-d4f3-4bc3-823e-26f9e40dfe89%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: What is better - create several document types or several indices?
Every index has a minimum of one shard. Multiple types can live in the same shard. Shards both have maintenance overheads and slow down queries. However, if you have a lot of targeted queries you can more easily reduce the shards accessed by reducing indexes than you could if you had multi-tenancy. I could be missing something but I don't think you can have multiple routing values in a query, but someone may want to query multiple log types. So it depends. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed078e55-b9e3-4c82-8285-a08ba5f90e21%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Constant High (~99%) CPU on 1 of 5 Nodes in Cluster
What jumps out at me is your that the CPU work you're doing seems to be very index related, your garbage collections are trying hard on the errant machine and not getting anywhere and you have a lot of deleted docs. Tell us about your indexing strategy? Tell us things like routing, how bursty it is and maybe why you have so many deleted docs. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2a79ccc0-d2cd-4d27-b66e-715d709c838b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Diagnosing a slow query
10 to 20% of the time queries take the better part of a second when only on one node and you indicate the queries are very similar in nature. For that second do a lot of queries show up all at once or are they spaced out? Do you see a spike in query requests? Do you see the CPU max out? Do you see queries get queued? Can you run the same exact query regularly and still see the problem or only in a heterogeneous set do you get the issue? Is there IO disk contention when they happen? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e427b15b-510c-4e28-8aa1-5af5dce30265%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Rolling Upgrading from 1.2.1 to 1.3.0 – java.lang.IllegalArgumentException: No enum constant org.apache.lucene.util.Version.4.3.1
Do you happen to know if optimize will create a segment larger than 5 gigs? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/619c435f-8040-4eb1-9528-ba1638d25ee3%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Clustering/Sharding impact on query performance
Your query looks weird it says filtered but has no filter. It specifies boost but has no sort. I would remove the filtered, either remove boost or force the sort on score and ensure I was using index options of offsets (this also includes term frequencies and doc counts and pretty much everything precompiled to do the TF idf thing) -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d9fd340-a001-42e5-afa1-b1a78bdca111%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How can I make this search requirement work?
A little late to the party but I would have used a custom index analyzer with lowercase, pattern, edgengram and a search analyzer of lowercase, pattern (maybe you have to flip lowercase and pattern) With the pattern tokenizer you can specify a regex. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/693ed0c3-2998-4da4-b30a-c7bf9f311770%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: High memory usage on dedicated master nodes
Maybe I'm missing something but if you give java mms em it will use it if only to store garbage. The trough after a garbage collection is usually more indicative of what is actually in use. This looks like a CMS/parnew setup. Parnew is fast and low blocking, but leaves stuff behind. CMS blocks but is really good at taking out the garbage. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0e7c621e-3e85-4bc3-ac0f-a1971cb097f6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: elasticsearch dies every other day
Is the non data node a client node? Here we are counting master eligible nodes. Whether you havever 4 or 5 I would go with min master eligible nodes 3. I'm sure what your replica setup is, but only 2 nodes is probably not a healthy cluster. With this set to 3 you won't have a split brain. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b7b4fb77-478e-4709-9f6f-60bfad67feef%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: elasticsearch dies every other day
Do you have thread counts and do any of them correlate to the crash times? I'm guessing that we'll find index threads leap up. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fab19f43-8735-4208-873f-a4575a05b313%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: What do these metrics mean?
So it doesn't appear that the sum of currentqueries = K * active search threads . In other words they are not proportional. Currentqueries was flat, and search.active jumped 2 orders of magnitude. We have highly varied indexes and searches. Some indexes have only one shard with one replica and some have 40 shards with 1 replica. Some queries are nothing more than a search string with no boosting ordered by a date and with no facets. Others have 6 facets, boosts, interesting analyzers and ordered by score. I'm sure some searches use more resources than others, but do they expand to multiple threads? I would imagine one could easily expand this out at the shard (or maybe segment) level. I could also imagine that facets could get their own threads once the reduced document set was determined. --Shannon Monasco P.S. If you don't know about version 0.19.3, but know about .90+ or even 1.x that's fine. I'll take answers I can get. On Friday, July 11, 2014 9:58:03 PM UTC-6, smonasco wrote: Thanks Ivan! On Jul 11, 2014 5:56 PM, Ivan Brusic i...@brusic.com wrote: The default in 0.90 (not sure about 0.19.) should still be a fixed search thread pool: http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/modules-threadpool.html I find that the current queries and active threads tends to be the same number. When I look at my graphs, they have the same movements. If you do not have any queued requests you might want to set a lower cap if your cluster is experiencing slowness. BTW, the best course of action that you can take is to simply upgrade and not worry about thread settings. The Lucene 4 improvements (Elasticsearch 0.90 I believe) were monumental in the space savings. New versions will offer tons of other improvements, but the 0.90 release was especially great! Cheers, Ivan On Fri, Jul 11, 2014 at 3:09 PM, Shannon Monasco smona...@gmail.com wrote: I have never seen any queuing on search threads. Not sure if in .19 it defaults to cache or not but that's the behavior I see. What about current queries from indices stats? On Jul 11, 2014 2:53 PM, Ivan Brusic i...@brusic.com wrote: Your second paragraph is correct. The threads are the total number of search threads at your disposal, active is the number of ongoing threads and queue are the number of threads that cannot be run since your thread pool is exhausted, which should be when active == threads, but not always the case. The default number of search threads is based upon the number of processors (3x # of available processors). There is no good metric for determining a balance since searches can be either lightweight (milliseconds) or heavyweight (minutes), but I would argue that the key metric to monitor is your queue. Is it normally empty? Spiky behavior? Requests constantly queued? http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html Cheers, Ivan On Fri, Jul 11, 2014 at 1:19 PM, smonasco smona...@gmail.com wrote: I should probably preface everything with I'm running a 5 node cluster with version 0.19.3 and should be up to version 1.1.2 by the middle of August, but I have some confusion around metrics I'm seeing, what they mean and what are good values. In thread_pools I see threads, active and queued. Queued + active != threads. I assume this really is a work pool and you have active threads, a thread count in the work pool and queued work. So some explanation around this would be nice. I've correlated some spikes in search threads with heap mem utilization explosions. current searches sort of also correlate, but I have more current searches than search threads and there is not search threadpool queueing. I'm not sure how current searches correlate (or if they should/do) with search threads. I've observed the following: Devestating: 10,000 current searches on worst index sustained over hours with not much change ending at the same time as spikes of 1000 search threads (where we generally average 50) and a heap explosion. Oddly OK: current searches averaging 15 on worst index spiking to 105 with search threads averaging 50 with maxes of 300 spiking to averages of 120 and maxes of 1000 So... I guess, what are good ranges for search threads and current searches? --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout
Re: Upgrade 0.26.6 - 1.2.2 any catches?
@Ivan, what do you mean the stores are throttled with a very low level? Do you mean index threads or merge factor or what? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/63273182-325a-4253-bb79-f9a7b5e3b199%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: elasticsearch dies every other day
You are certainly having a heap utilization issue. As the utilization gets close to 100% the GCs get aggressive. Is your heap utilization staying close to the edge? Does it jump up at the end? What about field cache? What about hot threads? 200+ per second is higher than needed indexing. You may want to try scalling back index threads. Otherwise if it's not a huge spike you may need more memory. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/cb2e2b9b-2821-4b07-a85f-9bb51873b14a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Reading and writing the same document too fast -- data loss
Sounds like you're searching and using the results to reindex the doc. Search is not real time, but get is. So you could get the document after the search. Also make sure your application isn't stepping on itself and is either in serial or has some sort of lock idea. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bc2ceabb-0c7d-445b-82d5-25568b08cb0e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
What do these metrics mean?
I should probably preface everything with I'm running a 5 node cluster with version 0.19.3 and should be up to version 1.1.2 by the middle of August, but I have some confusion around metrics I'm seeing, what they mean and what are good values. In thread_pools I see threads, active and queued. Queued + active != threads. I assume this really is a work pool and you have active threads, a thread count in the work pool and queued work. So some explanation around this would be nice. I've correlated some spikes in search threads with heap mem utilization explosions. current searches sort of also correlate, but I have more current searches than search threads and there is not search threadpool queueing. I'm not sure how current searches correlate (or if they should/do) with search threads. I've observed the following: Devestating: 10,000 current searches on worst index sustained over hours with not much change ending at the same time as spikes of 1000 search threads (where we generally average 50) and a heap explosion. Oddly OK: current searches averaging 15 on worst index spiking to 105 with search threads averaging 50 with maxes of 300 spiking to averages of 120 and maxes of 1000 So... I guess, what are good ranges for search threads and current searches? --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Recommended Hardware Specs Sharding\Index Strategy
I wonder if you might get some better performance via parent child relationships. Reinforcing a large nested document would mean many Lucene deletes and inserts every week, but adding new data based on this past week's info should be cheaper in index operations and probably in all other places having to load the entire 1.74meg documents over what's probably less than 365 times smaller. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d2960353-f75d-460d-b77a-7b890bf0d820%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Visibility
Thanks. I'll have to check that out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6fb48f2b-d690-40bd-a01c-649849ee1388%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Visibility
Thanks. I'll have to check that out. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/51693680-2a0b-4ac4-a3b5-e0c721773701%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Visibility
Hi, I'm trying to get a lot more visibility and metrics into what's going on under the hood. Occasionally, we see spikes in memory. I'd like to get heap mem used on a per shard basis. If I'm not mistaken, somewhere somehow, this Lucene index that is a shard is using memory in the heap, and I'd like to collect metric. It may also be an operation somewhere higher up in the elasticsearch level where we are merging results from shards or results from indexes (maybe elasticsearch doesn't bother to merge twice but merges once), that's also a mem space I'd like to collect data on. I think a per query mem use would also be something interesting, though, perhaps obviously too much to keep up with for every query (maybe a future opt-in feature, unless it's already there and I'm missing it). Other cluster events like nodes entering and exiting the cluster or the changing of the master would be nice to collect. I'm guessing some of this isn't available and some of it is, but my Google-Fu seems to be lacking. I'm pretty sure I can poll to figure out the events happened, but was wondering if there was something in the java client node where I could get a Future or some other hook to turn it into a push instead of a pull. Any help will be appreciated. I'm aware it's a wide net though. --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/56362f94-c20b-4201-ae15-5f5f9ca77ff4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: How to see all document types for a give index?
I believe http://localhost:9200/index/_mapping will give you types. It is an indirect method for sure, but that kind of metadata is going to be in memory and not require fielddata cache. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/75b3cb8c-5c22-48e5-88d3-d7f64377ca94%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Elasticsearch Memory issue
My understanding was that ES 1.1 was using memory mapped files and so the field cache would not be part of the heap. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/308f68ba-bd7a-4775-ae1d-90088445d103%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Facet cache size and other memory metrics
Hi, We're having problems with some nodes hitting the maximum heap size and were looking into ways to get visibility into the field cache impact of different indexes/shards. Any suggestions? --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6d95699e-b6cf-4553-907e-8aa029f045a6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Facet cache size and other memory metrics
For instance I think I remember some plugin that would give you an idea how big an impact a facet might have on your field cache, and I think that was suppose to become part of Elasticsearch itself, but I may be dreaming. On Tuesday, June 17, 2014 4:06:35 PM UTC-6, smonasco wrote: Hi, We're having problems with some nodes hitting the maximum heap size and were looking into ways to get visibility into the field cache impact of different indexes/shards. Any suggestions? --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a7998f97-36dc-45fd-86a4-28eae0a9eb90%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Facet cache size and other memory metrics
This has something like what I'd like to find in the Cache stats per field section https://github.com/bleskes/elasticfacets , but I'm unsure if it's any good for 1.x On Tuesday, June 17, 2014 4:10:13 PM UTC-6, smonasco wrote: For instance I think I remember some plugin that would give you an idea how big an impact a facet might have on your field cache, and I think that was suppose to become part of Elasticsearch itself, but I may be dreaming. On Tuesday, June 17, 2014 4:06:35 PM UTC-6, smonasco wrote: Hi, We're having problems with some nodes hitting the maximum heap size and were looking into ways to get visibility into the field cache impact of different indexes/shards. Any suggestions? --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d869473a-96ff-4169-a96a-1fd0e11681d6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Extremely Simple Java Request
I rolled my own. On Thursday, June 12, 2014 5:23:00 PM UTC-6, smonasco wrote: I've been through the API documentation and the elasticsearch source code. I'm looking to create a service that when given a list of (URL path, JSON path, metric type) will take the value found at the JSON path from the JSON returned from the URL in my cluster and put it into our metrics system. I'm trying to find something really simple in the Java space that will give me a JSON object of some variety or a response object I can call .toXContent on when I pass it the URL. Not sure if I'm explaining this well, but something like: Request request = new SimpleRequest(url); Response response = client.get(request).actionGet(); XContentBuilder responseContent = XContentFactory.jsonBuilder(); response.toXContent(responseContent, null); If I need to build it myself that's ok. I'd just like to know it's not built and maybe have a pointer or two. --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8dcd8d74-453d-4578-9f1b-1250ceaa7594%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Extremely Simple Java Request
I've been through the API documentation and the elasticsearch source code. I'm looking to create a service that when given a list of (URL path, JSON path, metric type) will take the value found at the JSON path from the JSON returned from the URL in my cluster and put it into our metrics system. I'm trying to find something really simple in the Java space that will give me a JSON object of some variety or a response object I can call .toXContent on when I pass it the URL. Not sure if I'm explaining this well, but something like: Request request = new SimpleRequest(url); Response response = client.get(request).actionGet(); XContentBuilder responseContent = XContentFactory.jsonBuilder(); response.toXContent(responseContent, null); If I need to build it myself that's ok. I'd just like to know it's not built and maybe have a pointer or two. --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0b62a618-8812-450b-9160-e2763c02ee0b%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Elasticsearch/Lucene Delete space reuse? recovery?
I'm starting a project to index log files. I don't particularly want to wait until the log files roll over. There will be files from 100's of apps running across 100's of machines (not all apps intersect with all machines, but you get the drift). Some roll over very fast; some may take days. The problem comes that if I am constantly reindexing the same document (same id) am I loosing all old space (store and or index) or is Elasticsearch/Lucene smart enough to say here's a new version we'll overwrite the old store/index entries and point to this one where they are the same and add new ones. Certainly, there is a more sophisticated model that treats every line as a unique document/row such that this doesn't become an issue, but I'm not ready to spend that kind of dev and hardware at this issue. (Our elasticsearch solution is wrapped in a system that becomes really heavy handed when indexing such small pieces.) --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9d9d38f7-ba4f-470c-9864-5b9af8abc773%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Percolate Existing Documents Seems Broke
Maybe I'm off or I missed something where this was covered, but... curl -XPUT 'localhost:9200/test' curl -XPUT 'localhost:9200/test/.percolator/1' -d ' { query: { query_string: { query: headline:\apples\ } }, attributes: { query_name: headline apples } }' curl -XGET 'localhost:9200/test/test/_percolate' -d ' { doc: { headline: apples } }' get the match I expect curl -XPUT 'localhost:9200/test/test/1' -d ' { doc: { headline: apples } }' curl -XGET 'localhost:9200/test/test/1/_percolate' no match --Shannon Monasco P.S. I am using elasticsearch-1.0.1 and running it on my local windows workstation. I can roll back the yml to the most basic one if that helps. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/76a4cd31-e31a-4005-bf6f-51ca89c50bbc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Too Many Open Files
Sorry to have taken so long to reply. So I went ahead and followed your link. I'd been there before, but decided to give it a deeper look. I found actually, however, that bigdesk told me how many max open files the process was using and from there I was able to determine that my settings in limits.conf was not being honored even though if I switched to the context Elasticsearch was running under I would get the appropriate limits. I then dug into the service script and found someone dropped a ulimit statement into the script that was overwriting the limits.conf setting. Thank you, Shannon Monasco On Wednesday, January 22, 2014 10:09:42 AM UTC-7, Ivan Brusic wrote: The first thing to do is check if your limits are actually being persisted and used. The elasticsearch site has a good writeup: http://www.elasticsearch.org/tutorials/too-many-open-files/ Second, it might be possible that you are reaching the 128k limit. How many shards per node do you have? Do you have non standard merge settings? You can use the status API to find out how many open files you have. I do not have a link since it might have changed since 0.19. Also, be aware that it is not possible to do rolling upgrades with nodes have different major versions of elasticsearch. The underlying data will be fine and does not need to be upgraded, but nodes will not be able to communicate with each other. Cheers, Ivan On Tue, Jan 21, 2014 at 7:42 AM, smonasco smon...@gmail.com javascript: wrote: Sorry wrong error message. [2014-01-18 06:47:06,232][WARN ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to accept a connection. java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:226) at org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:227) at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) The other posted error message is newer and seems to follow the too many open files error message. --Shannon Monasco On Tuesday, January 21, 2014 8:35:18 AM UTC-7, smonasco wrote: Hi, I am using version 0.19.3 I have the nofile limit set to 128K and am getting errors like [2014-01-18 06:52:54,857][WARN ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to initialize an accepted socket. org.elasticsearch.common.netty.channel.ChannelException: Failed to create a selector. at org.elasticsearch.common.netty.channel.socket.nio. AbstractNioWorker.start(AbstractNioWorker.java:154) at org.elasticsearch.common.netty.channel.socket.nio. AbstractNioWorker.register(AbstractNioWorker.java:131) at org.elasticsearch.common.netty.channel.socket.nio. NioServerSocketPipelineSink$Boss.registerAcceptedChannel( NioServerSocketPipelineSink.java:269) at org.elasticsearch.common.netty.channel.socket.nio. NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink. java:231) at org.elasticsearch.common.netty.util.internal. DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker( ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run( ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.IOException: Too many open files at sun.nio.ch.IOUtil.makePipe(Native Method) at sun.nio.ch.EPollSelectorImpl.init(EPollSelectorImpl.java: 65) at sun.nio.ch.EPollSelectorProvider.openSelector( EPollSelectorProvider.java:36) at java.nio.channels.Selector.open(Selector.java:227) at org.elasticsearch.common.netty.channel.socket.nio. AbstractNioWorker.start(AbstractNioWorker.java:152) ... 7 more I am aware that version 0.19.3 is old. We have been having trouble getting our infrastructure group to build out new nodes so we can have a rolling upgrade with testing for both versions going on. I am now setting the limit to 1048576 as per http://stackoverflow.com/ questions/1212925/on-linux-set-maximum-open-files-to-unlimited-possible, however, I'm concerned this may cause other issues. If anyone has any suggestions I'd love to hear them. I am using this as fuel for the please pay attention and get us the support we need so we can upgrade campaign. --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving
Re: Too Many Open Files
Sorry wrong error message. [2014-01-18 06:47:06,232][WARN ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to accept a connection. java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:226) at org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:227) at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) The other posted error message is newer and seems to follow the too many open files error message. --Shannon Monasco On Tuesday, January 21, 2014 8:35:18 AM UTC-7, smonasco wrote: Hi, I am using version 0.19.3 I have the nofile limit set to 128K and am getting errors like [2014-01-18 06:52:54,857][WARN ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to initialize an accepted socket. org.elasticsearch.common.netty.channel.ChannelException: Failed to create a selector. at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.start(AbstractNioWorker.java:154) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.register(AbstractNioWorker.java:131) at org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.registerAcceptedChannel(NioServerSocketPipelineSink.java:269) at org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:231) at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.io.IOException: Too many open files at sun.nio.ch.IOUtil.makePipe(Native Method) at sun.nio.ch.EPollSelectorImpl.init(EPollSelectorImpl.java:65) at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36) at java.nio.channels.Selector.open(Selector.java:227) at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.start(AbstractNioWorker.java:152) ... 7 more I am aware that version 0.19.3 is old. We have been having trouble getting our infrastructure group to build out new nodes so we can have a rolling upgrade with testing for both versions going on. I am now setting the limit to 1048576 as per http://stackoverflow.com/questions/1212925/on-linux-set-maximum-open-files-to-unlimited-possible, however, I'm concerned this may cause other issues. If anyone has any suggestions I'd love to hear them. I am using this as fuel for the please pay attention and get us the support we need so we can upgrade campaign. --Shannon Monasco -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c77b1f9-7838-40f9-b7e2-6006370265a5%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.