Data Node Allocation Weight

2015-03-20 Thread smonasco
Hi,

I'm in a bit of a crunch.  We have some servers we added to our cluster 
whose processors were not as good as the older servers.  I'd rather not get 
into how we got swindled into accepting them, but this leaves me in a place 
with imbalanced hardware between my nodes.

The crunch bit comes in that during our peak load we begin to have slow 
queries which lead to rejected queries once we max out the CPUs on the new 
machines.

I'm trying to find a way to tell ES to put more shards on the more powerful 
nodes.

The simplest idea, would be if I could give a weight to servers or 
something else rather than try to move shards around by hand, create zones 
and force some indexes into them, etc.

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c4d9eb0b-fc46-4f40-a4c0-809632dec5b8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


term complexity and filter caching

2014-10-28 Thread smonasco
Hi peeps,

Leaf level filter cache is nice, but caching complex filters is better if 
you can get some hit ratio.

We have some items (querystrings that may become bool filters later) with 
over 50 terms (sometimes hundreds) that get reused often.  We essentially 
aggregate financial research for hundreds of sites, each of which have only 
paid for certain data and this is the primary case for these complex and 
highly reused queries.

However, we are looking at adding in some leaf level filter caching on 
other items.  Say an industry or a report type.  Across fields with small 
permutation and high re-use.

Elasticsearch employs an LRU policy on filter cache.

However, the complex queries with high term complexity and lots of ANDs 
and ORs saves far more time, CPU and memory per use than the leaf level 
caching on something like reporttype:research ,  so I'd like to add some 
weight to the more complex queries when determining what gets evicted.

Thoughts?

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e51fafea-d4f3-4bc3-823e-26f9e40dfe89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What is better - create several document types or several indices?

2014-09-11 Thread smonasco
Every index has a minimum of one shard.  Multiple types can live in the same 
shard.  Shards both have maintenance overheads and slow down queries.  However, 
if you have a lot of targeted queries you can more easily reduce the shards 
accessed by reducing indexes than you could if you had multi-tenancy.  I could 
be missing something but I don't think you can have multiple routing values in 
a query, but someone may want to query multiple log types.

So it depends. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ed078e55-b9e3-4c82-8285-a08ba5f90e21%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Constant High (~99%) CPU on 1 of 5 Nodes in Cluster

2014-08-03 Thread smonasco
What jumps out at me is your that the CPU work you're doing seems to be very 
index related, your garbage collections are trying hard on the errant machine 
and not getting anywhere and you have a lot of deleted docs.

Tell us about your indexing strategy?  Tell us things like routing, how bursty 
it is and maybe why you have so many deleted docs.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2a79ccc0-d2cd-4d27-b66e-715d709c838b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Diagnosing a slow query

2014-08-02 Thread smonasco
10 to 20% of the time queries take the better part of a second when only on one 
node and you indicate the queries are very similar in nature.

For that second do a lot of queries show up all at once or are they spaced out?
Do you see a spike in query requests?
Do you see the CPU max out?
Do you see queries get queued?
Can you run the same exact query regularly and still see the problem or only in 
a heterogeneous set do you get the issue?
Is there IO disk contention when they happen?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e427b15b-510c-4e28-8aa1-5af5dce30265%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Rolling Upgrading from 1.2.1 to 1.3.0 – java.lang.IllegalArgumentException: No enum constant org.apache.lucene.util.Version.4.3.1

2014-07-28 Thread smonasco
Do you happen to know if optimize will create a segment larger than 5 gigs?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/619c435f-8040-4eb1-9528-ba1638d25ee3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Clustering/Sharding impact on query performance

2014-07-20 Thread smonasco
Your query looks weird it says filtered but has no filter.  It specifies boost 
but has no sort.  I would remove the filtered, either remove boost or force the 
sort on score and ensure I was using index options of offsets (this also 
includes term frequencies and doc counts and pretty much everything precompiled 
to do the TF idf thing)

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9d9fd340-a001-42e5-afa1-b1a78bdca111%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How can I make this search requirement work?

2014-07-16 Thread smonasco
A little late to the party but I would have used a custom index analyzer with 
lowercase, pattern, edgengram and a search analyzer of lowercase, pattern  
(maybe you have to flip lowercase and pattern)

With the pattern tokenizer you can specify a regex.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/693ed0c3-2998-4da4-b30a-c7bf9f311770%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: High memory usage on dedicated master nodes

2014-07-16 Thread smonasco
Maybe I'm missing something but if you give java mms em it will use it if only 
to store garbage.  The trough after a garbage collection is usually more 
indicative of what is actually in use.

This looks like a CMS/parnew setup.  Parnew is fast and low blocking, but 
leaves stuff behind.  CMS blocks but is really good at taking out the garbage.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0e7c621e-3e85-4bc3-ac0f-a1971cb097f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch dies every other day

2014-07-14 Thread smonasco
Is the non data node a client node?  Here we are counting master eligible 
nodes.  Whether you havever 4 or 5 I would go with min master eligible nodes 3. 
 I'm sure what your replica setup is, but only 2 nodes is probably not a 
healthy cluster.

With this set to 3 you won't have a split brain.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b7b4fb77-478e-4709-9f6f-60bfad67feef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch dies every other day

2014-07-14 Thread smonasco
Do you have thread counts and do any of them correlate to the crash times?  I'm 
guessing that we'll find index threads leap up.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fab19f43-8735-4208-873f-a4575a05b313%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What do these metrics mean?

2014-07-14 Thread smonasco
So it doesn't appear that the sum of currentqueries = K * active search 
threads .  In other words they are not proportional.

Currentqueries was flat, and search.active jumped 2 orders of magnitude. 
 We have highly varied indexes and searches.  Some indexes have only one 
shard with one replica and some have 40 shards with 1 replica.  Some 
queries are nothing more than a search string with no boosting ordered by a 
date and with no facets.  Others have 6 facets, boosts, interesting 
analyzers and ordered by score.

I'm sure some searches use more resources than others, but do they expand 
to multiple threads?  I would imagine one could easily expand this out at 
the shard (or maybe segment) level.  I could also imagine that facets could 
get their own threads once the reduced document set was determined.

--Shannon Monasco

P.S. If you don't know about version 0.19.3, but know about .90+ or even 
1.x that's fine.  I'll take answers I can get.

On Friday, July 11, 2014 9:58:03 PM UTC-6, smonasco wrote:

 Thanks Ivan!
 On Jul 11, 2014 5:56 PM, Ivan Brusic i...@brusic.com wrote:

 The default in 0.90 (not sure about 0.19.) should still be a fixed search 
 thread pool: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/modules-threadpool.html

 I find that the current queries and active threads tends to be the same 
 number. When I look at my graphs, they have the same movements.

 If you do not have any queued requests you might want to set a lower cap 
 if your cluster is experiencing slowness.

 BTW, the best course of action that you can take is to simply upgrade and 
 not worry about thread settings. The Lucene 4 improvements (Elasticsearch 
 0.90 I believe) were monumental in the space savings. New versions will 
 offer tons of other improvements, but the 0.90 release was especially great!

 Cheers,

 Ivan


 On Fri, Jul 11, 2014 at 3:09 PM, Shannon Monasco smona...@gmail.com 
 wrote:

 I have never seen any queuing on search threads.  Not sure if in .19 it 
 defaults to cache or not but that's the behavior I see.

 What about current queries from indices stats?
 On Jul 11, 2014 2:53 PM, Ivan Brusic i...@brusic.com wrote:

  Your second paragraph is correct. The threads are the total number of 
 search threads at your disposal, active is the number of ongoing threads 
 and queue are the number of threads that cannot be run since your thread 
 pool is exhausted, which should be when active == threads, but not always 
 the case.

 The default number of search threads is based upon the number of 
 processors (3x # of available processors). There is no good metric for 
 determining a balance since searches can be either lightweight 
 (milliseconds) or heavyweight (minutes), but I would argue that the key 
 metric to monitor is your queue. Is it normally empty? Spiky behavior? 
 Requests constantly queued?


 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html

 Cheers,

 Ivan


 On Fri, Jul 11, 2014 at 1:19 PM, smonasco smona...@gmail.com wrote:

 I should probably preface everything with I'm running a 5 node cluster 
 with version 0.19.3 and should be up to version 1.1.2 by the middle of 
 August, but I have some confusion around metrics I'm seeing, what they 
 mean 
 and what are good values.

 In thread_pools I see threads, active and queued.  Queued + active != 
 threads.  I assume this really is a work pool and you have active 
 threads, 
 a thread count in the work pool and queued work.  So some explanation 
 around this would be nice.

 I've correlated some spikes in search threads with heap mem 
 utilization explosions.  current searches sort of also correlate, but I 
 have more current searches than search threads and there is not search 
 threadpool queueing.

 I'm not sure how current searches correlate (or if they should/do) 
 with search threads.

 I've observed the following:

 Devestating: 10,000 current searches on worst index sustained over 
 hours with not much change ending at the same time as spikes of  1000 
 search threads (where we generally average  50) and a heap explosion.

 Oddly OK: current searches averaging 15 on worst index spiking to 105 
 with search threads averaging 50 with maxes of 300 spiking to averages of 
 120 and maxes of  1000


 So...  I guess, what are good ranges for search threads and current 
 searches?

 --Shannon Monasco

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout

Re: Upgrade 0.26.6 - 1.2.2 any catches?

2014-07-14 Thread smonasco
@Ivan, what do you mean the stores are throttled with a very low level?  Do you 
mean index threads or merge factor or what?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/63273182-325a-4253-bb79-f9a7b5e3b199%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: elasticsearch dies every other day

2014-07-13 Thread smonasco
You are certainly having a heap utilization issue.  As the utilization gets 
close to 100% the GCs get aggressive.

Is your heap utilization staying close to the edge?  Does it jump up at the 
end?  What about field cache?  What about hot threads?

200+ per second is higher than needed indexing.  You may want to try scalling 
back index threads.

Otherwise if it's not a huge spike you may need more memory.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb2e2b9b-2821-4b07-a85f-9bb51873b14a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Reading and writing the same document too fast -- data loss

2014-07-13 Thread smonasco
Sounds like you're searching and using the results to reindex the doc.  Search 
is not real time, but get is.  So you could get the document after the search.  
Also make sure your application isn't stepping on itself and is either in 
serial or has some sort of lock idea.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bc2ceabb-0c7d-445b-82d5-25568b08cb0e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


What do these metrics mean?

2014-07-11 Thread smonasco
I should probably preface everything with I'm running a 5 node cluster with 
version 0.19.3 and should be up to version 1.1.2 by the middle of August, 
but I have some confusion around metrics I'm seeing, what they mean and 
what are good values.

In thread_pools I see threads, active and queued.  Queued + active != 
threads.  I assume this really is a work pool and you have active threads, 
a thread count in the work pool and queued work.  So some explanation 
around this would be nice.

I've correlated some spikes in search threads with heap mem utilization 
explosions.  current searches sort of also correlate, but I have more 
current searches than search threads and there is not search threadpool 
queueing.

I'm not sure how current searches correlate (or if they should/do) with 
search threads.

I've observed the following:

Devestating: 10,000 current searches on worst index sustained over hours 
with not much change ending at the same time as spikes of  1000 search 
threads (where we generally average  50) and a heap explosion.

Oddly OK: current searches averaging 15 on worst index spiking to 105 with 
search threads averaging 50 with maxes of 300 spiking to averages of 120 
and maxes of  1000


So...  I guess, what are good ranges for search threads and current 
searches?

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommended Hardware Specs Sharding\Index Strategy

2014-07-03 Thread smonasco
I wonder if you might get some better performance via parent child 
relationships.  Reinforcing a large nested document would mean many Lucene 
deletes and inserts every week, but adding new data based on this past week's 
info should be cheaper in index operations and probably in all other places 
having to load the entire 1.74meg documents over what's probably less than 365 
times smaller.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d2960353-f75d-460d-b77a-7b890bf0d820%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Visibility

2014-07-03 Thread smonasco
Thanks.  I'll have to check that out. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6fb48f2b-d690-40bd-a01c-649849ee1388%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Visibility

2014-07-03 Thread smonasco
Thanks.  I'll have to check that out. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/51693680-2a0b-4ac4-a3b5-e0c721773701%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Visibility

2014-07-02 Thread smonasco
Hi,

I'm trying to get a lot more visibility and metrics into what's going on 
under the hood.

Occasionally, we see spikes in memory.  I'd like to get heap mem used on a 
per shard basis.  If I'm not mistaken, somewhere somehow, this Lucene index 
that is a shard is using memory in the heap, and I'd like to collect metric.

It may also be an operation somewhere higher up in the elasticsearch level 
where we are merging results from shards or results from indexes (maybe 
elasticsearch doesn't bother to merge twice but merges once), that's also a 
mem space I'd like to collect data on.

I think a per query mem use would also be something interesting, though, 
perhaps obviously too much to keep up with for every query (maybe a future 
opt-in feature, unless it's already there and I'm missing it).

Other cluster events like nodes entering and exiting the cluster or the 
changing of the master would be nice to collect.

I'm guessing some of this isn't available and some of it is, but my 
Google-Fu seems to be lacking.  I'm pretty sure I can poll to figure out 
the events happened, but was wondering if there was something in the java 
client node where I could get a Future or some other hook to turn it into a 
push instead of a pull.

Any help will be appreciated.  I'm aware it's a wide net though.

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/56362f94-c20b-4201-ae15-5f5f9ca77ff4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to see all document types for a give index?

2014-07-01 Thread smonasco
I believe http://localhost:9200/index/_mapping will give you types.

It is an indirect method for sure, but that kind of metadata is going to be in 
memory and not require fielddata cache.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/75b3cb8c-5c22-48e5-88d3-d7f64377ca94%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Memory issue

2014-07-01 Thread smonasco
My understanding was that ES 1.1 was using memory mapped files and so the field 
cache would not be part of the heap.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/308f68ba-bd7a-4775-ae1d-90088445d103%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Facet cache size and other memory metrics

2014-06-17 Thread smonasco
Hi,

We're having problems with some nodes hitting the maximum heap size and 
were looking into ways to get visibility into the field cache impact of 
different indexes/shards.

Any suggestions?

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6d95699e-b6cf-4553-907e-8aa029f045a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Facet cache size and other memory metrics

2014-06-17 Thread smonasco
For instance I think I remember some plugin that would give you an idea how 
big an impact a facet might have on your field cache, and I think that was 
suppose to become part of Elasticsearch itself, but I may be dreaming.

On Tuesday, June 17, 2014 4:06:35 PM UTC-6, smonasco wrote:

 Hi,

 We're having problems with some nodes hitting the maximum heap size and 
 were looking into ways to get visibility into the field cache impact of 
 different indexes/shards.

 Any suggestions?

 --Shannon Monasco


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a7998f97-36dc-45fd-86a4-28eae0a9eb90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Facet cache size and other memory metrics

2014-06-17 Thread smonasco
This has something like what I'd like to find in the Cache stats per field 
section https://github.com/bleskes/elasticfacets , but I'm unsure if it's 
any good for 1.x

On Tuesday, June 17, 2014 4:10:13 PM UTC-6, smonasco wrote:

 For instance I think I remember some plugin that would give you an idea 
 how big an impact a facet might have on your field cache, and I think that 
 was suppose to become part of Elasticsearch itself, but I may be dreaming.

 On Tuesday, June 17, 2014 4:06:35 PM UTC-6, smonasco wrote:

 Hi,

 We're having problems with some nodes hitting the maximum heap size and 
 were looking into ways to get visibility into the field cache impact of 
 different indexes/shards.

 Any suggestions?

 --Shannon Monasco



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d869473a-96ff-4169-a96a-1fd0e11681d6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Extremely Simple Java Request

2014-06-14 Thread smonasco
I rolled my own.

On Thursday, June 12, 2014 5:23:00 PM UTC-6, smonasco wrote:

 I've been through the API documentation and the elasticsearch source code.

 I'm looking to create a service that when given a list of (URL path, JSON 
 path, metric type) will take the value found at the JSON path from the JSON 
 returned from the URL in my cluster and put it into our metrics system.

 I'm trying to find something really simple in the Java space that will 
 give me a JSON object of some variety or a response object I can call 
 .toXContent on when I pass it the URL.

 Not sure if I'm explaining this well, but something like:

 Request request = new SimpleRequest(url);
 Response response = client.get(request).actionGet();
 XContentBuilder responseContent = XContentFactory.jsonBuilder();
 response.toXContent(responseContent, null);

 If I need to build it myself that's ok.  I'd just like to know it's not 
 built and maybe have a pointer or two.

 --Shannon Monasco


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8dcd8d74-453d-4578-9f1b-1250ceaa7594%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Extremely Simple Java Request

2014-06-12 Thread smonasco
I've been through the API documentation and the elasticsearch source code.

I'm looking to create a service that when given a list of (URL path, JSON 
path, metric type) will take the value found at the JSON path from the JSON 
returned from the URL in my cluster and put it into our metrics system.

I'm trying to find something really simple in the Java space that will give 
me a JSON object of some variety or a response object I can call 
.toXContent on when I pass it the URL.

Not sure if I'm explaining this well, but something like:

Request request = new SimpleRequest(url);
Response response = client.get(request).actionGet();
XContentBuilder responseContent = XContentFactory.jsonBuilder();
response.toXContent(responseContent, null);

If I need to build it myself that's ok.  I'd just like to know it's not 
built and maybe have a pointer or two.

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0b62a618-8812-450b-9160-e2763c02ee0b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch/Lucene Delete space reuse? recovery?

2014-06-03 Thread smonasco
I'm starting a project to index log files.  I don't particularly want to 
wait until the log files roll over.  There will be files from 100's of apps 
running across 100's of machines (not all apps intersect with all machines, 
but you get the drift).  Some roll over very fast; some may take days.

The problem comes that if I am constantly reindexing the same document 
(same id) am I loosing all old space (store and or index) or is 
Elasticsearch/Lucene smart enough to say here's a new version we'll 
overwrite the old store/index entries and point to this one where they are 
the same and add new ones.

Certainly, there is a more sophisticated model that treats every line as a 
unique document/row such that this doesn't become an issue, but I'm not 
ready to spend that kind of dev and hardware at this issue.  (Our 
elasticsearch solution is wrapped in a system that becomes really heavy 
handed when indexing such small pieces.)

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9d9d38f7-ba4f-470c-9864-5b9af8abc773%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Percolate Existing Documents Seems Broke

2014-04-23 Thread smonasco
Maybe I'm off or I missed something where this was covered, but...

curl -XPUT 'localhost:9200/test'

curl -XPUT 'localhost:9200/test/.percolator/1' -d '
{
query: {
query_string: {
   query: headline:\apples\
}
},
attributes: {
query_name: headline apples
}
}'

curl -XGET 'localhost:9200/test/test/_percolate' -d '
{
doc: {
headline: apples
}
}'

get the match I expect

curl -XPUT 'localhost:9200/test/test/1' -d '
{
doc: {
headline: apples
}
}'

curl -XGET 'localhost:9200/test/test/1/_percolate'

no match


--Shannon Monasco

P.S. I am using elasticsearch-1.0.1 and running it on my local windows 
workstation.  I can roll back the yml to the most basic one if that helps.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/76a4cd31-e31a-4005-bf6f-51ca89c50bbc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Too Many Open Files

2014-03-04 Thread smonasco
Sorry to have taken so long to reply.  So I went ahead and followed your 
link.  I'd been there before, but decided to give it a deeper look.  I 
found actually, however, that bigdesk told me how many max open files the 
process was using and from there I was able to determine that my settings 
in limits.conf was not being honored even though if I switched to the 
context Elasticsearch was running under I would get the appropriate limits.

I then dug into the service script and found someone dropped a ulimit 
statement into the script that was overwriting the limits.conf setting.

Thank you,
Shannon Monasco



On Wednesday, January 22, 2014 10:09:42 AM UTC-7, Ivan Brusic wrote:

 The first thing to do is check if your limits are actually being persisted 
 and used. The elasticsearch site has a good writeup: 
 http://www.elasticsearch.org/tutorials/too-many-open-files/

 Second, it might be possible that you are reaching the 128k limit. How 
 many shards per node do you have? Do you have non standard merge settings? 
 You can use the status API to find out how many open files you have. I do 
 not have a link since it might have changed since 0.19.

 Also, be aware that it is not possible to do rolling upgrades with nodes 
 have different major versions of elasticsearch. The underlying data will be 
 fine and does not need to be upgraded, but nodes will not be able to 
 communicate with each other.

 Cheers,

 Ivan



 On Tue, Jan 21, 2014 at 7:42 AM, smonasco smon...@gmail.com javascript:
  wrote:

 Sorry wrong error message.

 [2014-01-18 06:47:06,232][WARN 
 ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to accept a 
 connection.
 java.io.IOException: Too many open files
 at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
 at 
 sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:226)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:227)
 at 
 org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)


 The other posted error message is newer and seems to follow the too many 
 open files error message.

 --Shannon Monasco

 On Tuesday, January 21, 2014 8:35:18 AM UTC-7, smonasco wrote:

 Hi,

 I am using version 0.19.3 I have the nofile limit set to 128K and am 
 getting errors like

 [2014-01-18 06:52:54,857][WARN 
 ][netty.channel.socket.nio.NioServerSocketPipelineSink] 
 Failed to initialize an accepted socket.
 org.elasticsearch.common.netty.channel.ChannelException: Failed to 
 create a selector.
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.start(AbstractNioWorker.java:154)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.register(AbstractNioWorker.java:131)
 at org.elasticsearch.common.netty.channel.socket.nio.
 NioServerSocketPipelineSink$Boss.registerAcceptedChannel(
 NioServerSocketPipelineSink.java:269)
 at org.elasticsearch.common.netty.channel.socket.nio.
 NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.
 java:231)
 at org.elasticsearch.common.netty.util.internal.
 DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(
 ThreadPoolExecutor.java:1110)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(
 ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.io.IOException: Too many open files
 at sun.nio.ch.IOUtil.makePipe(Native Method)
 at sun.nio.ch.EPollSelectorImpl.init(EPollSelectorImpl.java:
 65)
 at sun.nio.ch.EPollSelectorProvider.openSelector(
 EPollSelectorProvider.java:36)
 at java.nio.channels.Selector.open(Selector.java:227)
 at org.elasticsearch.common.netty.channel.socket.nio.
 AbstractNioWorker.start(AbstractNioWorker.java:152)
 ... 7 more

 I am aware that version 0.19.3 is old.  We have been having trouble 
 getting our infrastructure group to build out new nodes so we can have a 
 rolling upgrade with testing for both versions going on.  I am now setting 
 the limit to 1048576 as per http://stackoverflow.com/
 questions/1212925/on-linux-set-maximum-open-files-to-unlimited-possible, 
 however, I'm concerned this may cause other issues.

 If anyone has any suggestions I'd love to hear them.  I am using this as 
 fuel for the please pay attention and get us the support we need so we can 
 upgrade campaign.

 --Shannon Monasco

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving

Re: Too Many Open Files

2014-01-21 Thread smonasco
Sorry wrong error message.

[2014-01-18 06:47:06,232][WARN 
][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to accept a 
connection.
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:226)
at 
org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:227)
at 
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)


The other posted error message is newer and seems to follow the too many 
open files error message.

--Shannon Monasco

On Tuesday, January 21, 2014 8:35:18 AM UTC-7, smonasco wrote:

 Hi,

 I am using version 0.19.3 I have the nofile limit set to 128K and am 
 getting errors like

 [2014-01-18 06:52:54,857][WARN 
 ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to 
 initialize an accepted socket.
 org.elasticsearch.common.netty.channel.ChannelException: Failed to create 
 a selector.
 at 
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.start(AbstractNioWorker.java:154)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.register(AbstractNioWorker.java:131)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.registerAcceptedChannel(NioServerSocketPipelineSink.java:269)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:231)
 at 
 org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.io.IOException: Too many open files
 at sun.nio.ch.IOUtil.makePipe(Native Method)
 at sun.nio.ch.EPollSelectorImpl.init(EPollSelectorImpl.java:65)
 at 
 sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36)
 at java.nio.channels.Selector.open(Selector.java:227)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.start(AbstractNioWorker.java:152)
 ... 7 more

 I am aware that version 0.19.3 is old.  We have been having trouble 
 getting our infrastructure group to build out new nodes so we can have a 
 rolling upgrade with testing for both versions going on.  I am now setting 
 the limit to 1048576 as per 
 http://stackoverflow.com/questions/1212925/on-linux-set-maximum-open-files-to-unlimited-possible,
  
 however, I'm concerned this may cause other issues.

 If anyone has any suggestions I'd love to hear them.  I am using this as 
 fuel for the please pay attention and get us the support we need so we can 
 upgrade campaign.

 --Shannon Monasco


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c77b1f9-7838-40f9-b7e2-6006370265a5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.