from:"smonasco"

Data Node Allocation Weight

2015-03-20 Thread smonasco

Hi,

I'm in a bit of a crunch.  We have some servers we added to our cluster 
whose processors were not as good as the older servers.  I'd rather not get 
into how we got swindled into accepting them, but this leaves me in a place 
with imbalanced hardware between my nodes.

The crunch bit comes in that during our peak load we begin to have slow 
queries which lead to rejected queries once we max out the CPUs on the new 
machines.

I'm trying to find a way to tell ES to put more shards on the more powerful 
nodes.

The simplest idea, would be if I could give a weight to servers or 
something else rather than try to move shards around by hand, create zones 
and force some indexes into them, etc.

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c4d9eb0b-fc46-4f40-a4c0-809632dec5b8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

term complexity and filter caching

2014-10-28 Thread smonasco

Hi peeps,

Leaf level filter cache is nice, but caching complex filters is better if 
you can get some hit ratio.

We have some items (querystrings that may become bool filters later) with 
over 50 terms (sometimes hundreds) that get reused often.  We essentially 
aggregate financial research for hundreds of sites, each of which have only 
paid for certain data and this is the primary case for these complex and 
highly reused queries.

However, we are looking at adding in some leaf level filter caching on 
other items.  Say an industry or a report type.  Across fields with small 
permutation and high re-use.

Elasticsearch employs an LRU policy on filter cache.

However, the complex queries with high term complexity and lots of ANDs 
and ORs saves far more time, CPU and memory per use than the leaf level 
caching on something like reporttype:research ,  so I'd like to add some 
weight to the more complex queries when determining what gets evicted.

Thoughts?

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e51fafea-d4f3-4bc3-823e-26f9e40dfe89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: What is better - create several document types or several indices?

2014-09-11 Thread smonasco

Every index has a minimum of one shard.  Multiple types can live in the same 
shard.  Shards both have maintenance overheads and slow down queries.  However, 
if you have a lot of targeted queries you can more easily reduce the shards 
accessed by reducing indexes than you could if you had multi-tenancy.  I could 
be missing something but I don't think you can have multiple routing values in 
a query, but someone may want to query multiple log types.

So it depends. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ed078e55-b9e3-4c82-8285-a08ba5f90e21%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Constant High (~99%) CPU on 1 of 5 Nodes in Cluster

2014-08-03 Thread smonasco

What jumps out at me is your that the CPU work you're doing seems to be very 
index related, your garbage collections are trying hard on the errant machine 
and not getting anywhere and you have a lot of deleted docs.

Tell us about your indexing strategy?  Tell us things like routing, how bursty 
it is and maybe why you have so many deleted docs.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2a79ccc0-d2cd-4d27-b66e-715d709c838b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Diagnosing a slow query

2014-08-02 Thread smonasco

10 to 20% of the time queries take the better part of a second when only on one 
node and you indicate the queries are very similar in nature.

For that second do a lot of queries show up all at once or are they spaced out?
Do you see a spike in query requests?
Do you see the CPU max out?
Do you see queries get queued?
Can you run the same exact query regularly and still see the problem or only in 
a heterogeneous set do you get the issue?
Is there IO disk contention when they happen?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e427b15b-510c-4e28-8aa1-5af5dce30265%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Rolling Upgrading from 1.2.1 to 1.3.0 – java.lang.IllegalArgumentException: No enum constant org.apache.lucene.util.Version.4.3.1

2014-07-28 Thread smonasco

Do you happen to know if optimize will create a segment larger than 5 gigs?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/619c435f-8040-4eb1-9528-ba1638d25ee3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Clustering/Sharding impact on query performance

2014-07-20 Thread smonasco

Your query looks weird it says filtered but has no filter.  It specifies boost 
but has no sort.  I would remove the filtered, either remove boost or force the 
sort on score and ensure I was using index options of offsets (this also 
includes term frequencies and doc counts and pretty much everything precompiled 
to do the TF idf thing)

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9d9fd340-a001-42e5-afa1-b1a78bdca111%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How can I make this search requirement work?

2014-07-16 Thread smonasco

A little late to the party but I would have used a custom index analyzer with 
lowercase, pattern, edgengram and a search analyzer of lowercase, pattern  
(maybe you have to flip lowercase and pattern)

With the pattern tokenizer you can specify a regex.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/693ed0c3-2998-4da4-b30a-c7bf9f311770%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: High memory usage on dedicated master nodes

2014-07-16 Thread smonasco

Maybe I'm missing something but if you give java mms em it will use it if only 
to store garbage.  The trough after a garbage collection is usually more 
indicative of what is actually in use.

This looks like a CMS/parnew setup.  Parnew is fast and low blocking, but 
leaves stuff behind.  CMS blocks but is really good at taking out the garbage.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0e7c621e-3e85-4bc3-ac0f-a1971cb097f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: elasticsearch dies every other day

2014-07-14 Thread smonasco

Is the non data node a client node?  Here we are counting master eligible 
nodes.  Whether you havever 4 or 5 I would go with min master eligible nodes 3. 
 I'm sure what your replica setup is, but only 2 nodes is probably not a 
healthy cluster.

With this set to 3 you won't have a split brain.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b7b4fb77-478e-4709-9f6f-60bfad67feef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: elasticsearch dies every other day

2014-07-14 Thread smonasco

Do you have thread counts and do any of them correlate to the crash times?  I'm 
guessing that we'll find index threads leap up.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fab19f43-8735-4208-873f-a4575a05b313%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: What do these metrics mean?

2014-07-14 Thread smonasco

So it doesn't appear that the sum of currentqueries = K * active search
threads . In other words they are not proportional.

Currentqueries was flat, and search.active jumped 2 orders of magnitude.
We have highly varied indexes and searches. Some indexes have only one
shard with one replica and some have 40 shards with 1 replica. Some
queries are nothing more than a search string with no boosting ordered by a
date and with no facets. Others have 6 facets, boosts, interesting
analyzers and ordered by score.

I'm sure some searches use more resources than others, but do they expand
to multiple threads? I would imagine one could easily expand this out at
the shard (or maybe segment) level. I could also imagine that facets could
get their own threads once the reduced document set was determined.

--Shannon Monasco

P.S. If you don't know about version 0.19.3, but know about .90+ or even
1.x that's fine. I'll take answers I can get.

On Friday, July 11, 2014 9:58:03 PM UTC-6, smonasco wrote:

Thanks Ivan!
On Jul 11, 2014 5:56 PM, Ivan Brusic i...@brusic.com wrote:

The default in 0.90 (not sure about 0.19.) should still be a fixed search
thread pool:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/modules-threadpool.html

I find that the current queries and active threads tends to be the same
number. When I look at my graphs, they have the same movements.

If you do not have any queued requests you might want to set a lower cap
if your cluster is experiencing slowness.

BTW, the best course of action that you can take is to simply upgrade and
not worry about thread settings. The Lucene 4 improvements (Elasticsearch
0.90 I believe) were monumental in the space savings. New versions will
offer tons of other improvements, but the 0.90 release was especially great!

Cheers,

Ivan

On Fri, Jul 11, 2014 at 3:09 PM, Shannon Monasco smona...@gmail.com
wrote:

I have never seen any queuing on search threads. Not sure if in .19 it
defaults to cache or not but that's the behavior I see.

What about current queries from indices stats?
On Jul 11, 2014 2:53 PM, Ivan Brusic i...@brusic.com wrote:

Your second paragraph is correct. The threads are the total number of
search threads at your disposal, active is the number of ongoing threads
and queue are the number of threads that cannot be run since your thread
pool is exhausted, which should be when active == threads, but not always
the case.

The default number of search threads is based upon the number of
processors (3x # of available processors). There is no good metric for
determining a balance since searches can be either lightweight
(milliseconds) or heavyweight (minutes), but I would argue that the key
metric to monitor is your queue. Is it normally empty? Spiky behavior?
Requests constantly queued?

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html

Cheers,

Ivan

On Fri, Jul 11, 2014 at 1:19 PM, smonasco smona...@gmail.com wrote:

I should probably preface everything with I'm running a 5 node cluster
with version 0.19.3 and should be up to version 1.1.2 by the middle of
August, but I have some confusion around metrics I'm seeing, what they
mean
and what are good values.

In thread_pools I see threads, active and queued. Queued + active !=
threads. I assume this really is a work pool and you have active
threads,
a thread count in the work pool and queued work. So some explanation
around this would be nice.

I've correlated some spikes in search threads with heap mem
utilization explosions. current searches sort of also correlate, but I
have more current searches than search threads and there is not search
threadpool queueing.

I'm not sure how current searches correlate (or if they should/do)
with search threads.

I've observed the following:

Devestating: 10,000 current searches on worst index sustained over
hours with not much change ending at the same time as spikes of 1000
search threads (where we generally average 50) and a heap explosion.

Oddly OK: current searches averaging 15 on worst index spiking to 105
with search threads averaging 50 with maxes of 300 spiking to averages of
120 and maxes of 1000

So... I guess, what are good ranges for search threads and current
searches?

--Shannon Monasco

--
You received this message because you are subscribed to the Google
Groups elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout

Re: Upgrade 0.26.6 - 1.2.2 any catches?

2014-07-14 Thread smonasco

@Ivan, what do you mean the stores are throttled with a very low level?  Do you 
mean index threads or merge factor or what?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/63273182-325a-4253-bb79-f9a7b5e3b199%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: elasticsearch dies every other day

2014-07-13 Thread smonasco

You are certainly having a heap utilization issue.  As the utilization gets 
close to 100% the GCs get aggressive.

Is your heap utilization staying close to the edge?  Does it jump up at the 
end?  What about field cache?  What about hot threads?

200+ per second is higher than needed indexing.  You may want to try scalling 
back index threads.

Otherwise if it's not a huge spike you may need more memory.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cb2e2b9b-2821-4b07-a85f-9bb51873b14a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Reading and writing the same document too fast -- data loss

2014-07-13 Thread smonasco

Sounds like you're searching and using the results to reindex the doc.  Search 
is not real time, but get is.  So you could get the document after the search.  
Also make sure your application isn't stepping on itself and is either in 
serial or has some sort of lock idea.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bc2ceabb-0c7d-445b-82d5-25568b08cb0e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

What do these metrics mean?

2014-07-11 Thread smonasco

I should probably preface everything with I'm running a 5 node cluster with
version 0.19.3 and should be up to version 1.1.2 by the middle of August,
but I have some confusion around metrics I'm seeing, what they mean and
what are good values.

In thread_pools I see threads, active and queued. Queued + active !=
threads. I assume this really is a work pool and you have active threads,
a thread count in the work pool and queued work. So some explanation
around this would be nice.

I've correlated some spikes in search threads with heap mem utilization
explosions. current searches sort of also correlate, but I have more
current searches than search threads and there is not search threadpool
queueing.

I'm not sure how current searches correlate (or if they should/do) with
search threads.

I've observed the following:

Devestating: 10,000 current searches on worst index sustained over hours
with not much change ending at the same time as spikes of 1000 search
threads (where we generally average 50) and a heap explosion.

Oddly OK: current searches averaging 15 on worst index spiking to 105 with
search threads averaging 50 with maxes of 300 spiking to averages of 120
and maxes of 1000

So... I guess, what are good ranges for search threads and current
searches?

--Shannon Monasco

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8723911c-f764-4286-9f52-7750273a7610%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Recommended Hardware Specs Sharding\Index Strategy

2014-07-03 Thread smonasco

I wonder if you might get some better performance via parent child 
relationships.  Reinforcing a large nested document would mean many Lucene 
deletes and inserts every week, but adding new data based on this past week's 
info should be cheaper in index operations and probably in all other places 
having to load the entire 1.74meg documents over what's probably less than 365 
times smaller.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d2960353-f75d-460d-b77a-7b890bf0d820%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Visibility

2014-07-03 Thread smonasco

Thanks.  I'll have to check that out. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6fb48f2b-d690-40bd-a01c-649849ee1388%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Visibility

2014-07-03 Thread smonasco

Thanks.  I'll have to check that out. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/51693680-2a0b-4ac4-a3b5-e0c721773701%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Visibility

2014-07-02 Thread smonasco

Hi,

I'm trying to get a lot more visibility and metrics into what's going on
under the hood.

Occasionally, we see spikes in memory. I'd like to get heap mem used on a
per shard basis. If I'm not mistaken, somewhere somehow, this Lucene index
that is a shard is using memory in the heap, and I'd like to collect metric.

It may also be an operation somewhere higher up in the elasticsearch level
where we are merging results from shards or results from indexes (maybe
elasticsearch doesn't bother to merge twice but merges once), that's also a
mem space I'd like to collect data on.

I think a per query mem use would also be something interesting, though,
perhaps obviously too much to keep up with for every query (maybe a future
opt-in feature, unless it's already there and I'm missing it).

Other cluster events like nodes entering and exiting the cluster or the
changing of the master would be nice to collect.

I'm guessing some of this isn't available and some of it is, but my
Google-Fu seems to be lacking. I'm pretty sure I can poll to figure out
the events happened, but was wondering if there was something in the java
client node where I could get a Future or some other hook to turn it into a
push instead of a pull.

Any help will be appreciated. I'm aware it's a wide net though.

--Shannon Monasco

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/56362f94-c20b-4201-ae15-5f5f9ca77ff4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: How to see all document types for a give index?

2014-07-01 Thread smonasco

I believe http://localhost:9200/index/_mapping will give you types.

It is an indirect method for sure, but that kind of metadata is going to be in 
memory and not require fielddata cache.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/75b3cb8c-5c22-48e5-88d3-d7f64377ca94%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Elasticsearch Memory issue

2014-07-01 Thread smonasco

My understanding was that ES 1.1 was using memory mapped files and so the field 
cache would not be part of the heap.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/308f68ba-bd7a-4775-ae1d-90088445d103%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Facet cache size and other memory metrics

2014-06-17 Thread smonasco

Hi,

We're having problems with some nodes hitting the maximum heap size and 
were looking into ways to get visibility into the field cache impact of 
different indexes/shards.

Any suggestions?

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6d95699e-b6cf-4553-907e-8aa029f045a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Facet cache size and other memory metrics

2014-06-17 Thread smonasco

For instance I think I remember some plugin that would give you an idea how 
big an impact a facet might have on your field cache, and I think that was 
suppose to become part of Elasticsearch itself, but I may be dreaming.

On Tuesday, June 17, 2014 4:06:35 PM UTC-6, smonasco wrote:

 Hi,

 We're having problems with some nodes hitting the maximum heap size and 
 were looking into ways to get visibility into the field cache impact of 
 different indexes/shards.

 Any suggestions?

 --Shannon Monasco


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a7998f97-36dc-45fd-86a4-28eae0a9eb90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Facet cache size and other memory metrics

2014-06-17 Thread smonasco

This has something like what I'd like to find in the Cache stats per field 
section https://github.com/bleskes/elasticfacets , but I'm unsure if it's 
any good for 1.x

On Tuesday, June 17, 2014 4:10:13 PM UTC-6, smonasco wrote:

 For instance I think I remember some plugin that would give you an idea 
 how big an impact a facet might have on your field cache, and I think that 
 was suppose to become part of Elasticsearch itself, but I may be dreaming.

 On Tuesday, June 17, 2014 4:06:35 PM UTC-6, smonasco wrote:

 Hi,

 We're having problems with some nodes hitting the maximum heap size and 
 were looking into ways to get visibility into the field cache impact of 
 different indexes/shards.

 Any suggestions?

 --Shannon Monasco



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d869473a-96ff-4169-a96a-1fd0e11681d6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Extremely Simple Java Request

2014-06-14 Thread smonasco

I rolled my own.

On Thursday, June 12, 2014 5:23:00 PM UTC-6, smonasco wrote:

 I've been through the API documentation and the elasticsearch source code.

 I'm looking to create a service that when given a list of (URL path, JSON 
 path, metric type) will take the value found at the JSON path from the JSON 
 returned from the URL in my cluster and put it into our metrics system.

 I'm trying to find something really simple in the Java space that will 
 give me a JSON object of some variety or a response object I can call 
 .toXContent on when I pass it the URL.

 Not sure if I'm explaining this well, but something like:

 Request request = new SimpleRequest(url);
 Response response = client.get(request).actionGet();
 XContentBuilder responseContent = XContentFactory.jsonBuilder();
 response.toXContent(responseContent, null);

 If I need to build it myself that's ok.  I'd just like to know it's not 
 built and maybe have a pointer or two.

 --Shannon Monasco


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8dcd8d74-453d-4578-9f1b-1250ceaa7594%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Extremely Simple Java Request

2014-06-12 Thread smonasco

I've been through the API documentation and the elasticsearch source code.

I'm looking to create a service that when given a list of (URL path, JSON 
path, metric type) will take the value found at the JSON path from the JSON 
returned from the URL in my cluster and put it into our metrics system.

I'm trying to find something really simple in the Java space that will give 
me a JSON object of some variety or a response object I can call 
.toXContent on when I pass it the URL.

Not sure if I'm explaining this well, but something like:

Request request = new SimpleRequest(url);
Response response = client.get(request).actionGet();
XContentBuilder responseContent = XContentFactory.jsonBuilder();
response.toXContent(responseContent, null);

If I need to build it myself that's ok.  I'd just like to know it's not 
built and maybe have a pointer or two.

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0b62a618-8812-450b-9160-e2763c02ee0b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch/Lucene Delete space reuse? recovery?

2014-06-03 Thread smonasco

I'm starting a project to index log files.  I don't particularly want to 
wait until the log files roll over.  There will be files from 100's of apps 
running across 100's of machines (not all apps intersect with all machines, 
but you get the drift).  Some roll over very fast; some may take days.

The problem comes that if I am constantly reindexing the same document 
(same id) am I loosing all old space (store and or index) or is 
Elasticsearch/Lucene smart enough to say here's a new version we'll 
overwrite the old store/index entries and point to this one where they are 
the same and add new ones.

Certainly, there is a more sophisticated model that treats every line as a 
unique document/row such that this doesn't become an issue, but I'm not 
ready to spend that kind of dev and hardware at this issue.  (Our 
elasticsearch solution is wrapped in a system that becomes really heavy 
handed when indexing such small pieces.)

--Shannon Monasco

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9d9d38f7-ba4f-470c-9864-5b9af8abc773%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Percolate Existing Documents Seems Broke

2014-04-23 Thread smonasco

Maybe I'm off or I missed something where this was covered, but...

curl -XPUT 'localhost:9200/test'

curl -XPUT 'localhost:9200/test/.percolator/1' -d '
{
query: {
query_string: {
   query: headline:\apples\
}
},
attributes: {
query_name: headline apples
}
}'

curl -XGET 'localhost:9200/test/test/_percolate' -d '
{
doc: {
headline: apples
}
}'

get the match I expect

curl -XPUT 'localhost:9200/test/test/1' -d '
{
doc: {
headline: apples
}
}'

curl -XGET 'localhost:9200/test/test/1/_percolate'

no match


--Shannon Monasco

P.S. I am using elasticsearch-1.0.1 and running it on my local windows 
workstation.  I can roll back the yml to the most basic one if that helps.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/76a4cd31-e31a-4005-bf6f-51ca89c50bbc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Too Many Open Files

2014-03-04 Thread smonasco

Sorry to have taken so long to reply. So I went ahead and followed your
link. I'd been there before, but decided to give it a deeper look. I
found actually, however, that bigdesk told me how many max open files the
process was using and from there I was able to determine that my settings
in limits.conf was not being honored even though if I switched to the
context Elasticsearch was running under I would get the appropriate limits.

I then dug into the service script and found someone dropped a ulimit
statement into the script that was overwriting the limits.conf setting.

Thank you,
Shannon Monasco

On Wednesday, January 22, 2014 10:09:42 AM UTC-7, Ivan Brusic wrote:

The first thing to do is check if your limits are actually being persisted
and used. The elasticsearch site has a good writeup:
http://www.elasticsearch.org/tutorials/too-many-open-files/

Second, it might be possible that you are reaching the 128k limit. How
many shards per node do you have? Do you have non standard merge settings?
You can use the status API to find out how many open files you have. I do
not have a link since it might have changed since 0.19.

Also, be aware that it is not possible to do rolling upgrades with nodes
have different major versions of elasticsearch. The underlying data will be
fine and does not need to be upgraded, but nodes will not be able to
communicate with each other.

Cheers,

Ivan

On Tue, Jan 21, 2014 at 7:42 AM, smonasco smon...@gmail.com javascript:
wrote:

Sorry wrong error message.

[2014-01-18 06:47:06,232][WARN
][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to accept a
connection.
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:226)
at
org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:227)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

The other posted error message is newer and seems to follow the too many
open files error message.

--Shannon Monasco

On Tuesday, January 21, 2014 8:35:18 AM UTC-7, smonasco wrote:

Hi,

I am using version 0.19.3 I have the nofile limit set to 128K and am
getting errors like

[2014-01-18 06:52:54,857][WARN
][netty.channel.socket.nio.NioServerSocketPipelineSink]
Failed to initialize an accepted socket.
org.elasticsearch.common.netty.channel.ChannelException: Failed to
create a selector.
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.start(AbstractNioWorker.java:154)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.register(AbstractNioWorker.java:131)
at org.elasticsearch.common.netty.channel.socket.nio.
NioServerSocketPipelineSink$Boss.registerAcceptedChannel(
NioServerSocketPipelineSink.java:269)
at org.elasticsearch.common.netty.channel.socket.nio.
NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.
java:231)
at org.elasticsearch.common.netty.util.internal.
DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.IOException: Too many open files
at sun.nio.ch.IOUtil.makePipe(Native Method)
at sun.nio.ch.EPollSelectorImpl.init(EPollSelectorImpl.java:
65)
at sun.nio.ch.EPollSelectorProvider.openSelector(
EPollSelectorProvider.java:36)
at java.nio.channels.Selector.open(Selector.java:227)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.start(AbstractNioWorker.java:152)
... 7 more

I am aware that version 0.19.3 is old. We have been having trouble
getting our infrastructure group to build out new nodes so we can have a
rolling upgrade with testing for both versions going on. I am now setting
the limit to 1048576 as per http://stackoverflow.com/
questions/1212925/on-linux-set-maximum-open-files-to-unlimited-possible,
however, I'm concerned this may cause other issues.

If anyone has any suggestions I'd love to hear them. I am using this as
fuel for the please pay attention and get us the support we need so we can
upgrade campaign.

--Shannon Monasco

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving

Re: Too Many Open Files

2014-01-21 Thread smonasco

Sorry wrong error message.

[2014-01-18 06:47:06,232][WARN 
][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to accept a 
connection.
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:226)
at 
org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:227)
at 
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)


The other posted error message is newer and seems to follow the too many 
open files error message.

--Shannon Monasco

On Tuesday, January 21, 2014 8:35:18 AM UTC-7, smonasco wrote:

 Hi,

 I am using version 0.19.3 I have the nofile limit set to 128K and am 
 getting errors like

 [2014-01-18 06:52:54,857][WARN 
 ][netty.channel.socket.nio.NioServerSocketPipelineSink] Failed to 
 initialize an accepted socket.
 org.elasticsearch.common.netty.channel.ChannelException: Failed to create 
 a selector.
 at 
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.start(AbstractNioWorker.java:154)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.register(AbstractNioWorker.java:131)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.registerAcceptedChannel(NioServerSocketPipelineSink.java:269)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:231)
 at 
 org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.io.IOException: Too many open files
 at sun.nio.ch.IOUtil.makePipe(Native Method)
 at sun.nio.ch.EPollSelectorImpl.init(EPollSelectorImpl.java:65)
 at 
 sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36)
 at java.nio.channels.Selector.open(Selector.java:227)
 at 
 org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.start(AbstractNioWorker.java:152)
 ... 7 more

 I am aware that version 0.19.3 is old.  We have been having trouble 
 getting our infrastructure group to build out new nodes so we can have a 
 rolling upgrade with testing for both versions going on.  I am now setting 
 the limit to 1048576 as per 
 http://stackoverflow.com/questions/1212925/on-linux-set-maximum-open-files-to-unlimited-possible,
  
 however, I'm concerned this may cause other issues.

 If anyone has any suggestions I'd love to hear them.  I am using this as 
 fuel for the please pay attention and get us the support we need so we can 
 upgrade campaign.

 --Shannon Monasco


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c77b1f9-7838-40f9-b7e2-6006370265a5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Data Node Allocation Weight

term complexity and filter caching

Re: What is better - create several document types or several indices?

Re: Constant High (~99%) CPU on 1 of 5 Nodes in Cluster

Re: Diagnosing a slow query

Re: Rolling Upgrading from 1.2.1 to 1.3.0 – java.lang.IllegalArgumentException: No enum constant org.apache.lucene.util.Version.4.3.1

Re: Clustering/Sharding impact on query performance

Re: How can I make this search requirement work?

Re: High memory usage on dedicated master nodes

Re: elasticsearch dies every other day

Re: elasticsearch dies every other day

Re: What do these metrics mean?

Re: Upgrade 0.26.6 - 1.2.2 any catches?

Re: elasticsearch dies every other day

Re: Reading and writing the same document too fast -- data loss

What do these metrics mean?

Re: Recommended Hardware Specs Sharding\Index Strategy

Re: Visibility

Re: Visibility

Visibility

Re: How to see all document types for a give index?

Re: Elasticsearch Memory issue

Facet cache size and other memory metrics

Re: Facet cache size and other memory metrics

Re: Facet cache size and other memory metrics

Re: Extremely Simple Java Request

Extremely Simple Java Request

Elasticsearch/Lucene Delete space reuse? recovery?

Percolate Existing Documents Seems Broke

Re: Too Many Open Files

Re: Too Many Open Files

31 matches

Site Navigation

Mail list logo

Footer information