Upgrade to 1.4.2 from 1.1.1 causes spike in Disk IO

2015-02-11 Thread George Stathis
Hi there,

We just completed this past Monday an ES upgrade from 1.1.1 to 1.4.2 and a 
Java upgrade from 7 to 8. No other changes. Same boxes, same specs, no new 
queries or filters, no increased requests. What I see in Marvel since the 
upgrade is a large spike in Disk IO:

https://lh6.googleusercontent.com/-i6OYBiqputk/VNuJC-47eII/A_g/TWxTcres-Uk/s1600/Screen%2BShot%2B2015-02-11%2Bat%2B11.37.58%2BAM.png

and a large dip in field data, filter cache and overall index memory 
consumption:

https://lh5.googleusercontent.com/-2DRu7af1npc/VNuJNqsvOQI/A_o/56trKYYMiQw/s1600/Screen%2BShot%2B2015-02-11%2Bat%2B11.37.47%2BAM.png

Should I be looking at the new circuit breaker 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html#circuit-breaker
 settings?

Any pointers are welcome!

-GS

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5e6dbf8a-4dd9-46d3-8cdb-a45c79a6b43c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Marvel - empty node stats section in Dashboard - JS error in console

2014-02-20 Thread George Stathis
- Running 0.90.11
- Marvel monitor machine and the cluster it monitors are separated
- Installed Marvel the day after it was released and it was originally 
running fine (all panels filled with data)

Checked the dashboard again today (nothing has changed since it was 
installed) and some panels in the dashboard like the node stats and the 
indices stats are blank. I tried re-installing marvel on the monitoring 
box, cleaned out all marvel-* indexes and even wiped out the data path 
directory and let it re-build and the problem persisted.

So I took a look at the JS console and it shows the following errors (host 
name redacted):

event.returnValue is deprecated. Please use the standard 
event.preventDefault() instead. app.js?r=face658:5
POST 
http://some_internal_host_name_/.marvel-2014.02.20/_search?search_type=count 
400 (Bad Request) app.js?r=face658:9
TypeError: Cannot read property 'timestamp' of undefined
at 
http://some_internal_host_name_/_plugin/marvel/kibana/app/panels/marvel/stats_table/module.js?r=face658:4:10741
at i 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:458)
at 
http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:1014
at Object.f.$eval 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:6963)
at Object.f.$digest 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:5755)
at Object.f.$apply 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:7111)
at f 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:11507)
at r 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:13216)
at XMLHttpRequest.v.onreadystatechange 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:13851)
 
app.js?r=face658:8
POST 
http://some_internal_host_name_/.marvel-2014.02.20/_search?search_type=count 
400 (Bad Request) app.js?r=face658:9
TypeError: Cannot read property 'timestamp' of undefined
at 
http://some_internal_host_name_/_plugin/marvel/kibana/app/panels/marvel/stats_table/module.js?r=face658:4:10741
at i 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:458)
at 
http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:1014
at Object.f.$eval 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:6963)
at Object.f.$digest 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:5755)
at Object.f.$apply 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:7111)
at f 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:11507)
at r 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:13216)
at XMLHttpRequest.v.onreadystatechange 
(http://some_internal_host_name_/_plugin/marvel/kibana/app/app.js?r=face658:9:13851)
 
app.js?r=face658:8
Retrying call from Thu Feb 20 2014 11:46:56 GMT-0500 (EST) 
module.js?r=face658:4
Retrying call from Thu Feb 20 2014 11:46:56 GMT-0500 (EST) 
module.js?r=face658:4
Retrying call from Thu Feb 20 2014 11:47:16 GMT-0500 (EST) 
module.js?r=face658:4
Retrying call from Thu Feb 20 2014 11:47:16 GMT-0500 (EST) 
module.js?r=face658:4


I don't understand how this error just showed up without having changed 
anything on the monitor machine or the cluster it monitors. Any clues are 
welcome.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3ed39bf1-2e93-4a6a-979f-c42718440a53%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Marvel - empty node stats section in Dashboard - JS error in console

2014-02-20 Thread George Stathis
Found the issue. The monitoring machine log was filled with these:

[2014-02-20 14:05:07,137][DEBUG][action.search.type   ] 
[elasticsearch.marvel.1.traackr.com] [.marvel-2014.02.20][0], 
node[SwW74zsHRSGZd7OIpukWVQ], [P], s[STARTED]: Failed to execute 
[org.elasticsearch.action.search.SearchRequest@516c99db]
org.elasticsearch.search.SearchParseException: [.marvel-2014.02.20][0]: 
query[ConstantScore(BooleanFilter(+no_cache(@timestamp:[1392919507058 TO 
1392923107136]) +cache(_type:index_stats) +cache(@timestamp:[139292250 
TO 139292316])))],from[-1],size[0]: Parse Failure [Failed to parse 
source 
[{size:0,query:{filtered:{query:{match_all:{}},filter:{bool:{must:[{range:{@timestamp:{from:1392919507058,to:now}}},{term:{_type:index_stats}},{range:{@timestamp:{from:now-10m/m,to:now/m}}}],facets:{timestamp:{terms_stats:{key_field:index.raw,value_field:@timestamp,order:term,size:2000}},primaries.docs.count:{terms_stats:{key_field:index.raw,value_field:primaries.docs.count,order:term,size:2000}},primaries.indexing.index_total:{terms_stats:{key_field:index.raw,value_field:primaries.indexing.index_total,order:term,size:2000}},total.search.query_total:{terms_stats:{key_field:index.raw,value_field:total.search.query_total,order:term,size:2000}},total.merges.total_size_in_bytes:{terms_stats:{key_field:index.raw,value_field:total.merges.total_size_in_bytes,order:term,size:2000}},total.fielddata.memory_size_in_bytes:{terms_stats:{key_field:index.raw,value_field:total.fielddata.memory_size_in_bytes,order:term,size:2000]]
at 
org.elasticsearch.search.SearchService.parseSource(SearchService.java:581)
at 
org.elasticsearch.search.SearchService.createContext(SearchService.java:484)
at 
org.elasticsearch.search.SearchService.createContext(SearchService.java:469)
at 
org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:462)
at 
org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:234)
at 
org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
at 
org.elasticsearch.action.search.type.TransportSearchCountAction$AsyncAction.sendExecuteFirstPhase(TransportSearchCountAction.java:70)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.search.facet.FacetPhaseExecutionException: 
Facet [timestamp]: failed to find mapping for index.raw
at 
org.elasticsearch.search.facet.termsstats.TermsStatsFacetParser.parse(TermsStatsFacetParser.java:119)
at 
org.elasticsearch.search.facet.FacetParseElement.parse(FacetParseElement.java:94)
at 
org.elasticsearch.search.SearchService.parseSource(SearchService.java:569)
... 12 more
[2014-02-20 14:05:07,137][DEBUG][action.search.type   ] 
[elasticsearch.marvel.1.traackr.com] All shards failed for phase: [query]

Looking at failed to find mapping made me think of the marvel mappings. 
Sure enough, I did overwrite the settings but I messed up: instead of 
re-naming my custom template I accidentally named it marvel essentially 
clobbering the default template. No wonder it wasn't able to index the 
fields properly.

I restored the template and it all works fine now.

Should have looked at the logs first. Thanks for point that out.


On Thursday, February 20, 2014 1:31:00 PM UTC-5, Boaz Leskes wrote:

 Hi George, 

 I'm wondering why these search request return 400 bad request. My guess is 
 that it misses data.

 Can you look at the ES logs on you server and see whether you have errors?

 Also please run the following (note that it doesn't have a body, that's 
 intentional): 

 curl http://...:9200/.marvel-2014.02.20/_search?search_type=count

 Does this return correctly?

 Cheers,
 Boaz


 On Thursday, February 20, 2014 6:04:10 PM UTC+1, George Stathis wrote:

 - Running 0.90.11
 - Marvel monitor machine and the cluster it monitors are separated
 - Installed Marvel the day after it was released and it was originally 
 running fine (all panels filled with data)

 Checked the dashboard again today (nothing has changed since it was 
 installed) and some panels in the dashboard like the node stats and the 
 indices stats are blank. I tried re-installing marvel on the monitoring 
 box, cleaned out all marvel-* indexes and even wiped out the data path 
 directory and let it re-build and the problem persisted.

 So I took a look at the JS console and it shows the following errors 
 (host name redacted

Optimizing shard size and count per node

2014-02-17 Thread George Stathis
I'm going through a cluster sizing exercise and I'm looking for some 
pointers.

I have tested the target hardware with a single shard loading production 
data and simulating some of our actual queries using Gatling. Our heaviest 
queries are a blend of top_children with filters, match_all 
with query_string filters and query_string queries 
with has_parent query_string filters. I've narrowed down an optimal shard 
size based on those tests which should give me the target number of shards 
for my index (num_shards = index_size / max_shard_size). Some other notes 
are:

   - memory on the box is split 50/50 between JVM heap and OS.
   - the available OS page cache memory can maybe fit two shards worth of 
   data
   - I have plenty of room left for field and filter caches
   - heap and GC seem to be behaving nicely with a slow and consistent step 
   curve in Bigdesk
   - tests seem to show the box being CPU bound but it's not disk IO: 
   iostat does not show any disk reads during the high load times, Bigdesk 
   shows most time spent in user space and the highest activity seems to be 
   around young generation collections

At this point, what are some of the most important things to consider in 
terms of how many of these shards to co-locate on the same node? 

   - Would adding more shards on the same box negate the tests I just ran?
   - Do I limit the number of shards to how much can fit in page cache on 
   the node? That really blows up the number of nodes I will need which is 
   pretty costly. Is this approach really worth it?
   - Since I'm CPU bound, shouldn't I be worried about more cores if I want 
   more shards per box?

Any pointers are welcome. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8c2f49e2-1cac-4668-a7ae-b029f2ae6950%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.