Hi, 
I like to share my experience and in the same time hope I can get some 
tips. 

The query was run against  an index with about 700 million  documents.  
Two things happens,
1. The node run  this query crashed. It is the node configured not to 
proccess data. 

2. The data nodes start crazy on GC. eventually old generation gc cannot 
reduce the heep usage and the nodes becomes unresponsive.  in some cases. 
OLD generation gc even increased size of the heap:


*2014-12-20 07:21:03,370][WARN ][monitor.jvm              ] [******] 
[gc][young][2796041][224976] duration [1.1s], collections [1]/[1.3s], total 
[1.1s]/[3.4h], memory [21.5gb]->[21.2gb]/[29.8gb], all_pools {[young] 
[1.4gb]->[3.4mb]/[1.4gb]}{[survivor] 
[191.3mb]->[191.3mb]/[191.3mb]}{[old] [19.9gb]->[21gb]/[28.1gb]}*


It is a bad query by itself. But I expected ES cluster handles it 
gracefully. It does throw this exception:

* Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: 
[FIELDDATA] Data too large, data for [_uid] would be larger than limit of 
[19206989414/17.8gb]*
I guess ES stopped at some point because  field data exceeds the default 
limit. But it is too late to stop the query that caused heap memory issue. 
I am wondering if there is any obvious wrong with my ES cluster 
configuration. 
I have 5 box eah with 125 ram and 32 cores. I deploy two data nodes on each 
of them the heap fixed at 31G  and configuration is favor bulk ingesting. I 
actually saw above 60+K document ingesting through put per second.  It was 
working fine until that query comes.  

Thanks,

Jack


 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ae1b7ea6-d801-4d67-b047-69ab54f1f38b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to