Re: Solr hangs / LRU operations are heavy on cpu

2015-03-22 Thread Umesh Prasad
We use filter very heavily because we run an e-commerce site which has a
lot of faceting and drill downs configured at different paths on the store
..
 We are using master  slave replication and we use slaves to support
higher qps.

filterCache :
 Concurrent LFU Cache(maxSize=1, initialSize=4000, minSize=9000,
acceptableSize=9500, cleanupThread=true, timeDecay=true).

We see 95-99% hit ratio on filter cache and most of our filters evictions
on filter cache.

These are figures from one of our prod boxes ..

   - size:9260
   - warmupTime:272007
   - timeDecay:true
   - cumulative_lookups:9220776
   - cumulative_hits:9048703
   - cumulative_hitratio:0.98


We had the default settings 2 yrs back on cache (untuned caches) and our
perf numbers were real bad. We got like 25%   latency improvement by tuning
our caches properly .. So tuning the caches was well worth the effort ..




On 21 March 2015 at 02:16, Erick Erickson erickerick...@gmail.com wrote:

 Are you faceting? That can sometimes use one of the caches
 (just glanced at stack trace...) as entries are pushed into and
 removed from the cache during the same request. Shot
 in the dark.

 Best,
 Erick

 On Fri, Mar 20, 2015 at 12:17 PM, Yonik Seeley ysee...@gmail.com wrote:
  The document cache is not really going to be taking up time here.
  How many concurrent requests (threads) are you testing with here?
 
  One thing I've seen over the years is a false sense of what is taking
  up time when benchmarks with a lot of threads are used.  The reason is
  that when there are a lot more threads than CPUs, it's natural for
  context switches to happen where synchronizations happen.  You look at
  a profiler or thread dumps, and you see a bunch of threads piled up on
  synchronization.  This does not mean that removing that
  synchronization will really help anything... the threads can't all run
  at once.
 
  -Yonik
 
 
  On Thu, Mar 19, 2015 at 6:35 PM, Sergey Shvets ser...@bintime.com
 wrote:
  Hi,
 
  we have quite a problem with Solr. We are running it in a config 6x3,
 and
  suddenly solr started to hang, taking all the available cpu on the
 nodes.
 
  In the threads dump noticed things like this can eat lot of CPU time
 
 
 - org.apache.solr.search.LRUCache.put(LRUCache.java:116)
 -
 
 org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:705)
 -
 
 org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:155)
 -
 
 org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:183)
 -
 
 org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:88)
 -
 
 org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:158)
 -
 
 org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:148)
 -
 
 org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:242)
 -
 
 org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:153)
 -
 org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:96)
 -
 
 org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:52)
 -
 
 org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:758)
 -
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:426)
 -
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
 -
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
 -
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
 -
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
 -
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
 -
 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
 -
 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
 -
 
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
 -
 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
 
 
  The cache itself is very minimalistic
 
 
filterCache class=solr.FastLRUCache size=512 initialSize=512
  autowarmCount=0/
  queryResultCache class=solr.LRUCache size=512
  initialSize=512 autowarmCount=0/
  documentCache class=solr.LRUCache size=512 initialSize=512
  autowarmCount=0/
  fieldValueCache class=solr.FastLRUCache size=1024
  autowarmCount=256 showItems=10 /
  cache name=perSegFilter class=solr.search.LRUCache size=10
  initialSize=0 autowarmCount=10
  regenerator=solr.NoOpRegenerator/
  enableLazyFieldLoadingtrue/enableLazyFieldLoading
  queryResultWindowSize20/queryResultWindowSize
  queryResultMaxDocsCached200/queryResultMaxDocsCached
 
  Solr version is 4.10.3
 
  Any of help is appreciated!
 
  sergey

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Shawn Heisey
On 3/19/2015 8:49 PM, Umesh Prasad wrote:
 It might be because LRUCache by default will try to evict its entries on
 each call to put and putAll. LRUCache is built on top of java's
 LinkedHashMap. Check the javadoc of removeEldestEntry
 http://docs.oracle.com/javase/7/docs/api/java/util/LinkedHashMap.html#removeEldestEntry%28java.util.Map.Entry%29


 Try using LFUCache and a separate cleanup thread .. We have been using that
 for over 2 yrs now without any issues ..

All cache implementations evict old entries on put if the cache is
full, including LFUCache.  What's different is how the evicted entry is
chosen and how efficient the eviction process is.

I wrote the LFUCache implementation that's currently in Solr.  It is the
most basic naive implementation of LFU that you can write, the kind of
thing that a beginning Computer Science student would write to show a
correct implementation. :)  It's probably suitable for very small cache
sizes (double digits), but if the cache size is large, LFUCache is very
inefficient at eviction.  With a large size, it might hit the CPU even
harder than LRUCache.

I have written a much better implementation that's more efficient, I
need to polish the code and commit it.

As a general rule, I would expect the LRU implementations to always be
more efficient at eviction than any implementation of LFU, but some
query patterns will have a higher cache hitCount with an LFU
implementation, so the tradeoff might be worth making.

Thanks,
Shawn



Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Sergey Shvets
Hello Umesh,

Thank you, indeed that gave positive results so far.

we  changed  completely to LFU. Today it went quite okay. We wait till
it shows more stability and then work out the optimal cache size.

Below is a summary of the changes.

- filterCache class=solr.FastLRUCache size=512 initialSize=512 
autowarmCount=0/
- queryResultCache class=solr.LRUCache size=512 initialSize=512 
autowarmCount=0/
- documentCache class=solr.LRUCache size=512 initialSize=512 
autowarmCount=0/
- cache name=perSegFilter class=solr.search.LRUCache size=10 
initialSize=0 autowarmCount=10 regenerator=solr.NoOpRegenerator/
+ filterCache class=solr.LFUCache size=512 initialSize=512 
autowarmCount=0 cleanupThread=True /
+ queryResultCache class=solr.LFUCache size=512 initialSize=512 
autowarmCount=0 cleanupThread=True /
+ documentCache class=solr.LFUCache size=512 initialSize=512 
autowarmCount=0 cleanupThread=True /
+ fieldValueCache class=solr.LFUCache size=512 autowarmCount=256 
showItems=10 cleanupThread=True /
+ cache name=perSegFilter class=solr.LFUCache size=10 initialSize=0 
autowarmCount=10 regenerator=solr.NoOpRegenerator cleanupThread=True /



-- 
Best regards,
 Sergeymailto:ser...@bintime.com



Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Sergey Shvets
Hello Shawn,

In that case it makes it a bit strange the behavior as it was noticed.
LRU   was   heavy   on  the  CPU in threads dump, and I don't have any
reasonable explanation for that.

However switch to LFU seemingly solved the case.



-- 
Best regards,
 Sergeymailto:ser...@bintime.com



Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Yonik Seeley
The document cache is not really going to be taking up time here.
How many concurrent requests (threads) are you testing with here?

One thing I've seen over the years is a false sense of what is taking
up time when benchmarks with a lot of threads are used.  The reason is
that when there are a lot more threads than CPUs, it's natural for
context switches to happen where synchronizations happen.  You look at
a profiler or thread dumps, and you see a bunch of threads piled up on
synchronization.  This does not mean that removing that
synchronization will really help anything... the threads can't all run
at once.

-Yonik


On Thu, Mar 19, 2015 at 6:35 PM, Sergey Shvets ser...@bintime.com wrote:
 Hi,

 we have quite a problem with Solr. We are running it in a config 6x3, and
 suddenly solr started to hang, taking all the available cpu on the nodes.

 In the threads dump noticed things like this can eat lot of CPU time


- org.apache.solr.search.LRUCache.put(LRUCache.java:116)
-
org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:705)
-

 org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:155)
-

 org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:183)
-

 org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:88)
-
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:158)
-

 org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:148)
-

 org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:242)
-
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:153)
- org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:96)
-

 org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:52)
-

 org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:758)
-

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:426)
-

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
-

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
-

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
-

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
-

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
-

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
-

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
-
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
-

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)


 The cache itself is very minimalistic


   filterCache class=solr.FastLRUCache size=512 initialSize=512
 autowarmCount=0/
 queryResultCache class=solr.LRUCache size=512
 initialSize=512 autowarmCount=0/
 documentCache class=solr.LRUCache size=512 initialSize=512
 autowarmCount=0/
 fieldValueCache class=solr.FastLRUCache size=1024
 autowarmCount=256 showItems=10 /
 cache name=perSegFilter class=solr.search.LRUCache size=10
 initialSize=0 autowarmCount=10
 regenerator=solr.NoOpRegenerator/
 enableLazyFieldLoadingtrue/enableLazyFieldLoading
 queryResultWindowSize20/queryResultWindowSize
 queryResultMaxDocsCached200/queryResultMaxDocsCached

 Solr version is 4.10.3

 Any of help is appreciated!

 sergey


Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Chris Hostetter
: we have quite a problem with Solr. We are running it in a config 6x3, and
: suddenly solr started to hang, taking all the available cpu on the nodes.
: 
: In the threads dump noticed things like this can eat lot of CPU time
: 
: 
:- org.apache.solr.search.LRUCache.put​(LRUCache.java:116)
:-
:org.apache.solr.search.SolrIndexSearcher.doc​(SolrIndexSearcher.java:705)

That specific code path pertains to the documentCache - this 
particular thread appears to be blocked on inserting docs into that 
(synchronized) map because of some other thread already doing an insert.

depending on your usage patterns, you may find it better to just disable 
the documentCache completley -- it's primarily useful when you have lots 
of stored fields in your docs, and a lot of hot documents that are 
frequently returned by a lot of different searches (ie: because you always 
sort on the same sets of fields) ... but if you aren't seeing any hits on 
your documentCache, just get rid of it.


the choice of having a documentCache and what type of cacheImpl to use for 
the doc cache can be completley independent of what impl you use ofr oher 
caches (maybe you disable the doc cache, use LRU for the filterCache, and 
LFU for the queryResultCache -- they are all independently configurable, 
one size doesn't neccessarily fit all)



-Hoss
http://www.lucidworks.com/

Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Erick Erickson
Are you faceting? That can sometimes use one of the caches
(just glanced at stack trace...) as entries are pushed into and
removed from the cache during the same request. Shot
in the dark.

Best,
Erick

On Fri, Mar 20, 2015 at 12:17 PM, Yonik Seeley ysee...@gmail.com wrote:
 The document cache is not really going to be taking up time here.
 How many concurrent requests (threads) are you testing with here?

 One thing I've seen over the years is a false sense of what is taking
 up time when benchmarks with a lot of threads are used.  The reason is
 that when there are a lot more threads than CPUs, it's natural for
 context switches to happen where synchronizations happen.  You look at
 a profiler or thread dumps, and you see a bunch of threads piled up on
 synchronization.  This does not mean that removing that
 synchronization will really help anything... the threads can't all run
 at once.

 -Yonik


 On Thu, Mar 19, 2015 at 6:35 PM, Sergey Shvets ser...@bintime.com wrote:
 Hi,

 we have quite a problem with Solr. We are running it in a config 6x3, and
 suddenly solr started to hang, taking all the available cpu on the nodes.

 In the threads dump noticed things like this can eat lot of CPU time


- org.apache.solr.search.LRUCache.put(LRUCache.java:116)
-
org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:705)
-

 org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:155)
-

 org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:183)
-

 org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:88)
-
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:158)
-

 org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:148)
-

 org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:242)
-
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:153)
- org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:96)
-

 org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:52)
-

 org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:758)
-

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:426)
-

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
-

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
-

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
-

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
-

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
-

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
-

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
-
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
-

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)


 The cache itself is very minimalistic


   filterCache class=solr.FastLRUCache size=512 initialSize=512
 autowarmCount=0/
 queryResultCache class=solr.LRUCache size=512
 initialSize=512 autowarmCount=0/
 documentCache class=solr.LRUCache size=512 initialSize=512
 autowarmCount=0/
 fieldValueCache class=solr.FastLRUCache size=1024
 autowarmCount=256 showItems=10 /
 cache name=perSegFilter class=solr.search.LRUCache size=10
 initialSize=0 autowarmCount=10
 regenerator=solr.NoOpRegenerator/
 enableLazyFieldLoadingtrue/enableLazyFieldLoading
 queryResultWindowSize20/queryResultWindowSize
 queryResultMaxDocsCached200/queryResultMaxDocsCached

 Solr version is 4.10.3

 Any of help is appreciated!

 sergey


Re: Solr hangs / LRU operations are heavy on cpu

2015-03-19 Thread Umesh Prasad
It might be because LRUCache by default will try to evict its entries on
each call to put and putAll. LRUCache is built on top of java's
LinkedHashMap. Check the javadoc of removeEldestEntry
http://docs.oracle.com/javase/7/docs/api/java/util/LinkedHashMap.html#removeEldestEntry%28java.util.Map.Entry%29


Try using LFUCache and a separate cleanup thread .. We have been using that
for over 2 yrs now without any issues ..

For comparison of Cache in solr you can check this link
https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig

On 20 March 2015 at 04:05, Sergey Shvets ser...@bintime.com wrote:

 LRUCache


It


-- 
Thanks  Regards
Umesh Prasad
Tech Lead @ flipkart.com

 in.linkedin.com/pub/umesh-prasad/6/5bb/580/