Re: Huge Performance: Solr distributed search

Dmitry Kan Fri, 25 Nov 2011 00:02:48 -0800

45 000 000 per shard approx, Tomcat, caching was tweaked in solrconfig and
shard given 12GB of RAM max.


<!-- Filter Cache

         Cache used by SolrIndexSearcher for filters (DocSets),
         unordered sets of *all* documents that match a query.  When a
         new searcher is opened, its caches may be prepopulated or
         "autowarmed" using data from caches in the old searcher.
         autowarmCount is the number of items to prepopulate.  For
         LRUCache, the autowarmed items will be the most recently
         accessed items.

         Parameters:
           class - the SolrCache implementation LRUCache or
               (LRUCache or FastLRUCache)
           size - the maximum number of entries in the cache
           initialSize - the initial capacity (number of entries) of
               the cache.  (see java.util.HashMap)
           autowarmCount - the number of entries to prepopulate from
               and old cache.
      -->

filterCache class="solr.FastLRUCache" size="1200" initialSize="1200"
autowarmCount="128"/>

<!-- Query Result Cache

         Caches results of searches - ordered lists of document ids
         (DocList) based on a query, a sort, and the range of
documents requested.
      -->

<queryResultCache class="solr.LRUCache" size="512" initialSize="512"
autowarmCount="32"/>

<!-- Document Cache

         Caches Lucene Document objects (the stored fields for each
         document).  Since Lucene internal document ids are transient,
         this cache will not be autowarmed.
      -->

<documentCache class="solr.LRUCache" size="512" initialSize="512"
autowarmCount="0"/>

<!-- Field Value Cache

         Cache used to hold field values that are quickly accessible
         by document id.  The fieldValueCache is created by default
         even if not configured here.
      -->

<!--
       <fieldValueCache class="solr.FastLRUCache"
                        size="512"
                        autowarmCount="128"
                        showItems="32" />
      -->

<!-- Custom Cache

         Example of a generic cache.  These caches may be accessed by
         name through SolrIndexSearcher.getCache(),cacheLookup(), and
         cacheInsert().  The purpose is to enable easy caching of
         user/application level data.  The regenerator argument should
         be specified as an implementation of solr.CacheRegenerator
         if autowarming is desired.
      -->

<!--
       <cache name="myUserCache"
              class="solr.LRUCache"
              size="4096"
              initialSize="1024"
              autowarmCount="1024"
              regenerator="com.mycompany.MyRegenerator"
              />
      -->

<!-- Lazy Field Loading

         If true, stored fields that are not requested will be loaded
         lazily.  This can result in a significant speed improvement
         if the usual case is to not load all stored fields,
         especially if the skipped fields are large compressed text
         fields.
    -->

<enableLazyFieldLoading>
true
</enableLazyFieldLoading>

<!-- Use Filter For Sorted Query

        A possible optimization that attempts to use a filter to
        satisfy a search.  If the requested sort does not include
        score, then the filterCache will be checked for a filter
        matching the query. If found, the filter will be used as the
        source of document ids, and then the sort will be applied to
        that.

        For most situations, this will not be useful unless you
        frequently get the same search repeatedly with different sort
        options, and none of them ever use "score"
     -->

<!--
      <useFilterForSortedQuery>true</useFilterForSortedQuery>
     -->

<!-- Result Window Size

        An optimization for use with the queryResultCache.  When a search
        is requested, a superset of the requested number of document ids
        are collected.  For example, if a search for a particular query
        requests matching documents 10 through 19, and queryWindowSize is 50,
        then documents 0 through 49 will be collected and cached.  Any further
        requests in that range can be satisfied via the cache.
     -->

<queryResultWindowSize>
50
</queryResultWindowSize>

<!-- Maximum number of documents to cache for any entry in the
        queryResultCache.
     -->

<queryResultMaxDocsCached>
200
</queryResultMaxDocsCached>


In you case I would first check if the network throughput is a bottleneck.

It would be nice if you could check timestamps of completing a request on
each of the shards and arrival time (via some http sniffer) at the frondend
SOLR's servers. Then you will see if it is frontend taking so much time or
was it a network issue.

Are you shards btw well balanced?

On Thu, Nov 24, 2011 at 7:06 PM, Artem Lokotosh <arco...@gmail.com> wrote:

> >> Can you merge, e.g. 3 shards together or is it much effort for your
> team?>Yes, we can merge. We'll try to do this and review how it will works
> Merge does not help :(I've tried to merge two shards in one, three
> shards in one, but results are similar to results first configuration
> with 30 shardsbut this solution have an one big minus the optimization
> proccess may take more time
> >>In our setup we currently have 16 shards with ~30GB each, but we
> rarely>>search in all of them at once
> How many documents per shards in your setup?Any difference between
> Tomcat, Jetty or other?
> Have you configured your servlet more specifically than default
> configuration?
>
>
> On Wed, Nov 23, 2011 at 4:38 PM, Artem Lokotosh <arco...@gmail.com> wrote:
> >> Is this log from the frontend SOLR (aggregator) or from a shard?
> > from aggregator
> >
> >> Can you merge, e.g. 3 shards together or is it much effort for your
> team?
> > Yes, we can merge. We'll try to do this and review how it will works
> > Thanks, Dmitry
> >
> > Any another ideas?
> >
> > On Wed, Nov 23, 2011 at 4:01 PM, Dmitry Kan <dmitry....@gmail.com>
> wrote:
> >> Hello,
> >>
> >> Is this log from the frontend SOLR (aggregator) or from a shard?
> >> Can you merge, e.g. 3 shards together or is it much effort for your
> team?
> >>
> >> In our setup we currently have 16 shards with ~30GB each, but we rarely
> >> search in all of them at once.
> >>
> >> Best,
> >> Dmitry
> >>
> >> On Wed, Nov 23, 2011 at 3:12 PM, Artem Lokotosh <arco...@gmail.com>
> wrote:
> >>
> > --
> > Best regards,
> > Artem Lokotosh        mailto:arco...@gmail.com
> >
>
> --
> Best regards,
> Artem Lokotosh        mailto:arco...@gmail.com
>



-- 
Regards,

Dmitry Kan

Re: Huge Performance: Solr distributed search

Reply via email to