RE: NRT and commit behavior

Tirthankar Chatterjee Wed, 21 Sep 2011 20:24:42 -0700

Okay, but is there any number that if we reach on the index size or total docs 
in the index or the size of physical memory that sharding should be considered.


I am trying to find the winning combination.
Tirthankar
-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Friday, September 16, 2011 7:46 AM
To: solr-user@lucene.apache.org
Subject: Re: NRT and commit behavior

Uhm, you're putting  a lot of index into not very much memory. I really think 
you're going to have to shard your index across several machines to get past 
this problem. Simply increasing the size of your caches is still limited by the 
physical memory you're working with.

You really have to put a profiler on the system to see what's going on. At that 
size there are too many things that it *could* be to definitively answer it 
with e-mails....

Best
Erick

On Wed, Sep 14, 2011 at 7:35 AM, Tirthankar Chatterjee 
<tchatter...@commvault.com> wrote:
> Erick,
> Also, we had  our solrconfig where we have tried increasing the cache.... 
> making the below value for autowarm count as 0 helps returning the commit 
> call within the second, but that will slow us down on searches....
>
> <filterCache
>      class="solr.FastLRUCache"
>      size="16384"
>      initialSize="4096"
>      autowarmCount="4096"/>
>
>    <!-- Cache used to hold field values that are quickly accessible
>         by document id.  The fieldValueCache is created by default
>         even if not configured here.
>      <fieldValueCache
>        class="solr.FastLRUCache"
>        size="512"
>        autowarmCount="128"
>        showItems="32"
>      />
>    -->
>
>   <!-- queryResultCache caches results of searches - ordered lists of
>         document ids (DocList) based on a query, a sort, and the range
>         of documents requested.  -->
>    <queryResultCache
>      class="solr.LRUCache"
>      size="16384"
>      initialSize="4096"
>      autowarmCount="4096"/>
>
>  <!-- documentCache caches Lucene Document objects (the stored fields for 
> each document).
>       Since Lucene internal document ids are transient, this cache 
> will not be autowarmed.  -->
>    <documentCache
>      class="solr.LRUCache"
>      size="512"
>      initialSize="512"
>      autowarmCount="512"/>
>
> -----Original Message-----
> From: Tirthankar Chatterjee [mailto:tchatter...@commvault.com]
> Sent: Wednesday, September 14, 2011 7:31 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NRT and commit behavior
>
> Erick,
> Here is the answer to your questions:
> Our index is 267 GB
> We are not optimizing...
> No we have not profiled yet to check the bottleneck, but logs indicate 
> opening the searchers is taking time...
> Nothing except SOLR
> Total memory is 16GB tomcat has 8GB allocated Everything 64 bit OS and 
> JVM and Tomcat
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Sunday, September 11, 2011 11:37 AM
> To: solr-user@lucene.apache.org
> Subject: Re: NRT and commit behavior
>
> Hmm, OK. You might want to look at the non-cached filter query stuff, it's 
> quite recent.
> The point here is that it is a filter that is applied only after all of the 
> less expensive filter queries are run, One of its uses is exactly ACL 
> calculations. Rather than calculate the ACL for the entire doc set, it only 
> calculates access for docs that have made it past all the other elements of 
> the query.... See SOLR-2429 and note that it is a 3.4 (currently being 
> released) only.
>
> As to why your commits are taking so long, I have no idea given that you 
> really haven't given us much to work with.
>
> How big is your index? Are you optimizing? Have you profiled the application 
> to see what the bottleneck is (I/O, CPU, etc?). What else is running on your 
> machine? It's quite surprising that it takes that long. How much memory are 
> you giving the JVM? etc...
>
> You might want to review: 
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
>
> On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee 
> <tchatter...@commvault.com> wrote:
>> Erick,
>> What you said is correct for us the searches are based on some Active 
>> Directory permissions which are populated in Filter query parameter. So we 
>> don't have any warming query concept as we cannot fire for every user ahead 
>> of time.
>>
>> What we do here is that when user logs in we do an invalid query(which 
>> return no results instead of '*') with the correct filter query (which is 
>> his permissions based on the login). This way the cache gets warmed up with 
>> valid docs.
>>
>> It works then.
>>
>>
>> Also, can you please let me know why commit is taking 45 mins to 1 hours on 
>> a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, 
>> etc. We tried passing waitSearcher as false and found that inside the code 
>> it hard coded to be true. Is there any specific reason. Can we change that 
>> value to honor what is being passed.
>>
>> Thanks,
>> Tirthankar
>>
>> -----Original Message-----
>> From: Erick Erickson [mailto:erickerick...@gmail.com]
>> Sent: Thursday, September 01, 2011 8:38 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: NRT and commit behavior
>>
>> Hmm, I'm guessing a bit here, but using an invalid query doesn't sound very 
>> safe, but I suppose it *might* be OK.
>>
>> What does "invalid" mean? Syntax error? not safe.
>>
>> search that returns 0 results? I don't know, but I'd guess that 
>> filling your caches, which is the point of warming queries, might be 
>> short circuited if the query returns
>> 0 results but I don't know for sure.
>>
>> But the fact that "invalid queries return quicker" does not inspire 
>> confidence since the *point* of warming queries is to spend the time up 
>> front so your users don't have to wait.
>>
>> So here's a test. Comment out your warming queries.
>> Restart your server and fire the warming query from the browser 
>> with&debugQuery=on and look at the QTime parameter.
>>
>> Now fire the same form of the query (as in the same sort, facet, grouping, 
>> etc, but presumably a valid term). See the QTime.
>>
>> Now fire the same form of the query with a *different* value in the query. 
>> That is, it should search on different terms but with the same sort, facet, 
>> etc. to avoid getting your data straight from the queryResultCache.
>>
>> My guess is that the last query will return much more quickly than the 
>> second query. Which would indicate that the first form isn't doing you any 
>> good.
>>
>> But a test is worth a thousand opinions.
>>
>> Best
>> Erick
>>
>> On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee 
>> <tchatter...@commvault.com> wrote:
>>> Also noticed that "waitSearcher" parameter value is not  honored inside 
>>> commit. It is always defaulted to true which makes it slow during indexing.
>>>
>>> What we are trying to do is use an invalid query (which wont return any 
>>> results) as a warming query. This way the commit returns faster. Are we 
>>> doing something wrong here?
>>>
>>> Thanks,
>>> Tirthankar
>>>
>>> -----Original Message-----
>>> From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
>>> Sent: Monday, July 18, 2011 11:38 AM
>>> To: solr-user@lucene.apache.org; yo...@lucidimagination.com
>>> Subject: Re: NRT and commit behavior
>>>
>>> In practice, in my experience at least, a very 'expensive' commit 
>>> can still slow down searches significantly, I think just due to CPU 
>>> (or
>>> i/o?) starvation. Not sure anything can be done about that.  That's my 
>>> experience in Solr 1.4.1, but since searches have always been async with 
>>> commits, it probably is the same situation even in more recent versions, 
>>> I'd guess.
>>>
>>> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>>>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nch...@earthlink.net>  
>>>> wrote:
>>>>> Very glad to hear that NRT is finally here!  But my question is this:
>>>>> will things still come to a standstill during a commit?
>>>> New updates can now proceed in parallel with a commit, and searches 
>>>> have always been completely asynchronous w.r.t. commits.
>>>>
>>>> -Yonik
>>>> http://www.lucidimagination.com
>>>>
>>> ******************Legal Disclaimer***************************
>>> "This communication may contain confidential and privileged material 
>>> for the sole use of the intended recipient. Any unauthorized review, 
>>> use or distribution by others is strictly prohibited. If you have 
>>> received the message in error, please advise the sender by reply 
>>> email and delete the message. Thank you."
>>> *********************************************************
>>>
>>
>

RE: NRT and commit behavior

Reply via email to