Re: Optimizing in SolrCloud
Don't. Optimize is a poorly-chosen name for a full merge. It doesn't make that much difference and there is almost never a need to do it on a periodic basis. The full merge will mean a longer time between the commit and the time that the data is first searchable. Do the commit, then search. wunder On Mar 29, 2012, at 4:04 PM, Jamie Johnson wrote: What is the best way to periodically optimize a Solr index? I've seen a few places where this is done from a CRON job, but I wanted to know if there are any other techniques that are used in practice for doing this. My use case is that we generally load a large corpus of data up front and then information trickle's in after that, but we want this information to be available for search within a reasonable amount of time (say 10 minutes). I believe that the CRON job would probably suffice but if there are any other thoughts/suggestions I'd be interested to hear them.
Re: Optimizing in SolrCloud
Thanks, does it matter that we are also updates to documents at various times? Do the deleted documents get removed when doing a merge or does that only get done on an optimize? On Thu, Mar 29, 2012 at 7:08 PM, Walter Underwood wun...@wunderwood.org wrote: Don't. Optimize is a poorly-chosen name for a full merge. It doesn't make that much difference and there is almost never a need to do it on a periodic basis. The full merge will mean a longer time between the commit and the time that the data is first searchable. Do the commit, then search. wunder On Mar 29, 2012, at 4:04 PM, Jamie Johnson wrote: What is the best way to periodically optimize a Solr index? I've seen a few places where this is done from a CRON job, but I wanted to know if there are any other techniques that are used in practice for doing this. My use case is that we generally load a large corpus of data up front and then information trickle's in after that, but we want this information to be available for search within a reasonable amount of time (say 10 minutes). I believe that the CRON job would probably suffice but if there are any other thoughts/suggestions I'd be interested to hear them.
Re: Optimizing in SolrCloud
On Thu, Mar 29, 2012 at 7:15 PM, Jamie Johnson jej2...@gmail.com wrote: Thanks, does it matter that we are also updates to documents at various times? Do the deleted documents get removed when doing a merge or does that only get done on an optimize? Yes, any merge removes documents that have been marked as deleted (from the segments involved in the merge). Optimize can still make sense, but more often in scenarios where documents are updated infrequently. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
Re: Optimizing in SolrCloud
The documents are removed from the search when the delete is committed. The space for those documents is reclaimed at the next merge for the segment where they were. wunder On Mar 29, 2012, at 4:15 PM, Jamie Johnson wrote: Thanks, does it matter that we are also updates to documents at various times? Do the deleted documents get removed when doing a merge or does that only get done on an optimize? On Thu, Mar 29, 2012 at 7:08 PM, Walter Underwood wun...@wunderwood.org wrote: Don't. Optimize is a poorly-chosen name for a full merge. It doesn't make that much difference and there is almost never a need to do it on a periodic basis. The full merge will mean a longer time between the commit and the time that the data is first searchable. Do the commit, then search. wunder On Mar 29, 2012, at 4:04 PM, Jamie Johnson wrote: What is the best way to periodically optimize a Solr index? I've seen a few places where this is done from a CRON job, but I wanted to know if there are any other techniques that are used in practice for doing this. My use case is that we generally load a large corpus of data up front and then information trickle's in after that, but we want this information to be available for search within a reasonable amount of time (say 10 minutes). I believe that the CRON job would probably suffice but if there are any other thoughts/suggestions I'd be interested to hear them.