Re: Index optimize runs in background.

2015-06-11 Thread Walter Underwood
Why would you care when the forced merge (not an “optimize”) is done? Start it and get back to work. Or even better, never force merge and let the algorithm take care of it. Seriously, I’ve been giving this advice since before Lucene was written, because Ultraseek had the same approach for

Re: Index optimize runs in background.

2015-06-11 Thread Upayavira
Until somewhere around Lucene 3.5, you needed to optimise, because the merge strategy used wasn't that clever and left lots of deletes in your largest segment. Around that point, the TieredMergePolicy became the default. Because its algorithm is much more sophisticated, it took away the need to

Re: Index optimize runs in background.

2015-06-10 Thread Erick Erickson
If I knew, I would fix it ;). The sub-optimizes (i.e. the ones sent out to each replica) should be sent in parallel and then each thread should wait for completion from the replicas. There is no real check for optimize, I believe that the return from the call is considered sufficient. If we can

Re: Index optimize runs in background.

2015-06-10 Thread Modassar Ather
Hi, There are 5 cores and a separate server for indexing on this solrcloud. Can you please share your suggestions on: How can indexer know that the optimize has completed even if the commit/optimize runs in background without going to the solr servers may be by using any solrj or other API? I

Re: Index optimize runs in background.

2015-06-04 Thread Erick Erickson
Can't get any failures to happen on my end so I really haven't a clue. Best, Erick On Thu, Jun 4, 2015 at 3:17 AM, Modassar Ather modather1...@gmail.com wrote: Hi, Please provide your inputs on optimize and commit running as background. Your suggestion will be really helpful. Thanks,

Re: Index optimize runs in background.

2015-06-04 Thread Modassar Ather
Hi, Please provide your inputs on optimize and commit running as background. Your suggestion will be really helpful. Thanks, Modassar On Tue, Jun 2, 2015 at 6:05 PM, Modassar Ather modather1...@gmail.com wrote: Erick! I could not find any underlying setting of 10 minutes. It is not only

Re: Index optimize runs in background.

2015-06-02 Thread Modassar Ather
Erick! I could not find any underlying setting of 10 minutes. It is not only optimize but commit is also behaving in the same fashion and is taking lesser time than usually had taken. As per my observation both are running in background. On Fri, May 29, 2015 at 7:21 PM, Erick Erickson

Re: Index optimize runs in background.

2015-05-29 Thread Modassar Ather
I have not added any timeout in the indexer except zk client time out which is 30 seconds. I am simply calling client.close() at the end of indexing. The same code was not running in background for optimize with solr-4.10.3 and org.apache.solr.client.solrj.impl.CloudSolrServer. On Fri, May 29,

Re: Index optimize runs in background.

2015-05-29 Thread Erick Erickson
I'm not talking about you setting a timeout, but the underlying connection timing out... The 10 minutes then the indexer exits comment points in that direction. Best, Erick On Thu, May 28, 2015 at 11:43 PM, Modassar Ather modather1...@gmail.com wrote: I have not added any timeout in the

Re: Index optimize runs in background.

2015-05-28 Thread Modassar Ather
The indexer takes almost 2 hours to optimize. It has a multi-threaded add of batches of documents to org.apache.solr.client.solrj.impl.CloudSolrClient. Once all the documents are indexed it invokes commit and optimize. I have seen that the optimize goes into background after 10 minutes and indexer

Re: Index optimize runs in background.

2015-05-28 Thread Erick Erickson
Are you timing out on the client request? The theory here is that it's still a synchronous call, but you're just timing out at the client level. At that point, the optimize is still running it's just the connection has been dropped Shot in the dark. Erick On Thu, May 28, 2015 at 10:31 PM,

Re: Index optimize runs in background.

2015-05-28 Thread Modassar Ather
I could not notice it but with my past experience of commit which used to take around 2 minutes is now taking around 8 seconds. I think this is also running as background. On Fri, May 29, 2015 at 10:52 AM, Modassar Ather modather1...@gmail.com wrote: The indexer takes almost 2 hours to

Re: Index optimize runs in background.

2015-05-27 Thread Upayavira
In this case, optimising makes sense, once the index is generated, you are not updating It. Upayavira On Wed, May 27, 2015, at 06:14 AM, Modassar Ather wrote: Our index has almost 100M documents running on SolrCloud of 5 shards and each shard has an index size of about 170+GB (for the record,

Re: Index optimize runs in background.

2015-05-27 Thread Erick Erickson
All strange of course. What do your Solr logs show when this happens? And how reproducible is this? Best, Erick On Wed, May 27, 2015 at 4:00 AM, Upayavira u...@odoko.co.uk wrote: In this case, optimising makes sense, once the index is generated, you are not updating It. Upayavira On Wed,

Re: Index optimize runs in background.

2015-05-26 Thread Modassar Ather
Our index has almost 100M documents running on SolrCloud of 5 shards and each shard has an index size of about 170+GB (for the record, we are not using stored fields - our documents are pretty large). We perform a full indexing every weekend and during the week there are no updates made to the

Re: Index optimize runs in background.

2015-05-26 Thread Modassar Ather
Hi, Erick you mentioned about a unit test to test the optimize running in background. Kindly share your findings if any. Thanks, Modassar On Mon, May 25, 2015 at 11:47 AM, Modassar Ather modather1...@gmail.com wrote: Thanks everybody for your replies. I have noticed the optimization running

Re: Index optimize runs in background.

2015-05-26 Thread Upayavira
Modassar, Are you saying that the reason you are optimising is because you have been doing it for years? If this is the only reason, you should stop doing it immediately. The one scenario in which optimisation still makes some sense is when you reindex every night and optimise straight after.

Re: Index optimize runs in background.

2015-05-26 Thread Erick Erickson
No results yet. I finished the test harness last night (not really a unit test, a stand-alone program that endlessly adds stuff and tests that every commit returns the correct number of docs). 8,000 cycles later there aren't any problems reported. Siiigh. On Tue, May 26, 2015 at 1:51 AM,

Re: Index optimize runs in background.

2015-05-26 Thread Shawn Heisey
On 5/26/2015 6:29 AM, Upayavira wrote: Are you saying that the reason you are optimising is because you have been doing it for years? If this is the only reason, you should stop doing it immediately. The one scenario in which optimisation still makes some sense is when you reindex every

Re: Index optimize runs in background.

2015-05-26 Thread Alessandro Benedetti
I completely agree with Upayavira and Shawn. Modassar, can you explain us how often do you index ? Have you ever played with the merge Factor ? I hardly think you need to optimise at all. Simply a tuning of the merge Factor should solve all your issues . I assume you were optimising only to have

Re: Index optimize runs in background.

2015-05-25 Thread Modassar Ather
Thanks everybody for your replies. I have noticed the optimization running in background every time I indexed. This is 5 node cluster with solr-5.1.0 and uses the CloudSolrClient. Kindly share your findings on this issue. Our index has almost 100M documents running on SolrCloud. We have been

Re: Index optimize runs in background.

2015-05-22 Thread Upayavira
On Fri, May 22, 2015, at 03:55 PM, Shawn Heisey wrote: On 5/21/2015 6:21 AM, Modassar Ather wrote: I am using Solr-5.1.0. I have an indexer class which invokes cloudSolrClient.optimize(true, true, 1). My indexer exits after the invocation of optimize and the optimization keeps on running

Re: Index optimize runs in background.

2015-05-22 Thread Shawn Heisey
On 5/21/2015 6:21 AM, Modassar Ather wrote: I am using Solr-5.1.0. I have an indexer class which invokes cloudSolrClient.optimize(true, true, 1). My indexer exits after the invocation of optimize and the optimization keeps on running in the background. Kindly let me know if it is per design

Re: Index optimize runs in background.

2015-05-22 Thread Erick Erickson
Actually, I've recently seen very similar behavior in Solr 4.10.3, but involving hard commits openSearcher=true, see: https://issues.apache.org/jira/browse/SOLR-7572. Of course I can't reproduce this at will, sii. A unit test should be very simple to write though, maybe I can get to it

Index optimize runs in background.

2015-05-21 Thread Modassar Ather
Hi, I am using Solr-5.1.0. I have an indexer class which invokes cloudSolrClient.optimize(true, true, 1). My indexer exits after the invocation of optimize and the optimization keeps on running in the background. Kindly let me know if it is per design and how can I make my indexer to wait until

Re: Index optimize runs in background.

2015-05-21 Thread Modassar Ather
Hi An insight on the question will be really helpful. Thanks, Modassar On Thu, May 21, 2015 at 5:51 PM, Modassar Ather modather1...@gmail.com wrote: Hi, I am using Solr-5.1.0. I have an indexer class which invokes cloudSolrClient.optimize(true, true, 1). My indexer exits after the