Why would you care when the forced merge (not an “optimize”) is done? Start it
and get back to work.
Or even better, never force merge and let the algorithm take care of it.
Seriously, I’ve been giving this advice since before Lucene was written,
because Ultraseek had the same approach for
Until somewhere around Lucene 3.5, you needed to optimise, because the
merge strategy used wasn't that clever and left lots of deletes in your
largest segment. Around that point, the TieredMergePolicy became the
default. Because its algorithm is much more sophisticated, it took away
the need to
If I knew, I would fix it ;). The sub-optimizes (i.e. the ones
sent out to each replica) should be sent in parallel and then
each thread should wait for completion from the replicas. There
is no real check for optimize, I believe that the return from the
call is considered sufficient. If we can
Hi,
There are 5 cores and a separate server for indexing on this solrcloud. Can
you please share your suggestions on:
How can indexer know that the optimize has completed even if the
commit/optimize runs in background without going to the solr servers may be
by using any solrj or other API?
I
Can't get any failures to happen on my end so I really haven't a clue.
Best,
Erick
On Thu, Jun 4, 2015 at 3:17 AM, Modassar Ather modather1...@gmail.com wrote:
Hi,
Please provide your inputs on optimize and commit running as background.
Your suggestion will be really helpful.
Thanks,
Hi,
Please provide your inputs on optimize and commit running as background.
Your suggestion will be really helpful.
Thanks,
Modassar
On Tue, Jun 2, 2015 at 6:05 PM, Modassar Ather modather1...@gmail.com
wrote:
Erick! I could not find any underlying setting of 10 minutes.
It is not only
Erick! I could not find any underlying setting of 10 minutes.
It is not only optimize but commit is also behaving in the same fashion and
is taking lesser time than usually had taken.
As per my observation both are running in background.
On Fri, May 29, 2015 at 7:21 PM, Erick Erickson
I have not added any timeout in the indexer except zk client time out which
is 30 seconds. I am simply calling client.close() at the end of indexing.
The same code was not running in background for optimize with solr-4.10.3
and org.apache.solr.client.solrj.impl.CloudSolrServer.
On Fri, May 29,
I'm not talking about you setting a timeout, but the underlying
connection timing out...
The 10 minutes then the indexer exits comment points in that direction.
Best,
Erick
On Thu, May 28, 2015 at 11:43 PM, Modassar Ather modather1...@gmail.com wrote:
I have not added any timeout in the
The indexer takes almost 2 hours to optimize. It has a multi-threaded add
of batches of documents to
org.apache.solr.client.solrj.impl.CloudSolrClient.
Once all the documents are indexed it invokes commit and optimize. I have
seen that the optimize goes into background after 10 minutes and indexer
Are you timing out on the client request? The theory here is that it's
still a synchronous call, but you're just timing out at the client
level. At that point, the optimize is still running it's just the
connection has been dropped
Shot in the dark.
Erick
On Thu, May 28, 2015 at 10:31 PM,
I could not notice it but with my past experience of commit which used to
take around 2 minutes is now taking around 8 seconds. I think this is also
running as background.
On Fri, May 29, 2015 at 10:52 AM, Modassar Ather modather1...@gmail.com
wrote:
The indexer takes almost 2 hours to
In this case, optimising makes sense, once the index is generated, you
are not updating It.
Upayavira
On Wed, May 27, 2015, at 06:14 AM, Modassar Ather wrote:
Our index has almost 100M documents running on SolrCloud of 5 shards and
each shard has an index size of about 170+GB (for the record,
All strange of course. What do your Solr logs show when this happens?
And how reproducible is this?
Best,
Erick
On Wed, May 27, 2015 at 4:00 AM, Upayavira u...@odoko.co.uk wrote:
In this case, optimising makes sense, once the index is generated, you
are not updating It.
Upayavira
On Wed,
Our index has almost 100M documents running on SolrCloud of 5 shards and
each shard has an index size of about 170+GB (for the record, we are not
using stored fields - our documents are pretty large). We perform a full
indexing every weekend and during the week there are no updates made to the
Hi,
Erick you mentioned about a unit test to test the optimize running in
background. Kindly share your findings if any.
Thanks,
Modassar
On Mon, May 25, 2015 at 11:47 AM, Modassar Ather modather1...@gmail.com
wrote:
Thanks everybody for your replies.
I have noticed the optimization running
Modassar,
Are you saying that the reason you are optimising is because you have
been doing it for years? If this is the only reason, you should stop
doing it immediately.
The one scenario in which optimisation still makes some sense is when
you reindex every night and optimise straight after.
No results yet. I finished the test harness last night (not really a
unit test, a stand-alone program that endlessly adds stuff and tests
that every commit returns the correct number of docs).
8,000 cycles later there aren't any problems reported.
Siiigh.
On Tue, May 26, 2015 at 1:51 AM,
On 5/26/2015 6:29 AM, Upayavira wrote:
Are you saying that the reason you are optimising is because you have
been doing it for years? If this is the only reason, you should stop
doing it immediately.
The one scenario in which optimisation still makes some sense is when
you reindex every
I completely agree with Upayavira and Shawn.
Modassar, can you explain us how often do you index ?
Have you ever played with the merge Factor ?
I hardly think you need to optimise at all.
Simply a tuning of the merge Factor should solve all your issues .
I assume you were optimising only to have
Thanks everybody for your replies.
I have noticed the optimization running in background every time I indexed.
This is 5 node cluster with solr-5.1.0 and uses the CloudSolrClient. Kindly
share your findings on this issue.
Our index has almost 100M documents running on SolrCloud. We have been
On Fri, May 22, 2015, at 03:55 PM, Shawn Heisey wrote:
On 5/21/2015 6:21 AM, Modassar Ather wrote:
I am using Solr-5.1.0. I have an indexer class which invokes
cloudSolrClient.optimize(true, true, 1). My indexer exits after the
invocation of optimize and the optimization keeps on running
On 5/21/2015 6:21 AM, Modassar Ather wrote:
I am using Solr-5.1.0. I have an indexer class which invokes
cloudSolrClient.optimize(true, true, 1). My indexer exits after the
invocation of optimize and the optimization keeps on running in the
background.
Kindly let me know if it is per design
Actually, I've recently seen very similar behavior in Solr 4.10.3, but
involving hard commits openSearcher=true, see:
https://issues.apache.org/jira/browse/SOLR-7572. Of course I can't
reproduce this at will, sii.
A unit test should be very simple to write though, maybe I can get to it
Hi,
I am using Solr-5.1.0. I have an indexer class which invokes
cloudSolrClient.optimize(true, true, 1). My indexer exits after the
invocation of optimize and the optimization keeps on running in the
background.
Kindly let me know if it is per design and how can I make my indexer to
wait until
Hi
An insight on the question will be really helpful.
Thanks,
Modassar
On Thu, May 21, 2015 at 5:51 PM, Modassar Ather modather1...@gmail.com
wrote:
Hi,
I am using Solr-5.1.0. I have an indexer class which invokes
cloudSolrClient.optimize(true, true, 1). My indexer exits after the
26 matches
Mail list logo