Thanks,Shawn.   Very useful information.

Please find below the log details:-



2018-06-20 17:19:06.661 ERROR
(updateExecutor-2-thread-8226-processing-crm_v2_01_shard3_replica1
x:crm_v2_01_shard3_replica2 r:core_node4 n:masked:8983_solr s:shard3
c:crm_v2_01) [c:crm_v2_01 s:shard3 r:core_node4
x:crm_v2_01_shard3_replica2] o.a.s.u.StreamingSolrClients error

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at crm_v2_01_shard3_replica1: Bad Request

request:
crm_v2_01_shard3_replica1/update?update.chain=add-unknown-fields-to-the-schema&update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fmasked%3A8983%2Fsolr%2Fcrm_v2_01_shard3_replica2%2F&wt=javabin&version=2

Remote error message: missing _version_ on update from leader

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:345)

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:184)

               at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$292(ExecutorUtil.java)

               at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

               at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

               at java.lang.Thread.run(Thread.java:748)

2018-06-20 17:19:06.662 WARN  (qtp1002191352-169102) [c:crm_v2_01 s:shard3
r:core_node4 x:crm_v2_01_shard3_replica2]
o.a.s.u.p.DistributedUpdateProcessor Error sending update to
http://masked:8983/solr

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://masked:8983/solr/crm_v2_01_shard3_replica3: Bad
Request

request:
http://masked:8983/solr/crm_v2_01_shard3_replica3/update?update.chain=add-unknown-fields-to-the-schema&update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fmasked%3A8983%2Fsolr%2Fcrm_v2_01_shard3_replica2%2F&wt=javabin&version=2

Remote error message: missing _version_ on update from leader

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:345)

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:184)

               at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$292(ExecutorUtil.java)

               at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

               at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

               at java.lang.Thread.run(Thread.java:748)

2018-06-20 17:19:06.662 ERROR (qtp1002191352-169102) [c:crm_v2_01 s:shard3
r:core_node4 x:crm_v2_01_shard3_replica2]
o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on
replica http://masked:8983/solr/crm_v2_01_shard3_replica3/

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://masked:8983/solr/crm_v2_01_shard3_replica3: Bad
Request

request:
http://masked:8983/solr/crm_v2_01_shard3_replica3/update?update.chain=add-unknown-fields-to-the-schema&update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fmasked%3A8983%2Fsolr%2Fcrm_v2_01_shard3_replica2%2F&wt=javabin&version=2

Remote error message: missing _version_ on update from leader

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:345)

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:184)

               at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$292(ExecutorUtil.java)

               at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

               at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

               at java.lang.Thread.run(Thread.java:748)

2018-06-20 17:19:06.662 INFO  (qtp1002191352-169102) [c:crm_v2_01 s:shard3
r:core_node4 x:crm_v2_01_shard3_replica2] o.a.s.c.ZkController Put replica
core=crm_v2_01_shard3_replica3 coreNodeName=core_node12 on masked:8983_solr
into leader-initiated recovery.

2018-06-20 17:19:06.662 WARN  (qtp1002191352-169102) [c:crm_v2_01 s:shard3
r:core_node4 x:crm_v2_01_shard3_replica2]
o.a.s.u.p.DistributedUpdateProcessor Error sending update to
http://masked:8983/solr

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at crm_v2_01_shard3_replica1: Bad Request

request:
crm_v2_01_shard3_replica1/update?update.chain=add-unknown-fields-to-the-schema&update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fmasked%3A8983%2Fsolr%2Fcrm_v2_01_shard3_replica2%2F&wt=javabin&version=2

Remote error message: missing _version_ on update from leader

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:345)

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:184)

               at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$292(ExecutorUtil.java)

               at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

               at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

               at java.lang.Thread.run(Thread.java:748)

2018-06-20 17:19:06.663 ERROR (qtp1002191352-169102) [c:crm_v2_01 s:shard3
r:core_node4 x:crm_v2_01_shard3_replica2]
o.a.s.u.p.DistributedUpdateProcessor Setting up to try to start recovery on
replica crm_v2_01_shard3_replica1/

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at crm_v2_01_shard3_replica1: Bad Request

request:
crm_v2_01_shard3_replica1/update?update.chain=add-unknown-fields-to-the-schema&update.distrib=FROMLEADER&distrib.from=http%3A%2F%2Fmasked%3A8983%2Fsolr%2Fcrm_v2_01_shard3_replica2%2F&wt=javabin&version=2

Remote error message: missing _version_ on update from leader

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:345)

               at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:184)

               at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)

               at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$292(ExecutorUtil.java)

               at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

               at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

               at java.lang.Thread.run(Thread.java:748)

2018-06-20 17:19:06.663 INFO  (qtp1002191352-169102) [c:crm_v2_01 s:shard3
r:core_node4 x:crm_v2_01_shard3_replica2] o.a.s.c.ZkController Put replica
core=crm_v2_01_shard3_replica1 coreNodeName=core_node13 on masked:8983_solr
into leader-initiated recovery.

2018-06-20 17:19:06.663 INFO  (qtp1002191352-169102) [c:crm_v2_01 s:shard3
r:core_node4 x:crm_v2_01_shard3_replica2]
o.a.s.u.p.LogUpdateProcessorFactory [crm_v2_01_shard3_replica2]  webapp=/solr
path=/update
params={update.distrib=TOLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://masked:8983/solr/crm_v2_01_shard3_replica3/&wt=javabin&version=2}{delete=[note-20151333-8M821761N
(-1603827973916459008)]} 0 4

2018-06-20 17:19:06.668 INFO
(updateExecutor-2-thread-8226-processing-x:crm_v2_01_shard3_replica2
r:core_node4 crm_v2_01_shard3_replica3// n:masked:8983_solr s:shard3
c:crm_v2_01) [c:crm_v2_01 s:shard3 r:core_node4
x:crm_v2_01_shard3_replica2] o.a.s.c.LeaderInitiatedRecoveryThread Put
replica core=crm_v2_01_shard3_replica3 coreNodeName=core_node12 on
masked:8983_solr
into leader-initiated recovery.

2018-06-20 17:19:06.668 WARN
(updateExecutor-2-thread-8226-processing-x:crm_v2_01_shard3_replica2
r:core_node4 crm_v2_01_shard3_replica3// n:masked:8983_solr s:shard3
c:crm_v2_01) [c:crm_v2_01 s:shard3 r:core_node4
x:crm_v2_01_shard3_replica2] o.a.s.c.LeaderInitiatedRecoveryThread Leader
is publishing core=crm_v2_01_shard3_replica3 coreNodeName =core_node12
state=down on behalf of un-reachable replica
http://masked:8983/solr/crm_v2_01_shard3_replica3/


Thanks,

Sujatha





On Wed, Jun 20, 2018 at 11:18 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 6/15/2018 3:14 PM, sujatha sankaran wrote:
>
>> We were initially having an issue with DBQ and heavy batch updates  which
>> used to result in many missing updates.
>>
>> After reading many mails in mailing list which mentions that DBQ and batch
>> update do not work well together, we switched to DBI. But  we are seeing
>> issue as mentioned in this jira issue:
>> https://issues.apache.org/jira/browse/SOLR-7384
>>
>
> If you're using the implicit router on your multi-shard collection,
> deleting by ID may not work for you.  There are a number of issues in Jira
> discussing various aspects of the problem.  On a collection using the
> compositeId router, I would expect those deletes to work well.
>
> Specifically we are seeing a pattern as :-
>>
>> ·        There are several  ERRORs and WARNs about “missing _*version*_”
>> type of thing.
>>
>> ·        ERROR message is typically single.
>>
>> ·        There are several WARNs after that and after couple of WARNs
>> there
>> is message that Leader initiated recovery has been kicked off .
>>
>
> Can you share these log entries?  The message on some of them is probably
> a dozen or more lines long, and may have multiple "Caused by" clauses that
> will also need to be included.  Seeing the whole log could be useful.
>
> *Setup info*:
>>
>> - Solr Cloud 6.6.2
>> --5 Node, 5 Shard, 3 replica setup
>> -~35million docs in the collection
>> -  Nodes have 90GB RAM 32 to JVM
>> -Soft commit interval 2 seconds, Hard commit (open searcher false) 15
>> seconds
>>
>
> Side notes:
>
> Solr would actually have more heap memory available if you set the heap to
> 31GB instead of 32GB.
>
> https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-
> java-jvm-memory-oddities/
>
> A 2 second soft commit interval is extremely aggressive.  If your soft
> commits are happening really quickly (far less that 1 second) then this
> might not be a problem, but with an index as large as yours, it is very
> likely that soft commits are taking much longer than 2 seconds.
>
> Thanks,
> Shawn
>
>

Reply via email to