[ 
https://issues.apache.org/jira/browse/SOLR-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-14058:
--------------------------------
           Attachment: SOLR-14058.patch
    Affects Version/s: master (9.0)
             Assignee: Yonik Seeley
               Status: Open  (was: Open)

We don't have a test that tickles this bug, but after reviewing the code, the 
fix (attached) is relatively straightforward.  

otherUpdatesIndex is initialized to be less than otherVersions.size(), and it 
is only ever decremented (and otherVersions is not modified), hence the correct 
check is otherUpdatesIndex >= 0.


> AIOOBE in PeerSync
> ------------------
>
>                 Key: SOLR-14058
>                 URL: https://issues.apache.org/jira/browse/SOLR-14058
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 8.3, master (9.0)
>            Reporter: Yonik Seeley
>            Assignee: Yonik Seeley
>            Priority: Major
>         Attachments: SOLR-14058.patch
>
>
> We hit an exception with 8.3 that someone else also hit on stackoverflow:
> https://stackoverflow.com/questions/58891563/problem-in-syncing-replicas-with-solr-8-3-with-zookeeper-3-5-6
> {quote}
> I recently converted a solr 7.x + zookeeper 3.4.14 to solr 8.3 + zk 3.5.6, 
> and depending on how I start the solr nodes I'm geting a sync exception.
> My setup uses 3 zk nodes and 2 solr nodes (let's call it A and B). The 
> collection that has this problem has 1 shard and 2 replicas. I've noticed 2 
> situations: (1) which works fine and (2) which does not work.
> 1) This works: I start solr node A, and wait until it's replica is elected 
> leader ("green" in the Solr interface 'Cloud'->'Graph') - which takes about 2 
> min; and only then start solr node B. Both replicas are active and the one in 
> A is the leader.
> 2) This does NOT work: I start solr node A, and a few secs after I star solr 
> node B (that is, before the 'A' replica is elected leader - still "Down" in 
> the solr interface). In this case I get the following exception:
> ERROR (coreZkRegister-1-thread-2-processing-n:192.168.15.20:8986_solr 
> x:alldata_shard1_replica_n1 c:alldata s:shard1 r:core_node3) [c:alldata 
> s:shard1 r:core_node3 x:alldata_shard1_replica_n1] o.a.s.c.SyncStrategy Sync 
> Failed:java.lang.IndexOutOfBoundsException: Index -1 out of bounds for length 
> 99
> It seems that if both solr node are started soon after each other, then ZK 
> cannot elect one as leader. This error only appears in the solr.log of node 
> A, even if I invert the order of starting nodes.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to