Khalid Alharbi created SOLR-9591:
------------------------------------

             Summary: Shards and replicas go down when indexing large number of 
files
                 Key: SOLR-9591
                 URL: https://issues.apache.org/jira/browse/SOLR-9591
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: SolrCloud
    Affects Versions: 5.5.2
            Reporter: Khalid Alharbi


Solr shards and replicas go down when indexing a large number of text files 
using the default [extracting request 
handler|https://cwiki.apache.org/confluence/x/c4DxAQ].
{code}
curl 'http://localhost:8983/solr/myCollection/update/extract?literal.id=someId' 
-F "myfile=/data/file1.txt"
{code}
and committing after indexing 5,000 files using:
{code}
curl 'http://localhost:8983/solr/myCollection/update?commit=true&wt=json'
{code}

This was on Solr (SolrCloud) version 5.5.2 with an external zookeeper cluster 
of five nodes. I also tried this on a single node SolrCloud with the embedded 
ZooKeeper but the collection went down as well. In both cases the error message 
is always "ERROR null DistributedUpdateProcessor ClusterState says we are the 
leader,​ but locally we don't think so"

I managed to come up with a work around that helped me index over 400K files 
without getting replicas down with that error message. The work around is to 
index 5K files, restart Solr, wait for shards and replicas to get active, then 
index the next 5K files, and repeat the previous steps.

If this is not enough to investigate this issue, I will be happy to provide 
more details regarding this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to