[ 
https://issues.apache.org/jira/browse/SOLR-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053776#comment-14053776
 ] 

Shalin Shekhar Mangar commented on SOLR-5596:
---------------------------------------------

I was looking into the logs of this fail today:
http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10616/

{code}
   [junit4]   2> 472241 T2893 oazsp.FileTxnLog.commit WARN fsync-ing the write 
ahead log in SyncThread:0 took 11588ms which will adversely effect operation 
latency. See the ZooKeeper troubleshooting guide
{code}

This error can be due to a slow machine but it also happens on fast machines if 
you try to do a lot of writes very fast on ZooKeeper which is what the 
testShardLeaderChange does. Perhaps we should add a small wait between 
operations?

Would it make sense to set forcefscync to no for ZooKeeper in our tests? At the 
very least, it would reduce the spurious failures and let us concentrate on 
fixing real bugs.

See 
http://mail-archives.apache.org/mod_mbox/zookeeper-user/201401.mbox/%3ccabtfevwoxh1d8d+to0wylmbap_crby6l9i9wh2le7s1zkpn...@mail.gmail.com%3E
and 
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/zookeeper_psuedo_scalability_and_absolute


> OverseerTest.testOverseerFailure - leader node already exists.
> --------------------------------------------------------------
>
>                 Key: SOLR-5596
>                 URL: https://issues.apache.org/jira/browse/SOLR-5596
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: 4.9, 5.0
>
>
> Seeing this a bunch on jenkins - previous leader ephemeral node is still 
> around for some reason.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to