[
https://issues.apache.org/jira/browse/SOLR-17478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18063687#comment-18063687
]
David Smiley commented on SOLR-17478:
-------------------------------------
It seems any SolrCloud test / benchmark now logs this error:
{noformat}
1100 ERROR (SUITE-HttpSolrCallCloudTest-seed#[19E1C07A6CABB617]-worker) []
o.a.s.c.ZkTestServer ZkTestServer requires the 'stat' command, temporarily
manipulating your whitelist
{noformat}
Not sure if this was so for 9.8 and something changed. This doesn't seem
error-worthy. I'm not even sure it's warn-worthy TBH.
> ZkTestServer bugs can cause 30sec test pauses
> ---------------------------------------------
>
> Key: SOLR-17478
> URL: https://issues.apache.org/jira/browse/SOLR-17478
> Project: Solr
> Issue Type: Test
> Reporter: Chris M. Hostetter
> Assignee: Chris M. Hostetter
> Priority: Major
> Fix For: 9.8, 10.0
>
> Attachments: SOLR-17478.patch
>
>
> Pop Quiz: which of these two (psuedo-code) ZK based test classes will be
> faster...
> {code:java}
> public class Test_X extends SolrTestCase {
> public void test() throws Exception {
> ZkTestServer zkServer = new ZkTestServer(createTempDir());
> zkServer.run();
> zkServer.shutdown();
> }
> }
> public class Test_Y extends SolrTestCaseJ4 {
> public void test() throws Exception {
> ZkTestServer zkServer = new ZkTestServer(createTempDir());
> zkServer.run();
> zkServer.shutdown();
> }
> }
> {code}
> ...if you guessed "Test_X, because SolrTestCase has less overhead then
> SolrTestCaseJ4" then you are *wrong by ~30 seconds.*
> Actually, that's not _always_ true: if you run both tests in the same JVM
> then they will both take the same amount of time:
> * If Test_Y runs first, they will both take ~1 second each
> * If Test_X runs first, they will both take 30+ seconds *_each_*
> The reason for the dependency comes down to:
> * The {{"zookeeper.4lw.commands.whitelist"}} sysprop is set/cleared in
> SolrTestCaseJ4 (BeforeClass/AfterClass), but _NOT_ in SolrTestCase
> * ZkTestServer depends on the ZK helper methods
> {{ClientBase.waitForServerUp()}} which depends on the 4LW {{"stat"}} being
> whitelisted
> ** If "stat" is not whitelisted, then {{waitForServerUp()}} timesout after
> 30 sec and return a failure
> ** Bonus problem: ZkTestServer never checks of the return value of
> {{waitForServerUp()}} !
> * Zookeeper only checks for the 4LW whitelist sysprop the first time it's
> needed by the {{FourLetterCommands}} class _and then caches it staticly_
> ** So if the first testclass to run a ZK server doesn't extend
> SolrTestCaseJ4, then _every_ test that uses ZkTestServer in that JVM will
> stall for 30 seconds
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]