[ 
https://issues.apache.org/jira/browse/SOLR-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151552#comment-16151552
 ] 

Erick Erickson commented on SOLR-11278:
---------------------------------------

I just ran the patch against 7x with two different modes:

1> the original patch x 100
2> removed the three second wait in the test case x 250

<1> had no errors
<2> had one error, not the same one and I haven't pursued it yet, excerpt 
below. I have the full test case. I'm going to put the 3 second wait back in 
and try 1,000 iterations to see if this error occurs again.

NOTE: I don't think then sleep is something we _want_ to leave in the code, 
just seeing if it alters the results for a clue where to look next.

This is good progress! Oh, I haven't reviewed the patch in detail yet either, 
just trying to get a sense of what the behavior is before diving in.

[junit4]   2> 49759 INFO  (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] o.a.s.c.SolrCore 
[cdcr-target_shard1_replica_n1]  CLOSING SolrCore 
org.apache.solr.core.SolrCore@2fdc6dad
   [junit4]   2> 49759 INFO  (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] 
o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.core.cdcr-target.shard1.replica_n1, tag=802975149
   [junit4]   2> 49759 INFO  (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] 
o.a.s.m.r.SolrJmxReporter Closing reporter 
[org.apache.solr.metrics.reporters.SolrJmxReporter@7b017431: rootName = 
solr_59962, domain = solr.core.cdcr-target.shard1.replica_n1, service url = 
null, agent id = null] for registry solr.core.cdcr-target.shard1.replica_n1 / 
com.codahale.metrics.MetricRegistry@67a56b63
   [junit4]   2> 49760 INFO  
(searcherExecutor-150-thread-1-processing-n:127.0.0.1:59962_solr 
x:cdcr-target_shard1_replica_n1 s:shard1 c:cdcr-target r:core_node2) 
[n:127.0.0.1:59962_solr c:cdcr-target s:shard1 r:core_node2 
x:cdcr-target_shard1_replica_n1] o.a.s.c.SolrCore 
[cdcr-target_shard1_replica_n1] Registered new searcher 
Searcher@51a42485[cdcr-target_shard1_replica_n1] 
main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_k(7.1.0):C1900)
 Uninverting(_l(7.1.0):C100)))}
   [junit4]   2> 49774 INFO  (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] 
o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.collection.cdcr-target.shard1.leader, tag=802975149
   [junit4]   2> 49775 INFO  (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] 
o.a.s.h.CdcrRequestHandler Solr core is being closed - shutting down CDCR 
handler @ cdcr-target:shard1
   [junit4]   2> 62525 ERROR (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] 
o.a.s.c.CachingDirectoryFactory Timeout waiting for all directory ref counts to 
be released - gave up waiting on 
CachedDir<<refCount=1;path=/Users/Erick/apache/solrJiras/beast/results/beast-tmp/203/J0/temp/solr.cloud.CdcrBootstrapTest_DCBEC103DFB44964-001/cdcr-target-003/node1/./cdcr-target_shard1_replica_n1/data/index.20170902102657976;done=false>>
   [junit4]   2> 62526 ERROR (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] 
o.a.s.c.CachingDirectoryFactory Error closing 
directory:org.apache.solr.common.SolrException: Timeout waiting for all 
directory ref counts to be released - gave up waiting on 
CachedDir<<refCount=1;path=/Users/Erick/apache/solrJiras/beast/results/beast-tmp/203/J0/temp/solr.cloud.CdcrBootstrapTest_DCBEC103DFB44964-001/cdcr-target-003/node1/./cdcr-target_shard1_replica_n1/data/index.20170902102657976;done=false>>
   [junit4]   2>        at 
org.apache.solr.core.CachingDirectoryFactory.close(CachingDirectoryFactory.java:178)
   [junit4]   2>        at 
org.apache.solr.core.SolrCore.close(SolrCore.java:1613)
   [junit4]   2>        at 
org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:859)
   [junit4]   2>        at 
org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1232)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.lambda$reloadCore$0(IndexFetcher.java:900)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> 
   [junit4]   2> 75243 ERROR (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] 
o.a.s.c.CachingDirectoryFactory Timeout waiting for all directory ref counts to 
be released - gave up waiting on 
CachedDir<<refCount=1;path=/Users/Erick/apache/solrJiras/beast/results/beast-tmp/203/J0/temp/solr.cloud.CdcrBootstrapTest_DCBEC103DFB44964-001/cdcr-target-003/node1/./cdcr-target_shard1_replica_n1/data/index;done=false>>
   [junit4]   2> 75243 ERROR (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] 
o.a.s.c.CachingDirectoryFactory Error closing 
directory:org.apache.solr.common.SolrException: Timeout waiting for all 
directory ref counts to be released - gave up waiting on 
CachedDir<<refCount=1;path=/Users/Erick/apache/solrJiras/beast/results/beast-tmp/203/J0/temp/solr.cloud.CdcrBootstrapTest_DCBEC103DFB44964-001/cdcr-target-003/node1/./cdcr-target_shard1_replica_n1/data/index;done=false>>
   [junit4]   2>        at 
org.apache.solr.core.CachingDirectoryFactory.close(CachingDirectoryFactory.java:178)
   [junit4]   2>        at 
org.apache.solr.core.SolrCore.close(SolrCore.java:1613)
   [junit4]   2>        at 
org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:859)
   [junit4]   2>        at 
org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1232)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.lambda$reloadCore$0(IndexFetcher.java:900)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> 
   [junit4]   2> 75244 ERROR (Thread-83) [n:127.0.0.1:59962_solr c:cdcr-target 
s:shard1 r:core_node2 x:cdcr-target_shard1_replica_n1] o.a.s.c.SolrCore 
java.lang.AssertionError: 1
   [junit4]   2>        at 
org.apache.solr.core.CachingDirectoryFactory.close(CachingDirectoryFactory.java:192)
   [junit4]   2>        at 
org.apache.solr.core.SolrCore.close(SolrCore.java:1613)
   [junit4]   2>        at 
org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:859)
   [junit4]   2>        at 
org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1232)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.lambda$reloadCore$0(IndexFetcher.java:900)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> 
   [junit4]   2> 75245 ERROR 
(recoveryExecutor-81-thread-1-processing-n:127.0.0.1:59962_solr 
x:cdcr-target_shard1_replica_n1 s:shard1 c:cdcr-target r:core_node2) 
[n:127.0.0.1:59962_solr c:cdcr-target s:shard1 r:core_node2 
x:cdcr-target_shard1_replica_n1] o.a.s.h.ReplicationHandler Index fetch failed 
:org.apache.solr.common.SolrException: Index fetch failed : 
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:655)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:332)
   [junit4]   2>        at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:419)
   [junit4]   2>        at 
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:773)
   [junit4]   2>        at 
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:724)
   [junit4]   2>        at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
   [junit4]   2>        at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
   [junit4]   2>        at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   [junit4]   2>        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   [junit4]   2>        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> Caused by: java.lang.NullPointerException
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.openNewSearcherAndUpdateCommitPoint(IndexFetcher.java:888)
   [junit4]   2>        at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:632)
   [junit4]   2>        ... 10 more
   [junit4]   2> 
   [junit4]   2> 75245 INFO  
(recoveryExecutor-81-thread-1-processing-n:127.0.0.1:59962_solr 
x:cdcr-target_shard1_replica_n1 s:shard1 c:cdcr-target r:core_node2) 
[n:127.0.0.1:59962_solr c:cdcr-target s:shard1 r:core_node2 
x:cdcr-target_shard1_replica_n1] o.a.s.h.CdcrRequestHandler boostrap cal



> CdcrBootstrapTest failing intermittently
> ----------------------------------------
>
>                 Key: SOLR-11278
>                 URL: https://issues.apache.org/jira/browse/SOLR-11278
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: CDCR
>    Affects Versions: 7.0, 6.6.1
>            Reporter: Amrit Sarkar
>            Assignee: Varun Thacker
>            Priority: Critical
>              Labels: test
>         Attachments: master-bs.patch, 
> SOLR-11278-cancel-bootstrap-on-stop.patch, SOLR-11278.patch, test_results
>
>
> {{CdcrBootstrapTest}} is failing while running beasts for significant 
> iterations.
> The bootstrapping is failing in the test, after the first batch is indexed 
> for each {{testmethod}}, which results in documents mismatch ::
> {code}
>   [beaster]   2> 39167 ERROR 
> (updateExecutor-39-thread-1-processing-n:127.0.0.1:42155_solr 
> x:cdcr-target_shard1_replica_n1 s:shard1 c:cdcr-target r:core_node2) 
> [n:127.0.0.1:42155_solr c:cdcr-target s:shard1 r:core_node2 
> x:cdcr-target_shard1_replica_n1] o.a.s.h.CdcrRequestHandler Bootstrap 
> operation failed
>   [beaster]   2> java.util.concurrent.ExecutionException: 
> java.lang.AssertionError
>   [beaster]   2>      at 
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   [beaster]   2>      at 
> java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   [beaster]   2>      at 
> org.apache.solr.handler.CdcrRequestHandler.lambda$handleBootstrapAction$0(CdcrRequestHandler.java:654)
>   [beaster]   2>      at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>   [beaster]   2>      at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   [beaster]   2>      at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   [beaster]   2>      at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>   [beaster]   2>      at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   [beaster]   2>      at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   [beaster]   2>      at java.lang.Thread.run(Thread.java:748)
>   [beaster]   2> Caused by: java.lang.AssertionError
>   [beaster]   2>      at 
> org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:813)
>   [beaster]   2>      at 
> org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:724)
>   [beaster]   2>      at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
>   [beaster]   2>      ... 5 more
> {code}
> {code}
>   [beaster] [01:37:16.282] FAILURE  153s | 
> CdcrBootstrapTest.testBootstrapWithSourceCluster <<<
>   [beaster]    > Throwable #1: java.lang.AssertionError: Document mismatch on 
> target after sync expected:<2000> but was:<1000>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to