[ https://issues.apache.org/jira/browse/SOLR-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki resolved SOLR-12075. -------------------------------------- Resolution: Fixed Fixed as a part of SOLR-12923. > TestLargeCluster is too flaky > ----------------------------- > > Key: SOLR-12075 > URL: https://issues.apache.org/jira/browse/SOLR-12075 > Project: Solr > Issue Type: Bug > Components: AutoScaling > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Priority: Major > > This test is failing a lot in jenkins builds, with two types of failures: > * specific test method failures - this may be caused by either bugs in the > autoscaling code, bugs in the simulator or timing issues. It should be > possible to narrow down the cause by using different speeds of simulated time. > * suite-level failures due to leaked threads - most of these failures > indicate the ongoing Policy calculations, eg: > {code} > com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from > SUITE scope at org.apache.solr.cloud.autoscaling.sim.TestLargeCluster: > 1) Thread[id=21406, name=AutoscalingActionExecutor-7277-thread-1, > state=RUNNABLE, group=TGRP-TestLargeCluster] > at java.util.ArrayList.iterator(ArrayList.java:834) > at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:131) > at org.apache.solr.common.util.Utils.makeDeepCopy(Utils.java:110) > at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:92) > at org.apache.solr.common.util.Utils.makeDeepCopy(Utils.java:108) > at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:92) > at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:74) > at org.apache.solr.client.solrj.cloud.autoscaling.Row.copy(Row.java:91) > at > org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.lambda$getMatrixCopy$1(Policy.java:297) > at > org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session$$Lambda$466/1757323495.apply(Unknown > Source) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) > at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > at > org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.getMatrixCopy(Policy.java:298) > at > org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.copy(Policy.java:287) > at > org.apache.solr.client.solrj.cloud.autoscaling.Row.removeReplica(Row.java:156) > at > org.apache.solr.client.solrj.cloud.autoscaling.MoveReplicaSuggester.tryEachNode(MoveReplicaSuggester.java:60) > at > org.apache.solr.client.solrj.cloud.autoscaling.MoveReplicaSuggester.init(MoveReplicaSuggester.java:34) > at > org.apache.solr.client.solrj.cloud.autoscaling.Suggester.getSuggestion(Suggester.java:129) > at > org.apache.solr.cloud.autoscaling.ComputePlanAction.process(ComputePlanAction.java:98) > at > org.apache.solr.cloud.autoscaling.ScheduledTriggers.lambda$null$3(ScheduledTriggers.java:307) > at > org.apache.solr.cloud.autoscaling.ScheduledTriggers$$Lambda$439/951218654.run(Unknown > Source) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$9/1677458082.run(Unknown > Source) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > at __randomizedtesting.SeedInfo.seed([C6FA0364D13DAFCC]:0) > {code} > It's possible that somewhere an InterruptedException is caught and not > propagated so that the Policy calculations don't terminate when the thread is > interrupted when closing parent components. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org