[jira] [Commented] (SOLR-16122) TestLeaderElectionZkExpiry failing frequently

2024-06-03 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851833#comment-17851833
 ] 

David Smiley commented on SOLR-16122:
-

This test seems to fail due to thread leaks.

Happened yesterday in CI:
{noformat}
  2> INFO: All leaked threads terminated.
   > com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked 
from SUITE scope at org.apache.solr.cloud.TestLeaderElectionZkExpiry: 
   >1) Thread[id=9557, 
name=zkConnectionManagerCallback-5960-thread-1-EventThread, state=WAITING, 
group=TGRP-TestLeaderElectionZkExpiry]
   > at java.base@11.0.16.1/jdk.internal.misc.Unsafe.park(Native 
Method)
   > at 
java.base@11.0.16.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
   > at 
java.base@11.0.16.1/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
   > at 
java.base@11.0.16.1/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:433)
   > at 
app//org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:535)
   >2) Thread[id=9549, 
name=zkConnectionManagerCallback-5960-thread-1-EventThread, state=WAITING, 
group=TGRP-TestLeaderElectionZkExpiry]
   > at java.base@11.0.16.1/java.lang.Object.wait(Native Method)
   > at java.base@11.0.16.1/java.lang.Object.wait(Object.java:328)
   > at 
app//org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1583)
   > at 
app//org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1555)
   > at 
app//org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1522)
   > at 
app//org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:1227)
   > at 
app//org.apache.solr.common.cloud.SolrZkClient.updateKeeper(SolrZkClient.java:863)
   > at 
app//org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:190)
   > at 
app//org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:59)
   > at 
app//org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:179)
   > at 
app//org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:564)
   > at 
app//org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:539)
   > at __randomizedtesting.SeedInfo.seed([B35AE6C0068D8659]:0)
{noformat}

And also for me on Crave recently (this time the OverseerShutdownThread):

{noformat}
2> SEVERE: 1 thread leaked from SUITE scope at 
org.apache.solr.cloud.TestLeaderElectionZkExpiry: 
  2>1) Thread[id=349, name=OverseerExitThread, state=TIMED_WAITING, 
group=Overseer state updater.]
  2> at java.base@11.0.23/java.lang.Thread.sleep(Native Method)
  2> at 
app//org.apache.solr.common.cloud.ZkCmdExecutor.retryDelay(ZkCmdExecutor.java:101)
  2> at 
app//org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:80)
  2> at 
app//org.apache.solr.common.cloud.SolrZkClient.delete(SolrZkClient.java:345)
  2> at 
app//org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:118)
  2> at 
app//org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:310)
  2> at 
app//org.apache.solr.cloud.LeaderElector.retryElection(LeaderElector.java:395)
  2> at 
app//org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:133)
  2> at 
app//org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:310)
  2> at 
app//org.apache.solr.cloud.LeaderElector.retryElection(LeaderElector.java:395)
  2> at 
app//org.apache.solr.cloud.ZkController.rejoinOverseerElection(ZkController.java:2364)
  2> at 
app//org.apache.solr.cloud.Overseer$ClusterStateUpdater.checkIfIamStillLeader(Overseer.java:511)
  2> at 
app//org.apache.solr.cloud.Overseer$ClusterStateUpdater$$Lambda$1667/0x00010099b840.run(Unknown
 Source)
  2> at java.base@11.0.23/java.lang.Thread.run(Thread.java:829)
{noformat}

This one above seems clear to me how it could happen since a new Thread is 
spawned with no wait 
[here|https://github.com/apache/solr/blob/70b6e4f6952cb7f9b3647865404487c68264668d/solr/core/src/java/org/apache/solr/cloud/Overseer.java#L417].



> TestLeaderElectionZkExpiry failing frequently
> -
>
> Key: SOLR-16122
> URL: https://issues.apache.org/jira/browse/SOLR-16122
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 9.0
>Reporter: Jan Høydahl
>Priority: Major
>
> Failing in 10% of runs - marking as {{@BadApple}} before the 9.0 release



--
This message was sent by Atlassian Jira

[jira] [Commented] (SOLR-16122) TestLeaderElectionZkExpiry failing frequently

2022-03-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513955#comment-17513955
 ] 

ASF subversion and git services commented on SOLR-16122:


Commit b6f6cf1b23da5ad61d9e451ba6a6dd7e3e9f0a3d in solr's branch 
refs/heads/branch_9x from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=b6f6cf1 ]

SOLR-16122 Bad-apple TestLeaderElectionZkExpiry test


> TestLeaderElectionZkExpiry failing frequently
> -
>
> Key: SOLR-16122
> URL: https://issues.apache.org/jira/browse/SOLR-16122
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 9.0
>Reporter: Jan Høydahl
>Priority: Major
>
> Failing in 10% of runs - marking as {{@BadApple}} before the 9.0 release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-16122) TestLeaderElectionZkExpiry failing frequently

2022-03-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513954#comment-17513954
 ] 

ASF subversion and git services commented on SOLR-16122:


Commit 60aa6693ba2fab2af71e83bc24c1c408d1d23d7b in solr's branch 
refs/heads/main from Jan Høydahl
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=60aa669 ]

SOLR-16122 Bad-apple TestLeaderElectionZkExpiry test


> TestLeaderElectionZkExpiry failing frequently
> -
>
> Key: SOLR-16122
> URL: https://issues.apache.org/jira/browse/SOLR-16122
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 9.0
>Reporter: Jan Høydahl
>Priority: Major
>
> Failing in 10% of runs - marking as {{@BadApple}} before the 9.0 release



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org