[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-27 Thread Anver Sotnikov (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17221604#comment-17221604
 ] 

Anver Sotnikov commented on SOLR-14940:
---

[~mdrob] It looks good to me. Thank you for adding tests.

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Assignee: Mike Drob
>Priority: Major
> Attachments: Actual references to hooks that in turn hold references 
> to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-26 Thread Anver Sotnikov (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220704#comment-17220704
 ] 

Anver Sotnikov commented on SOLR-14940:
---

Mike, you were right. We instrumented Solr with extra logging on registerHook 
and shutdown in ReplicationController to confirm that leak was due to flaky ZK 
connection. We bumped timeouts (SOLR-10471) and fine tuned GC as well. 
Replication going into recovery happens way less then it was before.

Stacktrace from registerHook 
{code}
at 
org.apache.solr.handler.ReplicationHandler.registerCloseHook(ReplicationHandler.java:1397)
java.lang.RuntimeException: ReplicationHandler.registerCloseHooks
at 
org.apache.solr.handler.ReplicationHandler.registerCloseHook(ReplicationHandler.java:1397)
 ~[?:?]
at 
org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:1239) 
~[?:?]
at 
org.apache.solr.cloud.ReplicateFromLeader.startReplication(ReplicateFromLeader.java:109)
 ~[?:?]
at 
org.apache.solr.cloud.ZkController.startReplicationFromLeader(ZkController.java:1327)
 ~[?:?]
at 
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:713)
 ~[?:?]
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:334) 
~[?:?]
at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:317) ~[?:?]
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
 ~[metrics-core-4.1.5.jar:4.1.5]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
~[?:?]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212)
 ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
~[?:?]
at java.lang.Thread.run(Unknown Source)
{code}



> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
> Attachments: Actual references to hooks that in turn hold references 
> to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-20 Thread Anver Sotnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anver Sotnikov updated SOLR-14940:
--
Attachment: (was: image-2020-10-20-13-58-58-522.png)

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
> Attachments: Actual references to hooks that in turn hold references 
> to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-20 Thread Anver Sotnikov (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217833#comment-17217833
 ] 

Anver Sotnikov commented on SOLR-14940:
---

Attached closeHooks contents showing hooks (ReplicationHandler$1 and 
ReplicationHandler$2) on the left in MAT

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
> Attachments: Actual references to hooks that in turn hold references 
> to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png, 
> image-2020-10-20-13-58-58-522.png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-20 Thread Anver Sotnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anver Sotnikov updated SOLR-14940:
--
Attachment: image-2020-10-20-13-58-58-522.png

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
> Attachments: Actual references to hooks that in turn hold references 
> to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png, 
> image-2020-10-20-13-58-58-522.png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-20 Thread Anver Sotnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anver Sotnikov updated SOLR-14940:
--
Attachment: Actual references to hooks that in turn hold references to 
ReplicationHandlers.png

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
> Attachments: Actual references to hooks that in turn hold references 
> to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png, 
> image-2020-10-20-13-58-58-522.png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-20 Thread Anver Sotnikov (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217828#comment-17217828
 ] 

Anver Sotnikov commented on SOLR-14940:
---

ReplicationHandler reference is kept in memory as hooks ReplicationHandler$1 
and  ReplicationHandler$2 are inner anonimous classes and as such they keep 
implicit reference to parent class instance (to use all the properties listed 
in hooks like executorService /currentIndexFetcher / etc)

 
{code:java}
  private void registerCloseHook() {
core.addCloseHook(new CloseHook() {
  @Override
  public void preClose(SolrCore core) {
if (executorService != null) executorService.shutdown(); // we don't 
wait for shutdown - this can deadlock core reload
  }

  @Override
  public void postClose(SolrCore core) {
if (pollingIndexFetcher != null) {
  pollingIndexFetcher.destroy();
}
if (currentIndexFetcher != null && currentIndexFetcher != 
pollingIndexFetcher) {
  currentIndexFetcher.destroy();
}
  }
});

core.addCloseHook(new CloseHook() {
  @Override
  public void preClose(SolrCore core) {
ExecutorUtil.shutdownAndAwaitTermination(restoreExecutor);
if (restoreFuture != null) {
  restoreFuture.cancel(false);
}
  }

  @Override
  public void postClose(SolrCore core) {}
});
  }

 {code}

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
> Attachments: Memory Analyzer SolrCore.closeHooks .png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-20 Thread Anver Sotnikov (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217710#comment-17217710
 ] 

Anver Sotnikov commented on SOLR-14940:
---

Attached screenshot of Memory Analyzer heap dump with 163 hooks registered. Let 
me know if you'd want to see any details of ReplicationHandlers referenced from 
hooks

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
> Attachments: Memory Analyzer SolrCore.closeHooks .png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-20 Thread Anver Sotnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anver Sotnikov updated SOLR-14940:
--
Attachment: Memory Analyzer SolrCore.closeHooks .png

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
> Attachments: Memory Analyzer SolrCore.closeHooks .png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-20 Thread Anver Sotnikov (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217708#comment-17217708
 ] 

Anver Sotnikov commented on SOLR-14940:
---

I assume Core reload actually creates a new core and discards old one, thus 
hooks would be executed and discarded. It seems that stopping replication and 
restarting replication on the same core would be a better representation of the 
problem. Planning to get trough the logs to understand what exactly triggers 
the problem (we do get quite a few Recoveries getting triggered due to ZK 
connectivity but need to trace if that is the real reason)

 

> ReplicationHandler memory leak through SolrCore.closeHooks
> --
>
> Key: SOLR-14940
> URL: https://issues.apache.org/jira/browse/SOLR-14940
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
> Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
> with 2 cores in each JVM.
>  
>Reporter: Anver Sotnikov
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
> nodes.
> Leader does not seem to be affected while Followers are.
>  
> Looking at memory dump we noticed that SolrCore holds lots of references to 
> ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
> which in turn holds ReplicationHandlers.
> ReplicationHandler registers hooks as anonymous inner classes in 
> SolrCore.closeHooks through ReplicationHandler.inform() -> 
> ReplicationHandler.registerCloseHook().
>  
> Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
> ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
> ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
> started again on same SolrCore - new ReplicationHandler will be created and 
> registered in closeHooks.
>  
> It looks like there are few scenarios when replication is stopped and 
> restarted on same core and in our TLOG setup it shows up quite often.
>  
> Potential solutions:
>  # Allow unregistering SolrCore.closeHooks so it can be used from 
> ReplicationHandler.shutdown
>  # Hack but easier - break the link between ReplicationHandler close hooks 
> and full ReplicationHandler object so ReplicationHandler can be GCed even 
> when hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks

2020-10-16 Thread Anver Sotnikov (Jira)
Anver Sotnikov created SOLR-14940:
-

 Summary: ReplicationHandler memory leak through SolrCore.closeHooks
 Key: SOLR-14940
 URL: https://issues.apache.org/jira/browse/SOLR-14940
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: replication (java)
 Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes 
with 2 cores in each JVM.

 
Reporter: Anver Sotnikov


We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG 
nodes.

Leader does not seem to be affected while Followers are.

 

Looking at memory dump we noticed that SolrCore holds lots of references to 
ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, 
which in turn holds ReplicationHandlers.

ReplicationHandler registers hooks as anonymous inner classes in 
SolrCore.closeHooks through ReplicationHandler.inform() -> 
ReplicationHandler.registerCloseHook().

 

Whenever ZkController.stopReplicationFromLeader is called - it would shutdown 
ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to 
ReplicationHandler will stay in SolrCore.closeHooks. Once replication is 
started again on same SolrCore - new ReplicationHandler will be created and 
registered in closeHooks.

 

It looks like there are few scenarios when replication is stopped and restarted 
on same core and in our TLOG setup it shows up quite often.

 

Potential solutions:
 # Allow unregistering SolrCore.closeHooks so it can be used from 
ReplicationHandler.shutdown
 # Hack but easier - break the link between ReplicationHandler close hooks and 
full ReplicationHandler object so ReplicationHandler can be GCed even when 
hooks are still registered in SolrCore.closeHooks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org