[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17221604#comment-17221604 ] Anver Sotnikov commented on SOLR-14940: --- [~mdrob] It looks good to me. Thank you for adding tests. > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Assignee: Mike Drob >Priority: Major > Attachments: Actual references to hooks that in turn hold references > to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220704#comment-17220704 ] Anver Sotnikov commented on SOLR-14940: --- Mike, you were right. We instrumented Solr with extra logging on registerHook and shutdown in ReplicationController to confirm that leak was due to flaky ZK connection. We bumped timeouts (SOLR-10471) and fine tuned GC as well. Replication going into recovery happens way less then it was before. Stacktrace from registerHook {code} at org.apache.solr.handler.ReplicationHandler.registerCloseHook(ReplicationHandler.java:1397) java.lang.RuntimeException: ReplicationHandler.registerCloseHooks at org.apache.solr.handler.ReplicationHandler.registerCloseHook(ReplicationHandler.java:1397) ~[?:?] at org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:1239) ~[?:?] at org.apache.solr.cloud.ReplicateFromLeader.startReplication(ReplicateFromLeader.java:109) ~[?:?] at org.apache.solr.cloud.ZkController.startReplicationFromLeader(ZkController.java:1327) ~[?:?] at org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:713) ~[?:?] at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:334) ~[?:?] at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:317) ~[?:?] at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180) ~[metrics-core-4.1.5.jar:4.1.5] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?] at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) {code} > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Attachments: Actual references to hooks that in turn hold references > to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anver Sotnikov updated SOLR-14940: -- Attachment: (was: image-2020-10-20-13-58-58-522.png) > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Attachments: Actual references to hooks that in turn hold references > to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217833#comment-17217833 ] Anver Sotnikov commented on SOLR-14940: --- Attached closeHooks contents showing hooks (ReplicationHandler$1 and ReplicationHandler$2) on the left in MAT > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Attachments: Actual references to hooks that in turn hold references > to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png, > image-2020-10-20-13-58-58-522.png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anver Sotnikov updated SOLR-14940: -- Attachment: image-2020-10-20-13-58-58-522.png > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Attachments: Actual references to hooks that in turn hold references > to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png, > image-2020-10-20-13-58-58-522.png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anver Sotnikov updated SOLR-14940: -- Attachment: Actual references to hooks that in turn hold references to ReplicationHandlers.png > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Attachments: Actual references to hooks that in turn hold references > to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png, > image-2020-10-20-13-58-58-522.png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217828#comment-17217828 ] Anver Sotnikov commented on SOLR-14940: --- ReplicationHandler reference is kept in memory as hooks ReplicationHandler$1 and ReplicationHandler$2 are inner anonimous classes and as such they keep implicit reference to parent class instance (to use all the properties listed in hooks like executorService /currentIndexFetcher / etc) {code:java} private void registerCloseHook() { core.addCloseHook(new CloseHook() { @Override public void preClose(SolrCore core) { if (executorService != null) executorService.shutdown(); // we don't wait for shutdown - this can deadlock core reload } @Override public void postClose(SolrCore core) { if (pollingIndexFetcher != null) { pollingIndexFetcher.destroy(); } if (currentIndexFetcher != null && currentIndexFetcher != pollingIndexFetcher) { currentIndexFetcher.destroy(); } } }); core.addCloseHook(new CloseHook() { @Override public void preClose(SolrCore core) { ExecutorUtil.shutdownAndAwaitTermination(restoreExecutor); if (restoreFuture != null) { restoreFuture.cancel(false); } } @Override public void postClose(SolrCore core) {} }); } {code} > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Attachments: Memory Analyzer SolrCore.closeHooks .png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217710#comment-17217710 ] Anver Sotnikov commented on SOLR-14940: --- Attached screenshot of Memory Analyzer heap dump with 163 hooks registered. Let me know if you'd want to see any details of ReplicationHandlers referenced from hooks > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Attachments: Memory Analyzer SolrCore.closeHooks .png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anver Sotnikov updated SOLR-14940: -- Attachment: Memory Analyzer SolrCore.closeHooks .png > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Attachments: Memory Analyzer SolrCore.closeHooks .png > > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217708#comment-17217708 ] Anver Sotnikov commented on SOLR-14940: --- I assume Core reload actually creates a new core and discards old one, thus hooks would be executed and discarded. It seems that stopping replication and restarting replication on the same core would be a better representation of the problem. Planning to get trough the logs to understand what exactly triggers the problem (we do get quite a few Recoveries getting triggered due to ZK connectivity but need to trace if that is the real reason) > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
Anver Sotnikov created SOLR-14940: - Summary: ReplicationHandler memory leak through SolrCore.closeHooks Key: SOLR-14940 URL: https://issues.apache.org/jira/browse/SOLR-14940 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: replication (java) Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes with 2 cores in each JVM. Reporter: Anver Sotnikov We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG nodes. Leader does not seem to be affected while Followers are. Looking at memory dump we noticed that SolrCore holds lots of references to ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, which in turn holds ReplicationHandlers. ReplicationHandler registers hooks as anonymous inner classes in SolrCore.closeHooks through ReplicationHandler.inform() -> ReplicationHandler.registerCloseHook(). Whenever ZkController.stopReplicationFromLeader is called - it would shutdown ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to ReplicationHandler will stay in SolrCore.closeHooks. Once replication is started again on same SolrCore - new ReplicationHandler will be created and registered in closeHooks. It looks like there are few scenarios when replication is stopped and restarted on same core and in our TLOG setup it shows up quite often. Potential solutions: # Allow unregistering SolrCore.closeHooks so it can be used from ReplicationHandler.shutdown # Hack but easier - break the link between ReplicationHandler close hooks and full ReplicationHandler object so ReplicationHandler can be GCed even when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org