[ https://issues.apache.org/jira/browse/SLING-10133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Timothee Maret updated SLING-10133: ----------------------------------- Affects Version/s: Content Distribution Core 0.4.0 > Memory leak in MonitoringDistributionPackageBuilder > --------------------------------------------------- > > Key: SLING-10133 > URL: https://issues.apache.org/jira/browse/SLING-10133 > Project: Sling > Issue Type: Bug > Affects Versions: Content Distribution Core 0.4.0 > Reporter: Timothee Maret > Priority: Major > > The MonitoringDistributionPackageBuilder maintain a list of MBean for the > latest packages. The number of packages to be monitored is passed as the > [queueCapacity|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/monitor/impl/MonitoringDistributionPackageBuilder.java#L49] > via the constructor. When the queueCapacity is 0, the monitoring is disabled. > [VaultDistributionPackageBuilderFactory|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/serialization/impl/vlt/VaultDistributionPackageBuilderFactory.java#L201] > and > [DistributionPackageBuilderFactory|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/serialization/impl/DistributionPackageBuilderFactory.java] > disable this feature by default. An environment that runs for multiple weeks > without restart and with the default configuration will experience a memory > leak that leads to the JVM running out of memory. > The implementation has two flaws that explain the memory leak. > > h2. #1 - Registering a MBean when the queueCapacity is 0 > The code [unconditionally registers a > MBean|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/monitor/impl/MonitoringDistributionPackageBuilder.java#L106] > even if the queueCapacity is 0. We need to only register a MBean when the > capacity is > 0. > h2. #2 - Concurrency issue when un-registering MBean > The code [attempts to > remove|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/monitor/impl/MonitoringDistributionPackageBuilder.java#L108] > by checking if the queueCapacity equals the list of MBeans. This check works > in a single threaded context but it falls short when > registerDistributionPackageMBean is invoked concurrently. In the latter case, > it can happen that the check never holds true leading the mBeans queue to > grow indefinitely. One solution is to leverage the features of the > LinkedBlockingDeque. Create a LinkedBlockingDeque with bounded capacity and > rely on the returned status from the offer method to decide if an item needs > to be removed. -- This message was sent by Atlassian Jira (v8.3.4#803005)