[ https://issues.apache.org/jira/browse/GEODE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100919#comment-16100919 ]
ASF GitHub Bot commented on GEODE-3286: --------------------------------------- GitHub user WireBaron opened a pull request: https://github.com/apache/geode/pull/657 GEODE-3286: Failing to cleanup connections from ConnectionTable recei… …ver table @kohlmu-pivotal @galen-pivotal @pivotal-amurmann @bschuchardt @hiteshk25 - prevent adding a closed connection to the connection table's receivers - add a new unit test for connection table - adding connection factory object for creating receiving connections - have the idle connection timeout ensure connections are removed from connection table receivers - modify tcpConduit stat accesses to allow easier mocking Signed-off-by: Hitesh Khamesra <hitesh...@yahoo.com> Thank you for submitting a contribution to Apache Geode. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [x] Has your PR been rebased against the latest commit within the target branch (typically `develop`)? - [x] Is your initial contribution a single, squashed commit? - [x] Does `gradlew build` run cleanly? - [x] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. If you need help, please send an email to d...@geode.apache.org. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WireBaron/geode feature/GEODE-3286 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/geode/pull/657.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #657 ---- commit 8aed26846de6e9ff1c123acae98a7b5ce6d82a83 Author: Brian Rowe <br...@pivotal.io> Date: 2017-07-25T22:43:35Z GEODE-3286: Failing to cleanup connections from ConnectionTable receiver table - prevent adding a closed connection to the connection table's receivers - add a new unit test for connection table - adding connection factory object for creating receiving connections - have the idle connection timeout ensure connections are removed from connection table receivers - modify tcpConduit stat accesses to allow easier mocking Signed-off-by: Hitesh Khamesra <hitesh...@yahoo.com> ---- > Failing to cleanup connections from ConnectionTable receiver table > ------------------------------------------------------------------ > > Key: GEODE-3286 > URL: https://issues.apache.org/jira/browse/GEODE-3286 > Project: Geode > Issue Type: Bug > Components: membership > Reporter: Brian Rowe > > This bug tracks gemfire issue 1554 > (https://jira-pivotal.atlassian.net/browse/GEM-1544). > Hello team, > A customer (VMWare) is experiencing several {{OutOfMemoryError}} on > production servers, and they believe there's a memory leak within GemFire. > Apparently 9.5GB of the heap heap is occupied by 487,828 instances of > {{sun.security.ssl.SSLSocketImpl}}, and 7.7GB of the heap is occupied by > 487,804 instances of {{sun.security.ssl.AppOutputStream}}, both referenced > from the {{receivers}} attribute within the {{ConnectionTable}} class. I got > this information from the Eclipse Memory Analyzer plugin, the images are > attached. > Below are some OQLs that I was able to run within the plugin, it is weird > that the collection of receivers is composed of 486.368 elements... > {code} > SELECT * FROM com.gemstone.gemfire.internal.tcp.ConnectionTable > -> 1 > SELECT receivers.size FROM com.gemstone.gemfire.internal.tcp.ConnectionTable > -> 486.368 > SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection > -> 487.758 > SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection con WHERE > con.stopped = true > -> 486.461 > SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection con WHERE > con.stopped = false > -> 1297 > {code} > That said, nothing in the statistics (maybe there's something, but I can't > find it...) seems to point to a spike in the amount of entries within the > regions, neither in the current amount of connections, nor anything to be > able to explain the continuous drop of the available heap over time > (chart#freeMemory). > The heap dump (approximately 20GB) and the statistics (don't have logs yet, > but they might not be required by looking at the heap and the statistics) > have been uploaded to [Google > Drive|https://drive.google.com/drive/folders/0BxDMZZTfEL4WUFZjbjhLMXptbEk?usp=sharing]. > Just for the record, apparently we delivered 8.2.0.6 to them a year and half > ago as a fix to [GEM-94|https://jira-pivotal.atlassian.net/browse/GEM-94] / > [GEODE-332|https://issues.apache.org/jira/browse/GEODE-332], they've been > running fine since then, until now. The last change in the > {{ConnectionTable}} was done to fix these issues, so if there's actually a > bug within the class, it will also exist on 8.2.5 (just a reminder to change > the affected version field if required). > The issue is not reproducible at will but happens in several of their > environments, yet I haven't been able to reproduce it in my lab environment > for now. > Please let me know if you need anything else to proceed. > Best regards. -- This message was sent by Atlassian JIRA (v6.4.14#64029)