[ 
https://issues.apache.org/jira/browse/GEODE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Baynes updated GEODE-3286:
--------------------------------
    Description: 
Improperly handled closed connections in ConnectionTable can lead to 
{{OutOfMemoryError}}.


  was:
This bug tracks gemfire issue 1554 
(https://jira-pivotal.atlassian.net/browse/GEM-1544).


Hello team,

A customer (VMWare) is experiencing several {{OutOfMemoryError}} on production 
servers, and they believe there's a memory leak within GemFire.
Apparently 9.5GB of the heap heap is occupied by 487,828 instances of 
{{sun.security.ssl.SSLSocketImpl}}, and 7.7GB of the heap is occupied by 
487,804 instances of {{sun.security.ssl.AppOutputStream}}, both referenced from 
the {{receivers}} attribute within the {{ConnectionTable}} class. I got this 
information from the Eclipse Memory Analyzer plugin, the images are attached.
Below are some OQLs that I was able to run within the plugin, it is weird that 
the collection of receivers is composed of 486.368 elements...

{code}
SELECT * FROM com.gemstone.gemfire.internal.tcp.ConnectionTable
        -> 1
SELECT receivers.size FROM com.gemstone.gemfire.internal.tcp.ConnectionTable 
        -> 486.368
SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection
        -> 487.758
SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection con WHERE 
con.stopped = true
        -> 486.461
SELECT * FROM com.gemstone.gemfire.internal.tcp.Connection con WHERE 
con.stopped = false
        -> 1297
{code}

That said, nothing in the statistics (maybe there's something, but I can't find 
it...) seems to point to a spike in the amount of entries within the regions, 
neither in the current amount of connections, nor anything to be able to 
explain the continuous drop of the available heap over time (chart#freeMemory).
The heap dump (approximately 20GB) and the statistics (don't have logs yet, but 
they might not be required by looking at the heap and the statistics) have been 
uploaded to [Google 
Drive|https://drive.google.com/drive/folders/0BxDMZZTfEL4WUFZjbjhLMXptbEk?usp=sharing].
Just for the record, apparently we delivered 8.2.0.6 to them a year and half 
ago as a fix to [GEM-94|https://jira-pivotal.atlassian.net/browse/GEM-94] / 
[GEODE-332|https://issues.apache.org/jira/browse/GEODE-332], they've been 
running fine since then, until now. The last change in the {{ConnectionTable}} 
was done to fix these issues, so if there's actually a bug within the class, it 
will also exist on 8.2.5 (just a reminder to change the affected version field 
if required).
The issue is not reproducible at will but happens in several of their 
environments, yet I haven't been able to reproduce it in my lab environment for 
now.
Please let me know if you need anything else to proceed.
Best regards.



> Failing to cleanup connections from ConnectionTable receiver table
> ------------------------------------------------------------------
>
>                 Key: GEODE-3286
>                 URL: https://issues.apache.org/jira/browse/GEODE-3286
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Brian Rowe
>
> Improperly handled closed connections in ConnectionTable can lead to 
> {{OutOfMemoryError}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to