Hi,
Thanks for your quick responding!
My test env is without patch 22334, so priv->ssl_ctx is shared one. In 
socket_reset there is no need to free it. but if with patch 22334, it is 
absolutely needed to free priv->ssl_ctx as well.
Following is the log captured from glusterfsd side while connection 
establishment, when accept returns, a new  new_trans is allocated, and the 
socket_init input param is this new_trans, because this new_trans has no option 
now, so in socket_init   ssl_setup_connection_params is not called, so there 
should be no malloc done here, as showed from following log, 
ssl_setup_connection_params is called in socket_event_handler, and here the 
ssl_ctx had been assigned to the shared one.
This issue is very easy to reproduce in my env, just “while true; do gluster v 
heal <vol-name> info;done” and check the memory of the corresponding glusterfsd 
, it is very obvious increase.

One more thing to confirm is that there is no need to free the ssl_sbio right? 
Ssl_free will handle it and just call ssl_free is enough?


[2019-04-21 05:02:17.820346] T [socket.c:2776:socket_server_event_handler] 
0-tcp.ccs-server: XXX server:192.168.1.13:53952, client:192.168.1.13:63683
[2019-04-21 05:02:17.820370] T [socket.c:2820:socket_server_event_handler] 
0-tcp.ccs-server: ### use non-blocking IO
[2019-04-21 05:02:17.820410] T [socket.c:2603:socket_event_handler] 
0-tcp.ccs-server: server (sock:126) in:1, out:0, err:0
[2019-04-21 05:02:17.820431] T [socket.c:2610:socket_event_handler] 
0-tcp.ccs-server: server (sock:126) socket is not connected, completing 
connection
[2019-04-21 05:02:17.820442] T [socket.c:3829:ssl_setup_connection_params] 
0-tcp.ccs-server: found old SSL context!   // context is shared between 
listening socket and accepted socket
[2019-04-21 05:02:17.820451] T [socket.c:310:ssl_setup_connection_prefix] 
0-tcp.ccs-server: + ssl_setup_connection_params() done!
[2019-04-21 05:02:17.822559] T [socket.c:2617:socket_event_handler] 
0-tcp.ccs-server: (sock:126) socket_complete_connection() returned 1
[2019-04-21 05:02:17.822585] T [socket.c:2621:socket_event_handler] 
0-tcp.ccs-server: (sock:126) returning to wait on socket
[2019-04-21 05:02:17.828455] T [socket.c:2603:socket_event_handler] 
0-tcp.ccs-server: server (sock:126) in:1, out:0, err:0
[2019-04-21 05:02:17.828483] T [socket.c:2610:socket_event_handler] 
0-tcp.ccs-server: server (sock:126) socket is not connected, completing 
connection
[2019-04-21 05:02:17.829130] D [socket.c:366:ssl_setup_connection_postfix] 
0-tcp.ccs-server: peer CN = example ee certificate
[2019-04-21 05:02:17.829157] D [socket.c:369:ssl_setup_connection_postfix] 
0-tcp.ccs-server: SSL verification succeeded (client: 192.168.1.13:63683) 
(server: 192.168.1.13:53952)
[2019-04-21 05:02:17.829171] T [socket.c:423:ssl_complete_connection] 
0-tcp.ccs-server: ssl_accepted!
[2019-04-21 05:02:17.829183] T [socket.c:2617:socket_event_handler] 
0-tcp.ccs-server: (sock:126) socket_complete_connection() returned 1
[2019-04-21 05:02:17.829192] T [socket.c:2621:socket_event_handler] 
0-tcp.ccs-server: (sock:126) returning to wait on socket
[2019-04-21 05:02:17.829261] T [socket.c:2603:socket_event_handler] 
0-tcp.ccs-server: server (sock:126) in:1, out:0, err:0
[2019-04-21 05:02:17.829282] T [socket.c:2628:socket_event_handler] 
0-tcp.ccs-server: Server socket (126) is already connected
[2019-04-21 05:02:17.829294] T [socket.c:493:__socket_ssl_readv] 
0-tcp.ccs-server: ***** reading over SSL
[2019-04-21 05:02:17.829311] T [socket.c:493:__socket_ssl_readv] 
0-tcp.ccs-server: ***** reading over SSL
[2019-04-21 05:02:17.829337] T [socket.c:493:__socket_ssl_readv] 
0-tcp.ccs-server: ***** reading over SSL


cynthia
From: Milind Changire <mchan...@redhat.com>
Sent: Monday, April 22, 2019 10:21 AM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>
Cc: Atin Mukherjee <amukh...@redhat.com>; gluster-devel@gluster.org
Subject: Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl

After patch 22334<https://review.gluster.org/c/glusterfs/+/22334>, the 
priv->ssl_ctx is now maintained per socket connection and is no longer shared.
So you might want to SSL_free(priv->ssl_ctx) as well and set priv->ssl_ctx to 
NULL.

There might be some strings that are duplicated (gf_strdup()) via the 
socket_init() code path. Please take a look at those as well.

Sorry about that. I missed it.


On Mon, Apr 22, 2019 at 7:25 AM Zhou, Cynthia (NSB - CN/Hangzhou) 
<cynthia.z...@nokia-sbell.com<mailto:cynthia.z...@nokia-sbell.com>> wrote:

Hi,
From my code study it seems priv->ssl_ssl is not properly released, I made a 
patch and the glusterfsd memory leak is alleviated with my patch, but some 
otherwhere is still leaking, I have no clue about the other leak points.

--- a/rpc/rpc-transport/socket/src/socket.c
+++ b/rpc/rpc-transport/socket/src/socket.c
@@ -1019,7 +1019,16 @@ static void __socket_reset(rpc_transport_t *this) {
   memset(&priv->incoming, 0, sizeof(priv->incoming));
   event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
-
+  if(priv->use_ssl&& priv->ssl_ssl)
+  {
+    gf_log(this->name, GF_LOG_TRACE,
+           "clear and reset for socket(%d), free ssl ",
+           priv->sock);
+    SSL_shutdown(priv->ssl_ssl);
+    SSL_clear(priv->ssl_ssl);
+    SSL_free(priv->ssl_ssl);
+    priv->ssl_ssl = NULL;
+  }
   priv->sock = -1;
   priv->idx = -1;
   priv->connected = -1;
@@ -4238,6 +4250,16 @@ void fini(rpc_transport_t *this) {
     pthread_mutex_destroy(&priv->out_lock);
     pthread_mutex_destroy(&priv->cond_lock);
     pthread_cond_destroy(&priv->cond);
+     if(priv->use_ssl&& priv->ssl_ssl)
+  {
+    gf_log(this->name, GF_LOG_TRACE,
+           "clear and reset for socket(%d), free ssl ",
+           priv->sock);
+    SSL_shutdown(priv->ssl_ssl);
+    SSL_clear(priv->ssl_ssl);
+    SSL_free(priv->ssl_ssl);
+    priv->ssl_ssl = NULL;
+  }
     if (priv->ssl_private_key) {
       GF_FREE(priv->ssl_private_key);
     }


From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: Thursday, April 18, 2019 5:31 PM
To: 'Atin Mukherjee' <amukh...@redhat.com<mailto:amukh...@redhat.com>>
Cc: 'Raghavendra Gowdappa' <rgowd...@redhat.com<mailto:rgowd...@redhat.com>>; 
'gluster-devel@gluster.org<mailto:gluster-devel@gluster.org>' 
<gluster-devel@gluster.org<mailto:gluster-devel@gluster.org>>
Subject: RE: [Gluster-devel] glusterfsd memory leak issue found after enable ssl

We scan it use memory-leak tool, there are following prints. We doubt some open 
ssl lib malloc is is not properly freed by glusterfs code.
er+0x2af [libglusterfs.so.0.0.1]\n\t\tstart_thread+0xda 
[libpthread-2.27.so<http://libpthread-2.27.so>]'
        13580 bytes in 175 allocations from stack
                b'CRYPTO_malloc+0x58 [libcrypto.so.1.0.2p]'
        232904 bytes in 14 allocations from stack
                b'CRYPTO_malloc+0x58 [libcrypto.so.1.0.2p]\n\t\t[unknown]'
[15:41:56] Top 10 stacks with outstanding allocations:
        8792 bytes in 14 allocations from stack
                b'CRYPTO_malloc+0x58 [libcrypto.so.1.0.2p]\n\t\t[unknown]'
        9408 bytes in 42 allocations from stack
                b'CRYPTO_realloc+0x4d [libcrypto.so.1.0.2p]'
        9723 bytes in 14 allocations from stack
                b'CRYPTO_malloc+0x58 [libcrypto.so.1.0.2p]\n\t\t[unknown]'
        10696 bytes in 21 allocations from stack
                b'CRYPTO_malloc+0x58 [libcrypto.so.1.0.2p]\n\t\t[unknown]'
        11319 bytes in 602 allocations from stack
                b'CRYPTO_malloc+0x58 [libcrypto.so.1.0.2p]\n\t\t[unknown]'
        11431 bytes in 518 allocations from stack
                b'CRYPTO_malloc+0x58 [libcrypto.so.1.0.2p]\n\t\t[unknown]'
        11704 bytes in 371 allocations from stack
                b'CRYPTO_malloc+0x58 [libcrypto.so.1.0.2p]\n\t\t[unknown]'


cynthia
From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: Thursday, April 18, 2019 5:25 PM
To: 'Atin Mukherjee' <amukh...@redhat.com<mailto:amukh...@redhat.com>>
Cc: Raghavendra Gowdappa <rgowd...@redhat.com<mailto:rgowd...@redhat.com>>; 
gluster-devel@gluster.org<mailto:gluster-devel@gluster.org>
Subject: RE: [Gluster-devel] glusterfsd memory leak issue found after enable ssl

I’ve test on glusterfs3.12.15 and glusterfs5.5 all have this issue, after 
enable tls ssl socket, when execute gluster v heal <vol-name> info, will 
trigger glfshel to connect glusterfsd process, and cause glusterfsd process 
memory leak. Could you please try in your env?

cynthia


From: Atin Mukherjee <amukh...@redhat.com<mailto:amukh...@redhat.com>>
Sent: Thursday, April 18, 2019 1:19 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) 
<cynthia.z...@nokia-sbell.com<mailto:cynthia.z...@nokia-sbell.com>>
Cc: Raghavendra Gowdappa <rgowd...@redhat.com<mailto:rgowd...@redhat.com>>; 
gluster-devel@gluster.org<mailto:gluster-devel@gluster.org>
Subject: Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl



On Wed, 17 Apr 2019 at 10:53, Zhou, Cynthia (NSB - CN/Hangzhou) 
<cynthia.z...@nokia-sbell.com<mailto:cynthia.z...@nokia-sbell.com>> wrote:
Hi,
In my recent test, I found that there are very severe glusterfsd memory leak 
when enable socket ssl option

What gluster version are you testing? Would you be able to continue your 
investigation and share the root cause?

--
- Atin (atinm)
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-devel


--
Milind
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply via email to