Looks like using BIO_free() is not right. Here's what the SSL_set_bio() man page says ...
SSL_set_bio() is similar to SSL_set0_rbio() and SSL_set0_wbio() except that it connects both the rbio and the wbio at the same time, and *transfers the ownership of rbio and wbio to ssl* according to the following set of rules: So, I think you were right about SSL_free() doing the job for the bio. However, SSL_free() has no reason to set the priv->ssl_sbio pointer to NULL. I think priv->ssl_sbio should be set to NULL immediately after the call to SSL_set_bio() is successful. And we need to add a comment while setting priv->ssl_sbio to NULL that the ownership of the bio has now been transferred to SSL and SSL will free the related memory appropriately. On Mon, Apr 22, 2019 at 11:37 AM Zhou, Cynthia (NSB - CN/Hangzhou) < cynthia.z...@nokia-sbell.com> wrote: > I tried to print priv->ssl_sbio after SSL_free() find this pointer is not > null, so I add free ssl_sbio with BIO_free, however this cause glusterfd > coredump > > (gdb) bt > > #0 0x00007f3047867f9b in raise () from /lib64/libc.so.6 > > #1 0x00007f3047869351 in abort () from /lib64/libc.so.6 > > #2 0x00007f30478aa8c7 in __libc_message () from /lib64/libc.so.6 > > #3 0x00007f30478b0e6a in malloc_printerr () from /lib64/libc.so.6 > > #4 0x00007f30478b2835 in _int_free () from /lib64/libc.so.6 > > #5 0x00007f3047c5bbbd in CRYPTO_free () from /lib64/libcrypto.so.10 > > #6 0x00007f3047d07582 in BIO_free () from /lib64/libcrypto.so.10 > > #7 0x00007f3043f9ba4b in __socket_reset (this=0x7f303c1ae710) at > socket.c:1032 > > #8 0x00007f3043f9c4aa in socket_event_poll_err (this=0x7f303c1ae710, > gen=1, idx=17) at socket.c:1232 > > #9 0x00007f3043fa1b7d in socket_event_handler (fd=26, idx=17, gen=1, > data=0x7f303c1ae710, poll_in=1, poll_out=0, poll_err=0) at socket.c:2669 > > #10 0x00007f3049307984 in event_dispatch_epoll_handler > (event_pool=0x1035610, event=0x7f3043b14e84) at event-epoll.c:587 > > #11 0x00007f3049307c5b in event_dispatch_epoll_worker (data=0x107e3e0) at > event-epoll.c:663 > > #12 0x00007f30480535da in start_thread () from /lib64/libpthread.so.0 > > #13 0x00007f3047929eaf in clone () from /lib64/libc.so.6 > > > > @@ -1019,7 +1019,20 @@ static void __socket_reset(rpc_transport_t *this) { > > memset(&priv->incoming, 0, sizeof(priv->incoming)); > > event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx); > > - > > + if(priv->use_ssl&& priv->ssl_ssl) > > + { > > + gf_log(this->name, GF_LOG_INFO, > > + "clear and reset for socket(%d), free ssl ", > > + priv->sock); > > + SSL_shutdown(priv->ssl_ssl); > > + SSL_clear(priv->ssl_ssl); > > + SSL_free(priv->ssl_ssl); > > + gf_log(this->name, GF_LOG_INFO,"priv->ssl_sbio of socket(%d)is %p > ",priv->sock,priv->ssl_sbio); > > + if(priv->ssl_sbio != NULL) > > + BIO_free(priv->ssl_sbio); > > + priv->ssl_ssl = NULL; > > + priv->ssl_sbio = NULL; > > + } > > priv->sock = -1; > > priv->idx = -1; > > priv->connected = -1; > > @@ -4238,6 +4251,20 @@ void fini(rpc_transport_t *this) { > > pthread_mutex_destroy(&priv->out_lock); > > pthread_mutex_destroy(&priv->cond_lock); > > pthread_cond_destroy(&priv->cond); > > + if(priv->use_ssl&& priv->ssl_ssl) > > + { > > + gf_log(this->name, GF_LOG_INFO, > > + "clear and reset for socket(%d), free ssl ", > > + priv->sock); > > + SSL_shutdown(priv->ssl_ssl); > > + SSL_clear(priv->ssl_ssl); > > + SSL_free(priv->ssl_ssl); > > + gf_log(this->name, GF_LOG_INFO,"priv->ssl_sbio of socket(%d)is %p > ",priv->sock,priv->ssl_sbio); > > + if(priv->ssl_sbio != NULL) > > + BIO_free(priv->ssl_sbio); > > + priv->ssl_ssl = NULL; > > + priv->ssl_sbio = NULL; > > + } > > if (priv->ssl_private_key) { > > GF_FREE(priv->ssl_private_key); > > > > > > *From:* Milind Changire <mchan...@redhat.com> > *Sent:* Monday, April 22, 2019 1:35 PM > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com> > *Cc:* Atin Mukherjee <amukh...@redhat.com>; gluster-devel@gluster.org > *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > This probably went unnoticed until now. > > > > > > > > On Mon, Apr 22, 2019 at 10:45 AM Zhou, Cynthia (NSB - CN/Hangzhou) < > cynthia.z...@nokia-sbell.com> wrote: > > Why there is no bio_free called in ssl_teardown_connection then? > > > > cynthia > > > > *From:* Milind Changire <mchan...@redhat.com> > *Sent:* Monday, April 22, 2019 10:21 AM > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com> > *Cc:* Atin Mukherjee <amukh...@redhat.com>; gluster-devel@gluster.org > *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > After patch 22334 <https://review.gluster.org/c/glusterfs/+/22334>, the > priv->ssl_ctx is now maintained per socket connection and is no longer > shared. > > So you might want to SSL_free(priv->ssl_ctx) as well and set priv->ssl_ctx > to NULL. > > > > There might be some strings that are duplicated (gf_strdup()) via the > socket_init() code path. Please take a look at those as well. > > > > Sorry about that. I missed it. > > > > > > On Mon, Apr 22, 2019 at 7:25 AM Zhou, Cynthia (NSB - CN/Hangzhou) < > cynthia.z...@nokia-sbell.com> wrote: > > > > Hi, > > From my code study it seems priv->ssl_ssl is not properly released, I made > a patch and the glusterfsd memory leak is alleviated with my patch, but > some otherwhere is still leaking, I have no clue about the other leak > points. > > > > --- a/rpc/rpc-transport/socket/src/socket.c > > +++ b/rpc/rpc-transport/socket/src/socket.c > > @@ -1019,7 +1019,16 @@ static void __socket_reset(rpc_transport_t *this) { > > memset(&priv->incoming, 0, sizeof(priv->incoming)); > > event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx); > > - > > + if(priv->use_ssl&& priv->ssl_ssl) > > + { > > + gf_log(this->name, GF_LOG_TRACE, > > + "clear and reset for socket(%d), free ssl ", > > + priv->sock); > > + SSL_shutdown(priv->ssl_ssl); > > + SSL_clear(priv->ssl_ssl); > > + SSL_free(priv->ssl_ssl); > > + priv->ssl_ssl = NULL; > > + } > > priv->sock = -1; > > priv->idx = -1; > > priv->connected = -1; > > @@ -4238,6 +4250,16 @@ void fini(rpc_transport_t *this) { > > pthread_mutex_destroy(&priv->out_lock); > > pthread_mutex_destroy(&priv->cond_lock); > > pthread_cond_destroy(&priv->cond); > > + if(priv->use_ssl&& priv->ssl_ssl) > > + { > > + gf_log(this->name, GF_LOG_TRACE, > > + "clear and reset for socket(%d), free ssl ", > > + priv->sock); > > + SSL_shutdown(priv->ssl_ssl); > > + SSL_clear(priv->ssl_ssl); > > + SSL_free(priv->ssl_ssl); > > + priv->ssl_ssl = NULL; > > + } > > if (priv->ssl_private_key) { > > GF_FREE(priv->ssl_private_key); > > } > > > > > > *From:* Zhou, Cynthia (NSB - CN/Hangzhou) > *Sent:* Thursday, April 18, 2019 5:31 PM > *To:* 'Atin Mukherjee' <amukh...@redhat.com> > *Cc:* 'Raghavendra Gowdappa' <rgowd...@redhat.com>; ' > gluster-devel@gluster.org' <gluster-devel@gluster.org> > *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > We scan it use memory-leak tool, there are following prints. We doubt some > open ssl lib malloc is is not properly freed by glusterfs code. > > er+0x2af [*libglusterfs.so.0.0.1]\n\t\tstart_thread+0xda* [ > libpthread-2.27.so]' > *13580* bytes in 175 allocations from stack > b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p*]' > *232904* bytes in 14 allocations from stack > b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown* > ]' > [15:41:56] Top 10 stacks with outstanding allocations: > *8792* bytes in 14 allocations from stack > b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown* > ]' > *9408* bytes in 42 allocations from stack > b'CRYPTO_realloc+0x4d [*libcrypto.so.1.0.2p*]' > *9723* bytes in 14 allocations from stack > b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown* > ]' > *10696* bytes in 21 allocations from stack > b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown* > ]' > *11319* bytes in 602 allocations from stack > b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown* > ]' > *11431* bytes in 518 allocations from stack > b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown* > ]' > *11704* bytes in 371 allocations from stack > b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown* > ]' > > > > > > cynthia > > *From:* Zhou, Cynthia (NSB - CN/Hangzhou) > *Sent:* Thursday, April 18, 2019 5:25 PM > *To:* 'Atin Mukherjee' <amukh...@redhat.com> > *Cc:* Raghavendra Gowdappa <rgowd...@redhat.com>; > gluster-devel@gluster.org > *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > I’ve test on glusterfs3.12.15 and glusterfs5.5 all have this issue, after > enable tls ssl socket, when execute gluster v heal <vol-name> info, will > trigger glfshel to connect glusterfsd process, and cause glusterfsd process > memory leak. Could you please try in your env? > > > > cynthia > > > > > > *From:* Atin Mukherjee <amukh...@redhat.com> > *Sent:* Thursday, April 18, 2019 1:19 PM > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com> > *Cc:* Raghavendra Gowdappa <rgowd...@redhat.com>; > gluster-devel@gluster.org > *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after > enable ssl > > > > > > > > On Wed, 17 Apr 2019 at 10:53, Zhou, Cynthia (NSB - CN/Hangzhou) < > cynthia.z...@nokia-sbell.com> wrote: > > Hi, > > In my recent test, I found that there are very severe glusterfsd memory > leak when enable socket ssl option > > > > What gluster version are you testing? Would you be able to continue your > investigation and share the root cause? > > > > -- > > - Atin (atinm) > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > -- > > Milind > > > > -- > > Milind > -- Milind
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel