On 12/03/2015 02:06 PM, Jakub Hrozek wrote:
On Mon, Nov 30, 2015 at 06:45:00PM +0100, Pavel Reichl wrote:


On 11/30/2015 06:02 PM, Jakub Hrozek wrote:

Wouldn't it then be better to see if another same object is already in
the hashtable and free it before replacing?

I agree it would be best. I tried that before and failed because I could not 
decipher out the relation of talloc contexts.

I tried that again. Seems that leaks are gone. Segfaults were not happening 
during my testing.

Code got even messier :-(

I think the code would look nice if it was placed in the else branch of
create_negcache_netgr() :-)

Done, thanks for hint.


I also don't think we need the tmp_ctx change..

I dare to disagree here. With prev. versions of patch I was experiencing
segfaults. IIRC step_context is being allocated on context of the netgroup
that is freed. I also think that it's not a good idea to allocate local data
on context that does not hold any reference to that data. But it might be
matter of personal taste.

What kind of segfaults, what was the backtrace? I think it might be good to
add talloc_zfree() instead of talloc_free() or explicitly NULL the pointer,
that should solve the issue without adding a new context...

Since it's per-domain, I wonder if step_ctx->dctx might be better
candidate than step_ctx either way.


Hello Jakub, I amended the patch as you proposed. You can now experience the 
segfault yourself - just query a non-existing netgroup.
I attached the backtrace as well.

I can investiage and use talloc reporting if you wish. But still using tmp_ctx 
seems as way of the least effort...
>From 0160ee92c4d9a9542abfd3e918686526a52e367d Mon Sep 17 00:00:00 2001
From: Pavel Reichl <prei...@redhat.com>
Date: Fri, 27 Nov 2015 07:53:00 -0500
Subject: [PATCH] NSS: Fix memory leak netgroup

If netgroup cannot be found in setnetgrent_retry() function
set_netgroup_entry() is called which steals getent_ctx directly to
nss_ctx->netgroups.
Subsequently function lookup_netgr_step() is called that (in case of
nenexisting group) will call create_negcache_netgr() which creates
a new dummy object to serve as negative cache. While doing so it calls
again set_netgroup_entry() for the same netgroup and it calls
hash_enter.

hash_enter will remove previously hashed entry for netgroup (created in
setnetgrent_retry()) from hash table but it won't be freed and thus it
leaks. This patch marks such netgroup and freed it after the call of
create_negcache_netgr().

Resolves:
https://fedorahosted.org/sssd/ticket/2865
---
 src/responder/nss/nsssrv_netgroup.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/src/responder/nss/nsssrv_netgroup.c b/src/responder/nss/nsssrv_netgroup.c
index 9a78c1119c2f4e06e43ebec29ace775adc997e08..8b1f807615ee352d60740e8852c4b0f0a6c91b73 100644
--- a/src/responder/nss/nsssrv_netgroup.c
+++ b/src/responder/nss/nsssrv_netgroup.c
@@ -434,6 +434,7 @@ static errno_t create_negcache_netgr(struct setent_step_ctx *step_ctx)
 {
     errno_t ret;
     struct getent_ctx *netgr;
+    struct getent_ctx *netgr_to_be_freed = NULL;
 
     netgr = talloc_zero(step_ctx->nctx, struct getent_ctx);
     if (netgr == NULL) {
@@ -441,6 +442,13 @@ static errno_t create_negcache_netgr(struct setent_step_ctx *step_ctx)
         ret = ENOMEM;
         goto done;
     } else {
+        /* Is there already netgroup with such name? */
+        ret = get_netgroup_entry(step_ctx->nctx, step_ctx->name,
+                                 &netgr_to_be_freed);
+        if (ret != EOK) {
+            netgr_to_be_freed = NULL;
+        }
+
         netgr->ready = true;
         netgr->found = false;
         netgr->entries = NULL;
@@ -461,6 +469,11 @@ static errno_t create_negcache_netgr(struct setent_step_ctx *step_ctx)
     }
 
 done:
+    /* Free netgroup after step_ctx is not needed. */
+    if (netgr_to_be_freed) {
+        talloc_zfree(netgr_to_be_freed);
+    }
+
     if (ret != EOK) {
         talloc_free(netgr);
     }
@@ -494,7 +507,8 @@ static errno_t lookup_netgr_step(struct setent_step_ctx *step_ctx)
         /* make sure to update the dctx if we changed domain */
         step_ctx->dctx->domain = dom;
 
-        talloc_free(name);
+        talloc_zfree(name);
+
         name = sss_get_cased_name(step_ctx, step_ctx->name,
                                   dom->case_sensitive);
         if (!name) {
@@ -623,10 +637,11 @@ static errno_t lookup_netgr_step(struct setent_step_ctx *step_ctx)
               "create_negcache_netgr failed with: %d:[%s], ignored.\n",
               ret, sss_strerror(ret));
     }
+
     ret = ENOENT;
 
 done:
-    talloc_free(name);
+    talloc_zfree(name);
     return ret;
 }
 
-- 
2.4.3

Continuing.

Program received signal SIGABRT, Aborted.
0x00007fa8b469e9c8 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:55
#0  0x00007fa8b469e9c8 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:55
        resultvar = 0
        pid = 26698
        selftid = 26698
#1  0x00007fa8b46a065a in __GI_abort () at abort.c:89
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0x7fa8b4e422c8, sa_sigaction 
= 0x7fa8b4e422c8}, sa_mask = {__val = {4398408, 23106640, 140362630798272, 
23133120, 140362632603712, 4397651, 23133216, 4397816, 
              23133120, 23101360, 0, 0, 140362558853104, 140362561768312, 0, 
23087856}}, sa_flags = 23087856, sa_restorer = 0x7fa8b8db2840}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x00007fa8b4e36e9c in talloc_abort (reason=0x7fa8b4e422c8 "Bad talloc magic 
value - access after free") at ../talloc.c:340
No locals.
#3  0x00007fa8b4e36778 in talloc_abort_access_after_free () at ../talloc.c:359
No locals.
#4  talloc_chunk_from_ptr (ptr=0x160ebf0) at ../talloc.c:380
        ptr = 0x160ebf0
        pp = 0x160ebf0 "caa1"
        tc = 0x160eb90
#5  _talloc_free (ptr=0x160ebf0, location=0x431d48 
"../src/responder/nss/nsssrv_netgroup.c:644") at ../talloc.c:1572
No locals.
#6  0x00000000004182f6 in lookup_netgr_step (step_ctx=0x16060a0) at 
../src/responder/nss/nsssrv_netgroup.c:644
        ret = 2
        dom = 0x0
        netgr = 0x7ffffab0d170
        name = 0x160ebf0 "caa1"
        lifetime = 32767
        __FUNCTION__ = "lookup_netgr_step"
#7  0x000000000041843e in lookup_netgr_dp_callback (err_maj=0, err_min=0, 
err_msg=0x160a2c0 "Success", ptr=0x16060a0) at 
../src/responder/nss/nsssrv_netgroup.c:671
        step_ctx = 0x16060a0
        dctx = 0x1612c50
        cmdctx = 0x1604750
        ret = -89076816
        __FUNCTION__ = "lookup_netgr_dp_callback"
#8  0x0000000000409178 in nsssrv_dp_send_acct_req_done (req=0x0) at 
../src/responder/nss/nsssrv_cmd.c:885
        cb_ctx = 0x1610c50
        ret = 0
        err_maj = 0
        err_min = 0
        err_msg = 0x160a2c0 "Success"
        __FUNCTION__ = "nsssrv_dp_send_acct_req_done"
#9  0x0000000000428c4b in sss_dp_internal_get_done (pending=0x16132e0, 
ptr=0x1610790) at ../src/responder/common/responder_dp.c:802
        ret = 0
        req = 0x1610790
        sdp_req = 0x16109a0
        cb = 0x1610bc0
        state = 0x1610920
        cb_state = 0x1613990
        __FUNCTION__ = "sss_dp_internal_get_done"
#10 0x00007fa8b80e5922 in complete_pending_call_and_unlock 
(connection=connection@entry=0x16030d0, pending=0x16132e0, 
message=message@entry=0x160b750) at dbus-connection.c:2331
No locals.
#11 0x00007fa8b80e8dd1 in dbus_connection_dispatch (connection=0x16030d0) at 
dbus-connection.c:4626
        message = 0x160b750
        link = <optimized out>
        reply_serial = <optimized out>
        filter_list_copy = 0x16030d0
        status = <optimized out>
        found_object = 0
        message_link = 0x1609450
        result = DBUS_HANDLER_RESULT_NOT_YET_HANDLED
        pending = <optimized out>
        connection = 0x16030d0
        status = <optimized out>
#12 0x00007fa8b87a121d in sbus_dispatch (ev=0x15f74a0, te=0x1601290, tv=..., 
data=0x1603430) at ../src/sbus/sssd_dbus_connection.c:96
        new_event = 0x15f74a0
        conn = 0x1603430
        dbus_conn = 0x16030d0
        ret = 0
        __FUNCTION__ = "sbus_dispatch"
#13 0x00007fa8b504e1dd in tevent_common_loop_timer_delay 
(ev=ev@entry=0x15f74a0) at ../tevent_timed.c:341
        current_time = {tv_sec = 1449152547, tv_usec = 812745}
        te = 0x1601290
#14 0x00007fa8b504f20a in epoll_event_loop_once (ev=0x15f74a0, 
location=<optimized out>) at ../tevent_epoll.c:911
        epoll_ev = 0x15f76e0
        tval = {tv_sec = 57, tv_usec = 999996}
        panic_triggered = false
#15 0x00007fa8b504d907 in std_event_loop_once (ev=0x15f74a0, 
location=0x7fa8b87ce602 "../src/util/server.c:673") at ../tevent_standard.c:114
        glue_ptr = <optimized out>
        glue = 0x15f7590
        ret = <optimized out>
#16 0x00007fa8b504a11d in _tevent_loop_once (ev=ev@entry=0x15f74a0, 
location=location@entry=0x7fa8b87ce602 "../src/util/server.c:673") at 
../tevent.c:533
        ret = <optimized out>
        nesting_stack_ptr = 0x0
#17 0x00007fa8b504a2bb in tevent_common_loop_wait (ev=0x15f74a0, 
location=0x7fa8b87ce602 "../src/util/server.c:673") at ../tevent.c:637
        ret = <optimized out>
#18 0x00007fa8b504d8a7 in std_event_loop_wait (ev=0x15f74a0, 
location=0x7fa8b87ce602 "../src/util/server.c:673") at ../tevent_standard.c:140
        glue_ptr = <optimized out>
        glue = 0x15f7590
        ret = <optimized out>
#19 0x00007fa8b87af361 in server_loop (main_ctx=0x15f88f0) at 
../src/util/server.c:673
No locals.
#20 0x0000000000406ac3 in main (argc=6, argv=0x7ffffab0d178) at 
../src/responder/nss/nsssrv.c:626
        opt = -1
        pc = 0x15f6060
        main_ctx = 0x15f88f0
        ret = 0
        uid = 0
        gid = 0
        long_options = {{longName = 0x0, shortName = 0 '\000', argInfo = 4, arg 
= 0x63dac0 <poptHelpOptions@@LIBPOPT_0>, val = 0, descrip = 0x42dfaa "Help 
options:", argDescrip = 0x0}, {
            longName = 0x42dfb8 "debug-level", shortName = 100 'd', argInfo = 
2, arg = 0x63db80 <debug_level>, val = 0, descrip = 0x42dfc4 "Debug level", 
argDescrip = 0x0}, {
            longName = 0x42dfd0 "debug-to-files", shortName = 102 'f', argInfo 
= 0, arg = 0x63da90 <debug_to_file>, val = 0, descrip = 0x42dfe0 "Send the 
debug output to files instead of stderr", 
            argDescrip = 0x0}, {longName = 0x42e011 "debug-to-stderr", 
shortName = 0 '\000', argInfo = 1073741824, arg = 0x63dbb0 <debug_to_stderr>, 
val = 0, 
            descrip = 0x42e028 "Send the debug output to stderr directly.", 
argDescrip = 0x0}, {longName = 0x42e052 "debug-timestamps", shortName = 0 
'\000', argInfo = 2, arg = 0x63da80 <debug_timestamps>, 
            val = 0, descrip = 0x42e063 "Add debug timestamps", argDescrip = 
0x0}, {longName = 0x42e078 "debug-microseconds", shortName = 0 '\000', argInfo 
= 2, arg = 0x63dba8 <debug_microseconds>, val = 0, 
            descrip = 0x42e090 "Show timestamps with microseconds", argDescrip 
= 0x0}, {longName = 0x42e0b2 "uid", shortName = 0 '\000', argInfo = 2, arg = 
0x7ffffab0d05c, val = 0, 
            descrip = 0x42e0b8 "The user ID to run the server as", argDescrip = 
0x0}, {longName = 0x42e0d9 "gid", shortName = 0 '\000', argInfo = 2, arg = 
0x7ffffab0d058, val = 0, 
            descrip = 0x42e0e0 "The group ID to run the server as", argDescrip 
= 0x0}, {longName = 0x0, shortName = 0 '\000', argInfo = 0, arg = 0x0, val = 0, 
descrip = 0x0, argDescrip = 0x0}}
        __FUNCTION__ = "main"
Detaching from program: /usr/libexec/sssd/sssd_nss, process 26698
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org

Reply via email to