On 08/05/2013 09:30 PM, Simo Sorce wrote: > On Mon, 2013-08-05 at 21:13 +0200, Michal Židek wrote: >> This is very ugly hotfix for ticket: >> https://fedorahosted.org/sssd/ticket/2018 >> >> So far we were not able to find out why the slot or name_ptr >> values were corrupted, but this should at least prevent segfaults. > > NACK, (but close) > > Please do not add all those comments about removing checks, those checks > are good to stay forever, the cache may be corrupted by a bad disk > sector or whatever, so they should never be removed.
Ok. I left just one FIXME (actually two, but related) label in the case where this patch relies on the same offset values of strs in both group and passwd data structure. I do not expect these structures to be changed, but it is something we want to have at least noticed in the code. > > Also please do not just return ENOENT, and pretend nothing happened. > If the cache is corrupted we want to take immediate corrective action. > I suggest we return a new SSSD error and in the topmost nsscache caller > we invalidate the current cache and reinit. Actually I started doing it like this, but handling the problem in the topmost caller is not very good idea here. When I was writing it I found several topmost callers and even forcing the error to bubble up to these callers required some api changes. In addition we would have to make sure we handle the corrupted memory cache case in the future, when we use the affected functions in other topmost callers (it is not so easy to track). So I decided to handle it on the place where we detect the error. > > Optional (may be should be a separate ticket) > We currently reinit by closing the file and creating a new one. We > should not do that, we should rather reset the header and then just zero > all the data and reinitialize all the various areas. The reason is that > if we have some bad bug and we keep creating new files we might run out > of space because clients may have old files open and mmapped, and unless > they perform a new operations, they may not release them. Note we should > *not* use ftruncate() here, just write the appropriate initialization > values everywhere needed. I agree, but sss_mmap_cache_reinit is completely unsuitable for this case and a huge overkill, even if we might change it in the future. I created a new function sss_mmap_cache_reset which is very lightweight. I will not create it as a separate ticket since I do not want to use the reinit function even temporary for this case. New patch is attached. Thanks Michal
>From 81ba383a8cde65e87bb71a67a3f21d9446a4faec Mon Sep 17 00:00:00 2001 From: Michal Zidek <mzi...@redhat.com> Date: Mon, 5 Aug 2013 20:59:33 +0200 Subject: [PATCH] mmap_cache: Check if slot and name_ptr are not invalid. This patch prevents jumping outside of allocated memory in case of corrupted slot or name_ptr values. It is not proper solution, just hotfix until we find out what is the root cause of ticket https://fedorahosted.org/sssd/ticket/2018 --- src/responder/nss/nsssrv_mmap_cache.c | 54 +++++++++++++++++++++++++++++++++-- src/responder/nss/nsssrv_mmap_cache.h | 2 ++ src/sss_client/nss_mc_group.c | 8 ++++++ src/sss_client/nss_mc_passwd.c | 8 ++++++ src/util/mmap_cache.h | 3 ++ 5 files changed, 73 insertions(+), 2 deletions(-) diff --git a/src/responder/nss/nsssrv_mmap_cache.c b/src/responder/nss/nsssrv_mmap_cache.c index 49878fc..cd5a643 100644 --- a/src/responder/nss/nsssrv_mmap_cache.c +++ b/src/responder/nss/nsssrv_mmap_cache.c @@ -373,8 +373,23 @@ static struct sss_mc_rec *sss_mc_find_record(struct sss_mc_ctx *mcc, } while (slot != MC_INVALID_VAL) { + if (slot > MC_SIZE_TO_SLOTS(mcc->dt_size)) { + DEBUG(SSSDBG_FATAL_FAILURE, + ("Corrupted fastcache. Slot number too big.\n")); + sss_mmap_cache_reset(mcc); + return NULL; + } + rec = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec); name_ptr = *((rel_ptr_t *)rec->data); + /* FIXME: This check relies on fact that offset of member strs + * is the same in structures sss_mc_pwd_data and sss_mc_group_data. */ + if (name_ptr != offsetof(struct sss_mc_pwd_data, strs)) { + DEBUG(SSSDBG_FATAL_FAILURE, + ("Corrupted fastcache. name_ptr value is %u.\n", name_ptr)); + sss_mmap_cache_reset(mcc); + return NULL; + } t_key = (char *)rec->data + name_ptr; if (strcmp(key->str, t_key) == 0) { @@ -608,6 +623,13 @@ errno_t sss_mmap_cache_pw_invalidate_uid(struct sss_mc_ctx *mcc, uid_t uid) } while (slot != MC_INVALID_VAL) { + if (slot > MC_SIZE_TO_SLOTS(mcc->dt_size)) { + DEBUG(SSSDBG_FATAL_FAILURE, ("Corrupted fastcache.\n")); + sss_mmap_cache_reset(mcc); + ret = ENOENT; + goto done; + } + rec = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec); data = (struct sss_mc_pwd_data *)(&rec->data); @@ -739,6 +761,13 @@ errno_t sss_mmap_cache_gr_invalidate_gid(struct sss_mc_ctx *mcc, gid_t gid) } while (slot != MC_INVALID_VAL) { + if (slot > MC_SIZE_TO_SLOTS(mcc->dt_size)) { + DEBUG(SSSDBG_FATAL_FAILURE, ("Corrupted fastcache.\n")); + sss_mmap_cache_reset(mcc); + ret = ENOENT; + goto done; + } + rec = MC_SLOT_TO_PTR(mcc->data_table, slot, struct sss_mc_rec); data = (struct sss_mc_grp_data *)(&rec->data); @@ -889,8 +918,9 @@ static void sss_mc_header_update(struct sss_mc_ctx *mc_ctx, int status) /* update header using barriers */ h = (struct sss_mc_header *)mc_ctx->mmap_base; MC_RAISE_BARRIER(h); - if (status != SSS_MC_HEADER_RECYCLED) { - /* no reason to update anything else if the file is recycled */ + if (status == SSS_MC_HEADER_ALIVE) { + /* no reason to update anything else if the file is recycled or + * right before reset */ h->hash_table = MC_PTR_DIFF(mc_ctx->hash_table, mc_ctx->mmap_base); h->free_table = MC_PTR_DIFF(mc_ctx->free_table, mc_ctx->mmap_base); h->data_table = MC_PTR_DIFF(mc_ctx->data_table, mc_ctx->mmap_base); @@ -1113,3 +1143,23 @@ done: talloc_free(tmp_ctx); return ret; } + +/* Erase all contents of the mmap cache. This will bring the cache + * to the same state as if it was just initialized. */ +void sss_mmap_cache_reset(struct sss_mc_ctx *mc_ctx) +{ + if (mc_ctx == NULL) { + DEBUG(SSSDBG_TRACE_FUNC, + ("Fastcache not initialized. Nothing to do.\n")); + return; + } + + sss_mc_header_update(mc_ctx, SSS_MC_HEADER_UNINIT); + + /* Reset the mmaped area */ + memset(mc_ctx->data_table, 0xff, mc_ctx->dt_size); + memset(mc_ctx->free_table, 0x00, mc_ctx->ft_size); + memset(mc_ctx->hash_table, 0xff, mc_ctx->ht_size); + + sss_mc_header_update(mc_ctx, SSS_MC_HEADER_ALIVE); +} diff --git a/src/responder/nss/nsssrv_mmap_cache.h b/src/responder/nss/nsssrv_mmap_cache.h index 25cec40..fdeaa09 100644 --- a/src/responder/nss/nsssrv_mmap_cache.h +++ b/src/responder/nss/nsssrv_mmap_cache.h @@ -63,4 +63,6 @@ errno_t sss_mmap_cache_gr_invalidate_gid(struct sss_mc_ctx *mcc, gid_t gid); errno_t sss_mmap_cache_reinit(TALLOC_CTX *mem_ctx, size_t n_elem, time_t timeout, struct sss_mc_ctx **mc_ctx); +void sss_mmap_cache_reset(struct sss_mc_ctx *mc_ctx); + #endif /* _NSSSRV_MMAP_CACHE_H_ */ diff --git a/src/sss_client/nss_mc_group.c b/src/sss_client/nss_mc_group.c index b3e9a8a..2d69be9 100644 --- a/src/sss_client/nss_mc_group.c +++ b/src/sss_client/nss_mc_group.c @@ -116,6 +116,10 @@ errno_t sss_nss_mc_getgrnam(const char *name, size_t name_len, } while (slot != MC_INVALID_VAL) { + if (slot > MC_SIZE_TO_SLOTS(gr_mc_ctx.dt_size)) { + /* This probably means that the memory cache was corrupted. */ + return ENOENT; + } ret = sss_nss_mc_get_record(&gr_mc_ctx, slot, &rec); if (ret) { @@ -180,6 +184,10 @@ errno_t sss_nss_mc_getgrgid(gid_t gid, } while (slot != MC_INVALID_VAL) { + if (slot > MC_SIZE_TO_SLOTS(gr_mc_ctx.dt_size)) { + /* This probably means that the memory cache was corrupted. */ + return ENOENT; + } ret = sss_nss_mc_get_record(&gr_mc_ctx, slot, &rec); if (ret) { diff --git a/src/sss_client/nss_mc_passwd.c b/src/sss_client/nss_mc_passwd.c index 4acc642..fa21bd2 100644 --- a/src/sss_client/nss_mc_passwd.c +++ b/src/sss_client/nss_mc_passwd.c @@ -117,6 +117,10 @@ errno_t sss_nss_mc_getpwnam(const char *name, size_t name_len, } while (slot != MC_INVALID_VAL) { + if (slot > MC_SIZE_TO_SLOTS(pw_mc_ctx.dt_size)) { + /* This probably means that the memory cache was corrupted */ + return ENOENT; + } ret = sss_nss_mc_get_record(&pw_mc_ctx, slot, &rec); if (ret) { @@ -181,6 +185,10 @@ errno_t sss_nss_mc_getpwuid(uid_t uid, } while (slot != MC_INVALID_VAL) { + if (slot > MC_SIZE_TO_SLOTS(pw_mc_ctx.dt_size)) { + /* This probably means that the memory cache was corrupted */ + return ENOENT; + } ret = sss_nss_mc_get_record(&pw_mc_ctx, slot, &rec); if (ret) { diff --git a/src/util/mmap_cache.h b/src/util/mmap_cache.h index 55383c0..6c223df 100644 --- a/src/util/mmap_cache.h +++ b/src/util/mmap_cache.h @@ -78,6 +78,7 @@ typedef uint32_t rel_ptr_t; #define SSS_MC_MAJOR_VNO 0 #define SSS_MC_MINOR_VNO 4 +#define SSS_MC_HEADER_UNINIT 0 /* after ftruncate or before reset */ #define SSS_MC_HEADER_ALIVE 1 /* current and in use */ #define SSS_MC_HEADER_RECYCLED 2 /* file was recycled, reopen asap */ @@ -109,6 +110,8 @@ struct sss_mc_rec { char data[0]; }; +/* FIXME: Function sss_mc_find_record currently relies on fact that + * offset of strs is the same in both sss_mc_pwd_data and sss_mc_grp_data. */ struct sss_mc_pwd_data { rel_ptr_t name; /* ptr to name string, rel. to struct base addr */ uint32_t uid; -- 1.7.11.2
_______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel