On 12/15/2016 10:19 PM, Goffredo Baroncelli wrote:
> Hi All,
> 
> (please put me in CC in case of reply because I am not subscribed to the 
> mailing list)
> 
> I find a memory leak in the find command. In the function 
> free_file_system_list() (file fstype.c) the field me_mntroot of the struct 
> mount_entry is never free-ed. 
> 
> Commit c6148bca89e9465fd6ba3a10d273ec4cb58c2dbe in gnulibs added this field; 
> this field is filled by the function read_file_system_list(). To correct this 
> problem, we should call the free_mount_entry() function instead of doing the 
> free of the fields.
> 
> What seems strange to me is that this patch is quite old (August 2015), but 
> nobody noticed that. On my machine, find required up to 3GB during the 
> "updatedb" !!!
> 
> E.g.
> 
> # Without the patch, this command requires up to 602MB of memory !!!
> $ /usr/bin/time ./find /lib/modules/ -ignore_readdir_race      \( -fstype NFS 
> -o -fstype nfs -o -fstype nfs4 -o -fstype afs -o -fstype binfmt_misc -o 
> -fstype proc -o -fstype smbfs -o -fstype autofs -o -fstype iso9660 -o -fstype 
> ncpfs -o -fstype coda -o -fstype devpts -o -fstype ftpfs -o -fstype devfs -o 
> -fstype mfs -o -fstype shfs -o -fstype sysfs -o -fstype cifs -o -fstype 
> lustre_lite -o -fstype tmpfs -o -fstype usbfs -o -fstype udf -o -fstype ocfs2 
> -o      -type d -false  \) -prune -o -print0  | wc -l
> 31.59user 27.17system 0:58.82elapsed 99%CPU (0avgtext+0avgdata 
> 602476maxresident)k
> 0inputs+0outputs (0major+150129minor)pagefaults 0swaps
> 
> # With the patch, the same command as above requires only 2MB of memory !!!
> $ /usr/bin/time ./find /lib/modules/ -ignore_readdir_race      \( -fstype NFS 
> -o -fstype nfs -o -fstype nfs4 -o -fstype afs -o -fstype binfmt_misc -o 
> -fstype proc -o -fstype smbfs -o -fstype autofs -o -fstype iso9660 -o -fstype 
> ncpfs -o -fstype coda -o -fstype devpts -o -fstype ftpfs -o -fstype devfs -o 
> -fstype mfs -o -fstype shfs -o -fstype sysfs -o -fstype cifs -o -fstype 
> lustre_lite -o -fstype tmpfs -o -fstype usbfs -o -fstype udf -o -fstype ocfs2 
> -o      -type d -false  \) -prune -o -print0  | wc -l
> 30.94user 26.14system 0:57.14elapsed 99%CPU (0avgtext+0avgdata 
> 2764maxresident)k
> 0inputs+0outputs (0major+165minor)pagefaults 0swaps
> 
> 
> # for curiosity my /lib/modules contains
> $ find /lib/modules/ | wc -l
> 68101
> 
> Finally, what about caching the value of read_file_system_list() instead of 
> recomputing it every time ? On the basis of my tests I found an impressive 
> gain in terms of cpu usage. But what are the cons ?
> 
> 
> BR
> G.Baroncelli

I cannot reproduce here - i.e. I do not see such a huge memory leak here ... 
even
if I add some 8 bind-mounts so that /proc/self/mountinfo has 30k.
How does your test system look like? Is there anything special about your 
directory
structure?  What does 'valgrind --leak-check=full --show-leak-kinds=all ./find 
...'
tell you about the leak(s)? How many lost blocks do you have?
However, your patch is fine, of course; I wrapped it into a nice commit:

  [PATCH 1/5] find: fix memory leak in mount list handling

I also added a wrapper around gnulib's read_file_system_list() in order to
always free the list in the next invocation.  By this, we still get the same
semantics.

  [PATCH 2/5] maint: remove unused get_mounted_filesystems
  [PATCH 3/5] maint: inline now last use of must_read_fs_list
  [PATCH 4/5] maint: move is_used_fs_type to fstype.c
  [PATCH 5/5] find: avoid unneccessary reading of the mount list

One could argument that reading the mount list just once would be
enough, yet this would maybe change the result for automounted file systems.
Well, we could still do this further optimization at a later step.

Have a nice day,
Berny

>From 8a2bbee0c94f91842a61ebddf32f7fb6336c7768 Mon Sep 17 00:00:00 2001
From: Goffredo Baroncelli <[email protected]>
Date: Fri, 27 Jan 2017 02:16:08 +0100
Subject: [PATCH 1/5] find: fix memory leak in mount list handling

The following gnulib commit added the structure member me_mntroot
which our free_file_system_list function didn't consider, thus leading
to a memory leak:
http://git.sv.gnu.org/cgit/gnulib.git/commit/?id=c6148bca89e9

* find/fstype.c (free_file_system_list): Use gnulib's free_mount_entry
function to free the mount list instead of free()ing all members here
manually.
* NEWS (Bug Fixes): Mention the fix.

Reported in
http://lists.gnu.org/archive/html/findutils-patches/2016-12/msg00000.html
---
 NEWS          | 3 +++
 find/fstype.c | 9 +--------
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index 6b890c36..ba89f9dc 100644
--- a/NEWS
+++ b/NEWS
@@ -37,6 +37,9 @@ in the use of FTS.
 
 ** Bug Fixes
 
+find no longer leaks memory for a recently added member in gnulib's
+mount list structure.
+
 #48180: find -noop (an internal option not intended to be exposed to the user)
          no longer crashes.  Bug introduced in FINDUTILS-4.3.1.
 
diff --git a/find/fstype.c b/find/fstype.c
index 535f9202..a0ac8bca 100644
--- a/find/fstype.c
+++ b/find/fstype.c
@@ -75,14 +75,7 @@ free_file_system_list (struct mount_entry *p)
   while (p)
     {
       struct mount_entry *pnext = p->me_next;
-
-      free (p->me_devname);
-      free (p->me_mountdir);
-
-      if (p->me_type_malloced)
-	free (p->me_type);
-      p->me_next = NULL;
-      free (p);
+      free_mount_entry (p);
       p = pnext;
     }
 }
-- 
2.11.0

>From 473603b55e41dcc1a41584cfa3cee5c95c0192bc Mon Sep 17 00:00:00 2001
From: Bernhard Voelker <[email protected]>
Date: Tue, 24 Jan 2017 23:35:56 +0100
Subject: [PATCH 2/5] maint: remove unused get_mounted_filesystems

* find/fstype.c (get_mounted_filesystems): Remove unused function ...
* find/defs.h (get_mounted_filesystems): ... and its declaration.
---
 find/defs.h   |  1 -
 find/fstype.c | 38 --------------------------------------
 2 files changed, 39 deletions(-)

diff --git a/find/defs.h b/find/defs.h
index 52e522fb..1761ec50 100644
--- a/find/defs.h
+++ b/find/defs.h
@@ -371,7 +371,6 @@ void cleanup(void);
 
 /* fstype.c */
 char *filesystem_type (const struct stat *statp, const char *path);
-char * get_mounted_filesystems (void);
 dev_t * get_mounted_devices (size_t *);
 
 
diff --git a/find/fstype.c b/find/fstype.c
index a0ac8bca..deaf54a2 100644
--- a/find/fstype.c
+++ b/find/fstype.c
@@ -232,44 +232,6 @@ file_system_type_uncached (const struct stat *statp, const char *path)
 }
 
 
-char *
-get_mounted_filesystems (void)
-{
-  char *result = NULL;
-  size_t alloc_size = 0u;
-  size_t used = 0u;
-  struct mount_entry *entries, *entry;
-  void *p;
-
-  entries = must_read_fs_list (false);
-  for (entry=entries; entry; entry=entry->me_next)
-    {
-      size_t len;
-
-#ifdef MNTTYPE_IGNORE
-      if (!strcmp (entry->me_type, MNTTYPE_IGNORE))
-	continue;
-#endif
-
-      len = strlen (entry->me_mountdir) + 1;
-      p = extendbuf (result, used+len, &alloc_size);
-      if (p)
-	{
-	  result = p;
-	  strcpy (&result[used], entry->me_mountdir);
-	  used += len;		/* len already includes one for the \0 */
-	}
-      else
-	{
-	  break;
-	}
-    }
-
-  free_file_system_list (entries);
-  return result;
-}
-
-
 dev_t *
 get_mounted_devices (size_t *n)
 {
-- 
2.11.0

>From dd3528a3d3dce30c708d6a29351ebc7b2bf8a188 Mon Sep 17 00:00:00 2001
From: Bernhard Voelker <[email protected]>
Date: Tue, 24 Jan 2017 23:45:02 +0100
Subject: [PATCH 3/5] maint: inline now last use of must_read_fs_list

* find/fstype.c (must_read_fs_list): Remove function and inline body ...
(file_system_type_uncached): ... here where its last use was.
(get_mounted_devices): Remove it from a comment.
---
 find/fstype.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/find/fstype.c b/find/fstype.c
index deaf54a2..4f026255 100644
--- a/find/fstype.c
+++ b/find/fstype.c
@@ -159,23 +159,6 @@ set_fstype_devno (struct mount_entry *p)
   return 0;			/* not needed */
 }
 
-static struct mount_entry *
-must_read_fs_list (bool need_fs_type)
-{
-  struct mount_entry *entries = read_file_system_list (need_fs_type);
-  if (NULL == entries)
-    {
-      /* We cannot determine for sure which file we were trying to
-       * use because gnulib has abstracted all that stuff away.
-       * Hence we cannot issue a specific error message here.
-       */
-      error (EXIT_FAILURE, 0, _("Cannot read mounted file system list"));
-    }
-  return entries;
-}
-
-
-
 /* Return a newly allocated string naming the type of file system that the
    file PATH, described by STATP, is on.
    RELPATH is the file name relative to the current directory.
@@ -198,7 +181,15 @@ file_system_type_uncached (const struct stat *statp, const char *path)
 #endif
 
   best = NULL;
-  entries = must_read_fs_list (true);
+  entries = read_file_system_list (true);
+  if (NULL == entries)
+    {
+      /* We cannot determine for sure which file we were trying to
+       * use because gnulib has abstracted all that stuff away.
+       * Hence we cannot issue a specific error message here.
+       */
+      error (EXIT_FAILURE, 0, _("Cannot read mounted file system list"));
+    }
   for (type=NULL, entry=entries; entry; entry=entry->me_next)
     {
 #ifdef MNTTYPE_IGNORE
@@ -240,7 +231,7 @@ get_mounted_devices (size_t *n)
   struct mount_entry *entries, *entry;
   dev_t *result = NULL;
 
-  /* Use read_file_system_list () rather than must_read_fs_list()
+  /* Ignore read_file_system_list () not returning a valid list
    * because on some system this is always called at startup,
    * and find should only exit fatally if it needs to use the
    * result of this operation.   If we can't get the fs list
-- 
2.11.0

>From c996b7ade9b839678eed0e0f9ffdf9e47b35761d Mon Sep 17 00:00:00 2001
From: Bernhard Voelker <[email protected]>
Date: Wed, 25 Jan 2017 00:07:49 +0100
Subject: [PATCH 4/5] maint: move is_used_fs_type to fstype.c

* find/parser.c (is_used_fs_type): Move to ...
* find/fstype.c: ... here.
* find/defs.h (is_used_fs_type): Add declaration.
---
 find/defs.h   |  1 +
 find/fstype.c | 29 +++++++++++++++++++++++++++++
 find/parser.c | 30 ------------------------------
 3 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/find/defs.h b/find/defs.h
index 1761ec50..919c1bdc 100644
--- a/find/defs.h
+++ b/find/defs.h
@@ -371,6 +371,7 @@ void cleanup(void);
 
 /* fstype.c */
 char *filesystem_type (const struct stat *statp, const char *path);
+bool is_used_fs_type(const char *name);
 dev_t * get_mounted_devices (size_t *);
 
 
diff --git a/find/fstype.c b/find/fstype.c
index 4f026255..3b460540 100644
--- a/find/fstype.c
+++ b/find/fstype.c
@@ -138,6 +138,35 @@ filesystem_type (const struct stat *statp, const char *path)
   return current_fstype;
 }
 
+bool
+is_used_fs_type(const char *name)
+{
+  if (0 == strcmp("afs", name))
+    {
+      /* I guess AFS may not appear in /etc/mtab (or equivalent) but still be in use,
+	 so assume we always need to check for AFS.  */
+      return true;
+    }
+  else
+    {
+      const struct mount_entry *entries = read_file_system_list(false);
+      if (entries)
+	{
+	  const struct mount_entry *entry;
+	  for (entry = entries; entry; entry = entry->me_next)
+	    {
+	      if (0 == strcmp(name, entry->me_type))
+		return true;
+	    }
+	}
+      else
+	{
+	  return true;
+	}
+    }
+  return false;
+}
+
 static int
 set_fstype_devno (struct mount_entry *p)
 {
diff --git a/find/parser.c b/find/parser.c
index 0a03731d..0fc94d0a 100644
--- a/find/parser.c
+++ b/find/parser.c
@@ -1079,36 +1079,6 @@ static float estimate_fstype_success_rate (const char *fsname)
 
 
 static bool
-is_used_fs_type(const char *name)
-{
-  if (0 == strcmp("afs", name))
-    {
-      /* I guess AFS may not appear in /etc/mtab (or equivalent) but still be in use,
-	 so assume we always need to check for AFS.  */
-      return true;
-    }
-  else
-    {
-      const struct mount_entry *entries = read_file_system_list(false);
-      if (entries)
-	{
-	  const struct mount_entry *entry;
-	  for (entry = entries; entry; entry = entry->me_next)
-	    {
-	      if (0 == strcmp(name, entry->me_type))
-		return true;
-	    }
-	}
-      else
-	{
-	  return true;
-	}
-    }
-  return false;
-}
-
-
-static bool
 parse_fstype (const struct parser_table* entry, char **argv, int *arg_ptr)
 {
   const char *typename;
-- 
2.11.0

>From 7cb9a6bf6ec1670a33bcae18ae65af1cce4b1ae7 Mon Sep 17 00:00:00 2001
From: Bernhard Voelker <[email protected]>
Date: Wed, 25 Jan 2017 01:26:03 +0100
Subject: [PATCH 5/5] find: avoid unneccessary reading of the mount list

Add a static cache for mount list to fstype.c.

* find/fstype.c (get_file_system_list): Add wrapper around GNULIB's
read_file_system_list function with a static cache to avoid multiple
invocations of that GNULIB function.  Still re-read the list if the
first invocation didn't include the ME_TYPE members.
(is_used_fs_type): Use the above new wrapper.
(file_system_type_uncached): Likewise, and do not free the list here
anymore because this is now done in get_file_system_list() as needed.
---
 find/fstype.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/find/fstype.c b/find/fstype.c
index 3b460540..4fb851eb 100644
--- a/find/fstype.c
+++ b/find/fstype.c
@@ -116,6 +116,31 @@ in_afs (char *path)
 /* Nonzero if the current file system's type is known.  */
 static int fstype_known = 0;
 
+/* Read the mount list into a static cache, and return it.
+   This is a wrapper around gnulib's read_file_system_list ()
+   to avoid unneccessary reading of the mount list.  */
+static struct mount_entry *
+get_file_system_list (bool need_fs_type)
+{
+  /* Local cache for the mount list.  */
+  static struct mount_entry *mount_list = NULL;
+
+  /* Remember if the list contains the ME_TYPE members.  */
+  static bool has_fstype = false;
+
+  if (mount_list && ! has_fstype && need_fs_type)
+    {
+      free_file_system_list (mount_list);
+      mount_list = NULL;
+    }
+  if (! mount_list)
+    {
+      mount_list = read_file_system_list(need_fs_type);
+      has_fstype = need_fs_type;
+    }
+  return mount_list;
+}
+
 /* Return a static string naming the type of file system that the file PATH,
    described by STATP, is on.
    RELPATH is the file name relative to the current directory.
@@ -149,7 +174,7 @@ is_used_fs_type(const char *name)
     }
   else
     {
-      const struct mount_entry *entries = read_file_system_list(false);
+      const struct mount_entry *entries = get_file_system_list(false);
       if (entries)
 	{
 	  const struct mount_entry *entry;
@@ -210,7 +235,7 @@ file_system_type_uncached (const struct stat *statp, const char *path)
 #endif
 
   best = NULL;
-  entries = read_file_system_list (true);
+  entries = get_file_system_list (true);
   if (NULL == entries)
     {
       /* We cannot determine for sure which file we were trying to
@@ -243,7 +268,6 @@ file_system_type_uncached (const struct stat *statp, const char *path)
     {
       type = xstrdup (best->me_type);
     }
-  free_file_system_list (entries);
 
   /* Don't cache unknown values. */
   fstype_known = (type != NULL);
-- 
2.11.0

_______________________________________________
Findutils-patches mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/findutils-patches

Reply via email to