Re: Memory Leaks When Using Statistics (was Re: [PATCH 1/1]: Fix Memory-map and Double-free Errors in Statistics Handling (was Re: Connman-0.67 Crashes and/or Hangs on Start-up))
Hi Grant, Finally, I found some time for debugging. 2) Keeping connman in the stack, but changing the mapping from MAP_PRIVATE back to MAP_SHARED such that stats_file_setup fails with -EINVAL (due to being backed by a JFFS2 file system). Okay, that tells us at least if stats.c is not involved there is no memory leak elsewhere. That's kind of nice for the rest bad for me ;) I have run valgrind for a while and also monitoring the memory map of connman with pmap. I don't know if that's enough or good way trying to find memory leak. ConnMan run through the whole night with a bit of background traffic (mail client polling etc) and I couldn't see any change in the pmap The first entry looks like this: 22403: ./connmand -n -d src/stats.c Address Kbytes RSS Dirty Mode Mapping 0040 532 412 0 r-x-- connmand 00685000 36 36 20 rw--- connmand 01325000 132 120 120 rw---[ anon ] 00310500 124 108 0 r-x-- ld-2.13.so 00310521e000 4 4 4 r ld-2.13.so 00310521f000 4 4 4 rw--- ld-2.13.so 00310522 4 4 4 rw---[ anon ] 003105401604 652 0 r-x-- libc-2.13.so 0031055910002048 0 0 - libc-2.13.so 003105791000 16 16 8 r libc-2.13.so 003105795000 4 4 4 rw--- libc-2.13.so 003105796000 24 24 24 rw---[ anon ] 00310580 8 8 0 r-x-- libdl-2.13.so 0031058020002048 0 0 - libdl-2.13.so 003105a02000 4 4 4 r libdl-2.13.so 003105a03000 4 4 4 rw--- libdl-2.13.so 003105c0 92 56 0 r-x-- libpthread-2.13.so 003105c170002044 0 0 - libpthread-2.13.so 003105e16000 4 4 4 r libpthread-2.13.so 003105e17000 4 4 4 rw--- libpthread-2.13.so 003105e18000 16 4 4 rw---[ anon ] 00310640 28 16 0 r-x-- libxtables.so.5.0.0 0031064070002048 0 0 - libxtables.so.5.0.0 003106607000 4 4 4 rw--- libxtables.so.5.0.0 003106c0 28 20 0 r-x-- librt-2.13.so 003106c070002044 0 0 - librt-2.13.so 003106e06000 4 4 4 r librt-2.13.so 003106e07000 4 4 4 rw--- librt-2.13.so 00310740 92 44 0 r-x-- libresolv-2.13.so 0031074170002044 0 0 - libresolv-2.13.so 003107616000 4 4 4 r libresolv-2.13.so 003107617000 4 4 4 rw--- libresolv-2.13.so 003107618000 8 0 0 rw---[ anon ] 003107801048 424 0 r-x-- libglib-2.0.so.0.2600.0 0031079060002044 0 0 - libglib-2.0.so.0.2600.0 003107b05000 4 4 4 rw--- libglib-2.0.so.0.2600.0 003107b06000 4 4 4 rw---[ anon ] 00310840 268 208 0 r-x-- libdbus-1.so.3.5.2 0031084430002044 0 0 - libdbus-1.so.3.5.2 003108642000 4 4 4 r libdbus-1.so.3.5.2 003108643000 4 4 4 rw--- libdbus-1.so.3.5.2 00310b00 16 12 0 r-x-- libcap-ng.so.0.0.0 00310b0040002044 0 0 - libcap-ng.so.0.0.0 00310b203000 4 4 4 r libcap-ng.so.0.0.0 00310b204000 4 4 4 rw--- libcap-ng.so.0.0.0 7f295bebf000 4 4 0 r-x-- iospm.so 7f295bec2044 0 0 - iospm.so 7f295c0bf000 4 4 4 rw--- iospm.so 7f295c0c 24 24 24 rw---[ anon ] 7f295c0e2000 16 16 12 rw-s- ethernet_002481c57c73_cable.data 7f295c0e6000 28 24 0 r--s- gconv-modules.cache 7f295c0ed000 4 4 4 rw---[ anon ] 7fff6132b000 132 24 24 rw---[ stack ] 7fff613ff000 4 4 0 r-x--[ anon ] ff60 4 0 0 r-x--[ anon ] -- -- -- total kB 248202340 320 and the current (14 hours later) one looks like this: 22403: ./connmand -n -d src/stats.c Address Kbytes RSS Dirty Mode Mapping 0040 532 412 0 r-x-- connmand 00685000 36 36 20 rw--- connmand 01325000 132 120 120 rw---[ anon ] 00310500 124 108 0 r-x-- ld-2.13.so 00310521e000 4 4 4 r ld-2.13.so 00310521f000 4 4 4 rw--- ld-2.13.so 00310522 4 4 4 rw---[ anon ] 003105401604 652 0 r-x--
Re: [PATCH 1/1]: Fix Memory-map and Double-free Errors in Statistics Handling (was Re: Connman-0.67 Crashes and/or Hangs on Start-up)
Good Morning Grant, On 02/17/2011 11:48 PM, Grant Erickson wrote: On 2/17/11 12:13 PM, Daniel Wagner wrote: On 02/17/2011 05:38 PM, Grant Erickson wrote: Thanks for taking the time to submit a comment for the code detailing the motivation for selecting MAP_PRIVATE. I have done some more research to determine why the MAP_SHARED fails on my platform. From what I can see, this is a limitation of JFFS2. For it's mmap method, it specifies: % grep 'mmap' linux/fs/jffs2/*.c linux/fs/jffs2/file.c:.mmap =generic_file_readonly_mmap, So, in short, for JFFS2 users in a top-of-tree kernel, connman statistics will never work with a PROT_WRITE + MAP_SHARED mapping. Thanks for tracking this down. In this case, it might make sense to add some fallback. My idea is to use MAP_PRIVATE and write the whole contents into a file after a certain period. The whole idea behind MAP_SHARED is to make sure the data really hits the disk. But until now, nobody with deep understanding of filesystem under Linux has done a review. So the current approach might be bogus. So if you are fine with having a fallback to MAP_PRIVATE + periodic write a file, then we should add this feature. Opinions? Were I writing this anew, I think I'd have skipped the mmap altogether, maintained the statistics in memory and flushed things out to a file at the appropriate times. If I understand you correctly, then allocating a memory block and use the current append write to this block can easily implemented withouth the use of mmap. Instead of allocating per mmap we just have to use malloc (or the glib wrapper for it g_try_new0()) for this. Working with what's there, I think I'd do away with MAP_SHARED, convert the mapping to MAP_PRIVATE and then call msync within the _stats_update path. This should be fairly simple to achieve. All that is said, however, without a savvy perspective on how statistic are used within connman. I have only second hand knowledge about the garantees we have to provide on the numbers. Marcel told me there exists a requirement for mobile phone manufacturers to grantee that the statistic numbers are like 95% accurate. How this number is computed or derived from is unclear. From this we thought we need to make sure that we really write data to the disk and also do not have too much disk IO. Currently, there is for each update at max two pages to flush to the disk. As I said nobody of us was really sure if this is what is going to work as expected. So if you have a better understanding what is really needed please be my guest ;) I'm glad to learn here something! cheers, daniel ___ connman mailing list connman@connman.net http://lists.connman.net/listinfo/connman
Re: Memory Leaks When Using Statistics (was Re: [PATCH 1/1]: Fix Memory-map and Double-free Errors in Statistics Handling (was Re: Connman-0.67 Crashes and/or Hangs on Start-up))
Hi Grant, On 17 February 2011 20:10, Grant Erickson maratho...@gmail.com wrote: To isolate the leaks, I systematically eliminated processes from the system until I was left with: - syslogd - klogd - wpa_supplicant - dbus-daemon - connmand I have run connmand under valgrind and it reports memory leaks in gsupplicant code. I sent earlier some memoryleak patches in various parts of connman but unfortunately I have lately had no time to find out where the latest supplicant leaks are. Regards, Jukka ___ connman mailing list connman@connman.net http://lists.connman.net/listinfo/connman
Re: [PATCH 1/1]: Fix Memory-map and Double-free Errors in Statistics Handling (was Re: Connman-0.67 Crashes and/or Hangs on Start-up)
On 2/17/11 12:13 PM, Daniel Wagner wrote: On 02/17/2011 05:38 PM, Grant Erickson wrote: Thanks for taking the time to submit a comment for the code detailing the motivation for selecting MAP_PRIVATE. I have done some more research to determine why the MAP_SHARED fails on my platform. From what I can see, this is a limitation of JFFS2. For it's mmap method, it specifies: % grep 'mmap' linux/fs/jffs2/*.c linux/fs/jffs2/file.c:.mmap =generic_file_readonly_mmap, So, in short, for JFFS2 users in a top-of-tree kernel, connman statistics will never work with a PROT_WRITE + MAP_SHARED mapping. Thanks for tracking this down. In this case, it might make sense to add some fallback. My idea is to use MAP_PRIVATE and write the whole contents into a file after a certain period. The whole idea behind MAP_SHARED is to make sure the data really hits the disk. But until now, nobody with deep understanding of filesystem under Linux has done a review. So the current approach might be bogus. So if you are fine with having a fallback to MAP_PRIVATE + periodic write a file, then we should add this feature. Opinions? Were I writing this anew, I think I'd have skipped the mmap altogether, maintained the statistics in memory and flushed things out to a file at the appropriate times. Working with what's there, I think I'd do away with MAP_SHARED, convert the mapping to MAP_PRIVATE and then call msync within the _stats_update path. All that is said, however, without a savvy perspective on how statistic are used within connman. Best, Grant ___ connman mailing list connman@connman.net http://lists.connman.net/listinfo/connman
Re: [PATCH 1/1]: Fix Memory-map and Double-free Errors in Statistics Handling (was Re: Connman-0.67 Crashes and/or Hangs on Start-up)
On 01/28/2011 10:39 PM, Grant Erickson wrote: diff -aruN a/src/stats.c b/src/stats.c --- a/src/stats.c 2011-01-28 13:20:49.248299633 -0800 +++ b/src/stats.c 2011-01-28 13:19:44.824306025 -0800 @@ -223,10 +223,18 @@ TFR(close(file-fd)); file-fd = -1; - if (file-history_name != NULL) + if (file-history_name != NULL) { g_free(file-history_name); - g_free(file-name); - g_free(file); + file-history_name = NULL; + } + + if (file-name != NULL) { + g_free(file-name); + file-name = NULL; + } + + if (file != NULL) + g_free(file); } I think the last check (file != NULL) is not necessary, because we do derefence it several times above. cheers, daniel ___ connman mailing list connman@connman.net http://lists.connman.net/listinfo/connman
Re: [PATCH 1/1]: Fix Memory-map and Double-free Errors in Statistics Handling (was Re: Connman-0.67 Crashes and/or Hangs on Start-up)
On 1/31/11 1:47 AM, Daniel Wagner wrote: On Fri, Jan 28, 2011 at 08:13:28PM -0800, Grant Erickson wrote: As far as I can see, the stats mapping is not published or shared outside of connman. If so, there's no reason to map it SHARED versus PRIVATE. From the mmap man pages: MAP_SHARED Share this mapping. Updates to the mapping are visible to other processes that map this file, and are carried through to the underlying file. The file may not actually be updated until msync(2) or munmap() is called. MAP_PRIVATE Create a private copy-on-write mapping. Updates to the mapping are not visible to other processes mapping the same file, and are not carried through to the underlying file. It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region. The reason for picking MAP_SHARED was to be sure the changes in buffer really hits the file. I read that MAP_PRIVATE does not garantee this. This is reason I was picking MAP_SHARED together with msync. Daniel: Thanks for the clarification on the rationale for choosing MAP_SHARED. I'll have to do some digging in to determine why MAP_SHARED fails on the ARM/Linux 2.6.32 kernel. It might be beneficial to add the above comments to the source file for future reference. Best, Grant ___ connman mailing list connman@connman.net http://lists.connman.net/listinfo/connman
Re: [PATCH 1/1]: Fix Memory-map and Double-free Errors in Statistics Handling (was Re: Connman-0.67 Crashes and/or Hangs on Start-up)
On Jan 28, 2011, at 6:14 PM, Samuel Ortiz sa...@linux.intel.com wrote: On Fri, Jan 28, 2011 at 01:39:09PM -0800, Grant Erickson wrote: On Jan 28, 2011, at 12:10 PM, Grant Erickson wrote: On Jan 28, 2011, at 10:09 AM, Grant Erickson wrote: On Jan 28, 2011, at 9:44 AM, Samuel Ortiz wrote: On Fri, Jan 28, 2011 at 09:06:12AM -0800, Grant Erickson wrote: FYI. I need to check against GIT top-of-tree and dig into this further; however, with: * Wired Ethernet connected, connman-0.67 crashes on start-up * Wireless 802.11 WEXT, connman-0.67 hangs on start-up Backtrace with wired: Please give us a gdb backtrace or run test/backtrace src/connmand log where log is the below trace. The following patch fixes the core dump / fault issue, of which there were/are two core problems: 1) Trying to memory map the statistics file on armv7l-linux-unknown-gnu with MAP_SHARED results in a result value from mmap of MAP_FAILED. Using MAP_PRIVATE works correctly. This primary failure causes the secondary failure that leads to the segment fault. Why does this toolchain fail to build a SHARED mapping? That's the platform: Linux-2.6.32 2) The function stats_free, called indirectly from g_hash_table_remove(stats_hash, service), g_free(file-name) ends up double-freeing file-name. It had already been earlier freed in stats_file_setup, following the failure of status_file_remap. This part of the patch makes sense, I'll apply it. Cheers, Samuel. -- Intel Open Source Technology Centre http://oss.intel.com/ ___ connman mailing list connman@connman.net http://lists.connman.net/listinfo/connman
Re: [PATCH 1/1]: Fix Memory-map and Double-free Errors in Statistics Handling (was Re: Connman-0.67 Crashes and/or Hangs on Start-up)
On 1/28/11 6:14 PM, Samuel Ortiz wrote: On Fri, Jan 28, 2011 at 01:39:09PM -0800, Grant Erickson wrote: On Jan 28, 2011, at 12:10 PM, Grant Erickson wrote: On Jan 28, 2011, at 10:09 AM, Grant Erickson wrote: On Jan 28, 2011, at 9:44 AM, Samuel Ortiz wrote: On Fri, Jan 28, 2011 at 09:06:12AM -0800, Grant Erickson wrote: FYI. I need to check against GIT top-of-tree and dig into this further; however, with: * Wired Ethernet connected, connman-0.67 crashes on start-up * Wireless 802.11 WEXT, connman-0.67 hangs on start-up Backtrace with wired: Please give us a gdb backtrace or run test/backtrace src/connmand log where log is the below trace. The following patch fixes the core dump / fault issue, of which there were/are two core problems: 1) Trying to memory map the statistics file on armv7l-linux-unknown-gnu with MAP_SHARED results in a result value from mmap of MAP_FAILED. Using MAP_PRIVATE works correctly. This primary failure causes the secondary failure that leads to the segment fault. Why does this toolchain fail to build a SHARED mapping ? Samuel: Regrets about the last post, a premature send on the phone. The package builds without issue with the toolchain, Sourcery G++ 2010q1-202. Cited above, armv7l-linux-unknown-gnu is the platform, running Linux 2.6.32. As far as I can see, the stats mapping is not published or shared outside of connman. If so, there's no reason to map it SHARED versus PRIVATE. Best, Grant ___ connman mailing list connman@connman.net http://lists.connman.net/listinfo/connman