Hello, I tried to trawl a large website using wget and found that after it had been running for two days my system had slowed to a crawl. top revealed that wget was consuming all the memory and still growing. I had to kill it.
I ran wget on a much smaller web site (my own) and used valgrind to report on the memory leaks. The report is attached. I hope it helps. My web site is somewhat different than the one I was trawling but hopefully the trace will be useful. Regards, Andrew Marlow
==20196== Memcheck, a memory error detector. ==20196== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al. ==20196== Using LibVEX rev 1471, a library for dynamic binary translation. ==20196== Copyright (C) 2004-2005, and GNU GPL'd, by OpenWorks LLP. ==20196== Using valgrind-3.1.0, a dynamic binary instrumentation framework. ==20196== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al. ==20196== --20196-- Command line --20196-- wget2 --20196-- -r --20196-- -nc --20196-- http://www.andrewpetermarlow.co.uk --20196-- Startup, with flags: --20196-- -v --20196-- --leak-check=full --20196-- --show-reachable=yes --20196-- Contents of /proc/version: --20196-- Linux version 2.6.15-1.2054_FC5 ([EMAIL PROTECTED]) (gcc version 4.1.0 20060304 (Red Hat 4.1.0-3)) #1 Tue Mar 14 15:48:33 EST 2006 --20196-- Arch and subarch: X86, x86-sse0 --20196-- Valgrind library directory: /usr/lib/valgrind --20196-- Reading syms from /lib/ld-2.4.so (0x523000) --20196-- Reading syms from /home/amarlow/bin/wget2 (0x8048000) --20196-- Reading syms from /usr/lib/valgrind/x86-linux/memcheck (0xB0000000) --20196-- object doesn't have a dynamic symbol table --20196-- Reading suppressions file: /usr/lib/valgrind/default.supp --20196-- REDIR: 0x537B90 (index) redirected to 0xB00212A2 (vgPlain_x86_linux_REDIR_FOR_index) --20196-- Reading syms from /usr/lib/valgrind/x86-linux/vgpreload_core.so (0x4000000) --20196-- Reading syms from /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so (0x4003000) --20196-- Reading syms from /lib/libdl-2.4.so (0xDD7000) --20196-- Reading syms from /lib/librt-2.4.so (0x7BE000) --20196-- Reading syms from /lib/libssl.so.0.9.8a (0x771000) --20196-- object doesn't have a symbol table --20196-- Reading syms from /lib/libcrypto.so.0.9.8a (0x5A2000) --20196-- object doesn't have a symbol table --20196-- Reading syms from /lib/libc-2.4.so (0x101000) --20196-- Reading syms from /lib/libpthread-2.4.so (0x50D1000) --20196-- Reading syms from /usr/lib/libgssapi_krb5.so.2.2 (0x6D9000) --20196-- object doesn't have a symbol table --20196-- Reading syms from /usr/lib/libkrb5.so.3.2 (0x6F4000) --20196-- object doesn't have a symbol table --20196-- Reading syms from /lib/libcom_err.so.2.1 (0x561000) --20196-- object doesn't have a symbol table --20196-- Reading syms from /usr/lib/libk5crypto.so.3.0 (0x57B000) --20196-- object doesn't have a symbol table --20196-- Reading syms from /lib/libresolv-2.4.so (0x566000) --20196-- Reading syms from /usr/lib/libz.so.1.2.3 (0xDDD000) --20196-- object doesn't have a symbol table --20196-- Reading syms from /usr/lib/libkrb5support.so.0.0 (0x76B000) --20196-- object doesn't have a symbol table --20196-- REDIR: 0x523820 (_dl_sysinfo_int80) redirected to 0xB002129F (???) --20196-- REDIR: 0x16C1E0 (memset) redirected to 0x40061C0 (memset) --20196-- REDIR: 0x16C670 (memcpy) redirected to 0x4006900 (memcpy) --20196-- REDIR: 0x16B360 (rindex) redirected to 0x4005D60 (rindex) --20196-- REDIR: 0x16AAB0 (strcmp) redirected to 0x4005FE0 (strcmp) --20196-- REDIR: 0x16B1C8 (strncmp) redirected to 0x4005F70 (strncmp) --20196-- REDIR: 0x16A940 (index) redirected to 0x4005E50 (index) --20196-- REDIR: 0x16CF60 (strchrnul) redirected to 0x4006260 (strchrnul) --20196-- REDIR: 0x166C08 (malloc) redirected to 0x4005177 (malloc) --20196-- REDIR: 0x1683A4 (free) redirected to 0x4004DBF (free) --20196-- REDIR: 0x16C390 (stpcpy) redirected to 0x4006730 (stpcpy) --20196-- REDIR: 0x16AB10 (strcpy) redirected to 0x40064B0 (strcpy) --20196-- REDIR: 0x166914 (calloc) redirected to 0x4004557 (calloc) --20196-- REDIR: 0x168548 (realloc) redirected to 0x400521F (realloc) --20196-- REDIR: 0x16B090 (strnlen) redirected to 0x4005EE0 (strnlen) --20196-- REDIR: 0x16BCE0 (memchr) redirected to 0x40060A0 (memchr) --20196-- REDIR: 0x16CE90 (rawmemchr) redirected to 0x4006290 (rawmemchr) File `www.andrewpetermarlow.co.uk/index.html' already there; not retrieving. Loading robots.txt; please ignore errors. --18:17:30-- http://www.andrewpetermarlow.co.uk/robots.txt => `www.andrewpetermarlow.co.uk/robots.txt' Resolving www.andrewpetermarlow.co.uk... --20196-- REDIR: 0x16B2B4 (strncpy) redirected to 0x4006380 (strncpy) --20196-- Reading syms from /lib/libnss_files-2.4.so (0x431B000) --20196-- Reading syms from /lib/libnss_dns-2.4.so (0x4326000) 213.171.218.177 Connecting to www.andrewpetermarlow.co.uk|213.171.218.177|:80... connected. HTTP request sent, awaiting response... 404 Not Found 18:17:31 ERROR 404: Not Found. File `www.andrewpetermarlow.co.uk/images/beige031.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/me_small.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/index.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/interests.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/publications.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/sw.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/family.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/bookmarks.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/contact.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/cv.doc' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/valid-xhtml10.gif' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/anybrowser3.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/linux.logo.tiny.1a.gif' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/run_gnu.png' already there; not retrieving. File `www.andrewpetermarlow.co.uk/comments.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/logo_rspb.gif' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/bbc.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/prime.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/prime-drwho.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/lupus.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/lupusuk.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/nspcc.gif' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/aa_logo.gif' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/fanderson25.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/nacc.gif' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/project01.pdf' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/proposal.pdf' already there; not retrieving. File `www.andrewpetermarlow.co.uk/opensource.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/mico.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/windoze.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/unix_tools.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/me.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/brenda.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/jonathan.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/simon.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/parents.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/prime.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/Rev_19p2p3.tar.bz2' already there; not retrieving. --18:17:32-- http://www.andrewpetermarlow.co.uk/fructose/index.html => `www.andrewpetermarlow.co.uk/fructose/index.html' Connecting to www.andrewpetermarlow.co.uk|213.171.218.177|:80... connected. HTTP request sent, awaiting response... 404 Not Found 18:17:32 ERROR 404: Not Found. File `www.andrewpetermarlow.co.uk/ant.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/asn1.html' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/rpm.tar.gz' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/bsdtar-1.2.38.exe' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/pkreader.exe' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/arepl.tgz' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/xmsg.tgz' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/wi.txt' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/emacs.txt' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/snprintf.c' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/meport.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/brenda.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/images/jonathan.jpg' already there; not retrieving. --18:17:32-- http://www.andrewpetermarlow.co.uk/goodies/haunted.ppt => `www.andrewpetermarlow.co.uk/goodies/haunted.ppt' Connecting to www.andrewpetermarlow.co.uk|213.171.218.177|:80... connected. HTTP request sent, awaiting response... 404 Not Found 18:17:32 ERROR 404: Not Found. --18:17:32-- http://www.andrewpetermarlow.co.uk/images/simon.jpg => `www.andrewpetermarlow.co.uk/images/simon.jpg' Connecting to www.andrewpetermarlow.co.uk|213.171.218.177|:80... connected. HTTP request sent, awaiting response... 404 Not Found 18:17:32 ERROR 404: Not Found. File `www.andrewpetermarlow.co.uk/images/parents.jpg' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/ant-manual.pdf.bz2' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/ant-manual.pdf.gz' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/a.tar.bz2' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/a.tar.gz' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/ant-manual.tex' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/ant-manual.tex.gz' already there; not retrieving. File `www.andrewpetermarlow.co.uk/goodies/asn1.pdf' already there; not retrieving. FINISHED --18:17:32-- Downloaded: 0 bytes in 0 files --20196-- discard syms at 0x431B000-0x4326000 in /lib/libnss_files-2.4.so due to munmap() --20196-- discard syms at 0x4326000-0x432C000 in /lib/libnss_dns-2.4.so due to munmap() ==20196== ==20196== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 28 from 1) --20196-- --20196-- supp: 28 Fedora-Core-5-hack2 ==20196== malloc/free: in use at exit: 18,295 bytes in 287 blocks. ==20196== malloc/free: 5,647 allocs, 5,360 frees, 191,919 bytes allocated. ==20196== ==20196== searching for pointers to 287 not-freed blocks. ==20196== checked 287,748 bytes. ==20196== ==20196== 18 bytes in 1 blocks are still reachable in loss record 1 of 5 ==20196== at 0x40051F9: malloc (vg_replace_malloc.c:149) ==20196== by 0x1DE3B6: inet_ntoa (in /lib/libc-2.4.so) ==20196== by 0x805A235: pretty_print_address (host.c:427) ==20196== by 0x805A9C7: lookup_host (host.c:838) ==20196== by 0x804B369: connect_to_host (connect.c:365) ==20196== by 0x805F740: gethttp (http.c:1445) ==20196== by 0x8061A49: http_loop (http.c:2191) ==20196== by 0x806DA87: retrieve_url (retr.c:667) ==20196== by 0x806CA85: res_retrieve_file (res.c:546) ==20196== by 0x806BAFB: download_child_p (recur.c:559) ==20196== by 0x806B4CD: retrieve_tree (recur.c:347) ==20196== by 0x8068216: main (main.c:941) ==20196== ==20196== ==20196== 32 bytes in 2 blocks are still reachable in loss record 2 of 5 ==20196== at 0x40045EB: calloc (vg_replace_malloc.c:279) ==20196== by 0x8077FC3: checking_malloc0 (xmalloc.c:125) ==20196== by 0x8059F55: address_list_from_addrinfo (host.c:206) ==20196== by 0x805A8E1: lookup_host (host.c:793) ==20196== by 0x804B369: connect_to_host (connect.c:365) ==20196== by 0x805F740: gethttp (http.c:1445) ==20196== by 0x8061A49: http_loop (http.c:2191) ==20196== by 0x806DA87: retrieve_url (retr.c:667) ==20196== by 0x806CA85: res_retrieve_file (res.c:546) ==20196== by 0x806BAFB: download_child_p (recur.c:559) ==20196== by 0x806B4CD: retrieve_tree (recur.c:347) ==20196== by 0x8068216: main (main.c:941) ==20196== ==20196== ==20196== 2,048 bytes in 4 blocks are definitely lost in loss record 3 of 5 ==20196== at 0x40051F9: malloc (vg_replace_malloc.c:149) ==20196== by 0x8077F89: checking_malloc (xmalloc.c:113) ==20196== by 0x806D3C9: fd_read_hunk (retr.c:386) ==20196== by 0x805DE55: read_http_response_head (http.c:486) ==20196== by 0x805FD00: gethttp (http.c:1563) ==20196== by 0x8061A49: http_loop (http.c:2191) ==20196== by 0x806DA87: retrieve_url (retr.c:667) ==20196== by 0x806CA85: res_retrieve_file (res.c:546) ==20196== by 0x806BAFB: download_child_p (recur.c:559) ==20196== by 0x806B4CD: retrieve_tree (recur.c:347) ==20196== by 0x8068216: main (main.c:941) ==20196== ==20196== ==20196== 3,664 bytes in 18 blocks are still reachable in loss record 4 of 5 ==20196== at 0x40051F9: malloc (vg_replace_malloc.c:149) ==20196== by 0x8077F89: checking_malloc (xmalloc.c:113) ==20196== by 0x804D861: cookie_jar_new (cookies.c:93) ==20196== by 0x8061539: http_loop (http.c:2010) ==20196== by 0x806DA87: retrieve_url (retr.c:667) ==20196== by 0x806B2C4: retrieve_tree (recur.c:266) ==20196== by 0x8068216: main (main.c:941) ==20196== ==20196== ==20196== 12,533 bytes in 262 blocks are still reachable in loss record 5 of 5 ==20196== at 0x40051F9: malloc (vg_replace_malloc.c:149) ==20196== by 0x16AD70: strdup (in /lib/libc-2.4.so) ==20196== by 0x8078045: checking_strdup (xmalloc.c:160) ==20196== by 0x804D220: register_download (convert.c:766) ==20196== by 0x806DD75: retrieve_url (retr.c:770) ==20196== by 0x806B2C4: retrieve_tree (recur.c:266) ==20196== by 0x8068216: main (main.c:941) ==20196== ==20196== LEAK SUMMARY: ==20196== definitely lost: 2,048 bytes in 4 blocks. ==20196== possibly lost: 0 bytes in 0 blocks. ==20196== still reachable: 16,247 bytes in 283 blocks. ==20196== suppressed: 0 bytes in 0 blocks. --20196-- memcheck: sanity checks: 39 cheap, 2 expensive --20196-- memcheck: auxmaps: 0 auxmap entries (0k, 0M) in use --20196-- memcheck: auxmaps: 0 searches, 0 comparisons --20196-- memcheck: secondaries: 36 issued (2304k, 2M) --20196-- memcheck: secondaries: 48 accessible and distinguished (3072k, 3M) --20196-- tt/tc: 16,599 tt lookups requiring 17,360 probes --20196-- tt/tc: 16,599 fast-cache updates, 4 flushes --20196-- translate: new 6,886 (143,667 -> 2,411,073; ratio 167:10) [0 scs] --20196-- translate: dumped 0 (0 -> ??) --20196-- translate: discarded 227 (4,567 -> ??) --20196-- scheduler: 1,990,219 jumps (bb entries). --20196-- scheduler: 39/21,500 major/minor sched events. --20196-- sanity: 40 cheap, 2 expensive checks. --20196-- exectx: 30,011 lists, 340 contexts (avg 0 per list) --20196-- exectx: 11,032 searches, 10,696 full compares (969 per 1000) --20196-- exectx: 831 cmp2, 72 cmp4, 0 cmpAll
