Bug#334351: netplan: memory errors
tags 334351 + patch pending thanks I believe the attached patch solve the issue, but dropping the use of NGROUPS_MAX, and instead count the number of groups before allocating the memory needed from the heap. Please test it and let me know if it solve your problem. Unless you tell me it failed to work, I'll upload a new version with this bug fixed in a few days. Index: plan-1.9/src/netplan.h === RCS file: /home/pere/src/plan/plan-1.9/src/netplan.h,v retrieving revision 1.1.1.2 diff -u -3 -p -r1.1.1.2 netplan.h --- plan-1.9/src/netplan.h 9 Jan 2005 11:21:54 - 1.1.1.2 +++ plan-1.9/src/netplan.h 23 Dec 2005 23:13:48 - @@ -56,7 +56,8 @@ struct client { char *descr; /* client-provided self-description */ char*user; /* user name if authenticated */ unsigned int uid; /* client-provided user/group ID */ - int gids[NGROUPS_MAX];/* group access list */ + int gidcount; /* number of in group entries */ + int *gids; /* group access list */ BOOLauth_fail; /* required identd check failed */ int port; /* accept socket to client */ struct sockaddr_in addr; /* contains IP addr and port */ Index: plan-1.9/src/netplan.c === RCS file: /home/pere/src/cvsroot/src/plan/plan-1.9/src/netplan.c,v retrieving revision 1.1.1.3 diff -u -3 -p -r1.1.1.3 netplan.c --- plan-1.9/src/netplan.c 19 Feb 2005 10:39:37 - 1.1.1.3 +++ plan-1.9/src/netplan.c 23 Dec 2005 23:13:48 - @@ -330,12 +330,12 @@ int main( memset(cp, 0, sizeof(struct client)); memcpy(&cp->addr, &addr, n); cp->uid = nobody_uid; +cp->gids= allocate(sizeof(*(cp->gids))); +cp->gidcount = 1; cp->gids[0] = nobody_gid; cp->descr = mystrdup(nobody); cp->user= mystrdup(nobody); cp->port= netplan_port[c]; -for (i=1; i < NGROUPS_MAX; i++) - cp->gids[i] = -1; logger("client %d (%.200s) connected from %s\n" ,fd, cp->descr, ip_addr(&addr)); @@ -404,19 +404,29 @@ static void setup_gids( struct group *gr; int ngroups = 0; char **p; + int groupcount; + /* Unspecified if the default gid is included or not, so add + one just in case. */ + groupcount = getgroups(0, NULL) + 1; + cp->gids = allocate(groupcount * sizeof(*(cp->gids))); cp->gids[ngroups++] = gid; setgrent(); while ((gr = getgrent())) for (p=gr->gr_mem; *p; p++) if (!strcmp(user, *p)) { -if (ngroups < NGROUPS_MAX) +if (ngroups < groupcount) cp->gids[ngroups] = gr->gr_gid; ngroups++; break; } - if (ngroups > NGROUPS_MAX) - logger("too many groups (%d, used %d)\n", ngroups,NGROUPS_MAX); + if (ngroups > groupcount) { + logger("too many groups (%d, used %d)\n", ngroups, groupcount); + cp->gidcount = groupcount; + } + else + cp->gidcount = ngroups; + } @@ -525,6 +535,7 @@ static void close_client( for (fid=0; fid < max_files; fid++) close_file(fid, fd); + release(c->gids); release(c->descr); release(c->user); release(c->in.data); @@ -650,7 +661,7 @@ static void eval_message( "user %.100s, uid %d, gids", fd, c->user, c->uid); p = buf + strlen(buf); -for (i=0; igids[i]>=0; i++){ +for (i=0; igidcount && c->gids[i]>=0; i++){ sprintf(p, " %d", c->gids[i]); p += strlen(p); } @@ -684,7 +695,7 @@ static void eval_message( sprintf(buf, "client %d is %.200s (uid %d, gids", fd, arg, c->uid); p = buf + strlen(buf); - for (i=0; i < NGROUPS_MAX && c->gids[i] >= 0; i++) { + for (i=0; i < c->gidcount && c->gids[i] >= 0; i++) { sprintf(p, " %d", c->gids[i]); p += strlen(p); }
Bug#334351: netplan: memory errors
[Petter Reinholdtsen] > Can you try to apply this diff, to get the size of the memory > alloctaion printed, and run the program again: Doh. No need. I see from the backtrace, that the number is very large (268517376). Hm, could this be a signed/unsigned issue? Try to apply this patch and see if the problem goes away: --- netplan.c.~1.1.1.3.~2005-02-19 12:07:34.0 +0100 +++ netplan.c 2005-11-03 09:51:56.362527361 +0100 @@ -87,9 +87,9 @@ const char *arg); void logger(char *, ...); -void *allocate(int n) +void *allocate(size_t n) {void *p = malloc(n); if (!p) fatal("no memory"); return(p);} -void *reallocate(void *o, int n) +void *reallocate(void *o, size_t n) {void *p = o ? realloc(o, n) : malloc(n); if (!p) fatal("no memory"); return(p);} void release(void *p) Malloc takes size_t as an argument, not int. It would also be great if you could try to apply this patch, and report the output when running netplan as before. --- netplan.c.~1.1.1.3.~2005-02-19 12:07:34.0 +0100 +++ netplan.c 2005-11-03 09:54:23.963155116 +0100 @@ -266,6 +266,8 @@ } } nclients = sizeof(fd_set)*8;/* max # of clients */ + printf("sizeof(fd_set)=%d, sizeof(struct client)=%d, NGROUPS_MAX=%d\n", + sizeof(fd_set),sizeof(struct client), NGROUPS_MAX); client_list = allocate(nclients * sizeof(struct client)); memset(client_list, 0, nclients * sizeof(struct client)); -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
The problem seem to trigger in this code in netplan.c function main(). '==>' marks line 269. fd_set rd, wr, ex; /* returned fd masks from select */ [...] nclients = sizeof(fd_set)*8; /* max # of clients */ ==> client_list = allocate(nclients * sizeof(struct client)); memset(client_list, 0, nclients * sizeof(struct client)); I have no sensible explanation why allocate(8 * sizeof(fd_set) * sizeof(struct client)) should fail, as it is just a simple wrapper around malloc(). The client struct do not seem to be very large (unless NGROUPS_MAX is a very large number suddenly), and fd_set is not very large either. Can you try to apply this diff, to get the size of the memory alloctaion printed, and run the program again: --- netplan.c.~1.1.1.3.~2005-02-19 12:07:34.0 +0100 +++ netplan.c 2005-11-03 09:44:04.913433546 +0100 @@ -266,6 +266,8 @@ } } nclients = sizeof(fd_set)*8;/* max # of clients */ + printf("Trying to call allocate(%d)\n", + nclients * sizeof(struct client)); client_list = allocate(nclients * sizeof(struct client)); memset(client_list, 0, nclients * sizeof(struct client)); -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
Petter Reinholdtsen <[EMAIL PROTECTED]> writes: > [Dan Griswold] >> Will do! > > Hm, the problem seem to happen in the forked off child, and not in the > parent process. Try running the same using netplan -f, to avoid the > fork. > > To test in gdb, do this: > > gdb src/netplan > break exit > run -f > bt > which yields: (gdb) break exit Breakpoint 1 at 0x804a181: file netplan.c, line 384. (gdb) run -f Starting program: /usr/src/plan-1.9/src/netplan -f /usr/src/plan-1.9/src/netplan: fatal error: no memory, exiting. Breakpoint 1, exit (ret=1) at netplan.c:384 384 if (!exiting) { (gdb) bt #0 exit (ret=1) at netplan.c:384 #1 0x0804ccce in fatal (fmt=0x804e9a7 "no memory") at netplan.c:1421 #2 0x08049553 in allocate (n=268517376) at netplan.c:91 #3 0x08049ae7 in main (argc=2, argv=0xb904) at netplan.c:269 (gdb) -- -- Dan Griswold Rochester, NY -- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
[Dan Griswold] > Will do! Hm, the problem seem to happen in the forked off child, and not in the parent process. Try running the same using netplan -f, to avoid the fork. To test in gdb, do this: gdb src/netplan break exit run -f bt -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
Petter Reinholdtsen <[EMAIL PROTECTED]> writes: > [Dan Griswold] >>> Can you try with ltrace? >> >> Yields nothing obviously of value. > > Well, please send the last 50 lines anyway Will do! cantor:/usr/src/plan-1.9(root)# ltrace netplan __libc_start_main(0x804bfb0, 1, 0xb914, 0x804deb0, 0x804df20 getopt(1, 0xb914, "advfs") = -1 time(NULL) = 1129580481 ctime(0xb880)= "Mon Oct 17 16:21:21 2005\n" getpwnam("netplan") = 0xb7fcb054 getuid() = 0 getuid() = 0 seteuid(0) = 0 getgid() = 0 getgid() = 0 setegid(0, 0xb8001508, 0xb898, 0x804c0d7, 0) = 0 getuid() = 0 setgid(63434)= 0 setegid(63434, 63434, 63434, 0xb8000cc0, 0) = 0 setuid(63434)= 0 seteuid(63434) = 0 getgid() = 63434 getegid()= 63434 getuid() = 63434 geteuid()= 63434 getegid()= 63434 geteuid()= 63434 getgid() = 63434 getuid() = 63434 access("/var/lib/plan/netplan.dir/.", 7) = 0 umask(077) = 022 getuid() = 63434 setuid(63434)= 0 getgid() = 63434 setgid(63434)= 0 fork(netplan: fatal error: no memory, exiting. --- SIGCHLD (Child exited) --- <... fork resumed> ) = 4377 sleep(1) = 0 _exit(0 +++ exited (status 0) +++ > And send the last 50 strace lines as well, Okay: cantor:/home/dan(root)# strace netplan execve("/usr/sbin/netplan", ["netplan"], [/* 40 vars */]) = 0 uname({sys="Linux", node="cantor", ...}) = 0 brk(0) = 0x8051000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=115669, ...}) = 0 old_mmap(NULL, 115669, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7fce000 close(3)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/tls/libc.so.6", O_RDONLY)= 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300O\1"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1266800, ...}) = 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fcd000 old_mmap(NULL, 1276860, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7e95000 old_mmap(0xb7fc7000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x131000) = 0xb7fc7000 old_mmap(0xb7fcb000, 7100, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7fcb000 close(3)= 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7e94000 mprotect(0xb7fc7000, 4096, PROT_READ) = 0 set_thread_area({entry_number:-1 -> 6, base_addr:0xb7e946c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 munmap(0xb7fce000, 115669) = 0 time(NULL) = 1129580624 brk(0) = 0x8051000 brk(0x8072000) = 0x8072000 open("/etc/localtime", O_RDONLY)= 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=1267, ...}) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=1267, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fea000 read(3, "TZif\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0"..., 4096) = 1267 close(3)= 0 munmap(0xb7fea000, 4096)= 0 socket(PF_FILE, SOCK_STREAM, 0) = 3 fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) close(3)= 0 socket(PF_FILE, SOCK_STREAM, 0) = 3 fcntl64(3, F_GETFL) = 0x2 (flags O_RDWR) fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) close(3)
Bug#334351: netplan: memory errors
[Dan Griswold] >> Can you try with ltrace? > > Yields nothing obviously of value. Well, please send the last 50 lines anyway, to let me know what is going on when it fails. And send the last 50 strace lines as well, though I do not expect them to tell me much. > So, I guess I need to gpg --import . Yes? No need to update your GPG keyring. The build "error" is expected and non-fatal. The build is complete, and you can continue with the instructions. :) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
Petter Reinholdtsen <[EMAIL PROTECTED]> writes: > Can you try to verify this by installing the old binary package and > see if it get rid of the problem? > > ftp://ftp.debian.org/debian/pool/main/p/plan/netplan_1.9-2_i386.deb> Verified. 1.9-2 runs, and I can access my served calendars from plan. Good to know that this can be my workaround, if need be. > Can you try with ltrace? Yields nothing obviously of value. > >> valgrind reports the following in its summary: > [...] > >> Several of the errors reported above that summary are "Conditional >> jump or move depends on uninitialised value(s)" > > Those are the ones I am interested in. But it would be more useful if > the binary was compiled with debugging symbols included. I'm not sure > about your skill profile, so I make a stepwise list of instructions. Good call. I welcome the instructions. > apt-get source plan > cd plan-1.9 > debuild Failed at this point with these errors: gpg: skipped "Petter Reinholdtsen <[EMAIL PROTECTED]>": secret key not available gpg: [stdin]: clearsign failed: secret key not available debsign: gpg error occurred! Aborting So, I guess I need to gpg --import . Yes? > Do you have any special netplan configuration? I don't know if it's special, but I have 12 calendar files in the calendar directory (/var/lib/plan/netplan.dir/). One of them is 30k, the others are all less than 5k. I also use the web interface, but it runs only when accessed (rare here) and, from what I can grok (via grep), the cgi scripts do not call netplan directly, but only by means of plan. Many thanks, Dan -- -- Dan Griswold Rochester, NY -- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
[Dan Griswold] > That's what I thought when I looked in the changelog. I upgrade very > frequently, so my replaced version must be the immediately previous > one, 1.9-2. Can you try to verify this by installing the old binary package and see if it get rid of the problem? ftp://ftp.debian.org/debian/pool/main/p/plan/netplan_1.9-2_i386.deb> > strace doesn't appear to report much of value (although I could be > mistaken). Probably right. Can you try with ltrace? > valgrind reports the following in its summary: [...] > Several of the errors reported above that summary are "Conditional > jump or move depends on uninitialised value(s)" Those are the ones I am interested in. But it would be more useful if the binary was compiled with debugging symbols included. I'm not sure about your skill profile, so I make a stepwise list of instructions. apt-get source plan cd plan-1.9 debuild make -C src clean linux DEBUG=-g valgrind src/netplan If I got this right, you should have a netplan binary compiled with debug information, and run it under valgrind. > I don't know what I'm doing with gdb, so I'll hold off until > instructed further. Use the same binary as above, and run these commands: gdb src/netplan break exit run # It should now stop when exit() is called bt The output from the 'bt' command should be the backtrace at the point were it exits. > Thank you so much for your help. Thank you for your patience. I hope we can figure this one out. I had a look at the code, and have no idea how your problem could have been introduced by the recent changes. Do you have any special netplan configuration? -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
Petter Reinholdtsen <[EMAIL PROTECTED]> writes: > [Dan Griswold] >> With the new upgrade, the netplan daemon ate up all the memory on the >> system and caused the xserver and its associated tty to fail. This >> required a reboot. > > Hm. I'm not aware of any changes which could give this result. What > was the version number you used before this problem arrived? That's what I thought when I looked in the changelog. I upgrade very frequently, so my replaced version must be the immediately previous one, 1.9-2. > Can you try to run the program using strace and valgrind, to see what > is going on? And perhaps run it under gdb to try to get a backtrace? strace doesn't appear to report much of value (although I could be mistaken). valgrind reports the following in its summary: ==23228== ERROR SUMMARY: 32 errors from 18 contexts (suppressed: 0 from 0) ==23228== malloc/free: in use at exit: 688 bytes in 20 blocks. ==23228== malloc/free: 91 allocs, 70 frees, 11206 bytes allocated. ==23228== For counts of detected errors, rerun with: -v ==23228== searching for pointers to 20 not-freed blocks. ==23228== checked 78356 bytes. ==23228== ==23228== LEAK SUMMARY: ==23228==definitely lost: 156 bytes in 11 blocks. ==23228== possibly lost: 0 bytes in 0 blocks. ==23228==still reachable: 532 bytes in 9 blocks. ==23228== suppressed: 0 bytes in 0 blocks. ==23228== Use --leak-check=full to see details of leaked memory. ==23049== ==23049== ERROR SUMMARY: 32 errors from 18 contexts (suppressed: 0 from 0) ==23049== malloc/free: in use at exit: 156 bytes in 11 blocks. ==23049== malloc/free: 73 allocs, 62 frees, 7858 bytes allocated. ==23049== For counts of detected errors, rerun with: -v ==23049== searching for pointers to 11 not-freed blocks. ==23049== checked 69632 bytes. ==23049== ==23049== LEAK SUMMARY: ==23049==definitely lost: 156 bytes in 11 blocks. ==23049== possibly lost: 0 bytes in 0 blocks. ==23049==still reachable: 0 bytes in 0 blocks. ==23049== suppressed: 0 bytes in 0 blocks. ==23049== Use --leak-check=full to see details of leaked memory. Several of the errors reported above that summary are "Conditional jump or move depends on uninitialised value(s)" I don't know what I'm doing with gdb, so I'll hold off until instructed further. Thank you so much for your help. Dan Griswold Rochester, NY -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
[Dan Griswold] > With the new upgrade, the netplan daemon ate up all the memory on the > system and caused the xserver and its associated tty to fail. This > required a reboot. Hm. I'm not aware of any changes which could give this result. What was the version number you used before this problem arrived? > Is there a work-around for this? I've really come to depend on this > package. You can perhaps try to version in sarge or etch? I'm not aware of any workarounds. Can you try to run the program using strace and valgrind, to see what is going on? And perhaps run it under gdb to try to get a backtrace? I have no idea what can have caused this. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Bug#334351: netplan: memory errors
Package: netplan Version: 1.9-3 Severity: grave With the new upgrade, the netplan daemon ate up all the memory on the system and caused the xserver and its associated tty to fail. This required a reboot. Going to plan B, I tried apt-get source and compiling my own. It now yields this error message when I tried to start the daemon: Starting plan appointment daemon: netplan/usr/sbin/netplan: fatal error: no memory, exiting. Is there a work-around for this? I've really come to depend on this package. -- System Information: Debian Release: testing/unstable APT prefers unstable APT policy: (500, 'unstable') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.10.050423 Locale: LANG=en_US, LC_CTYPE=en_US (charmap=ISO-8859-1) Versions of packages netplan depends on: ii adduser 3.67.2 Add and remove users and groups ii libc6 2.3.5-7GNU C Library: Shared libraries an netplan recommends no packages. -- no debconf information -- Dan Griswold Rochester, NY -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]