Re: My network is dead because of this program :(
On Tue, May 15, 2001 at 10:44:32PM -0400, Matthew Emmerton wrote: [...] After going to single user mode, cause I can't kill the offending program once it is running in multiuser mode (even kill -9 won't work ... Probably because the program is forking and you can't kill it's children fast enough. Can anyone help me trace what the program does? And how can I prevent the program to DoS my network interface? Even when the program is started by unprivileged user, it works, it DoS my network interface. Is this a bug? [...] Unfortunately it looks like the program forks, does it's thing, and then each child forks too. There is a call to sleep probably to introduce a delay so that things don't go completely crazy right away, but processes build up exponentially with each child process forking, so eventually resources get exhausted. Take a look at the attachment for more details. -brian -- Brian O'Shea [EMAIL PROTECTED] It looks like the program basically does this, in pseudo-code: main() { int pid; while (1) { pid = fork(); if (pid == 0) { /* child process */ socketpair(); /* get two AF_LOCAL sockets */ setsockopt(); /* set receive buffer size on one socket */ setsockopt(); /* set send buffer size on other socket */ fcntl();/* set non-blocking I/O on both sockets */ fcntl(); write();/* write some data from one socket to the other */ write(); } /* else we're in the parent, or fork() failed */ sleep();/* sleep a while */ } return; } (gdb) disassemble main Dump of assembler code for function main: 0x804865c main: push %ebp 0x804865d main+1: mov%esp,%ebp 0x804865f main+3: sub$0x32018,%esp 0x8048665 main+9: nop We're probably in an infinite loop here. 0x8048666 main+10:movl $0x0,0xfff4(%ebp) 0x804866d main+17:lea0x0(%esi),%esi 0x8048670 main+20:cmpl $0x12,0xfff4(%ebp) 0x8048674 main+24:jle0x8048678 main+28 0x8048676 main+26:jmp0x8048690 main+52 The first thing we do in the loop is call fork(). 0x8048678 main+28:call 0x80484c0 fork Then we check its return value. 0x804867d main+33:mov%eax,%eax 0x804867f main+35:test %eax,%eax 0x8048681 main+37:je 0x8048688 main+44 0x8048683 main+39:jmp0x8048690 main+52 0x8048685 main+41:lea0x0(%esi),%esi It looks like the parent calls fork in a loop. The child processes that it creates continue. 0x8048688 main+44:incl 0xfff4(%ebp) 0x804868b main+47:jmp0x8048670 main+20 0x804868d main+49:lea0x0(%esi),%esi 0x8048690 main+52:add$0xfff4,%esp Sleep for 5 seconds ... 0x8048693 main+55:push $0x5 0x8048695 main+57:call 0x8048490 sleep 0x804869a main+62:add$0x10,%esp 0x804869d main+65:lea0x0(%esi),%esi 0x80486a0 main+68:jmp0x80486a8 main+76 0x80486a2 main+70:jmp0x80487ac main+336 0x80486a7 main+75:nop Child calls socketpair with the following arguments: int socketpair(int domain, int type, int protocol, int *sv) int domain = 0x1 (AF_LOCAL) int type = 0x1 (SOCK_STREAM) int protocol = 0x0 (typically 0 for AF_LOCAL) int *sv = address of an array of two file descriptors Push arguments to socketpair onto stack and call socketpair again: 0x80486a8 main+76:lea0xfff8(%ebp),%eax 0x80486ab main+79:push %eax 0x80486ac main+80:push $0x0 0x80486ae main+82:push $0x1 0x80486b0 main+84:push $0x1 0x80486b2 main+86:call 0x80484e0 socketpair 0x80486b7 main+91:add$0x10,%esp 0x80486ba main+94:mov%eax,%eax Note: It's strange that the address family is AF_LOCAL. I wouldn't think this would cause the problems that you are seeing with the xl0 device, unless AF_LOCAL sockets consume some of the same resources that this driver also consumes, and thus starves it of those resources. I don't know enough about FreeBSD to tell. Looks like we're checking the return value of socketpair. The value 0x is -1, which is what socketpair returns if it fails. 0x80486bc main+96:cmp$0x,%eax 0x80486bf main+99:jne0x80486c8 main+108 0x80486c1 main+101: jmp0x80487ac main+336 If it fails, jumps ahead to a call to pause (below at main+336) 0x80486c6 main+106: mov%esi,%esi 0x80486c8 main+108: movl $0x32000,0xfff4(%ebp) 0x80486cf main+115: add$0xfff4,%esp Push arguments to setsockopt onto stack and call setsockopt: 0x80486d2 main+118: push $0x4 0x80486d4 main+120: lea0xfff4(%ebp),%eax 0x80486d7 main+123: push %eax 0x80486d8 main+124: push $0x1002 0x80486dd main+129: push $0x 0x80486e2 main+134: mov0xfff8(%ebp),%eax 0x80486e5 main+137: push %eax 0x80486e6 main+138
Re: new rc.network6 and rc.firewall6
On Wed, Oct 25, 2000 at 06:04:43AM +0700, Alexey Dokuchaev wrote: On Tue, 24 Oct 2000, David O'Brien wrote: On Tue, Oct 24, 2000 at 04:23:40PM +0700, Alexey Dokuchaev wrote: Why can't I simply write kill -1 `cat /var/run/sendmail.pid`? What about deamons that don't understand `kill -HUP'? Sendmail didn't until very reciently. ``/etc/rc.d/some-deamon restart'' does the right thing reguardless how involved that might be. Though I see your point, actually, many UNIX books, including some pretty old ones, refer to sending HUP signal as standard way of restarting/resetting daemons. Using the `kill -HUP` method, how do you deal with the dependency issues that people have been mentioning in this thread? -brian -- Brian O'Shea [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: current hangs when boot
(Yikes, my message turned out to be a bit long, sorry) I did a little poking around. I'm running -current as of last Saturday: # uname -a FreeBSD panic.localdomain 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Sat Oct 21 22:20:11 PDT 2000 [EMAIL PROTECTED]:/usr/obj/usr/local/cvs up/current/src/sys/PANIC i386 On Mon, Oct 23, 2000 at 12:27:25AM +, Bigbear wrote: i update my system from 4.1 to current, when system boot, it hangs when: start elf ldconfig: /usr/lib /usr/lib/compat /usr/X11R6/lib why? I am also having this problem. If you interrupt it (with ^\ to send SIGQUIT), ldconfig generates a core. Then ldconfig will hang while setting a.out ldconfig path: ^Csetting a.out ldconfig path: /usr/lib/aout /usr/lib/compat/aout This can be interrupted too, and then it hangs while starting sshd. Interrupting sshd allows the boot to procede. I got a core from each program during the hang, and here's what I found: Here's the backtrace from the core obtained from ldconfig (rebuilt with -g) the first time around: (starting elf ldconfig) (gdb) bt #0 0x8054340 in read () #1 0x804c966 in mktemp () #2 0x804ca33 in arc4random_stir () #3 0x804cad9 in arc4random () #4 0x804c791 in mktemp () #5 0x804c692 in mkstemp () #6 0x804886a in write_elf_hints () #7 0x8048818 in update_elf_hints () #8 0x8048c61 in main () #9 0x8048139 in _start () And the second time around: (setting a.out ldconfig path) (gdb) bt #0 0x8054340 in read () #1 0x804c966 in mktemp () #2 0x804ca33 in arc4random_stir () #3 0x804cad9 in arc4random () #4 0x804c791 in mktemp () #5 0x804c692 in mkstemp () #6 0x8049590 in buildhints () #7 0x8048e39 in main () #8 0x8048139 in _start () And from sshd: (gdb) bt #0 0x28208784 in read () from /usr/lib/libc.so.4 #1 0x282081ce in __sread () from /usr/lib/libc.so.4 #2 0x281f67a6 in __srefill () from /usr/lib/libc.so.4 #3 0x281f23bd in fread () from /usr/lib/libc.so.4 #4 0x281217c1 in RAND_SSLeay () from /usr/lib/libcrypto.so.1 #5 0x28121869 in RAND_SSLeay () from /usr/lib/libcrypto.so.1 #6 0x281212cc in RAND_bytes () from /usr/lib/libcrypto.so.1 #7 0x28146099 in DSA_OpenSSL () from /usr/lib/libcrypto.so.1 #8 0x28146151 in BN_rand () from /usr/lib/libcrypto.so.1 #9 0x280e4561 in BN_is_prime_fasttest () from /usr/lib/libcrypto.so.1 #10 0x280e3e03 in BN_generate_prime () from /usr/lib/libcrypto.so.1 #11 0x280da4a8 in RSA_generate_key () from /usr/lib/libcrypto.so.1 #12 0x8059437 in getsockname () #13 0x804c35b in getsockname () #14 0x804b76d in getsockname () Running ldconfig manually, 'top' shows ldconfig sleeping on 'rndblk': PID USERNAME PRI NICE SIZERES STATETIME WCPUCPU COMMAND ... 228 root 46 0 216K 104K rndblk 0:00 0.00% 0.00% ldconfig More investigation: # fstat /dev/urandom USER CMD PID FD MOUNT INUM MODE SZ|DV R/W NAME root ldconfig 2283 / 7973 crw-r--r-- urandom r /dev/urandom # ps auxw | grep ldconfig root 228 0.0 0.4 216 104 d0 I 2:18AM 0:00.00 ldconfig -elf /usr/lib This commit from Peter Wemm on Oct 18 might shed some light: peter 2000/10/18 03:39:18 PDT Modified files: sys/dev/random randomdev.c Log: Attempt to fix the random read blocking. The old code slept at priority "0" and without PCATCH, so it was uninterruptable. And even when it did wake up after entropy arrived, it exited after the wakeup without actually reading the freshly arrived entropy. I sent this to Mark before but it seems he is in transit. Mark: feel free to replace this if it gets in your way. Revision ChangesPath 1.16 +14 -15src/sys/dev/random/randomdev.c Maybe this is a related problem (except now random read blocking is interruptable?) -- Brian O'Shea [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: new rc.network6 and rc.firewall6
On Mon, Oct 23, 2000 at 01:05:27AM -0700, David O'Brien wrote: On Sun, Oct 22, 2000 at 09:41:51PM -0400, Bill Vermillion wrote: One of the reasons for the numbers in the SysVR4 arena is to set the order of execution so programs which other depend upon are executed first. How does the NetBSD solve this problem. Very coolly. The main rc script runs a script named `rcorder' to generate the proper order. rc.shutdown also uses `rcorder' but reverses the ordering. Two examples are included below to show what `rcorder' uses to generate the list. These NetBSD rc files also provide "start", "stop", "restart", "status", etc. commands to assist the sysadmin. Again, *very* slick and still quite BSD-like. Sounds interesting. To add a new rc script to the system, do you have to add an entry to an "rc order list" somewhere (in addition to adding the new script)? How is that handled? The nice (or clumsy, depending on your point of view) part about the SysV way is that the order in which the rc scripts are executed is implicit in the scripts' names. Of course, they have added a symlink maze (worse, hard links on HP-UX) on top of that, making it tedious to maintain rc scripts by hand (maybe that was by design). [snip] -- Brian O'Shea [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: new rc.network6 and rc.firewall6
On Mon, Oct 23, 2000 at 05:07:42PM -0400, Brandon D. Valentine wrote: On Mon, 23 Oct 2000, Brian O'Shea wrote: Sounds interesting. To add a new rc script to the system, do you have to add an entry to an "rc order list" somewhere (in addition to adding the new script)? How is that handled? The nice (or clumsy, depending on your point of view) part about the SysV way is that the order in which the rc scripts are executed is implicit in the scripts' names. Of course, they have added a symlink maze (worse, hard links on HP-UX) on top of that, making it tedious to maintain rc scripts by hand (maybe that was by design). Hmm I don't have any NetBSD machines running the later 1.5 revisions yet, so I've not seen the new scripts, but I would say that adding a new script to a list of rc files would be much less hassle than adding an entry in a monolithic /etc/rc to process that new file. I agree. However, I was comparing it to the SysV rc script format, not to the existing BSD rc scripts. -brian -- Brian O'Shea [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: procfs info.
On Fri, Sep 29, 2000 at 08:49:06PM +, [EMAIL PROTECTED] wrote: You wrote: I need to know the exact format of the /proc/*/cmdline of FreeBSD. Actually I'm using 4.1 and I have discovered that at the end of cmdline file there are always 2 NULL characters. I'm not seeing that on my 4.x-stable system from about a month ago: Hmm, but look at this: [panic:/root]# uname -a FreeBSD panic.localdomain 5.0-CURRENT FreeBSD 5.0-CURRENT #0: Sat Sep 16 16:24:39 PDT 2000 [EMAIL PROTECTED]:/usr/obj/usr/local/cvs up/current/src/sys/PANIC i386 [panic:/root]# hd /proc/0/cmdline 73 77 61 70 70 65 72 00 00 00 00 00 00 00 00 00 |swapper.| 0010 [panic:/root]# hd /proc/10/cmdline 69 64 6c 65 00 65 72 00 00 00 00 00 00 00 00 00 |idle.er.| 0010 [panic:/root]# hd /proc/11/cmdline 73 6f 66 74 69 6e 74 65 72 72 75 70 74 00 00 00 |softinterrupt...| 0010 [panic:/root]# hd /proc/12/cmdline 69 72 71 31 34 3a 20 61 74 61 30 00 00 00 00 00 |irq14: ata0.| 0010 [panic:/root]# hd /proc/13/cmdline 69 72 71 31 35 3a 20 61 74 61 31 00 00 00 00 00 |irq15: ata1.| 0010 [panic:/root]# hd /proc/14/cmdline 69 72 71 31 31 3a 20 75 68 63 69 30 2b 00 00 00 |irq11: uhci0+...| 0010 [panic:/root]# hd /proc/15/cmdline 69 72 71 36 3a 20 66 64 63 30 00 00 00 00 00 00 |irq6: fdc0..| 0010 [panic:/root]# hd /proc/16/cmdline 69 72 71 31 3a 20 61 74 6b 62 64 30 00 00 00 00 |irq1: atkbd0| 0010 There seem to be lots of nulls at the end of the names of kernel threads (padding their names to 16 bytes). Not that it matters, but it's strange. -brian -- Brian O'Shea [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: strange freeze while starting kde2 :(
On Sun, Aug 13, 2000 at 11:09:25AM +0400, Ilmar S. Habibulin wrote: While starting kde2 beta my pc freezes and i have to push power off button. After reboot i hade to run fsck, because of "strange inconsistency". Some files(created by kde startup) were broken and contain corrupted data. Kernel doesn't panic, it just freezes. How can i examine this situation more detailed? Can anybody help? Are you certain that the kernel is hanging and not just the graphics display? Can you try logging in remotely over the network, or possibly on a serial port? Also, please include : - A description of the hardware you are using - When you last updated -current sources Thanks, -brian -- Brian O'Shea [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: Re[2]: Journaling Filesystem ?
On Sun, Jul 23, 2000 at 03:28:07PM -0700, Kris Kennaway wrote: On Sun, 23 Jul 2000, Joe McGuckin wrote: The big win with a journaling FS is when you have to reboot the system. With Softupdates, you still have to fsck. On a large FS (say half a terabyte) that can take hours. No you don't. Your filesystem will be in a consistent state except for blocks which are marked used but are not, so you can fsck in the background at the expense of not having all of your free space available at startup. Having said that, I don't know that this procedure has been well tested in practise, so you're advised to use caution when testing it :-) I didn't even know that background fsck was supported at all. I remember hearing Kirk talk about it as a future feature at FreeBSD CON last year, but I havn't heard anything about it since. How do you use it? Thanks, -brian p.s. Forgive me if this is well documented in -CURRENT. At the moment, the latest version of FreeBSD that I have available to me is 4.1-RC (cvsup from July 21) and I can't find any mention of it. -- Brian O'Shea [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Panic in boot after flushing buffers
Hello, I am running -CURRENT from June 27, 2000 (started cvsup around 19:05) on a PII 266 MHz with 32MB RAM and one IDE disk. Initially, I noticed that while syncing disks during a reboot, the system would always give up before finishing. To capture the output, I configured the kernel to use a serial console by setting flags for the serial port in the hints file (hint.sio.0.flags="0xb0"). Now, instead of just failing to sync the disks, the system panics about two out of every three reboots. The kernel config file (MONSTER) is included as an attachment, as well as the hints file. Below is the panic information and stack trace. Let me know if you would like any more information (this is my first crack at running -CURRENT, so I'm new at this). Regards, -brian System shutdown time has arrived Shutting down daemon processes: . Waiting (max 60 seconds) for system process `bufdaemon' to stop...stopped Waiting (max 60 seconds) for system process `syncer' to stop...stopped syncing disks... Fatal trap 12: page fault while in kernel mode fault virtual address = 0xc090b5bd fault code = supervisor read, page not present instruction pointer = 0x8:0xc014c638 stack pointer = 0x10:0xc3b66f0c frame pointer = 0x10:0xc3b66f20 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 1 (init) interrupt mask = none panic: from debugger panic: from debugger Uptime: 11m4s dumping to dev #ad/0x20001, offset 65536 dump ata0: resetting devices .. done 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 --- #0 boot (howto=260) at ../../kern/kern_shutdown.c:303 303 dumppcb.pcb_cr3 = rcr3(); (kgdb) bt #0 boot (howto=260) at ../../kern/kern_shutdown.c:303 #1 0xc014cbd5 in panic (fmt=0xc02656f4 "from debugger") at ../../kern/kern_shutdown.c:553 #2 0xc011f479 in db_panic (addr=-1072380360, have_addr=0, count=1, modif=0xc3b66d78 "") at ../../ddb/db_command.c:433 #3 0xc011f419 in db_command (last_cmdp=0xc0294b78, cmd_table=0xc02949d8, aux_cmd_tablep=0xc02b4880) at ../../ddb/db_command.c:333 #4 0xc011f4de in db_command_loop () at ../../ddb/db_command.c:455 #5 0xc012169b in db_trap (type=12, code=0) at ../../ddb/db_trap.c:71 #6 0xc0244626 in kdb_trap (type=12, code=0, regs=0xc3b66ecc) at ../../i386/i386/db_interface.c:158 #7 0xc0252698 in trap_fatal (frame=0xc3b66ecc, eva=3230709181) at ../../i386/i386/trap.c:922 #8 0xc0252371 in trap_pfault (frame=0xc3b66ecc, usermode=0, eva=3230709181) at ../../i386/i386/trap.c:820 #9 0xc0251f2b in trap (frame={tf_fs = 16, tf_es = 16, tf_ds = 16, tf_edi = -1011454080, tf_esi = 1, tf_ebp = -1011454176, tf_isp = -1011454216, tf_ebx = -1064258240, tf_edx = 160160, tf_ecx = -1070796288, tf_eax = 455, tf_trapno = 12, tf_err = 0, tf_eip = -1072380360, tf_cs = 8, tf_eflags = 66050, tf_esp = -1011479040, tf_ss = 1}) at ../../i386/i386/trap.c:426 #10 0xc014c638 in boot (howto=0) at ../../kern/kern_shutdown.c:234 #11 0xc014c40c in reboot (p=0xc3b60e00, uap=0xc3b66f80) ---Type return to continue, or q return to quit--- at ../../kern/kern_shutdown.c:146 #12 0xc0252971 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077936612, tf_esi = -1077936624, tf_ebp = -1077936836, tf_isp = -1011453996, tf_ebx = -1077936732, tf_edx = -1, tf_ecx = 4, tf_eax = 55, tf_trapno = 7, tf_err = 2, tf_eip = 134536452, tf_cs = 31, tf_eflags = 643, tf_esp = -1077937056, tf_ss = 47}) at ../../i386/i386/trap.c:1126 #13 0xc0244f65 in Xint0x80_syscall () #14 0x80486ee in ?? () #15 0x8048478 in ?? () #16 0x8048139 in ?? () -- Brian O'Shea [EMAIL PROTECTED] # # MONSTER -- Based on the GENERIC kernel configuration file # machine i386 cpu I686_CPU ident MONSTER maxusers32 hints "MONSTER.hints" #Default places to look for devices. makeoptions DEBUG=-g#Build kernel with gdb(1) debug symbols options MATH_EMULATE#Support for x87 emulation options INET#InterNETworking options INET6 #IPv6 communications protocols options FFS #Berkeley Fast Filesystem options FFS_ROOT#FFS usable as root device [keep this!] options SOFTUPDATES #Enable FFS soft updates support options MFS #Memory Filesystem options MD_ROOT #MD is a potential root device options NFS #Network Filesystem options NFS_ROOT#NFS usable as root device, NFS required options MSDOSFS #MSDOS Filesystem options CD96
Re: roots shell == /bin/sh please
On Thu, Jun 29, 2000 at 01:11:39PM -0700, R Joseph Wright wrote: Speaking of csh and tcsh, I noticed that /bin/csh is hard linked to /bin/tcsh, yet when I invoke tcsh, I get a different prompt than when I invoke csh. I find this rather odd. When invoked as tcsh, the shell behaves like tcsh. This is a common technique (check out ex, nex, nvi, nview, vi, and view, for examples; all are hard links to the same file). The program checks its argv[0] and behaves differently depending on what it is set to. -brian -- Brian O'Shea [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message