kevent and unix dgram socket problem
Hi everyone-- I'm working on an application that is attempting to use kqueues to detect data arriving at a unix domain datagram socket, but kevents don't appear to get delivered when a datagram arrives. Using poll() for the same purpose appears to work fine. Also, if I switch the socket to the AF_INET domain, I see the correct behavior with kevent(). I distilled the problem into two files that I included. listen.cc creates a unix socket and blocks for data on a kevent() call. write.cc sends a brief message to the same unix socket. I've seen the problem on 6-STABLE and 4.5-RELEASE. Anyone have any thoughts or comments? Thanks, Jason // listen.cc = #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define LISTENQ 2 #define UN_PATH_LEN sizeof(((struct sockaddr_un*)0)->sun_path) int main(int argc, char *argv[]) { // new socket int fd = socket(AF_LOCAL, SOCK_DGRAM, 0); assert(fd >= 0); // make sure there isn't something in it's way unlink("usock"); // create the local address, bind & listen struct sockaddr_un addr; memset(&addr, 0, sizeof(addr)); addr.sun_family = AF_LOCAL; strncpy(addr.sun_path, "usock", UN_PATH_LEN - 1); assert(bind(fd, (sockaddr*) &addr, sizeof(sockaddr_un)) == 0); assert(listen(fd, LISTENQ) == 0); char buf[1024]; int nread; // uncomment this line to prove my socket is set up correctly // nread = read(fd, buf, sizeof(buf)); // printf("read %d bytes\n", nread); int kqueueFD; kqueueFD = kqueue(); struct kevent event; EV_SET(&event, fd, EVFILT_READ, EV_ADD, 0, 0, 0); assert(kevent(kqueueFD, &event, 1, 0, 0, 0) == 0); struct pollfd pfd; pfd.fd = fd; pfd.events = POLLIN; pfd.revents = 0; int r; // uncomment the following two lines to see poll behavior // while ((r = poll(&pfd, 1, INFTIM)) >= 0) { // printf("poll returned %d\n", r); // uncomment the following two lines to see kqueue behavior while ((r = kevent(kqueueFD, 0, 0, &event, 1, 0)) >= 0) { printf("kevent returned %d\n", r); nread = read(fd, buf, sizeof(buf)); printf("read %d bytes\n", nread); } } // write.cc = #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define LISTENQ 2 #define UN_PATH_LEN sizeof(((struct sockaddr_un*)0)->sun_path) int main(int argc, char *argv[]) { int fd = socket(AF_LOCAL, SOCK_DGRAM, 0); assert(fd >= 0); // create the local address & "connect" struct sockaddr_un addr; memset(&addr, 0, sizeof(addr)); addr.sun_family = AF_LOCAL; strncpy(addr.sun_path, "usock", UN_PATH_LEN - 1); assert(connect(fd, (sockaddr*) &addr, sizeof(sockaddr_un)) == 0); const char *msg = "this is the message\n"; write(fd, msg, strlen(msg)); } ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"
freebsd-5.4-stable panics
Hi-- I've been working on setting up a dual cpu, dual-core Opteron 275 with freebsd-5.4-stable, but have been getting panics and reboots fairly consistently. I think the problem I'm seeing might be related to this discussion: http://groups.google.com/group/lucky.freebsd.current/browse_thread/thread/6abaddffadebfdfe/f251a4874c2be3b1?lnk=st&q=freebsd+kernel+%22trap+9%22+closef&rnum=3&hl=en#f251a4874c2be3b1 but I can't be sure. I have several applications (on the order of 10) that each receive and send multicast data (each listens to 6-12 multicast streams and broadcasts 1). They also log to disk the data they broadcast. These applications each join all the groups they listen to at startup, and never explicitly leave these groups. These applications process 500-5000 packets per second between them in our environment. The machine usually panics after these applications have been up and running for 30 min to 6 hours. Several times the panic/reboot seems to have been triggered by an independent operation from these applications (copying a large file off the machine or moving a directory that contained the log files) After the first few panics, we rebuilt the kernel with trace and debug options and have saved a few core files. There seem to be 2 types of crashes we see with pretty different stack traces. What I'll call a type 1 crash, I believe, is often caused by one of the triggers I mention above. A type 2 crash appears to happen spontaneously after the machine has been running for a while. I poked around using kgdb in a core file from a type 2 crash, and it appeared the system hung closing sockets (specifically cleaning up multicast state i think) while cleaning up one of our multicast applications (note the trace through sys_exit). There's no reason this application should have been exiting unless it encountered some kind of error. I'm attaching: dmesg.txt kernel-conf.txt (kernel config file) type1-core.txt (a kgdb bt from a type1/triggered crash) type2-core.txt (a kgdb bt from a type2/spontaneous crash) I'm happy to dig for more information, recompile with different options, apply patches, or do anything else that might help get this problem diagnosed and fixed! Thanks, Jason Carroll Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-STABLE #1: Wed Sep 21 16:25:57 EDT 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/LOCAL-DEBUG WARNING: WITNESS option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Dual Core AMD Opteron(tm) Processor 275 (2190.66-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x20f12 Stepping = 2 Features=0x178bfbff Features2=0x1 AMD Features=0xe2500800,LM,3DNow+,3DNow> Hyperthreading: 2 logical CPUs real memory = 3942580224 (3759 MB) avail memory = 3805609984 (3629 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-27 on motherboard ioapic2 irqs 28-31 on motherboard acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 cpu1: on acpi0 cpu2: on acpi0 cpu3: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 6.0 on pci0 pci3: on pcib1 ohci0: mem 0xfeafc000-0xfeafcfff irq 19 at device 0.0 on pci3 usb0: OHCI version 1.0, legacy support usb0: on ohci0 usb0: USB revision 1.0 uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 3 ports with 3 removable, self powered ohci1: mem 0xfeafd000-0xfeafdfff irq 19 at device 0.1 on pci3 usb1: OHCI version 1.0, legacy support usb1: on ohci1 usb1: USB revision 1.0 uhub1: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 3 ports with 3 removable, self powered pci3: at device 6.0 (no driver attached) fxp0: port 0xbc00-0xbc3f mem 0xfeaa-0xfeab,0xfeafb000-0xfeafbfff irq 18 at device 8.0 on pci3 miibus0: on fxp0 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:e0:81:31:89:1b isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 7.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 pci0: at device 7.2 (no driver attached) pci0: at device 7.3 (no driver attached) pcib2: at device 10.0 on pci0 pci2: on pcib2 em0: port 0x8880-0x88bf mem 0xfc90-0xfc93,0xfc9c-0xfc9d irq 26 at device 2.0 on pci2 em0: Ethernet address: 00:04:23:ba