Re: bind error when using SO_REUSEPORT(implies SO_REUSEADDR)
On Tue, Jul 16, 2013 at 11:12:46AM -0400, John Baldwin wrote: On Thursday, March 15, 2012 8:07:46 pm Sean Bruno wrote: On Thu, 2012-03-15 at 16:59 -0700, Sean Bruno wrote: Hey, I just found a bind bug ticket in my queue about bind. I noted that on stable/6 stable/7 stable/9 head the referenced code fails. It seems that this is a problem, but I have no idea if its a real problem or not. Our devs think it is. Anyway, here is a code snippet to show the failure in bind. On linux/solaris this does not fail. http://people.freebsd.org/~sbruno/bind_test.c simple compile with gcc -o test test.c and run as normal user. Sean this is bind() not bind ... :-) Did the recent commit to HEAD fix this btw? As for me, bind_test.c does not expose any bug in freebsd, it only shows different behavior for freebsd and linux. On freebsd the test output is: serversock addr is 127.0.0.1:27539 dup bind: Address already in use This error was expected, tried to bind to used addr/port BUG: binding duplicate socket to server port succeeded dup2sock addr is 0.0.0.0:27539 overlapping explicit bind to same port number succeeded without SO_REUSEPORT listen succeeded after explicitly overlapping port bind autosock addr is 0.0.0.0:27539 bug triggered, port number conflict on sockets without SO_REUSEPORT listen succeded after implicitly overlapping port bind So, the first socket (serversock) is bound to the loopback address, then it tries some combinations of binding the second socket to the same port but to the wildcard address. When SO_REUSEADDR socket option is set, binding to the wildcard address succeeds for freebsd (and fails for linux). They call this a bug in freebsd, but this is well known and expected behavior (see e.g. Stevens' TCP/IP Illustrated Vol1). Or I missed the test's point? -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/179901: [netinet] [patch] Multicast SO_REUSEADDR handled incorrectly
The following reply was made to PR kern/179901; it has been noted by GNATS. From: Mikolaj Golub troc...@freebsd.org To: bug-follo...@freebsd.org Cc: Michael Gmelin free...@grem.de Subject: Re: kern/179901: [netinet] [patch] Multicast SO_REUSEADDR handled incorrectly Date: Sun, 30 Jun 2013 10:17:05 +0300 --EeQfGwPcQSOJBaQU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Jun 27, 2013 at 11:00:16PM +0300, Mikolaj Golub wrote: I don't insist on maintaining the old behaviour. But as actually we have 2 issues here (regression introduced by me in FreeBSD9 and historical behavior that looks wrong), with different priority, I would like to fix the issues separately. This way it will be easier to track the changes, e.g. when after a year it turns out that the second change has broken some other case. Here is a patch for the second issue. -- Mikolaj Golub --EeQfGwPcQSOJBaQU Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename=pr179901.2.1.patch commit 7cf3a6a95d74ae91c80350fc1ae8e96fe59c3c65 Author: Mikolaj Golub troc...@freebsd.org Date: Sun Jun 30 00:09:20 2013 +0300 A complete duplication of binding should be allowed if on both new and duplicated sockets a multicast address is bound and either SO_REUSEPORT or SO_REUSEADDR is set. But actually it works for the following combinations: * SO_REUSEPORT is set for the fist socket and SO_REUSEPORT for the new; * SO_REUSEADDR is set for the fist socket and SO_REUSEADDR for the new; * SO_REUSEPORT is set for the fist socket and SO_REUSEADDR for the new; and fails for this: * SO_REUSEADDR is set for the fist socket and SO_REUSEPORT for the new. Fix the last case. PR:179901 diff --git a/sys/netinet/in_pcb.c b/sys/netinet/in_pcb.c index 3506b74..eb15a38 100644 --- a/sys/netinet/in_pcb.c +++ b/sys/netinet/in_pcb.c @@ -554,7 +554,7 @@ in_pcbbind_setup(struct inpcb *inp, struct sockaddr *nam, in_addr_t *laddrp, * and a multicast address is bound on both * new and duplicated sockets. */ - if (so-so_options SO_REUSEADDR) + if ((so-so_options (SO_REUSEADDR|SO_REUSEPORT)) != 0) reuseport = SO_REUSEADDR|SO_REUSEPORT; } else if (sin-sin_addr.s_addr != INADDR_ANY) { sin-sin_port = 0; /* yech... */ diff --git a/sys/netinet6/in6_pcb.c b/sys/netinet6/in6_pcb.c index a0a6874..fb84279 100644 --- a/sys/netinet6/in6_pcb.c +++ b/sys/netinet6/in6_pcb.c @@ -156,7 +156,7 @@ in6_pcbbind(register struct inpcb *inp, struct sockaddr *nam, * and a multicast address is bound on both * new and duplicated sockets. */ - if (so-so_options SO_REUSEADDR) + if ((so-so_options (SO_REUSEADDR|SO_REUSEPORT)) != 0) reuseport = SO_REUSEADDR|SO_REUSEPORT; } else if (!IN6_IS_ADDR_UNSPECIFIED(sin6-sin6_addr)) { struct ifaddr *ifa; --EeQfGwPcQSOJBaQU-- ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/179901: [netinet] [patch] Multicast SO_REUSEADDR handled incorrectly
The following reply was made to PR kern/179901; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: Michael Gmelin free...@grem.de Cc: Mikolaj Golub troc...@freebsd.org, bug-follo...@freebsd.org Subject: Re: kern/179901: [netinet] [patch] Multicast SO_REUSEADDR handled incorrectly Date: Thu, 27 Jun 2013 23:00:16 +0300 On Wed, Jun 26, 2013 at 03:03:40PM +0200, Michael Gmelin wrote: Hi, I adapted the test code, you can find it at http://blog.grem.de/multicast.c Test output is: IPv4 Port : Bind using SO_REUSEADDR...OK (expected) Bind using SO_REUSEADDR...OK (expected) Bind using SO_REUSEPORT...FAIL (NOT expected): Address already in use IPv4 Port 5556: Bind using SO_REUSEPORT...OK (expected) Bind using SO_REUSEPORT...OK (expected) Bind using SO_REUSEADDR...OK (expected) Bind using SO_REUSEPORT...FAIL (NOT expected): Address already in use IPv4 Port 5557: Bind using SO_REUSEADDR x 2...OK (expected) Bind using SO_REUSEADDR x 2...OK (expected) Bind using SO_REUSEPORT...FAIL (NOT expected): Address already in use Bind using SO_REUSEADDR...OK (expected) Bind using SO_REUSEPORT...FAIL (NOT expected): Address already in use IPv4 Port 5558: Bind without socketopts...OK (expected) Bind using SO_REUSEADDR...FAIL (expected): Address already in use Bind using SO_REUSEPORT...FAIL (expected): Address already in use IPv4 Port 5559: Bind using SO_REUSEADDR...OK (expected) Bind without socketopts...FAIL (expected): Address already in use IPv4 Port 5560: Bind using SO_REUSEPORT...OK (expected) Bind using SO_REUSEPORT...OK (expected) Bind without socketopts...FAIL (expected): Address already in use IPv6 Port : Bind using SO_REUSEADDR...OK (expected) Bind using SO_REUSEADDR...OK (expected) Bind using SO_REUSEPORT...FAIL (NOT expected): Address already in use IPv6 Port 5556: Bind using SO_REUSEPORT...OK (expected) Bind using SO_REUSEPORT...OK (expected) Bind using SO_REUSEADDR...OK (expected) Bind using SO_REUSEPORT...FAIL (NOT expected): Address already in use IPv6 Port 5557: Bind using SO_REUSEADDR x 2...OK (expected) Bind using SO_REUSEADDR x 2...OK (expected) Bind using SO_REUSEPORT...FAIL (NOT expected): Address already in use Bind using SO_REUSEADDR...OK (expected) Bind using SO_REUSEPORT...FAIL (NOT expected): Address already in use IPv6 Port 5558: Bind without socketopts...OK (expected) Bind using SO_REUSEADDR...FAIL (expected): Address already in use Bind using SO_REUSEPORT...FAIL (expected): Address already in use IPv6 Port 5559: Bind using SO_REUSEADDR...OK (expected) Bind without socketopts...FAIL (expected): Address already in use IPv6 Port 5560: Bind using SO_REUSEPORT...OK (expected) Bind using SO_REUSEPORT...OK (expected) Bind without socketopts...FAIL (expected): Address already in use Thank you for testing! So you maintained the old PORT/ADDR behavior, which I think is not such a great idea. I would suggest to get another opinion on this, just because it's broken now doesn't mean we have to perpetuate it - maybe we should compare the behavior with other Unix(like) OSes like the other BSDs and Linux to see how their implementations work - usually ported software is not changed in that respect, so being compatible is valuable. It is difficult to talk about portability in the case of SO_REUSEPORT. AFAIK, there is no SO_REUSEPORT in Linux and it is recommended to always use SO_REUSEADDR for multicast in portable code. It looks like in this case we will always have expected behavior with the proposed patch. Besides my rant the code works as designed and seems to resemble the behavior before r227207 correctly (I manually applied the patches to 9.1-RELEASE). Fun fact: The code in ip6_output.c could have never worked in the first place, since it used IN_MULTICAST instead of IN6_IS_ADDR_MULTICAST: if (IN_MULTICAST(ntohl(in6p-inp_laddr.s_addr))) ... I don't insist on maintaining the old behaviour. But as actually we have 2 issues here (regression introduced by me in FreeBSD9 and historical behavior that looks wrong), with different priority, I would like to fix the issues separately. This way it will be easier to track the changes, e.g. when after a year it turns out that the second change has broken some other case. For now I am more concerned about having SO_REUSEADDR regression fixed in CURRENT and STABLE9 before 9.2. The patch is under review and I plan to commit it next week if it is ok. The second issue might require more discussion before commiting the change. -- Mikolaj Golub
Re: kern/179901: [netinet] [patch] Multicast SO_REUSEADDR handled incorrectly
The following reply was made to PR kern/179901; it has been noted by GNATS. From: Mikolaj Golub troc...@freebsd.org To: Michael Gmelin free...@grem.de Cc: bug-follo...@freebsd.org Subject: Re: kern/179901: [netinet] [patch] Multicast SO_REUSEADDR handled incorrectly Date: Tue, 25 Jun 2013 18:24:55 +0300 --tThc/1wpZn/ma/RB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Jun 25, 2013 at 01:39:38PM +0200, Michael Gmelin wrote: Yes, but it seems like your patch is fixing the not all places in in6_pcb.c, I think you should modify the code at line 246 as well: } else if (t (reuseport == 0 || (t-inp_flags2 INP_REUSEPORT) == 0)) { return (EADDRINUSE); } so it says } else if (t (reuseport inp_so_options(t)) == 0) { Good catch! I missed this because I was preparing the patch using r227207 as a reference, but this had been missed there too (fixed later in r233272 by glebius). Once 1) has been resolved I can test on a machine running 9.1-RELEASE later (the patch is small enough to apply it manually). I will run the unit test code from multicast.c I sent earlier and add IPv6 test cases to it as well. The updated patch is attached. Thanks. -- Mikolaj Golub --tThc/1wpZn/ma/RB Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename=pr179901.2.patch Index: sys/netinet/in_pcb.c === --- sys/netinet/in_pcb.c (revision 251760) +++ sys/netinet/in_pcb.c (working copy) @@ -467,6 +467,23 @@ in_pcb_lport(struct inpcb *inp, struct in_addr *la return (0); } + +/* + * Return cached socket options. + */ +int +inp_so_options(const struct inpcb *inp) +{ + int so_options; + + so_options = 0; + + if ((inp-inp_flags2 INP_REUSEPORT) != 0) + so_options |= SO_REUSEPORT; + if ((inp-inp_flags2 INP_REUSEADDR) != 0) + so_options |= SO_REUSEADDR; + return (so_options); +} #endif /* INET || INET6 */ #ifdef INET @@ -595,8 +612,7 @@ in_pcbbind_setup(struct inpcb *inp, struct sockadd if (tw == NULL || (reuseport tw-tw_so_options) == 0) return (EADDRINUSE); - } else if (t (reuseport == 0 || - (t-inp_flags2 INP_REUSEPORT) == 0)) { + } else if (t (reuseport inp_so_options(t)) == 0) { #ifdef INET6 if (ntohl(sin-sin_addr.s_addr) != INADDR_ANY || Index: sys/netinet/in_pcb.h === --- sys/netinet/in_pcb.h (revision 251760) +++ sys/netinet/in_pcb.h (working copy) @@ -442,6 +442,7 @@ struct tcpcb * inp_inpcbtotcpcb(struct inpcb *inp); void inp_4tuple_get(struct inpcb *inp, uint32_t *laddr, uint16_t *lp, uint32_t *faddr, uint16_t *fp); +int inp_so_options(const struct inpcb *inp); #endif /* _KERNEL */ @@ -543,6 +544,7 @@ void inp_4tuple_get(struct inpcb *inp, uint32_t * #define INP_PCBGROUPWILD0x0004 /* in pcbgroup wildcard list */ #define INP_REUSEPORT 0x0008 /* SO_REUSEPORT option is set */ #define INP_FREED 0x0010 /* inp itself is not valid */ +#define INP_REUSEADDR 0x0020 /* SO_REUSEADDR option is set */ /* * Flags passed to in_pcblookup*() functions. Index: sys/netinet/ip_output.c === --- sys/netinet/ip_output.c(revision 251760) +++ sys/netinet/ip_output.c(working copy) @@ -900,13 +900,10 @@ ip_ctloutput(struct socket *so, struct sockopt *so switch (sopt-sopt_name) { case SO_REUSEADDR: INP_WLOCK(inp); - if (IN_MULTICAST(ntohl(inp-inp_laddr.s_addr))) { - if ((so-so_options - (SO_REUSEADDR | SO_REUSEPORT)) != 0) - inp-inp_flags2 |= INP_REUSEPORT; - else - inp-inp_flags2 = ~INP_REUSEPORT; - } + if ((so-so_options SO_REUSEADDR) != 0) + inp-inp_flags2 |= INP_REUSEADDR; + else + inp-inp_flags2 = ~INP_REUSEADDR; INP_WUNLOCK(inp
Re: kern/179901: [netinet] [patch] Multicast SO_REUSEADDR handled incorrectly
The following reply was made to PR kern/179901; it has been noted by GNATS. From: Mikolaj Golub troc...@freebsd.org To: bug-follo...@freebsd.org, free...@grem.de Cc: Subject: Re: kern/179901: [netinet] [patch] Multicast SO_REUSEADDR handled incorrectly Date: Mon, 24 Jun 2013 23:29:42 +0300 --9amGYk9869ThD9tj Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Michael, Thank you for your analysis and the patch. I have the following notes to your patch though: 1) INET6 needs fixing too. 2) It looks like after introducing INP_REUSEADDR there is no need in handling the IN_MULTICAST case in ip_ctloutput(). 3) Actually you don't have to use IN_MULTICAST() in in_pcbbind_setup(): the information is already encoded in reuseport variable. 4) The patch not only fixes the regression introduced by r227207, but also changes the historical behavior before r227207. Although the change might be correct it is better to separate these issues. Feeling guilty for the regression introduced in r227207 I am eager to fix it ASAP, before 9.2 release. But I don't have strong opinion about changing the historical behavior. So, could you please look at the attached patch, which is based on your idea of INP_REUSEADDR flag? Now the code more resembles the code before r227207 in looks and I am a little more confident that there is no regression. I would appreciate any testing. Note, it is against CURRENT; STABLE will require patching in_pcb.h manually due to newly introduced INP_FREED flag. -- Mikolaj Golub --9amGYk9869ThD9tj Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename=pr179901.1.patch Index: sys/netinet/in_pcb.c === --- sys/netinet/in_pcb.c (revision 252162) +++ sys/netinet/in_pcb.c (working copy) @@ -467,6 +467,23 @@ in_pcb_lport(struct inpcb *inp, struct in_addr *la return (0); } + +/* + * Return cached socket options. + */ +int +inp_so_options(const struct inpcb *inp) +{ + int so_options; + + so_options = 0; + + if ((inp-inp_flags2 INP_REUSEPORT) != 0) + so_options |= SO_REUSEPORT; + if ((inp-inp_flags2 INP_REUSEADDR) != 0) + so_options |= SO_REUSEADDR; + return (so_options); +} #endif /* INET || INET6 */ #ifdef INET @@ -595,8 +612,8 @@ in_pcbbind_setup(struct inpcb *inp, struct sockadd if (tw == NULL || (reuseport tw-tw_so_options) == 0) return (EADDRINUSE); - } else if (t (reuseport == 0 || - (t-inp_flags2 INP_REUSEPORT) == 0)) { + } else if (t + (reuseport inp_so_options(t)) == 0) { #ifdef INET6 if (ntohl(sin-sin_addr.s_addr) != INADDR_ANY || Index: sys/netinet/in_pcb.h === --- sys/netinet/in_pcb.h (revision 252162) +++ sys/netinet/in_pcb.h (working copy) @@ -442,6 +442,7 @@ struct tcpcb * inp_inpcbtotcpcb(struct inpcb *inp); void inp_4tuple_get(struct inpcb *inp, uint32_t *laddr, uint16_t *lp, uint32_t *faddr, uint16_t *fp); +int inp_so_options(const struct inpcb *inp); #endif /* _KERNEL */ @@ -543,6 +544,7 @@ void inp_4tuple_get(struct inpcb *inp, uint32_t * #define INP_PCBGROUPWILD0x0004 /* in pcbgroup wildcard list */ #define INP_REUSEPORT 0x0008 /* SO_REUSEPORT option is set */ #define INP_FREED 0x0010 /* inp itself is not valid */ +#define INP_REUSEADDR 0x0020 /* SO_REUSEADDR option is set */ /* * Flags passed to in_pcblookup*() functions. Index: sys/netinet/ip_output.c === --- sys/netinet/ip_output.c(revision 252162) +++ sys/netinet/ip_output.c(working copy) @@ -900,13 +900,10 @@ ip_ctloutput(struct socket *so, struct sockopt *so switch (sopt-sopt_name) { case SO_REUSEADDR: INP_WLOCK(inp); - if (IN_MULTICAST(ntohl(inp-inp_laddr.s_addr))) { - if ((so-so_options - (SO_REUSEADDR | SO_REUSEPORT)) != 0) - inp-inp_flags2 |= INP_REUSEPORT; - else - inp-inp_flags2 = ~INP_REUSEPORT; - } + if ((so-so_options SO_REUSEADDR) != 0) + inp-inp_flags2 |= INP_REUSEADDR
Re: kern/167059: [tcp] [panic] System does panic in in_pcbbind() and hangs
The following reply was made to PR kern/167059; it has been noted by GNATS. From: Mikolaj Golub troc...@freebsd.org To: bug-follo...@freebsd.org, yeho...@gmail.com Cc: Subject: Re: kern/167059: [tcp] [panic] System does panic in in_pcbbind() and hangs Date: Sat, 18 May 2013 22:15:26 +0300 This looks similar to the issue fixed in 9.0 (r227207 + r227449). There was a discussion on freebsd-net@ titled Kernel panic on FreeBSD 9.0-beta2: http://lists.freebsd.org/pipermail/freebsd-net/2011-September/029858.html Are there chances that you can check =9.0? -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
lagg with wireless iface: iieee80211_waitfor_parent is called with a non-sleepable lock held
Hi, On my laptop I have lagg setup in failover mode between wired and wireless interfaces, as it is decribed in handbook: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html#networking-lagg-wired-and-wireless On start I have been observing witness warnings like below: taskqueue_drain with the following non-sleepable locks held: exclusive rw if_lagg rwlock (if_lagg rwlock) r = 0 (0xfe000aa9d408) locked @ /home/golub/freebsd/base/head/sys/modules/if_lagg/../../net/if_lagg.c:1065 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b kdb_backtrace() at kdb_backtrace+0x39 witness_warn() at witness_warn+0x4b2 taskqueue_drain() at taskqueue_drain+0x3a ieee80211_waitfor_parent() at ieee80211_waitfor_parent+0x28 ieee80211_ioctl() at ieee80211_ioctl+0x3e9 if_setflag() at if_setflag+0xc0 ifpromisc() at ifpromisc+0x2c lagg_ioctl() at lagg_ioctl+0x7d5 if_setflag() at if_setflag+0xc0 ifpromisc() at ifpromisc+0x2c bridge_ioctl_add() at bridge_ioctl_add+0x454 bridge_ioctl() at bridge_ioctl+0x268 in_control() at in_control+0x219 ifioctl() at ifioctl+0x1896 kern_ioctl() at kern_ioctl+0x1b0 sys_ioctl() at sys_ioctl+0x11f amd64_syscall() at amd64_syscall+0x282 Xfast_syscall() at Xfast_syscall+0xfb --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8011815ca, rsp = 0x7fffd3f8, rbp = 0x7fffd4a0 --- and eventually the panic Sleeping thread owns a non-sleepable lock in lagg_input, when a packet arrives simultaneously with ifconfig run. The lagg gets if_lagg rwlock before going to setflag, which ends up calling ieee80211_ioctl and ieee80211_waitfor_parent (wait for all deferred parent interface tasks to complete). Does anybody see a way how it could be solved? -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Proposal for changes to network device drivers and network stack (RFC)
On Fri, Sep 07, 2012 at 01:28:16AM -0700, Anuranjan Shukla wrote: Hi George, Thanks for taking a look. Some answers/comments below. Building FreeBSD without the network stack (network stack as a module) -- This would be interesting for many reasons, and I think it would be a good contribution. Does the work you've done in this area handle the VNET stuff that is in the stack as well? That is, how well does the network stack as a module play with the vnet architecture? I'll follow up on this one separately. FYI, there is at least this issue with virtualized global variables in modules: http://lists.freebsd.org/pipermail/freebsd-virtualization/2011-July/000737.html On archs that use link_elf.c (i.e. all except amd64, which uses link_elf_obj.c) virtualized global variables in modules can not be accessed from another modules, because link_elf on a module load does relocation only for VNET variables defined in this module. As it was pointed by Marko Zec, the same issue is with DPCPU. The latest patch I have (both for VNET and DPCPU): http://people.freebsd.org/~trociny/link_elf.c.pcpu_vnet.patch The fix is to make the linker on a module load recognize external VNET/DPCPU variables defined in the previously loaded modules and relocate them accordingly. For this set_pcpu_list and set_vnet_list are used, where the addresses of modules 'set_pcpu' and 'set_vnet' linker sets are stored in. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: net.inet.tcp.hostcache.list: RTTVAR value
On Mon, 02 Jul 2012 15:04:10 +0200 Andre Oppermann wrote: AO On 01.07.2012 18:30, Mikolaj Golub wrote: Hi, It looks for me that in the calculation of RTTVAR value for net.inet.tcp.hostcache.list sysctl a wrong scale is used: TCP_RTT_SCALE instead of TCP_RTTVAR_SCALE. See the attached patch. I am going to commit it if nobody tell me that I am wrong here. AO Correct. Thanks! Committed. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
netstat(1): negative tcp timer counters
Hi, I have noticed that `netstat -x' shows negative values for keep timer. In my case this is for connections in CLOSE state. Reviewing the timer code it looks like there is an issue in tcp_timer_* functions, when inp is checked for INP_DROPPED. If the flag is set the function returns and callout_deactivate() is never called. Adding some prints I made sure that observed negative counters in my case were due to this check. The attached patch (check for INP_DROPPED after callout_deactivate) fixes the issue for me. I would like to commit it if there are no objections. -- Mikolaj Golub Index: sys/netinet/tcp_timer.c === --- sys/netinet/tcp_timer.c (revision 237918) +++ sys/netinet/tcp_timer.c (working copy) @@ -183,13 +183,18 @@ tcp_timer_delack(void *xtp) return; } INP_WLOCK(inp); - if ((inp-inp_flags INP_DROPPED) || callout_pending(tp-t_timers-tt_delack) - || !callout_active(tp-t_timers-tt_delack)) { + if (callout_pending(tp-t_timers-tt_delack) || + !callout_active(tp-t_timers-tt_delack)) { INP_WUNLOCK(inp); CURVNET_RESTORE(); return; } callout_deactivate(tp-t_timers-tt_delack); + if ((inp-inp_flags INP_DROPPED) != 0) { + INP_WUNLOCK(inp); + CURVNET_RESTORE(); + return; + } tp-t_flags |= TF_ACKNOW; TCPSTAT_INC(tcps_delack); @@ -229,7 +234,7 @@ tcp_timer_2msl(void *xtp) } INP_WLOCK(inp); tcp_free_sackholes(tp); - if ((inp-inp_flags INP_DROPPED) || callout_pending(tp-t_timers-tt_2msl) || + if (callout_pending(tp-t_timers-tt_2msl) || !callout_active(tp-t_timers-tt_2msl)) { INP_WUNLOCK(tp-t_inpcb); INP_INFO_WUNLOCK(V_tcbinfo); @@ -237,6 +242,12 @@ tcp_timer_2msl(void *xtp) return; } callout_deactivate(tp-t_timers-tt_2msl); + if ((inp-inp_flags INP_DROPPED) != 0) { + INP_WUNLOCK(inp); + INP_INFO_WUNLOCK(V_tcbinfo); + CURVNET_RESTORE(); + return; + } /* * 2 MSL timeout in shutdown went off. If we're closed but * still waiting for peer to close and connection has been idle @@ -300,14 +311,20 @@ tcp_timer_keep(void *xtp) return; } INP_WLOCK(inp); - if ((inp-inp_flags INP_DROPPED) || callout_pending(tp-t_timers-tt_keep) - || !callout_active(tp-t_timers-tt_keep)) { + if (callout_pending(tp-t_timers-tt_keep) || + !callout_active(tp-t_timers-tt_keep)) { INP_WUNLOCK(inp); INP_INFO_WUNLOCK(V_tcbinfo); CURVNET_RESTORE(); return; } callout_deactivate(tp-t_timers-tt_keep); + if ((inp-inp_flags INP_DROPPED) != 0) { + INP_WUNLOCK(inp); + INP_INFO_WUNLOCK(V_tcbinfo); + CURVNET_RESTORE(); + return; + } /* * Keep-alive timer went off; send something * or drop connection if idle for too long. @@ -397,14 +414,20 @@ tcp_timer_persist(void *xtp) return; } INP_WLOCK(inp); - if ((inp-inp_flags INP_DROPPED) || callout_pending(tp-t_timers-tt_persist) - || !callout_active(tp-t_timers-tt_persist)) { + if (callout_pending(tp-t_timers-tt_persist) || + !callout_active(tp-t_timers-tt_persist)) { INP_WUNLOCK(inp); INP_INFO_WUNLOCK(V_tcbinfo); CURVNET_RESTORE(); return; } callout_deactivate(tp-t_timers-tt_persist); + if ((inp-inp_flags INP_DROPPED) != 0) { + INP_WUNLOCK(inp); + INP_INFO_WUNLOCK(V_tcbinfo); + CURVNET_RESTORE(); + return; + } /* * Persistance timer into zero window. * Force a byte to be output, if possible. @@ -469,14 +492,20 @@ tcp_timer_rexmt(void * xtp) return; } INP_WLOCK(inp); - if ((inp-inp_flags INP_DROPPED) || callout_pending(tp-t_timers-tt_rexmt) - || !callout_active(tp-t_timers-tt_rexmt)) { + if (callout_pending(tp-t_timers-tt_rexmt) || + !callout_active(tp-t_timers-tt_rexmt)) { INP_WUNLOCK(inp); INP_INFO_RUNLOCK(V_tcbinfo); CURVNET_RESTORE(); return; } callout_deactivate(tp-t_timers-tt_rexmt); + if ((inp-inp_flags INP_DROPPED) != 0) { + INP_WUNLOCK(inp); + INP_INFO_RUNLOCK(V_tcbinfo); + CURVNET_RESTORE(); + return; + } tcp_free_sackholes(tp); /* * Retransmission timer went off. Message has not ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: bin/151937: [patch] netstat(1) utility lack support of displaying rtt related counters of tcp sockets
Hi, Mykola, thank you for the report and the provided patch. Displaying rtt related counters per connection looks useful for me too. I am attaching the modified version of the patch to discuss (and commit if there are no objections or other suggestions). The differences from your version: 1) '-T' option is already used. Also, I don't like very much adding yet another option, so I added the statistics to '-x' option. Or it can be added to '-T' statistics. 2) As counter names I used names that are close to field names in the tcpcb structure. 3) To get hz, instead of kern.clockrate, I use kern.hz sysctl (as it simplifies the code a little) and for !live case read it from the dump. 4) The trick with printing to buf is used to pad the counters on the right, as it is with other counters. Also, it might be enough to display only srtt and rttvar statistics? -- Mikolaj Golub Index: usr.bin/netstat/inet.c === --- usr.bin/netstat/inet.c (revision 237835) +++ usr.bin/netstat/inet.c (working copy) @@ -293,6 +293,28 @@ fail: #undef KREAD } +static const char * +humanize_rtt(int val, int scale) +{ + size_t len; + static int hz; + static char buf[16]; + + if (hz == 0) { + hz = 1; + if (live) { + len = sizeof(hz); + if (sysctlbyname(kern.hz, hz, len, NULL, 0) == -1) +warn(sysctl: kern.hz); + } else { + kread(hz_addr, hz, sizeof(hz)); + } + } + snprintf(buf, sizeof(buf), %.3f, (float)val / (scale * hz)); + + return (buf); +} + /* * Print a summary of connections related to an Internet * protocol. For TCP, also give state of connection. @@ -441,6 +463,8 @@ protopr(u_long off, const char *name, int af1, int printf( %7.7s %7.7s %7.7s %7.7s %7.7s %7.7s, rexmt, persist, keep, 2msl, delack, rcvtime); +printf( %7.7s %7.7s %7.7s %9.9s, + srtt, rttvar, rttlow, rttupdate); } putchar('\n'); first = 0; @@ -548,6 +572,14 @@ protopr(u_long off, const char *name, int af1, int timer-tt_2msl / 1000, (timer-tt_2msl % 1000) / 10, timer-tt_delack / 1000, (timer-tt_delack % 1000) / 10, timer-t_rcvtime / 1000, (timer-t_rcvtime % 1000) / 10); + if (tp != NULL) { +printf( %7s, humanize_rtt(tp-t_srtt, +TCP_RTT_SCALE)); +printf( %7s, humanize_rtt(tp-t_rttvar, +TCP_RTTVAR_SCALE)); +printf( %7s, humanize_rtt(tp-t_rttlow, 1)); +printf( %9lu , tp-t_rttupdated); + } } if (istcp !Lflag !xflag !Tflag) { if (tp-t_state 0 || tp-t_state = TCP_NSTATES) Index: usr.bin/netstat/main.c === --- usr.bin/netstat/main.c (revision 237835) +++ usr.bin/netstat/main.c (working copy) @@ -184,6 +184,8 @@ static struct nlist nl[] = { { .n_name = _arpstat }, #define N_UNP_SPHEAD 56 { .n_name = unp_sphead }, +#define N_HZ 57 + { .n_name = _hz }, { .n_name = NULL }, }; @@ -358,6 +360,8 @@ int unit; /* unit number for above */ int af; /* address family */ int live; /* true if we are examining a live system */ +u_long hz_addr; /* address of hz variable in kernel memory */ + int main(int argc, char *argv[]) { @@ -563,6 +567,7 @@ main(int argc, char *argv[]) */ #endif kread(0, NULL, 0); + hz_addr = nl[N_HZ].n_value; if (iflag !sflag) { intpr(interval, nl[N_IFNET].n_value, NULL); exit(0); Index: usr.bin/netstat/netstat.h === --- usr.bin/netstat/netstat.h (revision 237835) +++ usr.bin/netstat/netstat.h (working copy) @@ -59,6 +59,8 @@ extern int unit; /* unit number for above */ extern int af; /* address family */ extern int live; /* true if we are examining a live system */ +extern u_long hz_addr; /* address of hz variable in kernel memory */ + int kread(u_long addr, void *buf, size_t size); const char *plural(uintmax_t); const char *plurales(uintmax_t); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
net.inet.tcp.hostcache.list: RTTVAR value
Hi, It looks for me that in the calculation of RTTVAR value for net.inet.tcp.hostcache.list sysctl a wrong scale is used: TCP_RTT_SCALE instead of TCP_RTTVAR_SCALE. See the attached patch. I am going to commit it if nobody tell me that I am wrong here. -- Mikolaj Golub Index: sys/netinet/tcp_hostcache.c === --- sys/netinet/tcp_hostcache.c (revision 237918) +++ sys/netinet/tcp_hostcache.c (working copy) @@ -624,7 +624,7 @@ sysctl_tcp_hc_list(SYSCTL_HANDLER_ARGS) msec(hc_entry-rmx_rtt * (RTM_RTTUNIT / (hz * TCP_RTT_SCALE))), msec(hc_entry-rmx_rttvar * -(RTM_RTTUNIT / (hz * TCP_RTT_SCALE))), +(RTM_RTTUNIT / (hz * TCP_RTTVAR_SCALE))), hc_entry-rmx_bandwidth * 8, hc_entry-rmx_cwnd, hc_entry-rmx_sendpipe, ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: bsnmp and HOST-RESOURCES-MIB
On Thu, 21 Jun 2012 19:23:33 +0700 Eugene Grosbein wrote: EG Hi! EG bsnmpd(1) has /usr/lib/snmp_hostres.so module in base system EG for HOST-RESOURCES-MIB implementation. What should I do to make EG bsnmpwalk -v 2c -s comm@localhost 1.3.6.1.2.1.25.3.3.1.2 EG work without complaining: EG bsnmpwalk: Invalid OID - 1.3.6.1.2.1.25.3.3.1.2 EG OID parsing error - 1.3.6.1.2.1.25.3.3.1.2 EG And without -n flag, please :-) EG I'd like it to resolve OIDs to their names. I am not very familiar with bsnmptools. Experimenting, I have found such combinations working: in138:~% bsnmpwalk -v 1 -s public@localhost -i hostres_tree.def 'hrProcessorTable' hrProcessorFrwID[5] = 0.0 hrProcessorFrwID[10] = 0.0 hrProcessorLoad[5] = 7 hrProcessorLoad[10] = 5 in138:~% bsnmpget -v 1 -s public@localhost -i hostres_tree.def 'hrProcessorLoad.5' hrProcessorLoad[5] = 8 Note, you should explicitly specify hostres_tree.def (from /usr/share/snmp/defs) for bsnmptools to be able to resolve name (no idea why). Unfortunately, bsnmpwalk does not work for hrProcessorLoad: in138:~% bsnmpwalk -v 1 -s public@localhost -i hostres_tree.def 'hrProcessorLoad' bsnmpwalk: Snmp dialog - Operation timed out Athough it works for the numerical format: in138:~% bsnmpwalk -v 1 -s public@localhost '1.3.6.1.2.1.25.3.3.1.2' 1.3.6.1.2.1.25.3.3.1.2.5 = 10 1.3.6.1.2.1.25.3.3.1.2.10 = 10 in138:~% bsnmpwalk -v 1 -s public@localhost -i hostres_tree.def '1.3.6.1.2.1.25.3.3.1.2' hrProcessorLoad[5] = 10 hrProcessorLoad[10] = 6 Also, no idea why. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: soreceive_stream: mbuf leak if called with mp0 and MSG_WAITALL
On Mon, 12 Mar 2012 22:01:49 +0100 Andre Oppermann wrote: AO Yes, doesn't compute this way. I've put in your fix in this revision: AO http://svn.freebsd.org/changeset/base/232867 Running your branch, smbfs tests have passed and no issues have been detected so far. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: soreceive_stream: mbuf leak if called with mp0 and MSG_WAITALL
Hi, On Tue, 06 Mar 2012 20:50:34 +0100 Andre Oppermann wrote: AO On 05.09.2011 21:58, Mikolaj Golub wrote: On Sun, 04 Sep 2011 12:30:53 +0300 Mikolaj Golub wrote: MG Apparently soreceive_stream() has an issue if it is called to receive data as a MG mbuf chain (by supplying an non zero mbuf **mp0) and with MSG_WAITALL set. MG I ran into this issue with smbfs, which uses soreceive() exactly in this way MG (see netsmb/smb_trantcp.c:nbssn_recv()). Stressing smbfs a little I also observed the following soreceive_stream() related panic: AO Hi Mikolaj, AO thank you very much for testing, reporting and fixing bugs in soreceive_stream(). AO I've altered your proposed patches a bit and committed them into my workqueue AO with the following revisions: AO http://svn.freebsd.org/changeset/base/232617 AO http://svn.freebsd.org/changeset/base/232618 AO Would you mind testing them again before they go into HEAD? With this patch smb mount fails with the error: smb_iod_recvall: tran return NULL without error AO Index: sys/kern/uipc_socket.c AO === AO --- sys/kern/uipc_socket.c (revision 232616) AO +++ sys/kern/uipc_socket.c (revision 232617) AO @@ -2044,7 +2044,7 @@ deliver: AOif (mp0 != NULL) { AO/* Dequeue as many mbufs as possible. */ AOif (!(flags MSG_PEEK) len = sb-sb_mb-m_len) { AO - for (*mp0 = m = sb-sb_mb; AO + for (m = sb-sb_mb; AO m != NULL m-m_len = len; AO m = m-m_next) { AOlen -= m-m_len; AO @@ -2052,10 +2052,15 @@ deliver: AOsbfree(sb, m); AOn = m; AO} AO + n-m_next = NULL; AOsb-sb_mb = m; AO + sb-sb_lastrecord = sb-sb_mb; AOif (sb-sb_mb == NULL) AOSB_EMPTY_FIXUP(sb); AO - n-m_next = NULL; AO + if (*mp0 != NULL) AO + m_cat(*mp0, m); AO + else AO + *mp0 = m; AO} At that moment m points to the end of the chain. Shouldn't *mp0 be set to sb-sb_mb before the for loop? AO/* Copy the remainder. */ AOif (len 0) { AO @@ -2066,9 +2071,9 @@ deliver: AOif (m == NULL) AOlen = 0;/* Don't flush data from sockbuf. */ AOelse AO - uio-uio_resid -= m-m_len; AO + uio-uio_resid -= len; AOif (*mp0 != NULL) AO - n-m_next = m; AO + m_cat(*mp0, m); AOelse AO*mp0 = m; AOif (*mp0 == NULL) { AO -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Kernel panic on FreeBSD 9.0-beta2
On Wed, 12 Oct 2011 09:53:34 +0800 dave jones wrote: dj On Fri, Oct 7, 2011 at 9:12 AM, dave jones wrote: 2011/10/4 Mikolaj Golub : On Sat, 1 Oct 2011 14:15:45 +0800 dave jones wrote: dj On Fri, Sep 30, 2011 at 9:41 PM, Robert Watson wrote: On Wed, 28 Sep 2011, Mikolaj Golub wrote: On Mon, 26 Sep 2011 16:12:55 +0200 K. Macy wrote: KM Sorry, didn't look at the images (limited bw), I've seen something KM like this before in timewait. This can't happen with UDP so will be KM interested in learning more about the bug. The panic can be easily triggered by this: Hi: Just catching up on this thread. I think the analysis here is generally right: in 9.0, you're much more likely to see an inpcb with its in_socket pointer cleared in the hash list than in prior releases, and in_pcbbind_setup() trips over this. However, at least on first glance (and from the perspective of invariants here), I think the bug is actualy that in_pcbbind_setup() is asking in_pcblookup_local() for an inpcb and then access the returned inpcb's in_socket pointer without acquiring a lock on the inpcb. Structurally, it can't acquire this lock for lock order reasons -- it already holds the lock on its own inpcb. Therefore, we should only access fields that are safe to follow in an inpcb when you hold a reference via the hash lock and not a lock on the inpcb itself, which appears generally OK (+/-) for all the fields in that clause but the t-inp_socket-so_options dereference. A preferred fix would cache the SO_REUSEPORT flag in an inpcb-layer field, such as inp_flags2, giving us access to its value without having to walk into the attached (or not) socket. This raises another structural question, which is whether we need a new inp_foo flags field that is protected explicitly by the hash lock, and not by the inpcb lock, which could hold fields relevant to address binding. I don't think we need to solve that problem in this context, as a slightly race on SO_REUSEPORT is likely acceptable. The suggested fix does perform the desired function of explicitly detaching the inpcb from the hash list before the socket is disconnected from the inpcb. However, it's incomplete in that the invariant that's being broken is also relied on for other protocols (such as raw sockets). The correct invariant is that inp_socket is safe to follow unconditionally if an inpcb is locked and INP_DROPPED isn't set -- the bug is in locked not in INP_DROPPED, which is why I think this is the wrong fix, even though it prevents a panic :-). dj Hello Robert, dj Thank you for taking your valuable time to find out the problem. dj Since I don't have idea about network internals, would you have a patch dj about this? I'd be glad to test it, thanks again. Here is the patch that implements what Robert suggests. Dave, could you test it? Sure. Thanks for cooking the patch. Machines have been running two days now without panic. Thank you for testing it. dj Is there any plan to commit your fix? Thank you. dj I'd upgrade to 9.0-release from beta-2 once it's released. I have an upgraded version of the patch, which is under review now. I have been waiting for the response before asking you to test it, but it would be great if you try it not waiting :-). As it was pointed by Robert the previous version introduced a regression: SO_REUSEPORT was ignored if setsockopt was called after bind (the old cached value was still used). So the updated version fixes this and also contains several other fixes, the most important among them is that it fixes the panic for IPv6 bind case too. -- Mikolaj Golub Index: sys/netinet/in_pcb.c === --- sys/netinet/in_pcb.c (revision 226165) +++ sys/netinet/in_pcb.c (working copy) @@ -575,8 +575,7 @@ in_pcbbind_setup(struct inpcb *inp, struct sockadd ntohl(t-inp_faddr.s_addr) == INADDR_ANY) (ntohl(sin-sin_addr.s_addr) != INADDR_ANY || ntohl(t-inp_laddr.s_addr) != INADDR_ANY || - (t-inp_socket-so_options - SO_REUSEPORT) == 0) + (t-inp_flags2 INP_REUSEPORT) == 0) (inp-inp_cred-cr_uid != t-inp_cred-cr_uid)) return (EADDRINUSE); @@ -590,19 +589,19 @@ in_pcbbind_setup(struct inpcb *inp, struct sockadd * being in use (for now). This is better * than a panic, but not desirable. */ -tw = intotw(inp); +tw = intotw(t); if (tw == NULL || (reuseport tw-tw_so_options) == 0) return (EADDRINUSE); - } else if (t - (reuseport t-inp_socket-so_options) == 0) { + } else if (t (reuseport == 0 || + (t-inp_flags2 INP_REUSEPORT) == 0)) { #ifdef INET6 if (ntohl(sin-sin_addr.s_addr
Re: Kernel panic on FreeBSD 9.0-beta2
On Sat, 1 Oct 2011 14:15:45 +0800 dave jones wrote: dj On Fri, Sep 30, 2011 at 9:41 PM, Robert Watson wrote: On Wed, 28 Sep 2011, Mikolaj Golub wrote: On Mon, 26 Sep 2011 16:12:55 +0200 K. Macy wrote: KM Sorry, didn't look at the images (limited bw), I've seen something KM like this before in timewait. This can't happen with UDP so will be KM interested in learning more about the bug. The panic can be easily triggered by this: Hi: Just catching up on this thread. I think the analysis here is generally right: in 9.0, you're much more likely to see an inpcb with its in_socket pointer cleared in the hash list than in prior releases, and in_pcbbind_setup() trips over this. However, at least on first glance (and from the perspective of invariants here), I think the bug is actualy that in_pcbbind_setup() is asking in_pcblookup_local() for an inpcb and then access the returned inpcb's in_socket pointer without acquiring a lock on the inpcb. Structurally, it can't acquire this lock for lock order reasons -- it already holds the lock on its own inpcb. Therefore, we should only access fields that are safe to follow in an inpcb when you hold a reference via the hash lock and not a lock on the inpcb itself, which appears generally OK (+/-) for all the fields in that clause but the t-inp_socket-so_options dereference. A preferred fix would cache the SO_REUSEPORT flag in an inpcb-layer field, such as inp_flags2, giving us access to its value without having to walk into the attached (or not) socket. This raises another structural question, which is whether we need a new inp_foo flags field that is protected explicitly by the hash lock, and not by the inpcb lock, which could hold fields relevant to address binding. I don't think we need to solve that problem in this context, as a slightly race on SO_REUSEPORT is likely acceptable. The suggested fix does perform the desired function of explicitly detaching the inpcb from the hash list before the socket is disconnected from the inpcb. However, it's incomplete in that the invariant that's being broken is also relied on for other protocols (such as raw sockets). The correct invariant is that inp_socket is safe to follow unconditionally if an inpcb is locked and INP_DROPPED isn't set -- the bug is in locked not in INP_DROPPED, which is why I think this is the wrong fix, even though it prevents a panic :-). dj Hello Robert, dj Thank you for taking your valuable time to find out the problem. dj Since I don't have idea about network internals, would you have a patch dj about this? I'd be glad to test it, thanks again. Here is the patch that implements what Robert suggests. Dave, could you test it? Robert dj Best regards, dj Dave. -- Mikolaj Golub Index: sys/netinet/in_pcb.c === --- sys/netinet/in_pcb.c (revision 225885) +++ sys/netinet/in_pcb.c (working copy) @@ -575,8 +575,7 @@ in_pcbbind_setup(struct inpcb *inp, struct sockadd ntohl(t-inp_faddr.s_addr) == INADDR_ANY) (ntohl(sin-sin_addr.s_addr) != INADDR_ANY || ntohl(t-inp_laddr.s_addr) != INADDR_ANY || - (t-inp_socket-so_options - SO_REUSEPORT) == 0) + (t-inp_flags2 INP_REUSEPORT) == 0) (inp-inp_cred-cr_uid != t-inp_cred-cr_uid)) return (EADDRINUSE); @@ -595,14 +594,15 @@ in_pcbbind_setup(struct inpcb *inp, struct sockadd (reuseport tw-tw_so_options) == 0) return (EADDRINUSE); } else if (t - (reuseport t-inp_socket-so_options) == 0) { + (reuseport == 0 || + (t-inp_flags2 INP_REUSEPORT) == 0)) { #ifdef INET6 if (ntohl(sin-sin_addr.s_addr) != INADDR_ANY || ntohl(t-inp_laddr.s_addr) != INADDR_ANY || -INP_SOCKAF(so) == -INP_SOCKAF(t-inp_socket)) +(inp-inp_vflag INP_IPV6PROTO) == 0 || +(t-inp_vflag INP_IPV6PROTO) == 0) #endif return (EADDRINUSE); } @@ -1867,6 +1867,11 @@ in_pcbinshash_internal(struct inpcb *inp, int do_p KASSERT((inp-inp_flags INP_INHASHLIST) == 0, (in_pcbinshash: INP_INHASHLIST)); + if ((inp-inp_socket-so_options SO_REUSEPORT) != 0 || + (IN_MULTICAST(ntohl(inp-inp_laddr.s_addr)) + (inp-inp_socket-so_options SO_REUSEADDR) != 0)) + inp-inp_flags2 |= INP_REUSEPORT; + #ifdef INET6 if (inp-inp_vflag INP_IPV6) hashkey_faddr = inp-in6p_faddr.s6_addr32[3] /* XXX */; Index: sys/netinet/in_pcb.h === --- sys/netinet/in_pcb.h (revision 225885) +++ sys/netinet/in_pcb.h (working copy) @@ -540,6 +540,7 @@ void inp_4tuple_get(struct inpcb *inp, uint32_t * #define INP_LLE_VALID 0x0001 /* cached lle is valid */ #define INP_RT_VALID 0x0002 /* cached rtentry is valid */ #define INP_PCBGROUPWILD 0x0004 /* in pcbgroup wildcard list */ +#define
Re: Kernel panic on FreeBSD 9.0-beta2
On Mon, 26 Sep 2011 16:12:55 +0200 K. Macy wrote: KM Sorry, didn't look at the images (limited bw), I've seen something KM like this before in timewait. This can't happen with UDP so will be KM interested in learning more about the bug. The panic can be easily triggered by this: test_udp.c Description: Binary data The other thread at that moment is in soclose-sofree-upd_detach-in_pcbfree. It looks for me that we should call in_pcbdrop() in udp_close() to remove inpcb from hashed lists, like it is done for tcp_close(). With this patch I don't observe the panic. Index: sys/netinet/udp_usrreq.c === --- sys/netinet/udp_usrreq.c (revision 225816) +++ sys/netinet/udp_usrreq.c (working copy) @@ -1486,6 +1486,7 @@ udp_close(struct socket *so) inp = sotoinpcb(so); KASSERT(inp != NULL, (udp_close: inp == NULL)); INP_WLOCK(inp); + in_pcbdrop(inp); if (inp-inp_faddr.s_addr != INADDR_ANY) { INP_HASH_WLOCK(V_udbinfo); in_pcbdisconnect(inp); KM On Mon, Sep 26, 2011 at 4:02 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Mon, Sep 26, 2011 at 5:12 AM, K. Macy km...@freebsd.org wrote: On Monday, September 26, 2011, Adrian Chadd adr...@freebsd.org wrote: On 26 September 2011 13:41, Arnaud Lacombe lacom...@gmail.com wrote: /* * XXX * This entire block sorely needs a rewrite. */ if (t ((t-inp_flags INP_TIMEWAIT) == 0) (so-so_type != SOCK_STREAM || ntohl(t-inp_faddr.s_addr) == INADDR_ANY) (ntohl(sin-sin_addr.s_addr) != INADDR_ANY || ntohl(t-inp_laddr.s_addr) != INADDR_ANY || (t-inp_socket-so_options SO_REUSEPORT) == 0) (inp-inp_cred-cr_uid != t-inp_cred-cr_uid)) return (EADDRINUSE); } more specifically, `t-inp_socket' is NULL. The top comment may not be relevant, as it's been here for the past 8 years. Why would t-inp_socket be NULL at this point? TIME_WAIT ... on UDP socket ? - Arnaud KM ___ KM freebsd-net@freebsd.org mailing list KM http://lists.freebsd.org/mailman/listinfo/freebsd-net KM To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: soreceive_stream: mbuf leak if called with mp0 and MSG_WAITALL
On Sun, 04 Sep 2011 12:30:53 +0300 Mikolaj Golub wrote: MG Apparently soreceive_stream() has an issue if it is called to receive data as a MG mbuf chain (by supplying an non zero mbuf **mp0) and with MSG_WAITALL set. MG I ran into this issue with smbfs, which uses soreceive() exactly in this way MG (see netsmb/smb_trantcp.c:nbssn_recv()). Stressing smbfs a little I also observed the following soreceive_stream() related panic: #9 0x80a28c80 in panic (fmt=0x80f4b4a4 sbappendstream 1) at /usr/src/sys/kern/kern_shutdown.c:606 #10 0x80a9746b in sbappendstream_locked (sb=0x8bff1874, m=0x8885a600) at /usr/src/sys/kern/uipc_sockbuf.c:527 #11 0x80bcef62 in tcp_do_segment (m=0x8885a600, th=0x8885a674, so=0x8bff1820, tp=0x8bb4f560, drop_hdrlen=52, tlen=51, iptos=0 '\0', ti_locked=1) at /usr/src/sys/netinet/tcp_input.c:2854 #12 0x80bd091d in tcp_input (m=0x8885a600, off0=20) at /usr/src/sys/netinet/tcp_input.c:1382 #13 0x80b5b4fe in ip_input (m=0x8885a600) at /usr/src/sys/netinet/ip_input.c:765 #14 0x80af504b in swi_net (arg=0x81825880) at /usr/src/sys/net/netisr.c:806 #15 0x809fd535 in intr_event_execute_handlers (p=0x86ddc588, ie=0x86d37200) at /usr/src/sys/kern/kern_intr.c:1257 #16 0x809fe419 in ithread_loop (arg=0x86d39bb0) at /usr/src/sys/kern/kern_intr.c:1270 #17 0x809fa7a8 in fork_exit (callout=0x809fe370 ithread_loop, arg=0x86d39bb0, frame=0x86926d28) at /usr/src/sys/kern/kern_fork.c:1029 #18 0x80d68914 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:275 (kgdb) fr 10 #10 0x80a9746b in sbappendstream_locked (sb=0x8bff1874, m=0x8885a600) at /usr/src/sys/kern/uipc_sockbuf.c:527 527 KASSERT(sb-sb_mb == sb-sb_lastrecord,(sbappendstream 1)); (kgdb) l 522 sbappendstream_locked(struct sockbuf *sb, struct mbuf *m) 523 { 524 SOCKBUF_LOCK_ASSERT(sb); 525 526 KASSERT(m-m_nextpkt == NULL,(sbappendstream 0)); 527 KASSERT(sb-sb_mb == sb-sb_lastrecord,(sbappendstream 1)); 528 529 SBLASTMBUFCHK(sb); 530 531 sbcompress(sb, m, sb-sb_mbtail); (kgdb) p m $1 = (struct mbuf *) 0x8885a600 (kgdb) p m-m_hdr.mh_next $2 = (struct mbuf *) 0x0 (kgdb) p sb-sb_mb $3 = (struct mbuf *) 0x93965e00 (kgdb) p sb-sb_lastrecord $4 = (struct mbuf *) 0x88cb0200 (kgdb) p sb $5 = (struct sockbuf *) 0x8bff1874 This sb belonged to smb_iod_thread which at that time was in soreceive_stream(), notifying the protocol that buffer had been drained: #1 0x80d74cb7 in ipi_nmi_handler () at /usr/src/sys/i386/i386/mp_machdep.c:1478 #2 0x80d7f383 in trap (frame=0xdc33ea58) at /usr/src/sys/i386/i386/trap.c:219 #3 0x80d6886c in calltrap () at /usr/src/sys/i386/i386/exception.s:168 #4 0x80a26955 in _rw_wlock_hard (rw=0x8d18fac0, tid=2285360576, file=0x80f68ceb /usr/src/sys/netinet/tcp_usrreq.c, line=732) at cpufunc.h:294 #5 0x80a274d6 in _rw_wlock (rw=0x8d18fac0, file=0x80f68ceb /usr/src/sys/netinet/tcp_usrreq.c, line=732) at /usr/src/sys/kern/kern_rwlock.c:240 #6 0x80bdf585 in tcp_usr_rcvd (so=0x8bff1820, flags=64) at /usr/src/sys/netinet/tcp_usrreq.c:732 #7 0x80a9cf63 in soreceive_stream (so=0x8bff1820, psa=0x0, uio=0xdc33ec10, mp0=0xdc33ec44, controlp=0x0, flagsp=0xdc33ec40) at /usr/src/sys/kern/uipc_socket.c:2097 #8 0x80a9a6c9 in soreceive (so=0x8bff1820, psa=0x0, uio=0xdc33ec10, mp0=0xdc33ec44, controlp=0x0, flagsp=0xdc33ec40) at /usr/src/sys/kern/uipc_socket.c:2309 #9 0x91165e14 in nbssn_recv (nbp=0x874a49c0, mpp=0xdc33ec98, lenp=0xdc33ec64, rpcodep=0xdc33ec6b , td=0x8837d5c0) at /usr/src/sys/modules/smbfs/../../netsmb/smb_trantcp.c:378 #10 0x91165fee in smb_nbst_recv (vcp=0x8961ae00, mpp=0xdc33ec98, td=0x8837d5c0) at /usr/src/sys/modules/smbfs/../../netsmb/smb_trantcp.c:598 #11 0x9116bda1 in smb_iod_recvall (iod=0x88c64980) at /usr/src/sys/modules/smbfs/../../netsmb/smb_iod.c:305 #12 0x9116c82c in smb_iod_thread (arg=0x88c64980) at /usr/src/sys/modules/smbfs/../../netsmb/smb_iod.c:645 #13 0x809fa7a8 in fork_exit (callout=0x9116c600 smb_iod_thread, arg=0x88c64980, frame=0xdc33ed28) at /usr/src/sys/kern/kern_fork.c:1029 #14 0x80d68914 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:275 (kgdb) fr 7 #7 0x80a9cf63 in soreceive_stream (so=0x8bff1820, psa=0x0, uio=0xdc33ec10, mp0=0xdc33ec44, controlp=0x0, flagsp=0xdc33ec40) at /usr/src/sys/kern/uipc_socket.c:2097 2097(*so-so_proto-pr_usrreqs-pru_rcvd)(so, flags); (kgdb) l 2092if ((so-so_proto-pr_flags PR_WANTRCVD) 2093(((flags MSG_WAITALL) uio-uio_resid 0) || 2094 !(flags MSG_SOCALLBCK))) { 2095SOCKBUF_UNLOCK(sb); 2096VNET_SO_ASSERT(so); 2097(*so-so_proto-pr_usrreqs-pru_rcvd)(so, flags); 2098SOCKBUF_LOCK(sb); 2099} 2100} 2101 (kgdb) p
soreceive_stream: mbuf leak if called with mp0 and MSG_WAITALL
Hi, Apparently soreceive_stream() has an issue if it is called to receive data as a mbuf chain (by supplying an non zero mbuf **mp0) and with MSG_WAITALL set. I ran into this issue with smbfs, which uses soreceive() exactly in this way (see netsmb/smb_trantcp.c:nbssn_recv()). If MSG_WAITALL is set and not all data is received it loops again but on the next run mb0 is set to sb-sb_mb again loosing all previously received mbufs. It looks like it should be set to the end of mb0 chain instead. See the attached path. Also, in the copy the remainder block we reduce uio_resid by m-m_len (the length of the last mbuf in the chain), but it looks like for the MSG_PEEK case the remainder may have more than one mbuf in the chain and we have to reduce by len (the length of the copied chain). I don't have a test case to check MSG_PEEK issue, but the patch fixes the issue with smbfs for me. -- Mikolaj Golub Index: sys/kern/uipc_socket.c === --- sys/kern/uipc_socket.c (revision 225368) +++ sys/kern/uipc_socket.c (working copy) @@ -2030,7 +2030,11 @@ deliver: if (mp0 != NULL) { /* Dequeue as many mbufs as possible. */ if (!(flags MSG_PEEK) len = sb-sb_mb-m_len) { - for (*mp0 = m = sb-sb_mb; + if (*mp0 == NULL) +*mp0 = sb-sb_mb; + else +n-m_next = sb-sb_mb; + for (m = sb-sb_mb; m != NULL m-m_len = len; m = m-m_next) { len -= m-m_len; @@ -2052,7 +2056,7 @@ deliver: if (m == NULL) len = 0; /* Don't flush data from sockbuf. */ else -uio-uio_resid -= m-m_len; +uio-uio_resid -= len; if (*mp0 != NULL) n-m_next = m; else @@ -2061,6 +2065,9 @@ deliver: error = ENOBUFS; goto out; } + /* Update n to point to the last mbuf. */ + for (; m != NULL; m = m-m_next) +n = m; } } else { /* NB: Must unlock socket buffer as uiomove may sleep. */ ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Problem using CARP + HAST ...
On Mon, 8 Aug 2011 16:54:10 +0200 Ferdinand Goldmann wrote: FG Hi! FG I am trying to create a common resource pool for a certain application using FG CARP/HAST as described in [1]. However while testing my setup I ran into a FG problem which I don't know how to fix or work around: FG If I shut down only the carp interface on the master (ifconfig carp0 down), FG the slave will take note of this, make his carp interface the master and FG mount the HAST storage using a script called by devd. Everything fine so far. BUT: FG If, however, I completely shut down the masters network connection (using shut on FG the switchport), the carp interface on the slave will still switch to master. FG But the script for making the HAST storage primary will just hang forever: FG root 46841 0.0 0.6 3628 1524 ?? S 4:21PM 0:00.08 /bin/sh /opt/bin/carp-hast-switch master FG root 47043 0.0 2.6 42228 6580 ?? S 4:22PM 0:00.03 hastd: hast0 (secondary) (hastd) FG Seemingly, this is because the hastd daemons on master and slave are unable to FG communicate. So the script waits forever for the secondary device to go away... : FG# Wait for any hastd secondary processes to stop FGfor disk in ${resources}; do FGwhile $( pgrep -lf hastd: ${disk} \(secondary\) /dev/null 21 ); do FG sleep 1 FG done What freebsd are you running on? I suppose it is release, because on STABLE this issue should be fixed -- the secondary terminates after timeout. FG Im a bit puzzled. Is there a way for hastd to make himself the master in case of a timeout FG or such? Because in normal operation, whenever the carp interface fails, the underlying FG infrastructure will most likely be down as well. On release you can just modify the script not to wait forever for hastd secondary to stop -- it will be terminated when the role is switched to primary. But anyway my advise is to use STABLE :-). -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: soreceive_stream: issues with O_NONBLOCK
On Thu, 07 Jul 2011 12:47:15 +0200 Andre Oppermann wrote: AO Please try this patch: AO http://people.freebsd.org/~andre/soreceive_stream.diff-20110707 It works for me. No issues detected so far. Thanks. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
soreceive_stream: issues with O_NONBLOCK
Hi, Trying soreceive_stream I found that many applications (like firefox, pidgin, gnus) might hang in soreceive_stream/sbwait. It was shown up that the issue was with O_NONBLOCK connections -- they blocked in recv() when should not have been. This can be checked with this simple test: http://people.freebsd.org/~trociny/test_nonblock.c In soreceive_stream we have the following code that looks wrong: 1968 /* Socket buffer is empty and we shall not block. */ 1969 if (sb-sb_cc == 0 1970 ((sb-sb_flags SS_NBIO) || (flags (MSG_DONTWAIT|MSG_NBIO { 1971 error = EAGAIN; 1972 goto out; 1973 } It should check so-so_state agains SS_NBIO, not sb-sb_flags. But just changing this is not enough. This check is called too early -- before checking that socket state is not SBS_CANTRCVMORE. As a result, if the peer closes the connection recv() returns EAGAIN instead of 0. See this example: http://people.freebsd.org/~trociny/test_close.c So I moved the nonblock check below SBS_CANTRCVMORE check and ended up with this patch: http://people.freebsd.org/~trociny/uipc_socket.c.soreceive_stream.patch It works for me fine. Also, this part looks wrong: 1958 /* We will never ever get anything unless we are connected. */ 1959 if (!(so-so_state (SS_ISCONNECTED|SS_ISDISCONNECTED))) { 1960 /* When disconnecting there may be still some data left. */ 1961 if (sb-sb_cc 0) 1962 goto deliver; 1963 if (!(so-so_state SS_ISDISCONNECTED)) 1964 error = ENOTCONN; 1965 goto out; 1966 } Why we check in 1959 that state is not SS_ISDISCONNECTED? If it is valid then the check at 1963 is useless becase it will be always true. Shouldn't it be something like below? if (!(so-so_state (SS_ISCONNECTED|SS_ISCONNECTING))) { /* When disconnecting there may be still some data left. */ if (sb-sb_cc 0) goto deliver; error = ENOTCONN; goto out; } (I don't see why we souldn't set ENOTCONN if the state is SS_ISDISCONNECTED). And the last :-). Currently, to try soreceive_stream one need to rebuild kernel with TCP_SORECEIVE_STREAM and then set tunable net.inet.tcp.soreceive_stream. Why do we need TCP_SORECEIVE_STREAM option? Wouldn't tunable be enough? It would simplify trying soreceive_stream by users and we might have more testing/feedback. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Scenario to make recv(MSG_WAITALL) stuck
On Sun, 19 Jun 2011 12:44:03 +0300 Kostik Belousov wrote: KB On Wed, Jun 15, 2011 at 09:44:33AM +0300, Mikolaj Golub wrote: On Tue, 14 Jun 2011 12:23:03 +0300 Kostik Belousov wrote: KB I do not understand what then happens for the recvfrom(2) call ? KB Would it get some error, or 0 as return and no data, or something else ? It will wait for data below in another loop (Now continue to read any data mbufs off of the head...). Elaborating, I would split soreceive_generic on three logical parts. In the first (restart) part we block until some data are received and also (without the patch) in the case of MSG_WAITALL if the buffer is big enough we block until all MSG_WAITALL request is received (actually it will spin in goto restart loop until some condition becomes invalid). The second part is some processing of received data and the third part is a while loop where data is copied to userspace and in the case of MSG_WAITALL request if not all data is received to satisfy the request it also waits for this data. My patch removes the condition in the first part in the case of MSG_WAITALL to wait for all data if buffer is big enough. We always will wait for the rest of data in the third part. It might be not so effective, and this is my first concern about the patch (although not big :-). KB Now I think that this part of the patch is right. KB The loop in the soreceive_generic() would behave as I would expect KB it for MSG_WAITALL. It copyout the received data to userspace by KB received chunks. KB I do not understand your note about effectiveness there. The old behaviour: if only a part of the request is recived and the buffer is large enough, wait for the rest and then go to processing. The new behaviour: if a part of data is recived, (unconditionally) process it and wait for the rest (and process). The first one looks a little more efficient (but has the issue for edge case with nearly full buffer). -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Scenario to make recv(MSG_WAITALL) stuck
On Tue, 14 Jun 2011 12:23:03 +0300 Kostik Belousov wrote: KB I do not understand what then happens for the recvfrom(2) call ? KB Would it get some error, or 0 as return and no data, or something else ? It will wait for data below in another loop (Now continue to read any data mbufs off of the head...). Elaborating, I would split soreceive_generic on three logical parts. In the first (restart) part we block until some data are received and also (without the patch) in the case of MSG_WAITALL if the buffer is big enough we block until all MSG_WAITALL request is received (actually it will spin in goto restart loop until some condition becomes invalid). The second part is some processing of received data and the third part is a while loop where data is copied to userspace and in the case of MSG_WAITALL request if not all data is received to satisfy the request it also waits for this data. My patch removes the condition in the first part in the case of MSG_WAITALL to wait for all data if buffer is big enough. We always will wait for the rest of data in the third part. It might be not so effective, and this is my first concern about the patch (although not big :-). KB Also, what is the MT_CONTROL chunk about ? When I removed the condition to skip blocking in the first part I started to observe panic on KASSERT(m-m_type == MT_DATA) for the following scenario (produced by HAST): sender: send(4 bytes); /* send protocol name */ sendmsg(); /* send descriptor (normal data is empty, descriptor in control data) */ receiver: recv(127 bytes, MSG_WAITALL); /* recive protocol name */ recvmsg(); /* recive descriptor */ Although the recv() has MSG_WAITALL, it exits after receiving 4 bytes because the next received data is of different (MT_CONTROL) type. An it panicked when got control data. It is unclear for me why it is not expected to have MT_CONTROL data in that part. We do have processing of MT_CONTROL above (in the second part) in the code but I still a have feeling that it is possible to create some scenario to break this assert without my patch too, but I have failed so far. And this is my second concern about my patch, big enough, because for now I am not sure that this is correct. Although I have not observed issues with it so far... Also, I am not sure if there is sense to bother with soreceive_generic() at all. May be it is more perspective to spend time on maturing soreceive_stream(). As I see it is going to be a replacement for soreceive_generic() for stream sockets. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Automatic receive buffer sizing works only for connections in ESTABLISHED state
Hi, Automatic receive buffer sizing works only for connections in ESTABLISHED state. In tcp_input() auto resizing code is under if (tp-t_state == TCPS_ESTABLISHED ...) branch. This is unfortunate for HAST, which uses one direction connections and shutdown another direction, so the receiving socket is in FIN_WAIT_2 and auto resizing does not work here. Is there some reason why it should be only for connections in ESTABLISHED state or this should be considered as a bug? -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Scenario to make recv(MSG_WAITALL) stuck
. But the window was closed when the buffer was filled and to avoid silly window syndrome it opens only when available space is larger than sb_hiwat/4 or maxseg: tcp_output(): /* * Calculate receive window. Don't shrink window, * but avoid silly window syndrome. */ if (recwin (long)(so-so_rcv.sb_hiwat / 4) recwin (long)tp-t_maxseg) recwin = 0; so it is stuck and pending data is only sent via TCP window probes. It looks like the fix could be to remove this condition to block if MSG_WAITALL is set and it is possible to do the entire receive operation at once, like in the patch: http://people.freebsd.org/~trociny/uipc_socket.c.soreceive_generic.MSG_DONTWAIT.patch This works for me but I am not sure this is a correct solution. Note, the issue is not reproduced with soreceive_stream. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Spurious ACKs, ICMP unreachable?
On Fri, 13 May 2011 14:38:34 -0700 Chuck Swiger wrote: CS On May 13, 2011, at 1:07 PM, Ivan Voras wrote: I'm seeing an an unusual problem at a remote machine; this machine is the FreeBSD server, and the client is a probably Windows machine (but I don't know the details yet). Something happens which causes FreeBSD to send ACKs to the client, and the client to send ICMP unreachable messages to the server. It is most likely a configuration error at the remote site but I have no idea how to verify this. CS Let's look at just one connection: CS 18:56:02.711942 IP server.http client.4732: Flags [.], ack 2110905191, win 0, length 0 CS 18:56:02.713155 IP server.http client.4732: Flags [.], ack 1, win 65535, length 0 CS The packet is FreeBSD webserver sending ACKs with zero window size; CS that's a sign of congestion that the client should not be sending more CS data and instead doing periodic window probes until the local box opens CS the window again. The next packet on the same connection then ACK's CS something outside of the window with a 64K window size. That's wrong; CS the other side probably sends an RST and the ICMP error. If you have TSO CS enabled, try turning it off. Might be this the thing that jhb@ was fixing in r221346? -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/154504: [libc] recv(2): PF_LOCAL stream connection is stuck in sbwait when recv(MSG_WAITALL) is used
Hi, Does the attached patch fix the problem for you? -- Mikolaj Golub Index: sys/kern/uipc_socket.c === --- sys/kern/uipc_socket.c (revision 220485) +++ sys/kern/uipc_socket.c (working copy) @@ -1845,10 +1845,16 @@ dontblock: } SBLASTRECORDCHK(so-so_rcv); SBLASTMBUFCHK(so-so_rcv); - error = sbwait(so-so_rcv); - if (error) { -SOCKBUF_UNLOCK(so-so_rcv); -goto release; + /* + * We could receive some data while was notifying the + * the protocol. Skip blocking in this case. + */ + if (so-so_rcv.sb_mb == NULL) { +error = sbwait(so-so_rcv); +if (error) { + SOCKBUF_UNLOCK(so-so_rcv); + goto release; +} } m = so-so_rcv.sb_mb; if (m != NULL) ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
recv() with MSG_WAITALL might stuck when receiving more than rcvbuf
Hi, When testing HAST synchronization running both primary and secondary HAST instances on the same host I faced an issue that the synchronization may be very slow: Apr 9 14:04:04 kopusha hastd[3812]: [test] (primary) Synchronization complete. 512MB synchronized in 16m38s (525KB/sec). hastd is synchronizing data in MAXPHYS (131072 bytes) blocks. Sending it splits them on smaller chunks of MAX_SEND_SIZE (32768 bytes), while receives the whole block calling recv() with MSG_WAITALL option. Sometimes recv() gets stuck: in tcpdump I see that sending side sent all chunks, all they were acked, but receiving thread is still waiting in recv(). netstat is reporting non empty Recv-Q for receiving side (with the amount of bytes usually equal to the size of last sent chunk). It looked like the receiving userspace was not informed by the kernel that all data had been arrived. I can reproduce the issue with the attached test_MSG_WAITALL.c. I think the issue is in soreceive_generic(). If MSG_WAITALL is set but the request is larger than the receive buffer, it has to do the receive in sections. So after receiving some data it notifies protocol (calls pr_usrreqs-pru_rcvd) about the data, releasing so_rcv lock. Returning it blocks in sbwait() waiting for the rest of data. I think there is a race: when it was in pr_usrreqs-pru_rcvd not keeping the lock the rest of data could arrive. Thus it should check for this before sbwait(). See the attached uipc_socket.c.soreceive.patch. The patch fixes the issue for me. Apr 9 14:16:40 kopusha hastd[2926]: [test] (primary) Synchronization complete. 512MB synchronized in 4s (128MB/sec). I observed the problem on STABLE but believe the same is on CURRENT. BTW, I also tried optimized version of soreceive(), soreceive_stream(). It does not have this problem. But with it I was observing tcp connections getting stuck in soreceive_stream() on firefox (with many tabs) or pidgin (with many contacts) start. The processes were killable only with -9. I did not investigate this much though. -- Mikolaj Golub test_MSG_WAITALL.c Description: Binary data Index: sys/kern/uipc_socket.c === --- sys/kern/uipc_socket.c (revision 220472) +++ sys/kern/uipc_socket.c (working copy) @@ -1836,28 +1836,34 @@ dontblock: /* * Notify the protocol that some data has been * drained before blocking. */ if (pr-pr_flags PR_WANTRCVD) { SOCKBUF_UNLOCK(so-so_rcv); VNET_SO_ASSERT(so); (*pr-pr_usrreqs-pru_rcvd)(so, flags); SOCKBUF_LOCK(so-so_rcv); } SBLASTRECORDCHK(so-so_rcv); SBLASTMBUFCHK(so-so_rcv); - error = sbwait(so-so_rcv); - if (error) { -SOCKBUF_UNLOCK(so-so_rcv); -goto release; + /* + * We could receive some data while was notifying the + * the protocol. Skip blocking in this case. + */ + if (so-so_rcv.sb_mb == NULL) { +error = sbwait(so-so_rcv); +if (error) { + SOCKBUF_UNLOCK(so-so_rcv); + goto release; +} } m = so-so_rcv.sb_mb; if (m != NULL) nextrecord = m-m_nextpkt; } } SOCKBUF_LOCK_ASSERT(so-so_rcv); if (m != NULL pr-pr_flags PR_ATOMIC) { flags |= MSG_TRUNC; if ((flags MSG_PEEK) == 0) (void) sbdroprecord_locked(so-so_rcv); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
bsnmp/snmpmod.h: #include sys/queue.h is missed
Hi, bsnmp/snmpmod.h uses SLIST but does not includes sys/queue.h. This breaks net-mgmt/bsnmp-ucd port (ports/153153). Could somebody look at the attached patch? -- Mikolaj Golub Index: contrib/bsnmp/snmpd/snmpmod.h === --- contrib/bsnmp/snmpd/snmpmod.h (revision 216439) +++ contrib/bsnmp/snmpd/snmpmod.h (working copy) @@ -33,6 +33,7 @@ #ifndef snmpmod_h_ #define snmpmod_h_ +#include sys/queue.h #include sys/types.h #include sys/socket.h #include net/if.h ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: bsnmp/snmpmod.h: #include sys/queue.h is missed
On Sat, 18 Dec 2010 13:03:58 +0200 Kostik Belousov wrote: KB On Sat, Dec 18, 2010 at 12:48:38PM +0200, Mikolaj Golub wrote: Hi, bsnmp/snmpmod.h uses SLIST but does not includes sys/queue.h. This breaks net-mgmt/bsnmp-ucd port (ports/153153). Could somebody look at the attached patch? KB sys/types.h, as well as sys/param.h should be included before KB other headers. Thanks. Overlooked this :-). -- Mikolaj Golub Index: contrib/bsnmp/snmpd/snmpmod.h === --- contrib/bsnmp/snmpd/snmpmod.h (revision 216439) +++ contrib/bsnmp/snmpd/snmpmod.h (working copy) @@ -34,6 +34,7 @@ #define snmpmod_h_ #include sys/types.h +#include sys/queue.h #include sys/socket.h #include net/if.h #include netinet/in.h ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
net/if_epair.c: semicolon missed
Hi, In net/if_epair.c semicolon is missed in epair_nh_drainedcpu() (see the patch below). This shows up when compiling with EPAIR_DEBUG. Also, what was a reason to declare epair_debug mib as XINT? Shouldn't be just INT? -- Mikolaj Golub Index: sys/net/if_epair.c === --- sys/net/if_epair.c (revision 215576) +++ sys/net/if_epair.c (working copy) @@ -305,7 +305,7 @@ epair_nh_drainedcpu(u_int cpuid) if ((ifp-if_drv_flags IFF_DRV_OACTIVE) != 0) { /* Our hwq overflew again. */ - epair_dpcpu-epair_drv_flags |= IFF_DRV_OACTIVE + epair_dpcpu-epair_drv_flags |= IFF_DRV_OACTIVE; DPRINTF(hw queue length overflow at %u\n, epair_nh.nh_qlimit); break; ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly
On Fri, 28 May 2010 04:40:03 GMT Lavrentiev, Anton (NIH/NLM/NCBI) [C] wrote: IMHO, it is not, unfortunately, a solution: it seems to clear ECONNRESET blindly and w/o distinguishing the situation when the remote end closes the connection prematurely (i.e. before acknowledging all data written from the local end) -- and that qualifies for the true connection reset by peer from close()... I did some experiments the results I would like to share here. The idea is following: the client sends data in one write() more then a win, while the server closes the connection without reading (sending RST on close). I also played with LINGER option. I have managed to get ECONNRESET only on write(), if the server sends RST before the client calls write(). In all other cases write()/close() returned without error. See the attachment for details. So I think that with the workaround (ignore ECONNRESET returned by sodisconnect() in soclose()) we would not make the situation worse (while it fixed the issue with applications getting unexpectedly ECONNRESET after shutdown()/close() sequence). -- Mikolaj Golub test_tcp_close.c Description: Binary data ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly
The following reply was made to PR kern/146845; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: freebsd-net@FreeBSD.org Cc: Lavrentiev\, Anton \(NIH\/NLM\/NCBI\) \[C\] l...@ncbi.nlm.nih.gov, Robert N. M. Watson rwat...@freebsd.org, bug-follo...@freebsd.org Subject: Re: kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly Date: Sun, 30 May 2010 11:05:45 +0300 --=-=-= On Fri, 28 May 2010 04:40:03 GMT Lavrentiev, Anton (NIH/NLM/NCBI) [C] wrote: IMHO, it is not, unfortunately, a solution: it seems to clear ECONNRESET blindly and w/o distinguishing the situation when the remote end closes the connection prematurely (i.e. before acknowledging all data written from the local end) -- and that qualifies for the true connection reset by peer from close()... I did some experiments the results I would like to share here. The idea is following: the client sends data in one write() more then a win, while the server closes the connection without reading (sending RST on close). I also played with LINGER option. I have managed to get ECONNRESET only on write(), if the server sends RST before the client calls write(). In all other cases write()/close() returned without error. See the attachment for details. So I think that with the workaround (ignore ECONNRESET returned by sodisconnect() in soclose()) we would not make the situation worse (while it fixed the issue with applications getting unexpectedly ECONNRESET after shutdown()/close() sequence). -- Mikolaj Golub --=-=-= Content-Type: application/octet-stream Content-Disposition: attachment; filename=test_tcp_close.c Content-Transfer-Encoding: base64 I2luY2x1ZGUgPHN5cy90eXBlcy5oPgojaW5jbHVkZSA8c3lzL3NvY2tldC5oPgojaW5jbHVkZSA8 bmV0aW5ldC9pbi5oPgojaW5jbHVkZSA8c2lnbmFsLmg+CiNpbmNsdWRlIDxzdGRpby5oPgojaW5j bHVkZSA8c3RyaW5nLmg+CiNpbmNsdWRlIDxzdGRsaWIuaD4KI2luY2x1ZGUgPHVuaXN0ZC5oPgoj aW5jbHVkZSA8ZXJyLmg+CgojZGVmaW5lIEJVRlNJWkUJNDA5NjAwCiNkZWZpbmUgUE9SVAkyMzQ4 MQojZGVmaW5lIFNMRUVQMQkwCiNkZWZpbmUgU0xFRVAyCTEKI3VuZGVmIExJTkdFUl9JTl9DTElF TlQKI3VuZGVmIExJTkdFUl9JTl9TRVJWRVIKCmludAptYWluKGludCBhcmdjLCBjaGFyICoqYXJn dikKewoJc3RydWN0IHNvY2thZGRyX2luIHNpbjsKCWludCBsaXN0ZW5mZCwgY29ubmZkLCBwaWQ7 CgljaGFyIGJ1ZltCVUZTSVpFXTsKI2lmZGVmIExJTkdFUl9JTl9DTElFTlQKCXN0cnVjdCBsaW5n ZXIgbGluZzsKI2Vsc2UKI2lmZGVmIExJTkdFUl9JTl9TRVJWRVIKCXN0cnVjdCBsaW5nZXIgbGlu ZzsKI2VuZGlmCiNlbmRpZiAvKiBMSU5HRVJfSU5fQ0xJRU5UIHx8IExJTkdFUl9JTl9TRVJWRVIg Ki8KCQoJaWYgKChsaXN0ZW5mZCA9IHNvY2tldChBRl9JTkVULCBTT0NLX1NUUkVBTSwgMCkpIDwg MCkKCQllcnIoMSwgInNvY2tldCBlcnJvciIpOwoJbWVtc2V0KCZzaW4sIDAsIHNpemVvZihzaW4p KTsKCXNpbi5zaW5fZmFtaWx5ID0gQUZfSU5FVDsKCXNpbi5zaW5fcG9ydCA9IGh0b25zKFBPUlQp OwoJaWYgKGJpbmQobGlzdGVuZmQsIChzdHJ1Y3Qgc29ja2FkZHIgKikgJnNpbiwKCQkgc2l6ZW9m KHNpbikpIDwgMCkKCQllcnIoMSwgImJpbmQgZXJyb3IiKTsKCWlmIChsaXN0ZW4obGlzdGVuZmQs IDEwMjQpIDwgMCkKCQllcnIoMSwgImxpc3RlbiBlcnJvciIpOwoJcGlkID0gZm9yaygpOwoJaWYg KHBpZCA9PSAtMSkKCQllcnIoMSwgImZvcmsgZXJyb3IiKTsKCWlmIChwaWQgIT0gMCkgewoJCWNs b3NlKGxpc3RlbmZkKTsKCQlzbGVlcCgxKTsKCQlpZiAoKGNvbm5mZCA9IHNvY2tldChBRl9JTkVU LCBTT0NLX1NUUkVBTSwgMCkpIDwgMCkgewoJCQkodm9pZClraWxsKHBpZCwgU0lHVEVSTSk7CgkJ CWVycigxLCAicGFyZW50OiBzb2NrZXQgZXJyb3IiKTsKCQl9CgkJaWYgKGNvbm5lY3QoY29ubmZk LCAoc3RydWN0IHNvY2thZGRyICopJnNpbiwKCQkJICAgIHNpemVvZihzaW4pKSA8IDApIHsKCQkJ KHZvaWQpa2lsbChwaWQsIFNJR1RFUk0pOwoJCQllcnIoMSwgInBhcmVudDogY29ubmVjdCBlcnJv ciIpOwoJCX0KI2lmZGVmIExJTkdFUl9JTl9DTElFTlQKCQlsaW5nLmxfb25vZmYgPSAxOwoJCWxp bmcubF9saW5nZXIgPSAxMDsKCQlpZiAoc2V0c29ja29wdChjb25uZmQsIFNPTF9TT0NLRVQsIFNP X0xJTkdFUiwKCQkJICAgICAgICZsaW5nLCBzaXplb2YobGluZykpIDwgMCkKCQkJZXJyKDEsICJw YXJlbnQ6IHNldHNvY2tvcHQgZXJyb3IiKTsKI2VuZGlmIC8qIExJTkdFUl9JTl9DTElFTlQgKi8K CQlzbGVlcChTTEVFUDEpOwoJCWlmICh3cml0ZShjb25uZmQsIGJ1ZiwgQlVGU0laRSkgPCAwKSB7 CgkJCSh2b2lkKWtpbGwocGlkLCBTSUdURVJNKTsKCQkJZXJyKDEsICJwYXJlbnQ6IHdyaXRlIGVy cm9yIik7CgkJfQoJCWlmIChjbG9zZShjb25uZmQpIDwgMCkgewoJCQkodm9pZClraWxsKHBpZCwg U0lHVEVSTSk7CgkJCWVycigxLCAicGFyZW50OiBjbG9zZSBlcnJvciIpOwoJCX0KCX0gZWxzZSB7 CgkJaWYgKChjb25uZmQgPSBhY2NlcHQobGlzdGVuZmQsIChzdHJ1Y3Qgc29ja2FkZHIgKilOVUxM LAoJCQkJICAgICBOVUxMKSkgPCAwKQoJCQllcnIoMSwgImNoaWxkOiBhY2NlcHQgZXJyb3IiKTsK I2lmZGVmIExJTkdFUl9JTl9TRVJWRVIJCQoJCS8qCgkJICogU2VuZCBSU1Qgb24gY2xvc2UuCgkJ ICovCgkJbGluZy5sX29ub2ZmID0gMTsKCQlsaW5nLmxfbGluZ2VyID0gMDsJCQoJCWlmIChzZXRz b2Nrb3B0KGNvbm5mZCwgU09MX1NPQ0tFVCwgU09fTElOR0VSLAoJCQkgICAgICAgJmxpbmcsIHNp emVvZihsaW5nKSkgPCAwKQoJCQllcnIoMSwgImNoaWxkOiBzZXRzb2Nrb3B0IGVycm9yIik7CiNl bmRpZiAvKiBMSU5HRVJfSU5fU0VSVkVSICovCgkJc2xlZXAoU0xFRVAyKTsKCQlpZiAoY2xvc2Uo Y29ubmZkKSA8IDApCgkJCWVycigxLCAiY2hpbGQ6IGNsb3NlIGVycm9yIik7Cgl9CglleGl0KDAp Owp9CgojaWYgMAoKU0xFRVAxID0gMDsgU0xFRVAyID0gMDogTElOR0VSX0lOX1NFUlZFUjogcGFy ZW50OiB3cml0ZSBlcnJvcjogQ29ubmVjdGlvbiByZXNldCBieSBwZWVyCjAwOjAwOjAwLjAwMDAw MCBJUCAxMjcuMC4wLjEuMjM4NTEgPiAxMjcuMC4wLjEuMjM0ODE6IEZsYWdzIFtTXSwgc2VxIDMy
Re: kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly
The following reply was made to PR kern/146845; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: Lavrentiev\, Anton \(NIH\/NLM\/NCBI\) \[C\] l...@ncbi.nlm.nih.gov Cc: Robert N. M. Watson rwat...@freebsd.org, freebsd-net@FreeBSD.org, bug-follo...@freebsd.org Subject: Re: kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly Date: Fri, 28 May 2010 12:26:33 +0300 On Fri, 28 May 2010 04:40:03 GMT Lavrentiev, Anton (NIH/NLM/NCBI) [C] wrote: LA IMHO, it is not, unfortunately, a solution: it seems to clear ECONNRESET LA blindly and w/o distinguishing the situation when the remote end closes the LA connection prematurely (i.e. before acknowledging all data written from the LA local end) -- and that qualifies for the true connection reset by peer LA from close()... I am not very familiar with the socket/tcp code but it looks for me that it might not make any difference. I can be wrong here but the situation you have described as true connection reset by peer seems to have the following path in the code: soclose() - sodisconnect() - tcp_usr_disconnect() - tcp_disconnect() But tcp_disconnect() does not return error, so we will not have ECONNRESET error in any case. May be you have a good test suite to reproduce this situation? :-) -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly
: 1) after shutdown() our output is closed; 2) then we call close(), soclose() checks that we are still in SS_ISCONNECTED and calls sodisconnect(); 3) at this time FIN arrives from the other end, which has called close() too, and the kernel disconnects the socket (INP_DROPPED is set); 4) sodisconnect()/tcp_usr_disconnect() checks for INP_DROPPED and returns ECONNRESET. I am attaching the patch, which may not be a solution but rather for illustration to described above. Running the test with this patch I am observing the following messages in error logs May 27 23:55:41 zhuzha kernel: ECONNRESET: so-state: 0x2000; file /usr/src/sys/kern/uipc_socket.c; line 664 and test does not fail. -- Mikolaj Golub tcp_close.c Description: Binary data uipc_socket.c.econnreset.patch Description: Binary data ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly
The following reply was made to PR kern/146845; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: bug-follo...@freebsd.org Cc: Anton Lavrentiev l...@ncbi.nlm.nih.gov, Robert Watson rwat...@freebsd.org, freebsd-net@FreeBSD.org Subject: Re: kern/146845: [libc] close(2) returns error 54 (connection reset by peer) wrongly Date: Fri, 28 May 2010 00:25:42 +0300 --=-=-= Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Hi, We observed the same issue on our FreeBSD6 and 7 servers. I tried to reproduce the problem writing a simple test case but failed -- I didn't come to the idea of shutdown()/close() sequence (as Anton did). Although looking now at the code we had the issue with I see that shutdown()/close() sequence was used there too. It looks like SO_LINGER is not important to reproduce ECONNRESET. shutdown()/close() on one end and close() on the other is enough. Also, slowdown of one the processes (done by Anton using select()) is not important too. Taking this into consideration I have wrote a simplified version of a test to reproduce the bug (may be it worth of including to tools/regression/sockets?). I can easily reproduce the error with this test on FreeBSD7.1 and 8-STABLE. Adding some prints to the kernel code I localized the place where the error appears and added panic() to get a backtrace. So, the backtrace: (kgdb) bt #0 doadump () at pcpu.h:246 #1 0xc04ec829 in db_fncall (dummy1=-1064461270, dummy2=0, dummy3=-1, dummy4=0xe85e58b0 ÄX^è) at /usr/src/sys/ddb/db_command.c:548 #2 0xc04ecc5f in db_command (last_cmdp=0xc0e0af9c, cmd_table=0x0, dopager=0) at /usr/src/sys/ddb/db_command.c:445 #3 0xc04ecd14 in db_command_script (command=0xc0e0bec4 call doadump) at /usr/src/sys/ddb/db_command.c:516 #4 0xc04f0e50 in db_script_exec (scriptname=0xe85e59bc kdb.enter.panic, warnifnotfound=Variable warnifnotfound is not available. ) at /usr/src/sys/ddb/db_script.c:302 #5 0xc04f0f37 in db_script_kdbenter (eventname=0xc0cc78ea panic) at /usr/src/sys/ddb/db_script.c:324 #6 0xc04eec18 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:228 #7 0xc08d9aa6 in kdb_trap (type=3, code=0, tf=0xe85e5af8) at /usr/src/sys/kern/subr_kdb.c:535 #8 0xc0befecb in trap (frame=0xe85e5af8) at /usr/src/sys/i386/i386/trap.c:690 #9 0xc0bd15eb in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #10 0xc08d9c2a in kdb_enter (why=0xc0cc78ea panic, msg=0xc0cc78ea panic) at cpufunc.h:71 #11 0xc08a95b6 in panic (fmt=0xc0ce6585 ECONNRESET) at /usr/src/sys/kern/kern_shutdown.c:562 #12 0xc0a3d805 in tcp_usr_disconnect (so=0xc715c670) at /usr/src/sys/netinet/tcp_usrreq.c:552 #13 0xc09111bd in sodisconnect (so=0xc715c670) at /usr/src/sys/kern/uipc_socket.c:810 #14 0xc0914144 in soclose (so=0xc715c670) at /usr/src/sys/kern/uipc_socket.c:658 #15 0xc08f6459 in soo_close (fp=0xc743e230, td=0xc7023000) at /usr/src/sys/kern/sys_socket.c:291 #16 0xc086efc3 in _fdrop (fp=0xc743e230, td=0xc7023000) at file.h:293 #17 0xc0870cf0 in closef (fp=0xc743e230, td=0xc7023000) at /usr/src/sys/kern/kern_descrip.c:2117 #18 0xc0871097 in kern_close (td=0xc7023000, fd=4) at /usr/src/sys/kern/kern_descrip.c:1162 #19 0xc087123a in close (td=0xc7023000, uap=0xe85e5cf8) at /usr/src/sys/kern/kern_descrip.c:1114 #20 0xc0bef600 in syscall (frame=0xe85e5d38) at /usr/src/sys/i386/i386/trap.c: #21 0xc0bd1680 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 #22 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 12 #12 0xc0a3d805 in tcp_usr_disconnect (so=0xc715c670) at /usr/src/sys/netinet/tcp_usrreq.c:552 552 panic(ECONNRESET); (kgdb) list 547 inp = sotoinpcb(so); 548 KASSERT(inp != NULL, (tcp_usr_disconnect: inp == NULL)); 549 INP_WLOCK(inp); 550 if (inp-inp_flags (INP_TIMEWAIT | INP_DROPPED)) { 551 error = ECONNRESET; 552 panic(ECONNRESET); 553 /* log(LOG_INFO, ECONNRESET 3: file %s; line %d\n, __FILE__, __LINE__); */ 554 goto out; 555 } 556 tp = intotcpcb(inp); (kgdb) p/x inp-inp_flags $1 = 0x480 #define INP_DROPPED 0x0400 /* protocol drop flag */ (kgdb) fr 14 #14 0xc0914144 in soclose (so=0xc715c670) at /usr/src/sys/kern/uipc_socket.c:658 658 error = sodisconnect(so); (kgdb) list 653 654 CURVNET_SET(so-so_vnet); 655 funsetown(so-so_sigio); 656 if (so-so_state SS_ISCONNECTED) { 657 if ((so-so_state SS_ISDISCONNECTING) == 0) { 658 error = sodisconnect(so); 659 if (error) { 660 if (error == ENOTCONN) 661 error = 0
Re: sockstat / netstat output 8.x vs 7.x
On Tue, 11 May 2010 13:24:02 -0700 Julian Elischer wrote: JE On 5/11/10 12:20 PM, Wes Peters wrote: The output header is instructive: USER COMMANDPID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS www httpd 18423 3 tcp4 6 *:80 *:* www httpd 18423 4 tcp4 *:* *:* www httpd 25184 3 tcp4 6 *:80 *:* www httpd 25184 4 tcp4 *:* *:* Same as 7, it's the foreign address. This is normally only useful for connected sockets. On Tue, May 11, 2010 at 11:14 AM, Mike Tancsam...@sentex.net wrote: [trying on freebsd-net since no response on stable] I noticed that apache on RELENG_8 and RELENG_7 shows up with output I cant seem to understand from sockstat -l and netstat -naW On RELENG_7, sockstat -l makes sense to me www httpd 83005 4 tcp4 *:443 *:* www httpd 82217 3 tcp4 *:80 *:* www httpd 82217 4 tcp4 *:443 *:* www httpd 38942 3 tcp4 *:80 *:* www httpd 38942 4 tcp4 *:443 *:* root httpd 1169 3 tcp4 *:80 *:* root httpd 1169 4 tcp4 *:443 *:* various processes listening on all bound IP addresses on ports 80 and 443. On RELENG_8 however, it shows up with an extra entry (at the end) www httpd 29005 4 tcp4 *:* *:* www httpd 29004 3 tcp4 6 *:80 *:* www httpd 29004 4 tcp4 *:* *:* www httpd 29003 3 tcp4 6 *:80 *:* www httpd 29003 4 tcp4 *:* *:* www httpd 66731 3 tcp4 6 *:80 *:* www httpd 66731 4 tcp4 *:* *:* root httpd 72197 3 tcp4 6 *:80 *:* root httpd 72197 4 tcp4 *:* *:* *:80 makes sense to me... process is listening on all IPs for port 80. What does *:* mean then ? JE I believe it has created a socket but not used it for anything JE it may be the 6 socket... otherwise I don't see what a tcp4 6 is JE meant to be. Comparing RELENG_8 and RELENG_7 outputs it might be for https, which looks like is not configured on RELENG_8 host. I think socket() was called but no any other actions with the socket was performed. Netstat gives a slightly different version of it Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp4 0 0 *.1984 *.*LISTEN tcp4 0 0 *.**.*CLOSED tcp46 0 0 *.80 *.*LISTEN state closed ? You can reproduce this with this simple program: zhuzha:~/src/test_socket% cat test.c #include sys/types.h #include sys/socket.h #include errno.h #include unistd.h #include err.h int main(int argc, char **argv) { int sockfd; if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) 0) errx(1, socket error); sleep(60); return 0; } zhuzha:~/src/test_socket% make cc -g -O0 -Wall test.c -o test zhuzha:~/src/test_socket% ./test [1] 56076 zhuzha:~/src/test_socket% sockstat|grep test golubtest 56076 3 tcp4 *:* *:* zhuzha:~/src/test_socket% netstat -na |grep CLOSED tcp4 0 0 *.**.*CLOSED -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault
This pr is duplicate of kern/116837 so I think we can close it. The problem is fixed in CURRENT and 8-STABLE and there is a patch for 7-STABLE (see kern/116837 for details). -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault
The following reply was made to PR kern/133902; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: bug-follo...@freebsd.org Cc: Leonardo Santagostini lsantagost...@gmail.com, Bjoern A. Zeeb b...@freebsd.org, freebsd-net@FreeBSD.org Subject: Re: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault Date: Mon, 03 May 2010 10:41:34 +0300 This pr is duplicate of kern/116837 so I think we can close it. The problem is fixed in CURRENT and 8-STABLE and there is a patch for 7-STABLE (see kern/116837 for details). -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 8.0-STABLE mpd - system freeze
On Sun, 2 May 2010 12:46:19 +0200 (CEST) Roar Pettersen wrote: Upgraded some servers from 7.2-stabel to 8.0-stable early april and since then I have seen stability problems with 8.0 servers which use mpd (vpn). I have tried several mpd version (5.5, 5.3 and 5.1), but the system freeze within 6 hours or 3-5 days. Early in april we got typical watchdog timeout error message just before the system freeze, but now we don't get any error message. Could you try disabling flowtable to see if it helps? sysctl -w net.inet.flowtable.enable=0 Sometimes we also see that the mpd process goes into a none killeable stauts, and then when I execute a shutdown -r the system hang with this message : stopping mpd5 Waiting for PIDS : 114830 second watchdog timeout expired. Shutdown terminated. Apr 29 21:04:52 init : some process would not die; ps axl advised Waiting (max 60 seconds) for system process 'vnlru' to stop... Could you provide the output of procstat -kk mpdpid when this happens again or even better: procstat -akk -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Races on alias deletion
I have sent pr about this issue. kern/146250 On Wed, 21 Apr 2010 08:28:48 +0300 Mikolaj Golub wrote: MG Hi, MG Accidentally due to misconfiguration of our tools we ran simultaneously MG deletion of the same interface alias and crashed the box (FreeBSD-7.1). MG So I did some experiments on my 8-STABLE (I have CURRENT in virtualbox only) MG to investigate this running concurrently two scripts, which were adding and MG deleting the same address: MG while true; do MG ifconfig $IFACE alias $IP MG ifconfig $IFACE -alias $IP MG done MG The box crashed just after I started the second script. The crash was in MG in_control() on removing ia-ia_ifa from ifp-if_addrhead list, because there MG was no check if the address is still in the list before removing. MG panic: Bad link elm 0xcd2f3b00 prev-next != elm MG #0 doadump () at pcpu.h:246 MG #1 0xc04ec829 in db_fncall (dummy1=-1064461270, dummy2=0, dummy3=-1, dummy4=0xe9a737fc \0208╖И) MG at /usr/src/sys/ddb/db_command.c:548 MG #2 0xc04ecc5f in db_command (last_cmdp=0xc0e0ab9c, cmd_table=0x0, dopager=0) MG at /usr/src/sys/ddb/db_command.c:445 MG #3 0xc04ecd14 in db_command_script (command=0xc0e0bac4 call doadump) at /usr/src/sys/ddb/db_command.c:516 MG #4 0xc04f0e50 in db_script_exec (scriptname=0xe9a73908 kdb.enter.panic, warnifnotfound=Variable warnifnotfound is not available. MG ) MG at /usr/src/sys/ddb/db_script.c:302 MG #5 0xc04f0f37 in db_script_kdbenter (eventname=0xc0cc760a panic) at /usr/src/sys/ddb/db_script.c:324 MG #6 0xc04eec18 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:228 MG #7 0xc08d9aa6 in kdb_trap (type=3, code=0, tf=0xe9a73a44) at /usr/src/sys/kern/subr_kdb.c:535 MG #8 0xc0befbeb in trap (frame=0xe9a73a44) at /usr/src/sys/i386/i386/trap.c:690 MG #9 0xc0bd130b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 MG #10 0xc08d9c2a in kdb_enter (why=0xc0cc760a panic, msg=0xc0cc760a panic) at cpufunc.h:71 MG #11 0xc08a95b6 in panic (fmt=0xc0c61bc0 Bad link elm %p prev-next != elm) MG at /usr/src/sys/kern/kern_shutdown.c:562 MG #12 0xc09ba87f in in_control (so=0xcdbd519c, cmd=2149607705, data=0xcd3db120 fxp0, ifp=0xc5b94c00, MG td=0xc92ddb90) at /usr/src/sys/netinet/in.c:604 MG #13 0xc095d400 in ifioctl (so=0xcdbd519c, cmd=2149607705, data=0xcd3db120 fxp0, td=0xc92ddb90) MG at /usr/src/sys/net/if.c:2516 MG #14 0xc08f69d5 in soo_ioctl (fp=0xcdc90af0, cmd=2149607705, data=0xcd3db120, active_cred=0xc9d78400, MG td=0xc92ddb90) at /usr/src/sys/kern/sys_socket.c:212 MG #15 0xc08f0a2d in kern_ioctl (td=0xc92ddb90, fd=3, com=2149607705, data=0xcd3db120 fxp0) at file.h:262 MG #16 0xc08f0bb4 in ioctl (td=0xc92ddb90, uap=0xe9a73cf8) at /usr/src/sys/kern/sys_generic.c:678 MG #17 0xc0bef320 in syscall (frame=0xe9a73d38) at /usr/src/sys/i386/i386/trap.c: MG #18 0xc0bd13a0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 MG #19 0x0033 in ?? () MG Previous frame inner to this frame (corrupt stack?) MG (kgdb) fr 12 MG #12 0xc09ba87f in in_control (so=0xcdbd519c, cmd=2149607705, data=0xcd3db120 fxp0, ifp=0xc5b94c00, MG td=0xc92ddb90) at /usr/src/sys/netinet/in.c:604 MG 604 TAILQ_REMOVE(ifp-if_addrhead, ia-ia_ifa, ifa_link); MG (kgdb) list MG 599 default: MG 600 panic(in_control: unsupported ioctl); MG 601 } MG 602 MG 603 IF_ADDR_LOCK(ifp); MG 604 TAILQ_REMOVE(ifp-if_addrhead, ia-ia_ifa, ifa_link); MG 605 IF_ADDR_UNLOCK(ifp); MG 606 ifa_free(ia-ia_ifa); /* if_addrhead */ MG 607 MG 608 IN_IFADDR_WLOCK(); MG The fist patch in the attachments fixed this type of crashes for me, but the MG box started to crash in in_lltable_prefix_free (now it was required for MG scripts to run a few seconds). MG (kgdb) bt MG #0 doadump () at pcpu.h:246 MG #1 0xc04ec829 in db_fncall (dummy1=1, dummy2=0, dummy3=-1056922880, dummy4=0xe8636760 ) MG at /usr/src/sys/ddb/db_command.c:548 MG #2 0xc04ecc21 in db_command (last_cmdp=0xc0e0ac1c, cmd_table=0x0, dopager=1) MG at /usr/src/sys/ddb/db_command.c:445 MG #3 0xc04ecd7a in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 MG #4 0xc04eec1d in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:229 MG #5 0xc08d9aa6 in kdb_trap (type=12, code=0, tf=0xe863694c) at /usr/src/sys/kern/subr_kdb.c:535 MG #6 0xc0beeedf in trap_fatal (frame=0xe863694c, eva=420) at /usr/src/sys/i386/i386/trap.c:929 MG #7 0xc0bef800 in trap (frame=0xe863694c) at /usr/src/sys/i386/i386/trap.c:328 MG #8 0xc0bd139b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 MG #9 0xc08a6a8b in _rw_wlock_hard (rw=0xc79e1508, tid=3334964384, file=0xc0ce01e4 /usr/src/sys/netinet/in.c, MG line=1370) at /usr/src/sys/kern/kern_rwlock.c:677 MG #10 0xc08a75d6 in _rw_wlock (rw=0xc79e1508, file=0xc0ce01e4
Races on alias deletion
=2151704858, data=0xc7841bc0 fxp0) at file.h:262 #17 0xc08f0bb4 in ioctl (td=0xc818db90, uap=0xe880dcf8) at /usr/src/sys/kern/sys_generic.c:678 #18 0xc0bef430 in syscall (frame=0xe880dd38) at /usr/src/sys/i386/i386/trap.c: #19 0xc0bd14b0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 #20 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 12 #12 0xc09b8efc in in_ifinit (ifp=0xc5b94c00, ia=0xc876ea00, sin=0xc185fcf6, scrub=0) at /usr/src/sys/netinet/in.c:844 844 LIST_REMOVE(ia, ia_hash); (kgdb) list in_ifinit 832 * and routing table entry. 833 */ 834 static int 835 in_ifinit(struct ifnet *ifp, struct in_ifaddr *ia, struct sockaddr_in *sin, 836 int scrub) 837 { 838 register u_long i = ntohl(sin-sin_addr.s_addr); 839 struct sockaddr_in oldaddr; 840 int s = splimp(), flags = RTF_UP, error = 0; 841 (kgdb) 842 oldaddr = ia-ia_addr; 843 if (oldaddr.sin_family == AF_INET) 844 LIST_REMOVE(ia, ia_hash); 845 ia-ia_addr = *sin; 846 if (ia-ia_addr.sin_family == AF_INET) { 847 IN_IFADDR_WLOCK(); 848 LIST_INSERT_HEAD(INADDR_HASH(ia-ia_addr.sin_addr.s_addr), 849 ia, ia_hash); 850 IN_IFADDR_WUNLOCK(); 851 } Applying the fourth patch fixed this. But it is still possible to crash the box: #0 doadump () at pcpu.h:246 #1 0xc04ec829 in db_fncall (dummy1=1, dummy2=0, dummy3=-1056922624, dummy4=0xe847c890 ) at /usr/src/sys/ddb/db_command.c:548 #2 0xc04ecc21 in db_command (last_cmdp=0xc0e0ad1c, cmd_table=0x0, dopager=1) at /usr/src/sys/ddb/db_command.c:445 #3 0xc04ecd7a in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #4 0xc04eec1d in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:229 #5 0xc08d9aa6 in kdb_trap (type=12, code=0, tf=0xe847ca7c) at /usr/src/sys/kern/subr_kdb.c:535 #6 0xc0beefbf in trap_fatal (frame=0xe847ca7c, eva=3735929146) at /usr/src/sys/i386/i386/trap.c:929 #7 0xc0bef8e0 in trap (frame=0xe847ca7c) at /usr/src/sys/i386/i386/trap.c:328 #8 0xc0bd147b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #9 0xc09b9c24 in in_control (so=0xc6e29670, cmd=2149607705, data=0xc6246ba0 fxp0, ifp=0xc5b94c00, td=0xc6a59940) at /usr/src/sys/netinet/in.c:331 #10 0xc095d400 in ifioctl (so=0xc6e29670, cmd=2149607705, data=0xc6246ba0 fxp0, td=0xc6a59940) at /usr/src/sys/net/if.c:2516 #11 0xc08f69d5 in soo_ioctl (fp=0xc6374700, cmd=2149607705, data=0xc6246ba0, active_cred=0xc7131280, td=0xc6a59940) at /usr/src/sys/kern/sys_socket.c:212 #12 0xc08f0a2d in kern_ioctl (td=0xc6a59940, fd=3, com=2149607705, data=0xc6246ba0 fxp0) at file.h:262 #13 0xc08f0bb4 in ioctl (td=0xc6a59940, uap=0xe847ccf8) at /usr/src/sys/kern/sys_generic.c:678 #14 0xc0bef490 in syscall (frame=0xe847cd38) at /usr/src/sys/i386/i386/trap.c: #15 0xc0bd1510 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 #16 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 9 #9 0xc09b9c24 in in_control (so=0xc6e29670, cmd=2149607705, data=0xc6246ba0 fxp0, ifp=0xc5b94c00, td=0xc6a59940) at /usr/src/sys/netinet/in.c:331 331 if (iap-ia_ifp == ifp (kgdb) list 326 * first one on the interface, if possible. 327 */ 328 dst = ((struct sockaddr_in *)ifr-ifr_addr)-sin_addr; 329 IN_IFADDR_RLOCK(); 330 LIST_FOREACH(iap, INADDR_HASH(dst.s_addr), ia_hash) { 331 if (iap-ia_ifp == ifp 332 iap-ia_addr.sin_addr.s_addr == dst.s_addr) { 333 if (td == NULL || prison_check_ip4(td-td_ucred, 334 dst) == 0) 335 ia = iap; (kgdb) p iap $1 = (struct in_ifaddr *) 0xdeadc0de But I don't have the patch for this yet :-). Also I have noticed that after running my tests long enough (but not so long to crash the box) the error message starts to appear on every attempt to add tested alias IP (although the alias is created): ifconfig: ioctl (SIOCAIFADDR): File exists This is because the route is not deleted on alias removal (some reference leak?). After removing the route manually the error does not appear. -- Mikolaj Golub --- sys/netinet/in.c.orig 2010-04-16 15:15:07.0 +0300 +++ sys/netinet/in.c 2010-04-18 17:22:57.0 +0300 @@ -601,8 +601,17 @@ in_control(struct socket *so, u_long cmd } IF_ADDR_LOCK(ifp); - TAILQ_REMOVE(ifp-if_addrhead, ia-ia_ifa, ifa_link); + TAILQ_FOREACH(ifa, ifp-if_addrhead, ifa_link) { + if (ia-ia_ifa == ifa) { + TAILQ_REMOVE(ifp-if_addrhead, ia-ia_ifa, ifa_link); + break; + } + } IF_ADDR_UNLOCK(ifp); + if (ifa == NULL) { + error = EADDRNOTAVAIL; + goto out; + } ifa_free(ia-ia_ifa);/* if_addrhead
Re: kmem leakage on tun/tap device removal
On Feb 28, 1:30 pm, to.my.troc...@gmail.com (Mikolaj Golub) wrote: But I have faced with another issue (not related to your patch, as it is observed with unpatched kernel too). When I try to run concurrently two create/destroy scripts with the same interface the system panics: Unread portion of the kernel message buffer: panic: Bad link elm 0xc5f1a800 next-prev != elm cpuid = 2 KDB: enter: panic exclusive sleep mutex if_clone lock (if_clone lock) r = 0 (0xc0da1cf0) locked @ /usr/src/sys/net/if_clone.c:248 exclusive sleep mutex if_clone lock (if_clone lock) r = 0 (0xc0da1cf0) locked @ /usr/src/sys/net/if_clone.c:248 exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xc6cd3560) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148 exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xc6b4dbd0) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148 Physical memory: 2019 MB Dumping 160 MB: 145 129 113 97 81 65 49 33 17 1 #0 doadump () at pcpu.h:246 246 __asm __volatile(movl %%fs:0,%0 : =r (td)); (kgdb) bt #0 doadump () at pcpu.h:246 #1 0xc04e8bb9 in db_fncall (dummy1=-1064515926, dummy2=0, dummy3=-1, dummy4=0xe83f4834 HH?è) at /usr/src/sys/ddb/db_command.c:548 #2 0xc04e8fef in db_command (last_cmdp=0xc0de14dc, cmd_table=0x0, dopager=0) at /usr/src/sys/ddb/db_command.c:445 #3 0xc04e90a4 in db_command_script (command=0xc0de2404 call doadump) at /usr/src/sys/ddb/db_command.c:516 #4 0xc04ed1d0 in db_script_exec (scriptname=0xe83f4940 kdb.enter.panic, warnifnotfound=Variable warnifnotfound is not available. ) at /usr/src/sys/ddb/db_script.c:302 #5 0xc04ed2b7 in db_script_kdbenter (eventname=0xc0ca1948 panic) at /usr/src/sys/ddb/db_script.c:324 #6 0xc04eaf98 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:228 #7 0xc08cc526 in kdb_trap (type=3, code=0, tf=0xe83f4a7c) at /usr/src/sys/kern/subr_kdb.c:535 #8 0xc0bdd38b in trap (frame=0xe83f4a7c) at /usr/src/sys/i386/i386/trap.c:690 #9 0xc0bbef1b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #10 0xc08cc6aa in kdb_enter (why=0xc0ca1948 panic, msg=0xc0ca1948 panic) at cpufunc.h:71 #11 0xc089d716 in panic (fmt=0xc0c3c80c Bad link elm %p next-prev != elm) at /usr/src/sys/kern/kern_shutdown.c:562 #12 0xc094e7fb in if_clone_destroyif (ifc=0xc0da1cc0, ifp=0xc5f1a800) at /usr/src/sys/net/if_clone.c:249 #13 0xc094eb52 in if_clone_destroy (name=0xc664ac20 tun0) at /usr/src/sys/net/if_clone.c:227 #14 0xc094c8a6 in ifioctl (so=0xc6e0a9a8, cmd=2149607801, data=0xc664ac20 tun0, td=0xc66c0d80) at /usr/src/sys/net/if.c:2412 #15 0xc08e8b25 in soo_ioctl (fp=0xc6d46af0, cmd=2149607801, data=0xc664ac20, active_cred=0xc5f62280, td=0xc66c0d80) at /usr/src/sys/kern/sys_socket.c:212 #16 0xc08e31bd in kern_ioctl (td=0xc66c0d80, fd=3, com=2149607801, data=0xc664ac20 tun0) at file.h:262 #17 0xc08e3344 in ioctl (td=0xc66c0d80, uap=0xe83f4cf8) at /usr/src/sys/kern/sys_generic.c:678 #18 0xc0bdca33 in syscall (frame=0xe83f4d38) at /usr/src/sys/i386/i386/trap.c:1078 #19 0xc0bbefb0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 #20 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 12 #12 0xc094e7fb in if_clone_destroyif (ifc=0xc0da1cc0, ifp=0xc5f1a800) at /usr/src/sys/net/if_clone.c:249 249 IFC_IFLIST_REMOVE(ifc, ifp); (kgdb) list 244 * switch to the vnet context of the target vnet. 245 */ 246 CURVNET_SET_QUIET(ifp-if_vnet); 247 248 IF_CLONE_LOCK(ifc); 249 IFC_IFLIST_REMOVE(ifc, ifp); 250 IF_CLONE_UNLOCK(ifc); 251 252 if_delgroup(ifp, ifc-ifc_name); 253 Actually, this issue has already been reported (kern/116837, see the bottom of the discussion) and there was a patch provided by Takahiro Kurosawa [check that ifp is on ifc-ifc_iflist before calling IFC_IFLIST_REMOVE(ifc, ifp)]. Although he mentioned that another race was still possible. I have tried the patch and yes it makes the situation much better: the box did not crush when running two ifconfig tun0 create/destroy scripts concurrently, but when I tried 8 concurrent processes :-) it crashed after a couple minutes in another place: (kgdb) bt #0 doadump () at pcpu.h:246 #1 0xc04ec379 in db_fncall (dummy1=1, dummy2=0, dummy3=-1056947200, dummy4=0xe86848e4 ) at /usr/src/sys/ddb/db_command.c:548 #2 0xc04ec771 in db_command (last_cmdp=0xc0e04d1c, cmd_table=0x0, dopager=1) at /usr/src/sys/ddb/db_command.c:445 #3 0xc04ec8ca in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #4 0xc04ee76d in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:229 #5 0xc08d7d06 in kdb_trap (type=12, code=0, tf=0xe8684ad0) at /usr/src/sys/kern/subr_kdb.c:535 #6 0xc0bea66f in trap_fatal (frame=0xe8684ad0, eva=3735929054) at /usr/src/sys/i386/i386/trap.c:929 #7 0xc0beaf90 in trap (frame=0xe8684ad0) at /usr/src/sys/i386/i386/trap.c:328 #8 0xc0bccd7b in calltrap
Re: kmem leakage on tun/tap device removal
enabled, resume, IOPL = 0 current process = 53523 (ifconfig) trap number = 12 panic: page fault cpuid = 0 Uptime: 1m15s Physical memory: 2019 MB Dumping 109 MB: 94 78 62 46 30 14 137 Thread 100216 (PID=53523: ifconfig) doadump () at pcpu.h:246 #0 doadump () at pcpu.h:246 #1 0xc0881c97 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:416 #2 0xc0881f89 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:579 #3 0xc0bb39ec in trap_fatal (frame=0xe84959ec, eva=416) at /usr/src/sys/i386/i386/trap.c:938 #4 0xc0bb3c70 in trap_pfault (frame=0xe84959ec, usermode=0, eva=416) at /usr/src/sys/i386/i386/trap.c:851 #5 0xc0bb4675 in trap (frame=0xe84959ec) at /usr/src/sys/i386/i386/trap.c:533 #6 0xc0b96e0b in calltrap () at /usr/src/sys/i386/i386/exception.s:165 #7 0xc087233f in _mtx_lock_sleep (m=0xc5c4b22c, tid=3330476288, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:369 #8 0xc0926471 in if_detach (ifp=0xc5c4b000) at /usr/src/sys/net/if.c:1188 #9 0xc0930879 in tun_destroy (tp=0xc6860d80) at /usr/src/sys/net/if_tun.c:259 #10 0xc0931927 in tun_clone_destroy (ifp=0xc5c4b000) at /usr/src/sys/net/if_tun.c:277 #11 0xc092a407 in ifc_simple_destroy (ifc=0xc0d496e0, ifp=0xc5c4b000) at /usr/src/sys/net/if_clone.c:595 #12 0xc092a62c in if_clone_destroyif (ifc=0xc0d496e0, ifp=0xc5c4b000) at /usr/src/sys/net/if_clone.c:254 #13 0xc092a9e2 in if_clone_destroy (name=0xc5f201e0 tun0) at /usr/src/sys/net/if_clone.c:227 #14 0xc0928a26 in ifioctl (so=0xc6806000, cmd=2149607801, data=0xc5f201e0 tun0, td=0xc6830900) at /usr/src/sys/net/if.c:2412 #15 0xc08c4c32 in soo_ioctl (fp=0xc6743968, cmd=2149607801, data=0xc5f201e0, active_cred=0xc66fa680, td=0xc6830900) at /usr/src/sys/kern/sys_socket.c:212 #16 0xc08bdb00 in kern_ioctl (td=0xc6830900, fd=3, com=2149607801, data=0xc5f201e0 tun0) at file.h:262 #17 0xc08bdc74 in ioctl (td=0xc6830900, uap=0xe8495cf8) at /usr/src/sys/kern/sys_generic.c:678 #18 0xc0bb3fb5 in syscall (frame=0xe8495d38) at /usr/src/sys/i386/i386/trap.c:1078 #19 0xc0b96e70 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 #20 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) 138 Thread 100211 (PID=53526: ifconfig) sched_switch (td=0xc6832480, newtd=0xc5951b40, flags=259) at /usr/src/sys/kern/sched_ule.c:1864 #0 sched_switch (td=0xc6832480, newtd=0xc5951b40, flags=259) at /usr/src/sys/kern/sched_ule.c:1864 #1 0xc088a15a in mi_switch (flags=259, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:449 #2 0xc08bc02b in turnstile_wait (ts=0xc5ec1700, owner=0xc6830900, queue=Variable queue is not available. ) at /usr/src/sys/kern/subr_turnstile.c:745 #3 0xc088073f in _rw_rlock (rw=0xc0db6024, file=0x0, line=0) at /usr/src/sys/kern/kern_rwlock.c:460 #4 0xc0924867 in ifunit_ref (name=0xc685e200 tun0) at /usr/src/sys/net/if.c:2017 #5 0xc0928d10 in ifioctl (so=0xc6820338, cmd=3223349536, data=0xc685e200 tun0, td=0xc6832480) at /usr/src/sys/net/if.c:2420 #6 0xc08c4c32 in soo_ioctl (fp=0xc6808c78, cmd=3223349536, data=0xc685e200, active_cred=0xc6802500, td=0xc6832480) at /usr/src/sys/kern/sys_socket.c:212 #7 0xc08bdb00 in kern_ioctl (td=0xc6832480, fd=3, com=3223349536, data=0xc685e200 tun0) at file.h:262 #8 0xc08bdc74 in ioctl (td=0xc6832480, uap=0xe846ccf8) at /usr/src/sys/kern/sys_generic.c:678 #9 0xc0bb3fb5 in syscall (frame=0xe846cd38) at /usr/src/sys/i386/i386/trap.c:1078 #10 0xc0b96e70 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261 #11 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kmem leakage on tun/tap device removal
On Sun, 28 Feb 2010 13:30:59 +0200 Mikolaj Golub wrote: I am running i386 8.0-STABLE (but rather old, from Dec 1, I can run tests on newer sources if this makes difference). On today 8.0-STABLE I have had the same panic. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
kmem leakage on tun/tap device removal
) dev-si_flags |= SI_CHEAPCLONE; - } } tuncreate(ifc-ifc_name, dev); @@ -239,10 +237,8 @@ tunclone(void *arg, struct ucred *cred, /* No preexisting struct cdev *, create one */ *dev = make_dev(tun_cdevsw, u, UID_UUCP, GID_DIALER, 0600, %s, name); - if (*dev != NULL) { - dev_ref(*dev); + if (*dev != NULL) (*dev)-si_flags |= SI_CHEAPCLONE; - } } if_clone_create(name, namelen, NULL); -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: mpd has hung
', p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = { slh_first = 0x0}, kl_lock = 0x802aaac0 knlist_mtx_lock, kl_unlock = 0x802aaa90 knlist_mtx_unlock, kl_assert_locked = 0x802a7de0 knlist_mtx_assert_locked, kl_assert_unlocked = 0x802a7df0 knlist_mtx_assert_unlocked, kl_lockarg = 0xff0012c070f8}, p_numthreads = 2, p_md = {md_ldt = 0x0, md_ldt_sd = {sd_lolimit = 0, sd_lobase = 0, sd_type = 0, sd_dpl = 0, sd_p = 0, sd_hilimit = 0, sd_xx0 = 0, sd_gran = 0, sd_hibase = 0, sd_xx1 = 0, sd_mbz = 0, sd_xx2 = 0}}, p_itcallout = {c_links = {sle = {sle_next = 0x0}, tqe = { tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, p_acflag = 17, p_peers = 0x0, p_leader = 0xff0012c07000, p_emuldata = 0x0, p_label = 0x0, p_sched = 0xff0012c07460, p_ktr = {stqh_first = 0x0, stqh_last = 0xff0012c07430}, p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0, p_pwait = {cv_description = 0x804c919f ppwait, cv_waiters = 0}} Unfortunately there is no stack trace for flowcleaner. I have asked Alexander to make the kernel panic on the next reboot and provide backtrace of flowcleaner thread from the crush dump but I don't know if he has managed to do this (this is a production host, which complicates things). -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: NFS mount properties
On Thu, 28 Jan 2010 08:46:33 -0800 Alan Aldrich wrote: I am trying to determine how to examine the actual properties of an nfs mount in FreeBSD In CentOS one can 'cat /proc/mounts' to determine all of the mount properties. Specifically I am trying to confirm whether the mount is using TCP or UDP as I want it to be TCP . Is there some similar way to tell in FreeBSD? My fstab entry says this 192.168.44.55:/mcp/home /netnfs rw,async,-d,-3,-s,-i,noatime,- T 0 0 It mounts fine, but I want to confirm that it is actually mounting with TCP and not UDP and cannot figure out what tool will tell me this. 'mount' tells me this 192.168.44.55:/mcp/home on /net (nfs, asynchronous, noatime) but not whether it is mounted TCP or UDP If it is mounted with TCP u UDP you can find out with netstat, checking tcp connections to 2049 port. If you see established connections like below tcp4 0 0 10.0.0.110.895 10.0.100.2.2049ESTABLISHED then tcp is used. Certainly if you have several mounts on the same ip it complicates the situation :-) I don't know any such tools that would report this info and would glad to hear about them, but if I really needed this info I could use the universal tool -- kgdb :-) zhuzha:~% sudo kgdb (kgdb) set print pretty (kgdb) p *mountlist.tqh_first.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next $26 = { ... mnt_list = { tqe_next = 0xc5e9a78c, tqe_prev = 0xc5e9acac }, ... mnt_opt = 0xc61f7760, ... mnt_stat = { f_version = 537068824, f_type = 4, f_flags = 0, f_bsize = 512, f_iosize = 32768, f_blocks = 284354052, f_bfree = 137997300, f_bavail = 115248976, f_files = 18394110, f_ffree = 16921938, f_syncwrites = 0, f_asyncwrites = 0, f_syncreads = 0, f_asyncreads = 0, f_spare = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, f_namemax = 255, f_owner = 0, f_fsid = { val = {67174148, 4} }, f_charspare = '\0' repeats 79 times, f_fstypename = nfs, '\0' repeats 12 times, f_mntfromname = srv01.ua1:/var/public\000\004(\000\000\000\000\000\214\004\b\003\000\000\000\003\000\000\000P\000\000\000X\002\000\000\200╩\000\000\000пBф\000\000\000\000 \a \aю\a7ф \a7фюx\037ф╟x\037ф\004\000\000\000\v\000\000, f_mntonname = /mnt/0, '\0' repeats 81 times }, ... } (kgdb) p *mountlist.tqh_first.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_opt.tqh_first $27 = { link = { tqe_next = 0xc6370560, tqe_prev = 0xc61f7760 }, name = 0xc61f7770 rw, value = 0xc61f7780, len = 1, pos = 0, seen = 0 } (kgdb) p *mountlist.tqh_first.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_opt.tqh_first.link.tqe_next $28 = { link = { tqe_next = 0xc6370580, tqe_prev = 0xc6370520 }, name = 0xc61f77a0 soft, value = 0xc61f77b0, len = 1, pos = 1, seen = 1 } (kgdb) p *mountlist.tqh_first.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_opt.tqh_first.link.tqe_next.link.tqe_next $29 = { link = { tqe_next = 0xc63705a0, tqe_prev = 0xc6370560 }, name = 0xc61f77c0 intr, value = 0xc61f7810, len = 1, pos = 2, seen = 1 } (kgdb) p *mountlist.tqh_first.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_opt.tqh_first.link.tqe_next.link.tqe_next.link.tqe_next $30 = { link = { tqe_next = 0xc63705e0, tqe_prev = 0xc6370580 }, name = 0xc61f77d0 rsize, value = 0xc61f77e0, len = 6, pos = 3, seen = 1 } (kgdb) p *mountlist.tqh_first.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_opt.tqh_first.link.tqe_next.link.tqe_next.link.tqe_next.link.tqe_next $31 = { link = { tqe_next = 0xc6370620, tqe_prev = 0xc63705a0 }, name = 0xc61f77f0 wsize, value = 0xc61f7800, len = 6, pos = 4, seen = 1 } (kgdb) p *mountlist.tqh_first.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_list.tqe_next.mnt_opt.tqh_first.link.tqe_next.link.tqe_next.link.tqe_next.link.tqe_next.link.tqe_next $32 = { link = { tqe_next = 0xc6370640, tqe_prev = 0xc63705e0 }, name = 0xc61f7820 tcp, value = 0xc61f7830, len = 1, pos = 5, seen = 1 } If I needed to do this frequently I would write a gdb script taking as an example nice scripts from jhb :-) http://people.freebsd.org/~jhb/gdb/ -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
close: Socket is not connected
) errx(1, fork(): %d, errno); if (0 != pid) { /* parent */ if ((listenfd = socket(AF_LOCAL, SOCK_STREAM, 0)) 0) errx(1, parent: socket error: %d, errno); unlink(UNIXSTR_PATH); bzero(servaddr, sizeof(servaddr)); servaddr.sun_family = AF_LOCAL; strcpy(servaddr.sun_path, UNIXSTR_PATH); if (bind(listenfd, (struct sockaddr *) servaddr, sizeof(servaddr)) 0) errx(1, parent: bind error: %d, errno); if (listen(listenfd, 1024) 0) errx(1, parent: listen error: %d, errno); for ( ; ; ) { if ((connfd = accept(listenfd, (struct sockaddr *) NULL, NULL)) 0) errx(1, parent: accept error: %d, errno); if (fcntl(connfd, F_SETFL, O_NONBLOCK) == -1) errx(1, parent: fcntl error: %d, errno); Read(connfd, buf, sizeof(buf)); Write(connfd, buf, sizeof(buf)); if (close(connfd) 0) errx(1, parent: close error: %d, errno); } } else { /* child */ /* wait some time while parent has created socket */ sleep(1); for ( ; ; ) { if ((connfd = socket(AF_LOCAL, SOCK_STREAM, 0)) 0) errx(1, child: socket error: %d, errno); if (fcntl(connfd, F_SETFL, O_NONBLOCK) == -1) errx(1, child: fcntl error: %d, errno); bzero(servaddr, sizeof(servaddr)); servaddr.sun_family = AF_LOCAL; strcpy(servaddr.sun_path, UNIXSTR_PATH); if (connect(connfd, (struct sockaddr *) servaddr, sizeof(servaddr)) 0) errx(1, child: connect error %d, errno); Write(connfd, buf, sizeof(buf)); Read(connfd, buf, sizeof(buf)); if (close(connfd) != 0) errx(1, child: close error: %d, errno); usleep(USLEEP); } } return 0; } -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: host(1) coredumps
On Mon, 14 Sep 2009 01:16:43 +0800 Eugene Grosbein wrote: EG On Sun, Sep 13, 2009 at 05:41:50PM +0200, vol...@vwsoft.com wrote: % host -l grosbein.pp.ru. ns2.rucable.net. ; Transfer failed. /usr/local/src/lib/bind/isc/../../../contrib/bind9/lib/isc/unix/socket.c:2486: REQUIREsock) != ((void *)0)) (((const isc__magic_t *)(sock))-magic == ((('I') 24 | ('O') 16 | ('i') 8 | ('o')) failed. zsh: abort (core dumped) host -l grosbein.pp.ru. ns2.rucable.net. Shoud I send PR? Eugene, the attached patch works around the error for me. As this is contributed code, it should be fixed upstream (no need to file a PR). Volker --- contrib/bind9/bin/dig/dighost.c.orig2009-09-13 14:24:13.0 + +++ contrib/bind9/bin/dig/dighost.c2009-09-13 14:31:52.0 + EG Indeed, the patch helps. Thank you. BTW, we have already had the pr about this problem. http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/138061 IMO it would be nice to add the patch there. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/134557: [netgraph] [hang] 7.2 with mpd5.3 hanging up - ng_pptp problem
Unfortunately, the problem was introduced by this commit :-) -- Author: mav Date: Sat Jan 31 12:48:09 2009 UTC (4 months, 4 weeks ago) Log Message: MFC rev. 187495 Check for infinite recursion possible on some broken PPTP/L2TP/... VPN setups. Mark packets with mbuf_tag on first interface passage and drop on second. PR: ports/129625, ports/125303 -- If a packet goes through two or more ng interfaces, while loop in the tag checking code can run infinitely. The attached patch should fix this. -- Mikolaj Golub --- netgraph/ng_iface.c.orig 2009-06-30 21:47:54.0 +0300 +++ netgraph/ng_iface.c 2009-06-30 21:49:29.0 +0300 @@ -365,7 +365,8 @@ } /* Protect from deadly infinite recursion. */ - while ((mtag = m_tag_locate(m, MTAG_NGIF, MTAG_NGIF_CALLED, NULL))) { + mtag = NULL; + while ((mtag = m_tag_locate(m, MTAG_NGIF, MTAG_NGIF_CALLED, mtag))) { if (*(struct ifnet **)(mtag + 1) == ifp) { log(LOG_NOTICE, Loop detected on %s\n, ifp-if_xname); m_freem(m); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/134557: [netgraph] [hang] 7.2 with mpd5.3 hanging up - ng_pptp problem
The following reply was made to PR kern/134557; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: bug-follo...@freebsd.org Cc: freebsd-net@FreeBSD.org, Sergei Cherveni sergei.cherv...@gmail.com, Alexander Motin m...@freebsd.org Subject: Re: kern/134557: [netgraph] [hang] 7.2 with mpd5.3 hanging up - ng_pptp problem Date: Tue, 30 Jun 2009 22:33:12 +0300 --=-=-= Unfortunately, the problem was introduced by this commit :-) -- Author:mav Date: Sat Jan 31 12:48:09 2009 UTC (4 months, 4 weeks ago) Log Message: MFC rev. 187495 Check for infinite recursion possible on some broken PPTP/L2TP/... VPN setups. Mark packets with mbuf_tag on first interface passage and drop on second. PR:ports/129625, ports/125303 -- If a packet goes through two or more ng interfaces, while loop in the tag checking code can run infinitely. The attached patch should fix this. -- Mikolaj Golub --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=ng_iface.c.patch --- netgraph/ng_iface.c.orig 2009-06-30 21:47:54.0 +0300 +++ netgraph/ng_iface.c2009-06-30 21:49:29.0 +0300 @@ -365,7 +365,8 @@ } /* Protect from deadly infinite recursion. */ - while ((mtag = m_tag_locate(m, MTAG_NGIF, MTAG_NGIF_CALLED, NULL))) { + mtag = NULL; + while ((mtag = m_tag_locate(m, MTAG_NGIF, MTAG_NGIF_CALLED, mtag))) { if (*(struct ifnet **)(mtag + 1) == ifp) { log(LOG_NOTICE, Loop detected on %s\n, ifp-if_xname); m_freem(m); --=-=-=-- ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system
Could you try the patch from kern/134557? -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system
The following reply was made to PR kern/133572; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: Dennis Melentyev dennis.melent...@gmail.com Cc: bug-follo...@freebsd.org, freebsd-net@FreeBSD.org Subject: Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system Date: Tue, 30 Jun 2009 23:00:00 +0300 Could you try the patch from kern/134557? -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: panic with ng_ipfw+ng_car and net.inet.ip.fw.one_pass=0
On Fri, 5 Jun 2009 22:56:47 +0400 Oleg Bulyzhin wrote: On Fri, Jun 05, 2009 at 04:57:52PM +0300, Mikolaj Golub wrote: It works for me. With the patch I has not managed to crash the system using my test. Some notes: - only ng_ipfw/ng_car subsystem has been tested (not dummynet). - my -current box is under qemu (I don't have real server running -current to test this). If you are interesting in some testing of dummynet before commiting this to current, let me know. I could try some tests but only the next week. I did some testing of dummynet though extra testing would not hurt. I see the patch has been commited to 8-CURRENT :-). Thanks. I did some dummy tests on fixed current (simple dummynet configuration + traffic + ipfw reloaded every second) and did not have any issues. At present I don't have old -current without fix to reproduce the crash, but on 7-STABLE running this test I saw in dmesg many ipfw: ouch!, skip past end of rules, denying packet messages and one time crashed the system. So it looks like my testbase rather good and would have found problems with fixed current if they still had had. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: panic with ng_ipfw+ng_car and net.inet.ip.fw.one_pass=0
On Fri, 5 Jun 2009 00:47:20 +0400 Oleg Bulyzhin wrote: On Wed, Jun 03, 2009 at 09:03:11PM +0400, Oleg Bulyzhin wrote: On Mon, Jun 01, 2009 at 11:12:45AM +0300, Mikolaj Golub wrote: It looks the problem has not drawn much attention :-). I was on vacation so did not reply in time. Dummynet like solution is not enough, dummynet is affected by this problem too. I'll send patch to you for testing tomorrow. Please test attached patch and let me know results. Patch made for -current and it changes ABI, so rebuilding ipfw with new headers required. It works for me. With the patch I has not managed to crash the system using my test. Some notes: - only ng_ipfw/ng_car subsystem has been tested (not dummynet). - my -current box is under qemu (I don't have real server running -current to test this). If you are interesting in some testing of dummynet before commiting this to current, let me know. I could try some tests but only the next week. If you are going to commit this to -current could you please fix ng_ipfw(4) man page too? Index: share/man/man4/ng_ipfw.4 === --- share/man/man4/ng_ipfw.4(revision 193478) +++ share/man/man4/ng_ipfw.4(working copy) @@ -84,11 +84,12 @@ struct ng_ipfw_tag { struct m_tagmt; /* tag header */ struct ip_fw*rule; /* matching rule */ + uint32_trule_id;/* matching rule id */ + uint32_tchain_id; /* ruleset id */ struct ifnet *ifp; /* interface, for ip_output */ int dir;/* packet direction */ #defineNG_IPFW_OUT 0 #defineNG_IPFW_IN 1 - int flags; /* flags, for ip_output() */ }; .Ed .Pp -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: panic with ng_ipfw+ng_car and net.inet.ip.fw.one_pass=0
On Mon, 25 May 2009 22:29:25 +0300 Mikolaj Golub wrote: Hi, Some times ago it has been posted to fido7.ru.unix.bsd about panics when using ipfw + ng_ipfw + ng_car. http://groups.google.com/group/fido7.ru.unix.bsd/browse_thread/thread/5907d1ba4e76675d For those who haven't learnt Russian yet ;-) here are some details. Max Irgiznov reported that when ng_ipf+ng_car construction was used and net.inet.ip.fw.one_pass=0 was set, the system reliably panicked on ipfw rules reload if there was some traffic through ng_car. The problem here is in the following. When the packet is returning back from ng_car queue to ipfw_chk and one_pass is turned off the next rule is being tried. But if the rules were reloaded while the packet was sitting in ng_car, the next rule pointer might be dangling and the kernel will panic. (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc07e1f7e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc07e2252 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc0495eb7 in db_panic (addr=Could not find the frame base for db_panic. ) at /usr/src/sys/ddb/db_command.c:446 #4 0xc04968bc in db_command (last_cmdp=0xc0c97514, cmd_table=0x0, dopager=1) at /usr/src/sys/ddb/db_command.c:413 #5 0xc04969ca in db_command_loop () at /usr/src/sys/ddb/db_command.c:466 #6 0xc04981bd in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:228 #7 0xc080ec76 in kdb_trap (type=12, code=0, tf=0xe6945774) at /usr/src/sys/kern/subr_kdb.c:524 #8 0xc0ad9e4f in trap_fatal (frame=0xe6945774, eva=3735929068) at /usr/src/sys/i386/i386/trap.c:930 #9 0xc0ada790 in trap (frame=0xe6945774) at /usr/src/sys/i386/i386/trap.c:320 #10 0xc0abeaab in calltrap () at /usr/src/sys/i386/i386/exception.s:159 #11 0xc903328c in ipfw_chk (args=0xe6945acc) at /usr/src/sys/modules/ipfw/../../netinet/ip_fw2.c:2516 #12 0xc90373f7 in ipfw_check_in (arg=0x0, m0=0xe6945bd0, ifp=0xc41f9000, dir=1, inp=0x0) at /usr/src/sys/modules/ipfw/../../netinet/ip_fw_pfil.c:125 #13 0xc088d6e8 in pfil_run_hooks (ph=0xc0d1f620, mp=0xe6945c24, ifp=0xc41f9000, dir=1, inp=0x0) at /usr/src/sys/net/pfil.c:78 #14 0xc08c766d in ip_input (m=0xc409ad00) at /usr/src/sys/netinet/ip_input.c:416 #15 0xc9011c39 in ng_ipfw_rcvdata (hook=0xc61a1780, item=0xc8fe5090) at /usr/src/sys/modules/netgraph/ipfw/../../../netgraph/ng_ipfw.c:250 #16 0xc68b80af in ng_apply_item (node=0xc7054c00, item=0xc8fe5090, rw=0) at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:2336 #17 0xc68b939f in ngthread (arg=0x0) at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3304 #18 0xc07be4c8 in fork_exit (callout=0xc68b91f0 ngthread, arg=0x0, frame=0xe6945d38) at /usr/src/sys/kern/kern_fork.c:810 #19 0xc0abeb20 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264 (kgdb) frame 11 #11 0xc903328c in ipfw_chk (args=0xe6945acc) at /usr/src/sys/modules/ipfw/../../netinet/ip_fw2.c:2516 warning: Source file is more recent than executable. 2516if (set_disable (1 f-set) ) (kgdb) list 2511ipfw_insn *cmd; 2512uint32_t tablearg = 0; 2513int l, cmdlen, skip_or; /* skip rest of OR block */ 2514 2515again: 2516if (set_disable (1 f-set) ) 2517continue; 2518 2519skip_or = 0; 2520for (l = f-cmd_len, cmd = f-cmd ; l 0 ; (kgdb) p f $1 = (struct ip_fw *) 0xdeadc0de (kgdb) DUMMYNET does not have such problems as ip_dn_ruledel_ptr(rule) is called when the rule is removed in reap_rules(). The first thought was to do the same here i.e. to broadcast remove the rule message to netgraph nodes, but glancing through the netgraph man I haven't figured out how it could be done if it is possible at all. So the other solution is to have some counter that increases every time when any rules are removed. When the packet is directed by ipfw to netgraph subsystem, the current value of the counter is stored in mtag. When the packet is coming back the current value of the counter is compared with one from the mtag and if they differ the packet is dropped. Just to prove the concept I have modified ip_fw2.c for 7.2-STABLE accordingly and it works for me. The patch is attached. It looks the problem has not drawn much attention :-). Anyway, another version of the patch is attached. This time almost all of the necessary modifications are done in ng_ipfw module. Only the small changes have been made in ip_fw module and I tried to make them in the same manner as it is done for dummynet. The main logic is the same as in the previous patch: have internal counter ng_ipfw_rdcnt that is increased every time when some rule is deleted from the chain and compare it with one stored in ng_ipfw_tag when a packet passes ng_ipfw_rcvdata(). The patch
panic with ng_ipfw+ng_car and net.inet.ip.fw.one_pass=0
Hi, Some times ago it has been posted to fido7.ru.unix.bsd about panics when using ipfw + ng_ipfw + ng_car. http://groups.google.com/group/fido7.ru.unix.bsd/browse_thread/thread/5907d1ba4e76675d For those who haven't learnt Russian yet ;-) here are some details. Max Irgiznov reported that when ng_ipf+ng_car construction was used and net.inet.ip.fw.one_pass=0 was set, the system reliably panicked on ipfw rules reload if there was some traffic through ng_car. The problem here is in the following. When the packet is returning back from ng_car queue to ipfw_chk and one_pass is turned off the next rule is being tried. But if the rules were reloaded while the packet was sitting in ng_car, the next rule pointer might be dangling and the kernel will panic. (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc07e1f7e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc07e2252 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc0495eb7 in db_panic (addr=Could not find the frame base for db_panic. ) at /usr/src/sys/ddb/db_command.c:446 #4 0xc04968bc in db_command (last_cmdp=0xc0c97514, cmd_table=0x0, dopager=1) at /usr/src/sys/ddb/db_command.c:413 #5 0xc04969ca in db_command_loop () at /usr/src/sys/ddb/db_command.c:466 #6 0xc04981bd in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:228 #7 0xc080ec76 in kdb_trap (type=12, code=0, tf=0xe6945774) at /usr/src/sys/kern/subr_kdb.c:524 #8 0xc0ad9e4f in trap_fatal (frame=0xe6945774, eva=3735929068) at /usr/src/sys/i386/i386/trap.c:930 #9 0xc0ada790 in trap (frame=0xe6945774) at /usr/src/sys/i386/i386/trap.c:320 #10 0xc0abeaab in calltrap () at /usr/src/sys/i386/i386/exception.s:159 #11 0xc903328c in ipfw_chk (args=0xe6945acc) at /usr/src/sys/modules/ipfw/../../netinet/ip_fw2.c:2516 #12 0xc90373f7 in ipfw_check_in (arg=0x0, m0=0xe6945bd0, ifp=0xc41f9000, dir=1, inp=0x0) at /usr/src/sys/modules/ipfw/../../netinet/ip_fw_pfil.c:125 #13 0xc088d6e8 in pfil_run_hooks (ph=0xc0d1f620, mp=0xe6945c24, ifp=0xc41f9000, dir=1, inp=0x0) at /usr/src/sys/net/pfil.c:78 #14 0xc08c766d in ip_input (m=0xc409ad00) at /usr/src/sys/netinet/ip_input.c:416 #15 0xc9011c39 in ng_ipfw_rcvdata (hook=0xc61a1780, item=0xc8fe5090) at /usr/src/sys/modules/netgraph/ipfw/../../../netgraph/ng_ipfw.c:250 #16 0xc68b80af in ng_apply_item (node=0xc7054c00, item=0xc8fe5090, rw=0) at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:2336 #17 0xc68b939f in ngthread (arg=0x0) at /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3304 #18 0xc07be4c8 in fork_exit (callout=0xc68b91f0 ngthread, arg=0x0, frame=0xe6945d38) at /usr/src/sys/kern/kern_fork.c:810 #19 0xc0abeb20 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264 (kgdb) frame 11 #11 0xc903328c in ipfw_chk (args=0xe6945acc) at /usr/src/sys/modules/ipfw/../../netinet/ip_fw2.c:2516 warning: Source file is more recent than executable. 2516if (set_disable (1 f-set) ) (kgdb) list 2511ipfw_insn *cmd; 2512uint32_t tablearg = 0; 2513int l, cmdlen, skip_or; /* skip rest of OR block */ 2514 2515again: 2516if (set_disable (1 f-set) ) 2517continue; 2518 2519skip_or = 0; 2520for (l = f-cmd_len, cmd = f-cmd ; l 0 ; (kgdb) p f $1 = (struct ip_fw *) 0xdeadc0de (kgdb) DUMMYNET does not have such problems as ip_dn_ruledel_ptr(rule) is called when the rule is removed in reap_rules(). The first thought was to do the same here i.e. to broadcast remove the rule message to netgraph nodes, but glancing through the netgraph man I haven't figured out how it could be done if it is possible at all. So the other solution is to have some counter that increases every time when any rules are removed. When the packet is directed by ipfw to netgraph subsystem, the current value of the counter is stored in mtag. When the packet is coming back the current value of the counter is compared with one from the mtag and if they differ the packet is dropped. Just to prove the concept I have modified ip_fw2.c for 7.2-STABLE accordingly and it works for me. The patch is attached. I would like to hear other people opinion, first of all if the proposed idea is good enough or there might be other better solutions for the problem (e.g. remove the rule broadcasting is possible). But also if somebody have any remarks about the patch itself I would happy to see them. E.g. I have added the counter just as static variable but as for me struct ip_fw_chain could be better place for this. Also is there any need to mark the tag with MTAG_PERSISTENT bit? -- Mikolaj Golub --- sys/netinet/ip_fw2.c.orig 2009-05-24 14:25:30.0 +0300 +++ sys/netinet/ip_fw2.c 2009-05-25 19:30:33.0 +0300 @@ -111,6 +111,15 @@ static int fw_verbose; static struct callout ipfw_timeout; static int
Re: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault
The following reply was made to PR kern/133902; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: bug-follo...@freebsd.org Cc: freebsd-b...@freebsd.org, freebsd-net@FreeBSD.org, lsantagost...@gmail.com Subject: Re: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault Date: Thu, 23 Apr 2009 17:14:02 +0300 I have asked Leonardo to provide more info and backtrace. So here is backtrace: cobra4# kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x65656c7b fault code = supervisor write, page not present instruction pointer = 0x20:0xc0786e00 stack pointer = 0x28:0xe958fac4 frame pointer = 0x28:0xe958fac4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 66873 (ssh) trap number = 12 panic: page fault cpuid = 1 Uptime: 54d11h21m54s Physical memory: 2023 MB Dumping 277 MB: 262 246 230 214 198 182 166 150 134 118 102 86 70 54 38 22 6 #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #0 doadump () at pcpu.h:195 #1 0xc0754457 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0754719 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:563 #3 0xc0a4905c in trap_fatal (frame=0xe958fa84, eva=1701145723) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc0a492e0 in trap_pfault (frame=0xe958fa84, usermode=0, eva=1701145723) at /usr/src/sys/i386/i386/trap.c:812 #5 0xc0a49c8c in trap (frame=0xe958fa84) at /usr/src/sys/i386/i386/trap.c:490 #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0786e00 in clear_selinfo_list (td=0xca3fc840) at /usr/src/sys/kern/sys_generic.c:1065 #8 0xc0788efc in kern_select (td=0xca3fc840, nd=8, fd_in=0x284010b8, fd_ou=0x284010bc, fd_ex=0x0, tvp=0x0) at /usr/src/sys/kern/sys_generic.c:794 #9 0xc07890de in select (td=0xca3fc840, uap=0xe958fcfc) at /usr/src/sys/kern/sys_generic.c:663 #10 0xc0a49635 in syscall (frame=0xe958fd38) at /usr/src/sys/i386/i386/trap.c:1035 #11 0xc0a2fc70 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:196 #12 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) The system panics on ifconfig tun0 destroy This issue is related to kern/116837. Leonardo, you can try the patch attached to that pr. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/133902: [tun] Killing tun0 iface ssh tunnel causes Panic String: page fault
I have asked Leonardo to provide more info and backtrace. So here is backtrace: cobra4# kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x65656c7b fault code = supervisor write, page not present instruction pointer = 0x20:0xc0786e00 stack pointer = 0x28:0xe958fac4 frame pointer = 0x28:0xe958fac4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 66873 (ssh) trap number = 12 panic: page fault cpuid = 1 Uptime: 54d11h21m54s Physical memory: 2023 MB Dumping 277 MB: 262 246 230 214 198 182 166 150 134 118 102 86 70 54 38 22 6 #0 doadump () at pcpu.h:195 195 pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #0 doadump () at pcpu.h:195 #1 0xc0754457 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0754719 in panic (fmt=Variable fmt is not available. ) at /usr/src/sys/kern/kern_shutdown.c:563 #3 0xc0a4905c in trap_fatal (frame=0xe958fa84, eva=1701145723) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc0a492e0 in trap_pfault (frame=0xe958fa84, usermode=0, eva=1701145723) at /usr/src/sys/i386/i386/trap.c:812 #5 0xc0a49c8c in trap (frame=0xe958fa84) at /usr/src/sys/i386/i386/trap.c:490 #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0786e00 in clear_selinfo_list (td=0xca3fc840) at /usr/src/sys/kern/sys_generic.c:1065 #8 0xc0788efc in kern_select (td=0xca3fc840, nd=8, fd_in=0x284010b8, fd_ou=0x284010bc, fd_ex=0x0, tvp=0x0) at /usr/src/sys/kern/sys_generic.c:794 #9 0xc07890de in select (td=0xca3fc840, uap=0xe958fcfc) at /usr/src/sys/kern/sys_generic.c:663 #10 0xc0a49635 in syscall (frame=0xe958fd38) at /usr/src/sys/i386/i386/trap.c:1035 #11 0xc0a2fc70 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:196 #12 0x0033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) The system panics on ifconfig tun0 destroy This issue is related to kern/116837. Leonardo, you can try the patch attached to that pr. -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/132734: panic in net/if_mib.c
The following reply was made to PR kern/132734; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: Alexey Illarionov littlesav...@orionet.ru Cc: bug-follo...@freebsd.org, Robert Watson rwat...@freebsd.org Subject: Re: kern/132734: panic in net/if_mib.c Date: Thu, 23 Apr 2009 22:29:36 +0300 SVN rev 191435 on 2009-04-23 18:23:08Z by rwatson Merge r191434 from stable/7 to releng/7.2: In sysctl_ifdata(), query the ifnet pointer using the index only once, rather than querying it, validating it, and then re-querying it without validating it. This may avoid a NULL pointer dereference and resulting kernel page fault if an interface is being deleted while bsnmp or other tools are querying data on the interface. The full fix, to properly refcount the interface for the duration of the sysctl, is in 8.x, but is considered too high-risk for 7.2, so instead will appear in 7.3 (if all goes well). So, Alexey, can you try upgrading to the latest stable/7 or releng/7.2 or apply attached patch to see if this tweak at least eliminates the instant panic? --- if_mib.c (revision 191424) +++ if_mib.c (working copy) @@ -82,11 +82,9 @@ return EINVAL; if (name[0] = 0 || name[0] if_index || - ifnet_byindex(name[0]) == NULL) + (ifp = ifnet_byindex(name[0])) == NULL) return ENOENT; - ifp = ifnet_byindex(name[0]); - switch(name[1]) { default: return ENOENT; ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/131310: [netgraph] [panic] 7.1 panics with mpd netgraph interface changes
The following reply was made to PR kern/131310; it has been noted by GNATS. From: Mikolaj Golub to.my.troc...@gmail.com To: bug-follo...@freebsd.org,Vitaly Dodonov dreamer@gmail.com Cc: Semenchuk Oleg darki...@gmail.com Subject: Re: kern/131310: [netgraph] [panic] 7.1 panics with mpd netgraph interface changes Date: Fri, 10 Apr 2009 15:09:38 +0300 This pr is closely related to kern/130977. You can try the patch from it, which adds if_delgroup(ifp, IFG_ALL) to if_detach(). -- Mikolaj Golub ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org