kern/207069 update with patch

2016-02-23 Thread David Cross
I finally got my way through forth and to the point where I have a working
patch; updated ticket with patch and included the testing I had completed.
2 line patch, should be very straightforward to someone familiar with that
code.

Thanks
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


10.2 - Process stuck in unkillable sleep

2016-02-23 Thread Paul Koch

Occasionally we see a process get stuck in an unkillable state and
the only solution is a hard reboot.

Occasionally == once every two weeks across 60+ servers, which are spread
across the globe in customer sites.  We have no remote access to these boxes.

The process that most often that gets stuck, but not limited to, is a large
scale Ping/SNMP poller.  It is a fairly simplistic C program that just fires
out lots of ping (raw ICMP socket) and SNMP (UDP socket) requests
asynchronously.

We've managed to trap the problem a few times on a test server running in
VirtualBox, but it also occurs on customer sites who run VMware, Hyper-V,
QEMU and on bare metal.


We raise this PR
 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204081

but suspect it is a similar/same issue as
 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200992

This is the info we've gathered from the most recent time it has occurred:


# uname -a
FreeBSD shed153.akips.com 10.2-RELEASE-p12 FreeBSD 10.2-RELEASE-p12 #0 r295070:
Sat Jan 30 20:03:44 UTC 2016  
r...@shed21.akips.com:/usr/obj/usr/src/sys/GENERIC amd64


The nm-poller has no state in top for some reason ??

last pid:  1847;  load averages:  0.62,  1.20,  1.33up 13+16:06:04  13:36:46
103 processes: 1 running, 102 sleeping
CPU:  1.0% user,  0.0% nice,  4.3% system,  0.0% interrupt, 94.7% idle
Mem: 650M Active, 541M Inact, 2527M Wired, 16M Cache, 417M Buf, 217M Free
ARC: 2087M Total, 102M MFU, 1968M MRU, 18K Anon, 9409K Header, 9088K Other
Swap: 4096M Total, 256M Used, 3840M Free, 6% Inuse

  PID USERNAMETHR PRI NICE   SIZERES STATE   C   TIMEWCPU COMMAND
 1013 akips 1  200 74076K  5544K select  1 195:41   0.59% nm-httpd
 1003 akips 1  200   164M 54328K select  0 236:18   0.49% 
nm-flow-collector
  888 root  1  200   101M 14920K select  0 163:56   0.39% nm-joatd
  885 akips 1  200 74004K  3092K nanslp  1 116:52   0.29% nm-timed
 1014 akips 1   40   851M   104M 0  18.0H   0.00% nm-poller
 1086 akips 1  200 21940K  2680K nanslp  0  66:25   0.00% top
 1015 akips 1  200   819M   256M nanslp  1  56:45   0.00% 
nm-poller-db
 1023 akips 1  200   114M 44760K select  0  55:00   0.00% 
nm-flow-meter
 1005 akips 1  200   159M  4172K select  0  51:00   0.00% nm-msgd
 1025 akips 1  200   114M 45644K select  0  44:22   0.00% 
nm-flow-meter
 1012 akips 1  200 60360K  5132K piperd  1  20:08   0.00% perl
 1027 akips 1  200   110M 34564K select  1  18:58   0.00% 
nm-flow-meter
  997 akips 1  200   819M 27600K select  0  12:59   0.00% 
nm-snmp-trapd
  991 akips 1  200 78104K  5384K select  1  10:53   0.00% 
nm-fifo-tee
  989 akips 1  200 78104K  5764K select  1  10:34   0.00% 
nm-fifo-tee
  990 akips 1  200 78104K  5496K select  0  10:31   0.00% 
nm-fifo-tee
 1047 akips 1  200   102M 29108K select  0  10:25   0.00% 
nm-flow-meter
  akips 1  200   102M 36000K select  0   9:18   0.00% 
nm-flow-meter
 1231 akips 1  200   102M 35952K select  1   9:17   0.00% 
nm-flow-meter
 1239 akips 1  200   102M 33132K select  0   8:51   0.00% 
nm-flow-meter
 1240 akips 1  200   102M 33132K select  1   8:51   0.00% 
nm-flow-meter
 1002 akips 1  200 74016K  3480K select  1   8:50   0.00% nm-syslogd
 1234 akips 1  200   102M 35920K select  1   8:49   0.00% 
nm-flow-meter
 1243 akips 1  200   102M 33148K select  0   8:46   0.00% 
nm-flow-meter
 1039 akips 1  200   820M 31388K select  0   8:46   0.00% nm-db
 1233 akips 1  200   102M 31256K select  0   8:43   0.00% 
nm-flow-meter
 1237 akips 1  200   102M 33168K select  0   8:43   0.00% 
nm-flow-meter
 1235 akips 1  200   102M 29040K select  1   8:41   0.00% 
nm-flow-meter
 1259 akips 1  200   102M 29096K select  0   8:40   0.00% 
nm-flow-meter
 1255 akips 1  200   102M 31756K select  1   8:40   0.00% 
nm-flow-meter
 1232 akips 1  200   102M 31780K select  1   8:39   0.00% 
nm-flow-meter
 1041 akips 1  200   820M 45284K select  0   8:34   0.00% nm-db
 1044 akips 1  200   820M 26172K select  1   8:28   0.00% nm-db
 1060 akips 1  200 74008K  3380K select  1   8:22   0.00% nm-syslog
 1077 akips 1  200   820M 26076K select  1   8:22   0.00% nm-db
 1048 akips 1  200   820M 26076K select  1   8:16   0.00% nm-db
 1045 akips 1  200   820M 27056K select  1   8:16   0.00% nm-db
 1046 akips 1  200   820M 26156K select  1   8:16   0.00% nm-db
22541 akips 1  200   820M 26092K select  1   8:16   0.00% nm-db
 1049 root  1  200   820M 26076K select  0   8:15   0.00% nm-db
 1043 akips 1  200   820M 26076K select  0   8:15   0.00% nm-db
 1006 akips 1  200 74004K