Re: Unkillable process in STOP state

2015-11-12 Thread Johan Schuijt-Li
Yeah, the only difference we had that for us the status was 'Ds' rather then 
'STOP'. So the status is most likely irrelevant, at least the traces are 
exactly the same! :)

- Johan


> On 12 Nov 2015, at 11:45, Slawa Olhovchenkov <s...@zxy.spb.ru> wrote:
> 
> On Thu, Nov 12, 2015 at 07:05:46AM +0100, Johan Schuijt-Li wrote:
> 
>> This seems like the exact same problem as that we’ve had, more details can 
>> be found in the following PR:
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200992 
>> <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200992>
> 
> May be. Other time I am see "sleep without queue" status
> 
>> The patch attached there solved all our problems.
> 
> I am try this patch on next update, thanks.
> 
>> - Johan
>> 
>> 
>>> On 12 Nov 2015, at 01:12, Slawa Olhovchenkov <s...@zxy.spb.ru> wrote:
>>> 
>>> I have amd64, STABLE r288167.
>>> 
>>> root@edge09:/home/admin # procstat -k -k 627
>>> PIDTID COMM TDNAME   KSTACK   
>>> 627 100167 tcpkali.new  -mi_switch+0xe1 
>>> thread_suspend_switch+0x170 thread_single+0x4e5 exit1+0xbe sys_sys_exit+0xe 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 100172 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 100173 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 100174 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 100175 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 100178 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 100180 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 102207 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 102208 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 102209 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> 627 102211 tcpkali.new  -mi_switch+0xe1 
>>> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
>>> amd64_syscall+0x357 Xfast_syscall+0xfb 
>>> root@edge09:/home/admin # procstat -t 627
>>> PIDTID COMM TDNAME   CPU  PRI STATE   WCHAN
>>> 627 100167 tcpkali.new  - 10  120 stop- 
>>> 627 100172 tcpkali.new  -  3  152 stop- 
>>> 627 100173 tcpkali.new  -  5  152 stop- 
>>> 627 100174 tcpkali.new  -  3  152 stop- 
>>> 627 100175 tcpkali.new  -  2  149 stop- 
>>> 627 100178 tcpkali.new  -  5  152 stop- 
>>> 627 100180 tcpkali.new  -  2  132 stop- 
>>> 627 102207 tcpkali.new  -  5  136 stop- 
>>> 627 102208 tcpkali.new  -  3  152 stop- 
>>> 627 102209 tcpkali.new  -  5  139 stop- 
>>> 627 102211 tcpkali.new  -  1  120 stop- 
>>> 
>>> kill -STOP don't have effect.
>>> gdb can't be attached.
>>> ___
>>> freebsd-stable@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>> 

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Unkillable process in STOP state

2015-11-11 Thread Johan Schuijt-Li
This seems like the exact same problem as that we’ve had, more details can be 
found in the following PR:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200992 


The patch attached there solved all our problems.

- Johan


> On 12 Nov 2015, at 01:12, Slawa Olhovchenkov  wrote:
> 
> I have amd64, STABLE r288167.
> 
> root@edge09:/home/admin # procstat -k -k 627
>  PIDTID COMM TDNAME   KSTACK   
>  627 100167 tcpkali.new  -mi_switch+0xe1 
> thread_suspend_switch+0x170 thread_single+0x4e5 exit1+0xbe sys_sys_exit+0xe 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 100172 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 100173 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 100174 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 100175 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 100178 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 100180 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 102207 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 102208 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 102209 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
>  627 102211 tcpkali.new  -mi_switch+0xe1 
> sleepq_timedwait_sig+0x8b _sleep+0x238 kern_kevent+0x401 sys_kevent+0x12a 
> amd64_syscall+0x357 Xfast_syscall+0xfb 
> root@edge09:/home/admin # procstat -t 627
>  PIDTID COMM TDNAME   CPU  PRI STATE   WCHAN
>  627 100167 tcpkali.new  - 10  120 stop- 
>  627 100172 tcpkali.new  -  3  152 stop- 
>  627 100173 tcpkali.new  -  5  152 stop- 
>  627 100174 tcpkali.new  -  3  152 stop- 
>  627 100175 tcpkali.new  -  2  149 stop- 
>  627 100178 tcpkali.new  -  5  152 stop- 
>  627 100180 tcpkali.new  -  2  132 stop- 
>  627 102207 tcpkali.new  -  5  136 stop- 
>  627 102208 tcpkali.new  -  3  152 stop- 
>  627 102209 tcpkali.new  -  5  139 stop- 
>  627 102211 tcpkali.new  -  1  120 stop- 
> 
> kill -STOP don't have effect.
> gdb can't be attached.
> ___
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: libopie problems after upgrade to 10.2

2015-08-15 Thread Johan Schuijt-Li
I actually have the exact same problem, but worked around by creating a 
symlink. My libopie.so.8 has the exact same properties as yours (file length 
and date).

I also did a small diff on the checksums in /usr/lib compared to the release 
build:

$ diff base.txt jail.txt 
12,13c12,13
 SHA256 (dtrace) = 
7a9b63f277e8ecc59969fd76c0ed77981b1cd58f6934e5893208820df6fd379d
 SHA256 (engines) = 
c99cccdfe0bbbd06d049309d67012809e53337f4b8d534cffbc7f28c4347f4da
---
 SHA256 (dtrace) = 
 e55210c96547241d3747b755578f7e2c9a9561759b6f90ba3b802090184531a3
 SHA256 (engines) = 
 d58cb5ccc79382bef2d1b2637c914120502032f8ab933d28f703ffc2ba0a9b0b
15,16c15,16
 SHA256 (i18n) = 
d7d88d2462ff9e50b9b5666bf93def022d9f10c23c038acc07f429c8ace86095
 SHA256 (include) = 
1a9401ab8473379478012cf6af354f3448dffd6ba22af138c940824e2695f6e6
---
 SHA256 (i18n) = 
 f699bb9a283924ade54398dc45c67c511789435ff76abd08060f1016a09d88bd
 SHA256 (include) = 
 5036f146c7e63b8414f62cac8c13abb92cb1c16dd8845748b8eca320a1f69d42
330,332c330,331
 SHA256 (libopie.so) = 
c8ce7de5c31ddfd588cf3dd5b68dd97b0f76180d09a7011a9e9b351f6c5cece0
 SHA256 (libopie.so.7) = 
c8ce7de5c31ddfd588cf3dd5b68dd97b0f76180d09a7011a9e9b351f6c5cece0
 SHA256 (libopie.so.8) = 
c8ce7de5c31ddfd588cf3dd5b68dd97b0f76180d09a7011a9e9b351f6c5cece0
---
 SHA256 (libopie.so) = 
 3a0c6bab3535b4a731a19e6df04b9d95ad1e3d0d6e44852c10f0ffbd7cde6ad1
 SHA256 (libopie.so.7) = 
 3a0c6bab3535b4a731a19e6df04b9d95ad1e3d0d6e44852c10f0ffbd7cde6ad1
558c557
 SHA256 (private) = 
11bd2e8f469c06a1f88baf0a43591cb83161399dd03a644f615bf04627cc5e99
---
 SHA256 (private) = 
 9888acf06a8c0cbb025309581c4ed52c17d91fcc6a75340fae05807f8d0c3915


I’m comparing my installed system against a clean 10.2-RELEASE poudriere jail.

- Johan

 On 15 Aug 2015, at 20:47, Chris Anderson c...@pobox.com wrote:
 
 just upgraded from 10.1-RELEASE-p16 to 10.2-RELEASE using freebsd-update.
 
 after the upgrade, I began getting errors because pam_opie.so.5 has an
 unsatisfied link to libopie.so.7 (my system only has libopie.so.8).
 
 I notice a fresh install of 10.2-RELEASE does indeed contain libopie.so.7,
 so I'm curious how I managed to get into this state in the first place and
 whether it is anything I should worry about. This machine has only been
 upgraded using freebsd-update and I'm pretty sure it started from
 10.0-RELEASE.
 
 I have temporarily worked around with an entry in libmap.
 
 Here are the files involved:
 
 # ls -l /usr/lib/pam_opie*
 lrwxr-xr-x  1 root  wheel13 Sep 27  2013 /usr/lib/pam_opie.so -
 pam_opie.so.5
 -r--r--r--  1 root  wheel  7000 Aug 14 11:56 /usr/lib/pam_opie.so.5
 lrwxr-xr-x  1 root  wheel19 Sep 27  2013 /usr/lib/pam_opieaccess.so -
 pam_opieaccess.so.5
 -r--r--r--  1 root  wheel  5568 Aug 14 11:56 /usr/lib/pam_opieaccess.so.5
 
 # ls -l /usr/lib/libopie*
 -r--r--r--  1 root  wheel  84582 Aug 14 11:57 /usr/lib/libopie.a
 lrwxr-xr-x  1 root  wheel 12 Sep 29  2014 /usr/lib/libopie.so -
 libopie.so.8
 -r--r--r--  1 root  wheel  38280 Oct  5  2014 /usr/lib/libopie.so.8
 -r--r--r--  1 root  wheel  88048 Aug 14 11:57 /usr/lib/libopie_p.a
 ___
 freebsd-stable@freebsd.org mailing list
 https://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: panic: pmap active 0xfffff8001b7154b8

2015-05-11 Thread Johan Schuijt-Li
Small update for archiving purposes:

I’ve been in contact with bdrewery and kib outside of the mailing list which 
resulted in the following patch:
https://svnweb.freebsd.org/base?view=revisionrevision=282679 
https://svnweb.freebsd.org/base?view=revisionrevision=282679

We’re currently in the process of testing this patch and we’re cautiously 
positive on the results thus far. We’ll be rolling out this patch further and 
in roughly one week we should be confident enough that this problem is fully 
resolved.

- Johan


 On 07 May 2015, at 17:06, Bryan Drewery bdrew...@freebsd.org wrote:
 
 On 5/7/2015 7:08 AM, Johan Schuijt-Li wrote:
 Hi,
 
 We’ve been seeing (seemingly) random reboots on 10.1-RELEASE virtual 
 machines (KVM virtualisation) on our production servers. In an attempt to 
 determine what was causing this we’ve switched to running a kernel with 
 INVARIANTS enabled. This resulted for us in the following panic:
 
 Unread portion of the kernel message buffer:
 panic: pmap active 0xf8001b7154b8
 cpuid = 3
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
 0xfe03dd1493a0
 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe03dd149450
 vpanic() at vpanic+0x126/frame 0xfe03dd149490
 kassert_panic() at kassert_panic+0x139/frame 0xfe03dd149500
 pmap_remove_pages() at pmap_remove_pages+0x8c/frame 0xfe03dd1495f0
 exec_new_vmspace() at exec_new_vmspace+0x16a/frame 0xfe03dd149650
 exec_elf64_imgact() at exec_elf64_imgact+0x658/frame 0xfe03dd149720
 kern_execve() at kern_execve+0x5e4/frame 0xfe03dd149a80
 sys_execve() at sys_execve+0x37/frame 0xfe03dd149ae0
 amd64_syscall() at amd64_syscall+0x25a/frame 0xfe03dd149bf0
 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe03dd149bf0
 --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x80158af1a, rsp = 
 0x7fffac38, rbp = 0x7fffad40 ---
 
 
 I’ve only come across one other report here (without result unfortunate):
 https://lists.freebsd.org/pipermail/freebsd-current/2014-June/050827.html 
 https://lists.freebsd.org/pipermail/freebsd-current/2014-June/050827.html 
 https://lists.freebsd.org/pipermail/freebsd-current/2014-June/050827.html 
 https://lists.freebsd.org/pipermail/freebsd-current/2014-June/050827.html
 
 
 I looked around for the conclusion of that thread but could not find it.
 I was reproducing so often I'm sure this case was fixed. I may have
 privately contacted one of the VM maintainers to fix it. However lacking
 evidence I think it just stopped happening for me and I never reported
 anything useful.
 
 Are other people aware of this issue or working on this?
 
 I can provide access to a VM with a kernel dump and the kernel build for 
 extra information if needed.
 
 
 What we really need is a full core dump (minidump) and backtrace. This
 will let us inspect the pmap state.
 
 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html 
 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html
 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html 
 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html
 
 
 -- 
 Regards,
 Bryan Drewery

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

panic: pmap active 0xfffff8001b7154b8

2015-05-07 Thread Johan Schuijt-Li
Hi,

We’ve been seeing (seemingly) random reboots on 10.1-RELEASE virtual machines 
(KVM virtualisation) on our production servers. In an attempt to determine what 
was causing this we’ve switched to running a kernel with INVARIANTS enabled. 
This resulted for us in the following panic:

Unread portion of the kernel message buffer:
panic: pmap active 0xf8001b7154b8
cpuid = 3
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe03dd1493a0
kdb_backtrace() at kdb_backtrace+0x39/frame 0xfe03dd149450
vpanic() at vpanic+0x126/frame 0xfe03dd149490
kassert_panic() at kassert_panic+0x139/frame 0xfe03dd149500
pmap_remove_pages() at pmap_remove_pages+0x8c/frame 0xfe03dd1495f0
exec_new_vmspace() at exec_new_vmspace+0x16a/frame 0xfe03dd149650
exec_elf64_imgact() at exec_elf64_imgact+0x658/frame 0xfe03dd149720
kern_execve() at kern_execve+0x5e4/frame 0xfe03dd149a80
sys_execve() at sys_execve+0x37/frame 0xfe03dd149ae0
amd64_syscall() at amd64_syscall+0x25a/frame 0xfe03dd149bf0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe03dd149bf0
--- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x80158af1a, rsp = 
0x7fffac38, rbp = 0x7fffad40 ---


I’ve only come across one other report here (without result unfortunate):
https://lists.freebsd.org/pipermail/freebsd-current/2014-June/050827.html 
https://lists.freebsd.org/pipermail/freebsd-current/2014-June/050827.html

Are other people aware of this issue or working on this?

I can provide access to a VM with a kernel dump and the kernel build for extra 
information if needed.

- Johan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: panic: pmap active 0xfffff8001b7154b8

2015-05-07 Thread Johan Schuijt-Li
 
 What we really need is a full core dump (minidump) and backtrace. This
 will let us inspect the pmap state.
 
 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html 
 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html
 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html 
 https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html

Sorry if my words were a bit unclear, but this is what I have. If you could 
e-mail me a preferred username with public key I can give you access to this.

- Johan
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org