And if it is of any use (from a sleeping KVM):
> ::status
debugging PID 20583 (64-bit)
file: /smartdc/bin/qemu-system-x86_64
threading model: native threads
status: stopped by debugger, sleeping in pollsys system call
> ::stack
libc.so.1`__pollsys+0xa()
libc.so.1`pselect+0x1cb(18, fffffd7fffdfd4a0, fffffd7fffdfb4a0,
fffffd7fffdf94a0, fffffd7fffdf9410, 0)
libc.so.1`select+0x5a(18, fffffd7fffdfd4a0, fffffd7fffdfb4a0, fffffd7fffdf94a0,
fffffd7fffdf9480)
main_loop_wait+0x299(0)
kvm_main_loop+0x119()
main_loop+0x17()
main+0x2da4(30, fffffd7fffdff898, fffffd7fffdffa20)
_start+0x6c()
> ::stacks
THREAD STATE SOBJ COUNT
3 UNPARKED <NONE> 2
kvm_cpu_exec+0x1c
kvm_main_loop_cpu+0x69
ap_main_loop+0xe9
libc.so.1`_thrp_setup+0x8a
libc.so.1`_lwp_start
2 UNPARKED <NONE> 2
libc.so.1`sigwaitinfo+0x17
sigwait_compat+0x59
libc.so.1`_thrp_setup+0x8a
libc.so.1`_lwp_start
1 UNPARKED <NONE> 1
libc.so.1`pselect+0x1cb
libc.so.1`select+0x5a
main_loop_wait+0x299
kvm_main_loop+0x119
main_loop+0x17
main+0x2da4
_start+0x6c
> ::siginfo
signal 1 (HUP)
code 0 (user generated via kill)
errno 0 (Error 0)
signal sent from PID -1056859968 (uid -2124672)
but I’m not really sure how to debug further. Exiting Dtrace via ::quit woke
the KVM back up again. The VNC session that I established prior (that provided
no interactivity or GUI or anything) came back to life without reconnecting.
On 31 Aug 2014, at 4:22 pm, David Finster via smartos-discuss
<[email protected]<mailto:[email protected]>>
wrote:
I doubt its an IO issue - as IO is very quiet at the moment (weekend):
[root@00-25-90-85-46-2c ~]# zpool iostat 1
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
zones 3.05T 5.64T 36 286 328K 2.53M
zones 3.05T 5.64T 0 16 0 299K
zones 3.05T 5.64T 0 1.19K 0 7.64M
zones 3.05T 5.64T 0 10 7.98K 132K
zones 3.05T 5.64T 0 8 0 132K
zones 3.05T 5.64T 0 8 0 152K
zones 3.05T 5.64T 0 54 0 599K
zones 3.05T 5.64T 0 1.07K 0 8.02M
zones 3.05T 5.64T 0 2 0 79.8K
zones 3.05T 5.64T 0 12 0 195K
zones 3.05T 5.64T 0 2 0 83.8K
zones 3.05T 5.64T 0 24 0 323K
zones 3.05T 5.64T 0 939 0 6.14M
zones 3.05T 5.64T 1 45 16.0K 439K
zones 3.05T 5.64T 0 17 0 227K
zones 3.05T 5.64T 0 15 0 263K
zones 3.05T 5.64T 0 12 0 228K
zones 3.05T 5.64T 0 1.03K 0 7.61M
zones 3.05T 5.64T 0 21 0 259K
zones 3.05T 5.64T 0 9 0 148K
iostat yielded the following: http://pastebin.com/su5g8M0h - short but that is
pretty representative of how it looked over a 2 minute run.
Further, I am able to VNC into the KVM machine, but the machine is totally
non-responsive. (won’t show any GUI at all). Willing to bet that if I run any
of the commands I’ve mentioned previously it’ll wake up again.
(Sorry for the double email again Micky - brain melt…)
On 31 Aug 2014, at 3:06 pm, David Finster
<[email protected]<mailto:[email protected]>>
wrote:
I’ll see how I go, but it is difficult to know when the issue would have
otherwise occurred as it is fairly sporadic/random.
If it were a Windows issue, how would running vmadm info/pstack/lockstack cause
the machine to begin responding again?
I’ve just noticed that one of the machines has gone to sleep again and since it
isn’t in production I can keep it down for the time being. If anyone has any
debugging suggestions it would be appreciated.
On 31 Aug 2014, at 1:39 pm, Micky
<[email protected]<mailto:[email protected]>> wrote:
Does it happen if you keep an RDP open? Do a ping -t or something to keep the
network sockets open on a fresh install or something like that. Seems like a
Windows issue.
On Sat, Aug 30, 2014 at 12:28 PM, David Finster via smartos-discuss
<[email protected]<mailto:[email protected]>>
wrote:
As an update, the issue occurred again on two different hosts for KVM machines.
Running pstack across the qemu process for the first KVM caused it to wake up,
but the following was captured:
http://pastebin.com/Js20eZPH
On the second machine, running plockstat also woke the KVM up, but provided
this:
http://pastebin.com/pbEt2bnj
On 29 Aug 2014, at 1:11 pm, David Finster via smartos-discuss
<[email protected]<mailto:[email protected]>>
wrote:
I should also be more specific in saying that it appears that the VM locks up
during this period of sleep. Once woken up, there are no event logs for the
period that the machine was unreachable. Network activity isn’t the only thing
affected.
Thanks,
Dave
On 29 Aug 2014, at 1:07 pm, David Finster via smartos-discuss
<[email protected]<mailto:[email protected]>>
wrote:
Hi Everyone
I’m seeing a weird issue with 3 particular KVM VMs whereby the virtual machine
appears to simply stop responding to network traffic (unpingable). The weird
thing is that as soon as I jump into the hypervisor and run ‘vmadm info
<uuid>’, the machine immediately starts responding to pings.
I obtained the stacks of the qemu process, but accidentally ran vmadm info
before I went further.
Stacks can be found here: http://pastebin.com/8FBxrcJJ
Does anyone have any suggestions/further debugging steps? Next time it happens
I’ll see if I can obtain lock info and anything else I can think of.
The KVM machines are running Server 2012 R2 64-bit (there are other machines
also running this OS that aren’t having issues) and the issue is occurring on
two separate hosts.
Build version is 20140717T041004Z
Thanks,
Dave
smartos-discuss | Archives<https://www.listbox.com/member/archive/184463/=now>
[https://www.listbox.com/images/feed-icon-10x10.jpg10f3ec5.jpg?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2ZlZWQtaWNvbi0xMHgxMC5qcGc]
<https://www.listbox.com/member/archive/rss/184463/25738179-216c4b5f> |
Modify<https://www.listbox.com/member/?&> Your Subscription
[https://www.listbox.com/images/listbox-logo-small.png10f3ec5.png?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2xpc3Rib3gtbG9nby1zbWFsbC5wbmc]
<http://www.listbox.com/>
smartos-discuss | Archives<https://www.listbox.com/member/archive/184463/=now>
[https://www.listbox.com/images/feed-icon-10x10.jpg10f3ec5.jpg?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2ZlZWQtaWNvbi0xMHgxMC5qcGc]
<https://www.listbox.com/member/archive/rss/184463/25738179-216c4b5f> |
Modify<https://www.listbox.com/member/?&> Your Subscription
[https://www.listbox.com/images/listbox-logo-small.png10f3ec5.png?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2xpc3Rib3gtbG9nby1zbWFsbC5wbmc]
<http://www.listbox.com/>
smartos-discuss | Archives<https://www.listbox.com/member/archive/184463/=now>
[https://www.listbox.com/images/feed-icon-10x10.jpg10f3ec5.jpg?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2ZlZWQtaWNvbi0xMHgxMC5qcGc]
<https://www.listbox.com/member/archive/rss/184463/25253051-d319687c> |
Modify<https://www.listbox.com/member/?&> Your Subscription
[https://www.listbox.com/images/listbox-logo-small.png10f3ec5.png?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2xpc3Rib3gtbG9nby1zbWFsbC5wbmc]
<http://www.listbox.com/>
smartos-discuss | Archives<https://www.listbox.com/member/archive/184463/=now>
[https://www.listbox.com/images/feed-icon-10x10.jpg10f3ec5.jpg?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2ZlZWQtaWNvbi0xMHgxMC5qcGc]
<https://www.listbox.com/member/archive/rss/184463/25738179-216c4b5f> |
Modify<https://www.listbox.com/member/?&> Your Subscription
[https://www.listbox.com/images/listbox-logo-small.png10f3ec5.png?uri=aHR0cHM6Ly93d3cubGlzdGJveC5jb20vaW1hZ2VzL2xpc3Rib3gtbG9nby1zbWFsbC5wbmc]
<http://www.listbox.com/>
-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription:
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com