You were right, it was frozen at virtual machine level.
panic kernel parameter worked, so server resumed with reboot.
But there were no panic displayed on the VNC console even if I was logged.
The main problem is, that combination of MON and OSD silent failure at
once will cause much longer res
I am using librbd.
rbd map was only my test to see, if it is not librbd related. Both -
librbd and rbd map were the same frozen result.
Node running virtuals has 4.9.0-3-amd64 kernel
Two tested virtuals have
4.9.0-3-amd64 kernel, second with
4.10.17-2-pve kernel
JP
On 7.11.2017 10:42, Wido
I migrated virtual to my second node which is running
qemu-kvm version 1:2.1+dfsg-12+deb8u6 (from debian oldstable)
the same situation - frozen after approx 30-40 seconds when
"libceph: osd6 down" appeared in syslog (not before).
Also my other virtual on first node was frozen in the same time.
Bot
If you are seeing this w/ librbd and krbd, I would suggest trying a
different version of QEMU and/or different host OS since loss of a disk
shouldn't hang it -- only potentially the guest OS.
On Tue, Nov 7, 2017 at 5:17 AM, Jan Pekař - Imatic
wrote:
> I'm calling kill -STOP to simulate behavior,
I'm calling kill -STOP to simulate behavior, that occurred, when on one
ceph node i was out of memory. Processes was not killed, but were
somehow suspended/unresponsible (they couldn't create new threads etc),
and that caused all virtuals (on other nodes) to hung.
I decided to simulate it with
On 17-11-07 12:02 AM, Jan Pekař - Imatic wrote:
Hi,
I'm using debian stretch with ceph 12.2.1-1~bpo80+1 and qemu
1:2.8+dfsg-6+deb9u3
I'm running 3 nodes with 3 monitors and 8 osds on my nodes, all on IPV6.
When I tested the cluster, I detected strange and severe problem.
On first node I'm run
> Op 7 november 2017 om 10:14 schreef Jan Pekař - Imatic :
>
>
> Additional info - it is not librbd related, I mapped disk through
> rbd map and it was the same - virtuals were stuck/frozen.
> I happened exactly when in my log appeared
>
Why aren't you using librbd? Is there a specific reason
Additional info - it is not librbd related, I mapped disk through
rbd map and it was the same - virtuals were stuck/frozen.
I happened exactly when in my log appeared
Nov 7 10:01:27 imatic-hydra01 kernel: [2266883.493688] libceph: osd6 down
I can attach with strace to qemu process and I can get
If you could install the debug packages and get a gdb backtrace from all
threads it would be helpful. librbd doesn't utilize any QEMU threads so
even if librbd was deadlocked, the worst case that I would expect would be
your guest OS complaining about hung kernel tasks related to disk IO (since
the