Am 08.10.2015 um 16:34 schrieb Bob Peterson: > ----- Original Message ----- >> >> Am 08.10.2015 um 16:15 schrieb Digimer: >>> On 08/10/15 07:50 AM, J. Echter wrote: >>>> Hi, >>>> >>>> i have a strange issue on CentOS 6.5 >>>> >>>> If i install a new vm on node1 it works well. >>>> >>>> If i install a new vm on node2 it gets stuck. >>>> >>>> Same if i do a dd if=/dev/zero of=/dev/DATEN/vm-test (on node2) >>>> >>>> On node1 it works: >>>> >>>> dd if=/dev/zero of=vm-test >>>> Schreiben in „vm-test“: Auf dem Gerät ist kein Speicherplatz mehr >>>> verfügbar >>>> 83886081+0 Datensätze ein >>>> 83886080+0 Datensätze aus >>>> 42949672960 Bytes (43 GB) kopiert, 2338,15 s, 18,4 MB/s >>>> >>>> >>>> dmesg shows the following (while dd'ing on node2): >>>> >>>> INFO: task flush-253:18:9820 blocked for more than 120 seconds. >>>> Not tainted 2.6.32-573.7.1.el6.x86_64 #1 >>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>> <snip> >>>> any hint on fixing that? >>> Every time I've seen this, it was because dlm was blocked. The most >>> common cause of DLM blocking is a failed fence call. Do you have fencing >>> configured *and* tested? >>> >>> If I were to guess, given the rather limited information you shared >>> about your setup, the live migration consumed the network bandwidth, >>> chocking out corosync traffic which caused the peer to be declared lost, >>> called a fence which failed and left locking hung (which is by design; >>> better to hang that risk corruption). >>> >> Hi, >> >> fencing is configured and works. >> >> I re-checked it by typing >> >> echo c > /proc/sysrq-trigger >> >> into node2 console. >> >> The machine is fenced and comes back up. But the problem persists. > Hi, > > Can you send any more information about the crash? What makes you think > it's gfs2 and not some other kernel component? Do you get any messages > on the console? If not, perhaps you can temporarily disable or delay fencing > long enough to get console messages. > > Regards, > > Bob Peterson > Red Hat File Systems > > _______________________________________________ > Hi,
i just recognized that gfs2 is probably the wrong candidate. I use clustered lvm (drbd), and i experience this on a lvm volume, not formatted to anything. What logs would you need to identify the cause? _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org