On 26/02/2019 07:32, Jan Friesse wrote:
> Edwin
>> Török wrote:
>>> Setup: 16 CentOS 7.6 VMs, 4 vCPUs, 4GiB RAM running on XenServer 7.6
>>> (Xen 4.7.6)
>>
>> 2 vCPUs makes this a lot easier to reproduce the lost network
>> connectivity/fencing.
nel-4.4.52-4.0.12.x86_64.rpm (XenServer Lima)
kernel-4.19.19-5.0.1.x86_64.rpm (XenServer master)
The updated repro steps are:
On 19/02/2019 16:26, Edwin Török wrote:> On 18/02/2019 18:27, Edwin
Török wrote:
> Setup: 16 CentOS 7.6 VMs, 4 vCPUs, 4GiB RAM running on XenServer 7.6
> (Xen 4.7
On 20/02/2019 23:47, Jan Pokorný wrote:
> On 20/02/19 21:16 +0100, Klaus Wenninger wrote:
>> On 02/20/2019 08:51 PM, Jan Pokorný wrote:
>>> On 20/02/19 17:37 +, Edwin Török wrote:
>>>> strace for the situation described below (corosync 95%, 1
>>>>
On 20/02/2019 13:08, Jan Friesse wrote:
> Edwin Török napsal(a):
>> On 20/02/2019 07:57, Jan Friesse wrote:
>>> Edwin,
>>>>
>>>>
>>>> On 19/02/2019 17:02, Klaus Wenninger wrote:
>>>>> On 02/19/2019 05:41 PM, Edwin Török wrote:
&g
On 20/02/2019 12:44, Jan Pokorný wrote:
> On 19/02/19 16:41 +0000, Edwin Török wrote:
>> Also noticed this: [ 5390.361861] crmd[12620]: segfault at 0 ip
>> 7f221c5e03b1 sp 7ffcf9cf9d88 error 4 in
>> libc-2.17.so[7f221c554000+1c2000] [ 5390.361918] Code: b8 00 00
&g
On 20/02/2019 07:57, Jan Friesse wrote:
> Edwin,
>>
>>
>> On 19/02/2019 17:02, Klaus Wenninger wrote:
>>> On 02/19/2019 05:41 PM, Edwin Török wrote:
>>>> On 19/02/2019 16:26, Edwin Török wrote:
>>>>> On 18/02/2019 18:27, Edwin Török wro
On 19/02/2019 17:02, Klaus Wenninger wrote:
> On 02/19/2019 05:41 PM, Edwin Török wrote:
>> On 19/02/2019 16:26, Edwin Török wrote:
>>> On 18/02/2019 18:27, Edwin Török wrote:
>>>> Did a test today with CentOS 7.6 with upstream kernel and with
>>>>
On 19/02/2019 16:26, Edwin Török wrote:
> On 18/02/2019 18:27, Edwin Török wrote:
>> Did a test today with CentOS 7.6 with upstream kernel and with
>> 4.20.10-1.el7.elrepo.x86_64 (tested both with upstream SBD, and our
>> patched [1] SBD) and was not able to reproduce
On 18/02/2019 18:27, Edwin Török wrote:
> Did a test today with CentOS 7.6 with upstream kernel and with
> 4.20.10-1.el7.elrepo.x86_64 (tested both with upstream SBD, and our
> patched [1] SBD) and was not able to reproduce the issue yet.
I was able to finally reproduce this using only
On 18/02/2019 15:49, Klaus Wenninger wrote:
> On 02/18/2019 04:15 PM, Christine Caulfield wrote:
>> On 15/02/2019 16:58, Edwin Török wrote:
>>> On 15/02/2019 16:08, Christine Caulfield wrote:
>>>> On 15/02/2019 13:06, Edwin Török wrote:
>>>>> I tri
On 15/02/2019 16:08, Christine Caulfield wrote:
> On 15/02/2019 13:06, Edwin Török wrote:
>> I tried again with 'debug: trace', lots of process pause here:
>> https://clbin.com/ZUHpd
>>
>> And here is an strace running realtime prio 99, a LOT of epoll_wait and
>&
On 15/02/2019 11:12, Christine Caulfield wrote:
> On 15/02/2019 10:56, Edwin Török wrote:
>> On 15/02/2019 09:31, Christine Caulfield wrote:
>>> On 14/02/2019 17:33, Edwin Török wrote:
>>>> Hello,
>>>>
>>>> We were testing corosync 2.4.3
On 15/02/2019 09:31, Christine Caulfield wrote:
> On 14/02/2019 17:33, Edwin Török wrote:
>> Hello,
>>
>> We were testing corosync 2.4.3/libqb 1.0.1-6/sbd 1.3.1/gfs2 on 4.19 and
>> noticed a fundamental problem with realtime priorities:
>> - corosync runs on CPU3
Hello,
We were testing corosync 2.4.3/libqb 1.0.1-6/sbd 1.3.1/gfs2 on 4.19 and
noticed a fundamental problem with realtime priorities:
- corosync runs on CPU3, and interrupts for the NIC used by corosync are
also routed to CPU3
- corosync runs with SCHED_RR, ksoftirqd does not (should it?), but
Hello,
We've seen an issue in production where DLM 4.0.7 gets "stuck" and
unable to join more lockspaces. Other nodes in the cluster were able to
join new lockspaces, but not the one that node 1 was stuck on.
GFS2 was unaffected (the "stuck" lockspace was for a userspace control
daemon, but thats
On 30/07/18 08:24, Ulrich Windl wrote:
> Hi!
>
> We have a strange problem on one cluster node running Xen PV VMs (SLES11
> SP4): After updating the kernel and adding new SBD devices (to replace an old
> storage system), the system just seems to freeze.
Hi,
Which version of Xen are you using
[Sorry for the long delay in replying, I was on vacation]
On 14/08/17 15:30, Klaus Wenninger wrote:
> If you have a disk you could use as shared-disk for sbd you could
> achieve a quorum-disk-like-behavior. (your package-versions
> look as if you are using RHEL-7.4)
Thanks for the suggestion,
On 14/08/17 13:46, Klaus Wenninger wrote:
> How does your /etc/sysconfig/sbd look like?
> With just that pcs-command you get some default-config with
> watchdog-only-support.
It currently looks like this:
SBD_DELAY_START=no
SBD_OPTS="-n cluster1"
SBD_PACEMAKER=yes
SBD_STARTMODE=always
Hi,
When setting up a cluster with just 1 node with auto-tie-breaker and
DLM, and incrementally adding more I got some unexpected fencing if the
2nd node doesn't join the cluster soon enough.
What I also found surprising is that if the cluster has ever seen 2
nodes, then turning off the
19 matches
Mail list logo