Re: [DRBD-user] drbdmanage quorum control

2017-10-04 Thread Jason Fitzpatrick
Hi,

While I understand the risks associated with forcing a single node
online by using the re-elect options, I am currently documenting a
pre-prod cluster and have to document destructive testing and recovery
procedures.

The situation I am trying to validate would be one where we have a 3
node cluster which spans datacenters, and we have ended up in a
position where one datacenter is off the air which has in turn taken
out 2 nodes of the 3 node cluster, the remaining node has for what
ever reason crashed / restarted and we now need to get this node
online (post cleanup tasks will be captured as part of the docs)

I know that I can simply copy the content of /var/lib/drbd.d/ to
/etc/drbd.d/ do a quick rename and then use drbdadm to bring the
resources online, but since I am provisioning all my drbd resources
via drbdmanage I would like to be able to force this service online,

I have tried the drbdmanage reelect (and force-win) options but am
still unable to connect to the drbdmange process (This said I am able
to see all drbd resources using drbdadm status)

[root@node1 ~]# drbdmanage reelect --force-win
Operation completed successfully
unknown
[root@node1 ~]# drbdadm status
  .drbdctrl role:Primary
 volume:0 disk:UpToDate
 volume:1 disk:UpToDate
 node2.domain.name connection:Connecting
 node3.domain.name connection:Connecting

 resource-sda role:Secondary
 disk:UpToDate
 node2.domain.name connection:Connecting

[root@lpisscl0001 ~]# drbdmanage ping
pong
[root@node1 ~]# drbdmanage v
ERROR:dbus.proxies:Introspect error on :1.30:/interface:
dbus.exceptions.DBusException: org.freedesktop.DBus.Error.NoReply: Did
not receive a reply. Possible causes include: the remote application
did not send a reply, the message bus security policy blocked the
reply, the reply timeout expired, or the network connection was
broken.
Waiting for server: ...
Error: Startup not successful (no quorum? not *both* nodes up in a 2
node cluster?)
No resources defined


So if I am correct the (completely unsupported / do so at your own
risk) process to force access to the drbdmanaged resources in the
event of loss of quorum for the drbdmanaged process would be

surviving node: node1
[root@node1 ~]# drbdmanage reelect --force-win
recover node2 / node3 from  DR procedure / backups

post reintroduction of additional nodes:
restart drbdmanaged process on node1 / reboot node1

Thanks

Jay

On 3 October 2017 at 16:08, Jason Fitzpatrick  wrote:
> Thanks I will try that now
>
> On 3 Oct 2017 12:05, "Yannis Milios"  wrote:
>>
>> I think you have to use 'drbdmanage reelect' command to reelect a new
>> leader first.
>>
>> man drbdmanage-reelect
>>
>> Yannis
>>
>>
>>
>> On Mon, Oct 2, 2017 at 2:12 PM, Jason Fitzpatrick
>>  wrote:
>>>
>>> Hi all
>>>
>>> I am trying to get my head around the quorum-control features within
>>> drbdmanage,
>>>
>>> I have deliberately crashed my cluster, and spun up one node, and as
>>> expected I am unable to get drbdmanage to start due to the lack of
>>> quorum,,
>>>
>>> I was under the impression that I should have been able to override
>>> the quorum state and get the drbdmanaged process online using DBUS /
>>> manually calling the service, but am drawing a blank..
>>>
>>> for the sake of this example it is a 2 node cluster node1 is online
>>> and node2 is still powered off,
>>>
>>> [root@node1]# drbdmanage quorum-control --override ignore node2
>>> Modifying quorum state of node 'node2':
>>> Waiting for server: ...
>>> Error: Startup not successful (no quorum? not *both* nodes up in a 2
>>> node cluster?)
>>> Error: Startup not successful (no quorum? not *both* nodes up in a 2
>>> node cluster?)
>>>
>>> Any advice?
>>>
>>> Thanks
>>>
>>> Jay
>>>
>>> --
>>>
>>> "The only difference between saints and sinners is that every saint
>>> has a past while every sinner has a future. "
>>> — Oscar Wilde
>>> ___
>>> drbd-user mailing list
>>> drbd-user@lists.linbit.com
>>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
>>
>



-- 

"The only difference between saints and sinners is that every saint
has a past while every sinner has a future. "
— Oscar Wilde
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] drbdmanage quorum control

2017-10-04 Thread Jason Fitzpatrick
Thanks I will try that now

On 3 Oct 2017 12:05, "Yannis Milios"  wrote:

> I think you have to use 'drbdmanage reelect' command to reelect a new
> leader first.
>
> man drbdmanage-reelect
>
> Yannis
>
>
>
> On Mon, Oct 2, 2017 at 2:12 PM, Jason Fitzpatrick <
> jayfitzpatr...@gmail.com> wrote:
>
>> Hi all
>>
>> I am trying to get my head around the quorum-control features within
>> drbdmanage,
>>
>> I have deliberately crashed my cluster, and spun up one node, and as
>> expected I am unable to get drbdmanage to start due to the lack of
>> quorum,,
>>
>> I was under the impression that I should have been able to override
>> the quorum state and get the drbdmanaged process online using DBUS /
>> manually calling the service, but am drawing a blank..
>>
>> for the sake of this example it is a 2 node cluster node1 is online
>> and node2 is still powered off,
>>
>> [root@node1]# drbdmanage quorum-control --override ignore node2
>> Modifying quorum state of node 'node2':
>> Waiting for server: ...
>> Error: Startup not successful (no quorum? not *both* nodes up in a 2
>> node cluster?)
>> Error: Startup not successful (no quorum? not *both* nodes up in a 2
>> node cluster?)
>>
>> Any advice?
>>
>> Thanks
>>
>> Jay
>>
>> --
>>
>> "The only difference between saints and sinners is that every saint
>> has a past while every sinner has a future. "
>> — Oscar Wilde
>> ___
>> drbd-user mailing list
>> drbd-user@lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>
>
>
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] drbdmanage quorum control

2017-10-03 Thread Yannis Milios
Thanks for clarifying this ...

Regards,
Yannis

On Tue, Oct 3, 2017 at 12:30 PM, Roland Kammerer  wrote:

> On Tue, Oct 03, 2017 at 12:05:50PM +0100, Yannis Milios wrote:
> > I think you have to use 'drbdmanage reelect' command to reelect a new
> > leader first.
> >
> > man drbdmanage-reelect
>
> In general that is a bad idea, and I regret that I exposed it as a
> subcommand and did not hide it behind a
> "--no-you-dont-want-that-unless-you-are-rck" where it then sill asks you
> to prove the Riemann hypothesis before continuing...
>
> > On Mon, Oct 2, 2017 at 2:12 PM, Jason Fitzpatrick <
> jayfitzpatr...@gmail.com>
> > wrote:
> >
> > > Hi all
> > >
> > > I am trying to get my head around the quorum-control features within
> > > drbdmanage,
> > >
> > > I have deliberately crashed my cluster, and spun up one node, and as
> > > expected I am unable to get drbdmanage to start due to the lack of
> > > quorum,,
> > >
> > > I was under the impression that I should have been able to override
> > > the quorum state and get the drbdmanaged process online using DBUS /
> > > manually calling the service, but am drawing a blank..
> > >
> > > for the sake of this example it is a 2 node cluster node1 is online
> > > and node2 is still powered off,
> > >
> > > [root@node1]# drbdmanage quorum-control --override ignore node2
> > > Modifying quorum state of node 'node2':
> > > Waiting for server: ...
> > > Error: Startup not successful (no quorum? not *both* nodes up in a 2
> > > node cluster?)
> > > Error: Startup not successful (no quorum? not *both* nodes up in a 2
> > > node cluster?)
> > >
> > > Any advice?
>
> Bring back the second node. In two node clusters that is the only clean
> way to bring back the cluster. If you want quorum, get >=3 nodes.
> Period. In two node clusters both have to be up. "reelect" is a last
> resort command for the absolute worst case to bring up a 2 node cluster
> where only one node survived and the other one is gone beyond repair.
> "reelect" with a forced win alters internal state to make that possible.
> It does not revert that internal state if, for whatever reason the
> second node then shows up again. You would have to restart the "reelect"
> node to get it then in a sane internal state again.
>
> tl;tr: If you want quorum: >=3 nodes. Don't use "reelect" to force wins.
>
> Regards, rck
> ___
> drbd-user mailing list
> drbd-user@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] drbdmanage quorum control

2017-10-03 Thread Roland Kammerer
On Tue, Oct 03, 2017 at 12:05:50PM +0100, Yannis Milios wrote:
> I think you have to use 'drbdmanage reelect' command to reelect a new
> leader first.
> 
> man drbdmanage-reelect

In general that is a bad idea, and I regret that I exposed it as a
subcommand and did not hide it behind a
"--no-you-dont-want-that-unless-you-are-rck" where it then sill asks you
to prove the Riemann hypothesis before continuing...

> On Mon, Oct 2, 2017 at 2:12 PM, Jason Fitzpatrick 
> wrote:
> 
> > Hi all
> >
> > I am trying to get my head around the quorum-control features within
> > drbdmanage,
> >
> > I have deliberately crashed my cluster, and spun up one node, and as
> > expected I am unable to get drbdmanage to start due to the lack of
> > quorum,,
> >
> > I was under the impression that I should have been able to override
> > the quorum state and get the drbdmanaged process online using DBUS /
> > manually calling the service, but am drawing a blank..
> >
> > for the sake of this example it is a 2 node cluster node1 is online
> > and node2 is still powered off,
> >
> > [root@node1]# drbdmanage quorum-control --override ignore node2
> > Modifying quorum state of node 'node2':
> > Waiting for server: ...
> > Error: Startup not successful (no quorum? not *both* nodes up in a 2
> > node cluster?)
> > Error: Startup not successful (no quorum? not *both* nodes up in a 2
> > node cluster?)
> >
> > Any advice?

Bring back the second node. In two node clusters that is the only clean
way to bring back the cluster. If you want quorum, get >=3 nodes.
Period. In two node clusters both have to be up. "reelect" is a last
resort command for the absolute worst case to bring up a 2 node cluster
where only one node survived and the other one is gone beyond repair.
"reelect" with a forced win alters internal state to make that possible.
It does not revert that internal state if, for whatever reason the
second node then shows up again. You would have to restart the "reelect"
node to get it then in a sane internal state again.

tl;tr: If you want quorum: >=3 nodes. Don't use "reelect" to force wins.

Regards, rck
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] drbdmanage quorum control

2017-10-03 Thread Yannis Milios
I think you have to use 'drbdmanage reelect' command to reelect a new
leader first.

man drbdmanage-reelect

Yannis



On Mon, Oct 2, 2017 at 2:12 PM, Jason Fitzpatrick 
wrote:

> Hi all
>
> I am trying to get my head around the quorum-control features within
> drbdmanage,
>
> I have deliberately crashed my cluster, and spun up one node, and as
> expected I am unable to get drbdmanage to start due to the lack of
> quorum,,
>
> I was under the impression that I should have been able to override
> the quorum state and get the drbdmanaged process online using DBUS /
> manually calling the service, but am drawing a blank..
>
> for the sake of this example it is a 2 node cluster node1 is online
> and node2 is still powered off,
>
> [root@node1]# drbdmanage quorum-control --override ignore node2
> Modifying quorum state of node 'node2':
> Waiting for server: ...
> Error: Startup not successful (no quorum? not *both* nodes up in a 2
> node cluster?)
> Error: Startup not successful (no quorum? not *both* nodes up in a 2
> node cluster?)
>
> Any advice?
>
> Thanks
>
> Jay
>
> --
>
> "The only difference between saints and sinners is that every saint
> has a past while every sinner has a future. "
> — Oscar Wilde
> ___
> drbd-user mailing list
> drbd-user@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user