Re: [DRBD-user] split brain detected when switching back to the 2node cluster from the DR node

2009-08-05 Thread Pierre LEBRECH
Hello,

Thank you for your answer.

Yes, I can use these commands to handle split-brains.

BUT, the thing I don't understand is why I get split-brain in this scenario.

I think it's not normal to get such a split-brain when I manually switch HA 
services from node3 to the 2node cluster.

If we look at the end of step 2 (below), we can see that everynode is connected 
and UpToDate (node1 point of view).

It's only when I start heartbeat on node1 and 2 (normal way after the switch 
over) that I get this split-brain.

There is something wrong somewhere but what?




guohuai li a écrit :
> Hi,
> 
>  
> 
> There are several items such as below in /etc/drbd.conf.
> 
> You may need to study it.
> 
>  
> 
> My DRBD is 8.3.0.
> 
> It works well.
> 
>  
> 
> edward
> 
>  
> 
> #after-sb-0pri disconnect;
> 
> after-sb-0pri "discard-older-primary";
> 
>  
> 
> #after-sb-1pri disconnect;
> 
> after-sb-1pri discard-secondary; 
>  
>> Date: Tue, 4 Aug 2009 18:58:15 +0200
>> From: pierre.lebr...@laposte.net
>> To: drbd-user@lists.linbit.com
>> Subject: [DRBD-user] split brain detected when switching back to the
> 2node cluster from the DR node
>>
>> Hello,
>>
>> I always get a split brain when I switch the HA services back to the
> 2node cluster from my DR node.
>>

STEP 1 :

>> Here are the steps I follow :
>>
>> - HA services are on the DR node
>> - I stop these HA services
>> - I umount the data
>>
>> The state of DRBD on node3 is as follow :
>>
>> --
>> version: 8.3.2 (api:88/proto:86-90)
>> GIT-hash: dd7985327f146f33b86 d4bff5ca8c94234ce840e build by
> r...@hcns1, 2009-08-04 09:41:09
>>
>> 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r
>> ns:0 nr:34083396 dw:34084032 dr:68168163 al:12 bm:2094 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:f oos:208
>> --
>>

STEP 2 :

>> Then, on node1 :
>>
>> - drbdadm primary r0
>> - I start the HA IP
>> - drbdadm --stacked up r0-U
>>
>> At this point, every thing is OK. Here is the output of cat /proc/drbd :
>>
>> --
>> version: 8.3.2 (api:88/proto:86-90)
>> GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by
> r...@hans1, 2009-08-04 09:43:39
>> 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r
>> ns:13706 nr:174 dw:15096 dr:102401147 al:30 bm:2170 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:f oos:0
>> 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r
>> ns:0 nr:244 dw:244 dr:416 al:0 b m:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
>> --
>>
>> Then, I set all things back (reset) :
>>

STEP 3 :

>> on node1 :
>>
>> - drbdadm --stacked down r0-U
>> - drbdadm secondary r0
>> - I stop the HA IP
>>
>> The state on node1 is as follow :
>>
>> --
>> version: 8.3.2 (api:88/proto:86-90)
>> GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by
> r...@hans1, 2009-08-04 09:43:39
>> 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r
>> ns:13707 nr:174 dw:15097 dr:102401147 al:30 bm:2186 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:f oos:0
>> 1: cs:Unconfigured
>> --
>>
>> on node3, I type these commands to reset the state :
>>
>> - drbdadm secondary r0-U
>>

STEP 4 :

>> Then, on node1 and node2, I start heartbeat normally.
>>
>>
>>
>> Well, each time I follo w theses steps, node3 gets a split-brain.
>>
>> Where is the problem?
>>
>>
>>
>>
>> context : 3-node cluster, every node connected, HA services on node1,
> DRBD version 8.3.2 on linux 2.6.30.
>>
>> ___
>> drbd-user mailing list
>> drbd-user@lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
> 
> 
> Share your memories online with anyone you want anyone you want.
> <http://www.microsoft.com/middleeast/windows/windowslive/products/photos-share.aspx?tab=1>
> 
> 
> 
> 
> ___
> drbd-user mailing list
> drbd-user@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] split brain detected when switching back to the 2node cluster from the DR node

2009-08-04 Thread guohuai li

Hi,
 
There are several items such as below in /etc/drbd.conf.
You may need to study it.
 
My DRBD is 8.3.0.
It works well.
 
edward
 
#after-sb-0pri disconnect;
after-sb-0pri "discard-older-primary";
 
#after-sb-1pri disconnect;
after-sb-1pri discard-secondary; 
 
> Date: Tue, 4 Aug 2009 18:58:15 +0200
> From: pierre.lebr...@laposte.net
> To: drbd-user@lists.linbit.com
> Subject: [DRBD-user] split brain detected when switching back to the 2node 
> cluster from the DR node
> 
> Hello,
> 
> I always get a split brain when I switch the HA services back to the 2node 
> cluster from my DR node.
> 
> Here are the steps I follow :
> 
> - HA services are on the DR node
> - I stop these HA services
> - I umount the data
> 
> The state of DRBD on node3 is as follow :
> 
> --
> version: 8.3.2 (api:88/proto:86-90)
> GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hcns1, 
> 2009-08-04 09:41:09
> 
> 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r
> ns:0 nr:34083396 dw:34084032 dr:68168163 al:12 bm:2094 lo:0 pe:0 ua:0 ap:0 
> ep:1 wo:f oos:208
> --
> 
> Then, on node1 :
> 
> - drbdadm primary r0
> - I start the HA IP
> - drbdadm --stacked up r0-U
> 
> At this point, every thing is OK. Here is the output of cat /proc/drbd :
> 
> --
> version: 8.3.2 (api:88/proto:86-90)
> GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hans1, 
> 2009-08-04 09:43:39
> 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r
> ns:13706 nr:174 dw:15096 dr:102401147 al:30 bm:2170 lo:0 pe:0 ua:0 ap:0 ep:1 
> wo:f oos:0
> 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r
> ns:0 nr:244 dw:244 dr:416 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
> --
> 
> Then, I set all things back (reset) :
> 
> on node1 :
> 
> - drbdadm --stacked down r0-U
> - drbdadm secondary r0
> - I stop the HA IP
> 
> The state on node1 is as follow :
> 
> --
> version: 8.3.2 (api:88/proto:86-90)
> GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hans1, 
> 2009-08-04 09:43:39
> 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r
> ns:13707 nr:174 dw:15097 dr:102401147 al:30 bm:2186 lo:0 pe:0 ua:0 ap:0 ep:1 
> wo:f oos:0
> 1: cs:Unconfigured
> --
> 
> on node3, I type these commands to reset the state :
> 
> - drbdadm secondary r0-U
> 
> Then, on node1 and node2, I start heartbeat normally.
> 
> 
> 
> Well, each time I follow theses steps, node3 gets a split-brain.
> 
> Where is the problem?
> 
> 
> 
> 
> context : 3-node cluster, every node connected, HA services on node1, DRBD 
> version 8.3.2 on linux 2.6.30.
> 
> ___
> drbd-user mailing list
> drbd-user@lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

_
Share your memories online with anyone you want.
http://www.microsoft.com/middleeast/windows/windowslive/products/photos-share.aspx?tab=1___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


[DRBD-user] split brain detected when switching back to the 2node cluster from the DR node

2009-08-04 Thread Pierre LEBRECH
Hello,

I always get a split brain when I switch the HA services back to the 2node 
cluster from my DR node.

Here are the steps I follow :

- HA services are on the DR node
- I stop these HA services
- I umount the data

The state of DRBD on node3 is as follow :

--
version: 8.3.2 (api:88/proto:86-90)
GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hcns1, 
2009-08-04 09:41:09

 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r
ns:0 nr:34083396 dw:34084032 dr:68168163 al:12 bm:2094 lo:0 pe:0 ua:0 ap:0 
ep:1 wo:f oos:208
--

Then, on node1 :

- drbdadm primary r0
- I start the HA IP
- drbdadm --stacked up r0-U

At this point, every thing is OK. Here is the output of cat /proc/drbd :

--
version: 8.3.2 (api:88/proto:86-90)
GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hans1, 
2009-08-04 09:43:39
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r
ns:13706 nr:174 dw:15096 dr:102401147 al:30 bm:2170 lo:0 pe:0 ua:0 ap:0 
ep:1 wo:f oos:0
 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r
ns:0 nr:244 dw:244 dr:416 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
--

Then, I set all things back (reset) :

on node1 :

- drbdadm --stacked down r0-U
- drbdadm secondary r0
- I stop the HA IP

The state on node1 is as follow :

--
version: 8.3.2 (api:88/proto:86-90)
GIT-hash: dd7985327f146f33b86d4bff5ca8c94234ce840e build by r...@hans1, 
2009-08-04 09:43:39
 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r
ns:13707 nr:174 dw:15097 dr:102401147 al:30 bm:2186 lo:0 pe:0 ua:0 ap:0 
ep:1 wo:f oos:0
 1: cs:Unconfigured
--

on node3, I type these commands to reset the state :

- drbdadm secondary r0-U

Then, on node1 and node2, I start heartbeat normally.



Well, each time I follow theses steps, node3 gets a split-brain.

Where is the problem?




context : 3-node cluster, every node connected, HA services on node1, DRBD 
version 8.3.2 on linux 2.6.30.

___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user