Re: [DRBD-user] oracle stop timeout while drbd resync
The environment has been recovered. I modified the pacemaker stop fail action to "echo c >/proc/sysrq-trigger" so that the system will be reboot and generate vmcore when resource stop fail. I am sure that the reason is oracle stop action is stalled during drbd resync. All the device used the same replication link. Here is "foreach bt" resource in vmcore analysis: PID: 6870 TASK: 8802c89b84c0 CPU: 14 COMMAND: "oracle" #0 [880281bd79c8] schedule at 8145a489 #1 [880281bd7b10] do_get_write_access at a02ae72d [jbd] #2 [880281bd7bd0] journal_get_write_access at a02ae899 [jbd] #3 [880281bd7bf0] __ext3_journal_get_write_access at a0327aec [ext3] #4 [880281bd7c20] ext3_reserve_inode_write at a0317ef3 [ext3] #5 [880281bd7c50] ext3_mark_inode_dirty at a0318f71 [ext3] #6 [880281bd7c90] ext3_dirty_inode at a03190f7 [ext3] #7 [880281bd7cb0] __mark_inode_dirty at 8117e7e0 #8 [880281bd7cf0] update_time at 81170c96 #9 [880281bd7d20] touch_atime at 81170efb #10 [880281bd7d60] generic_file_aio_read at 810f9e22 #11 [880281bd7e20] aio_rw_vect_retry at 81199bb4 #12 [880281bd7e50] aio_run_iocb at 8119b6c2 #13 [880281bd7e80] io_submit_one at 8119c1f0 #14 [880281bd7ec0] do_io_submit at 8119c3d8 #15 [880281bd7f80] system_call_fastpath at 81464592 RIP: 7f38ad4c36f7 RSP: 7fffc9ee77f0 RFLAGS: 00010206 RAX: 00d1 RBX: 81464592 RCX: 000152012960 RDX: 7fffc9ee77c0 RSI: 0001 RDI: 7f38af06 RBP: 000152012960 R8: 7fffc9ee77b0 R9: 7fffc9ee7750 R10: 7fffc9ee70d0 R11: 0206 R12: 0001553e0f80 R13: 7f38ac571c60 R14: 7fffc9ee77c0 R15: 7fffc9ee77e0 ORIG_RAX: 00d1 CS: 0033 SS: 002b 2016-09-01 7:48 GMT+08:00 Igor Cicimov : > > > On Thu, Sep 1, 2016 at 9:02 AM, Igor Cicimov > wrote: >> >> On 1 Sep 2016 1:16 am, "Mia Lueng" wrote: >> > >> > Yes, Oracle & drbd is running under pacemaker just in >> > primary/secondary mode. I stopped the oracle resource during DRBD is >> > resyncing and the oracle hangup >> > >> > 2016-08-31 14:38 GMT+08:00 Igor Cicimov >> > : >> > > >> > > >> > > On Wed, Aug 31, 2016 at 3:49 PM, Mia Lueng >> > > wrote: >> > >> >> > >> Hi: >> > >> I have a cluster with four drbd devices. I found oracle stopped >> > >> timeout while drbd is in resync state. >> > >> oracle is blocked like following: >> > >> >> > >> oracle6869 6844 0.0 0.0 71424 12616 ?S16:28 >> > >> 00:00:00 pipe_wait >> > >> /oracle/app/oracle/dbhome_1/bin/sqlplus >> > >> @/tmp/ora_ommbb_shutdown.sql >> > >> oracle6870 6869 0.0 0.1 4431856 26096 ? Ds 16:28 >> > >> 00:00:00 get_write_access oracleommbb >> > >> (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) >> > >> >> > >> >> > >> drbd state >> > >> >> > >> 2016-08-30 16:33:32 Dump [/proc/drbd] ... >> > >> = >> > >> version: 8.3.16 (api:88/proto:86-97) >> > >> GIT-hash: bbf851ee755a878a495cfd93e1a76bf90dc79442 Makefile.in build >> > >> by drbd@build 2012-06-07 16:03:04 >> > >> 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B >> > >> r- >> > >> ns:2777568 nr:0 dw:492604 dr:3305833 al:4761 bm:439 lo:31 pe:613 >> > >> ua:0 ap:31 ep:1 wo:d oos:4144796 >> > >>[==>.] sync'ed: 35.7% (4044/6280)M >> > >>finish: 0:10:19 speed: 6,680 (3,664) K/sec >> > >> 1: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent B >> > >> r- >> > >> ns:3709600 nr:0 dw:854764 dr:7632085 al:7689 bm:3401 lo:38 pe:3299 >> > >> ua:38 ap:0 ep:1 wo:d oos:6204676 >> > >>[===>] sync'ed: 41.5% (6056/10340)M >> > >>finish: 0:22:14 speed: 4,640 (10,016) K/sec >> > >> 2: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B >> > >> r- >> > >> ns:3968883 nr:0 dw:127937 dr:5179641 al:190 bm:304 lo:1 pe:139 ua:0 >> > >> ap:7 ep:1 wo:d oos:2124792 >> > >>[>...] sync'ed: 66.3% (2072/6144)M >> > >>finish: 0:06:12 speed: 5,692 (6,668) K/sec >> > >> 3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B >> > >> r- >> > >> ns:89737 nr:0 dw:439073 dr:2235186 al:724 bm:35 lo:0 pe:45 ua:0 >> > >> ap:7 >> > >> ep:1 wo:d oos:8131104 >> > >>[>] sync'ed: 1.6% (7940/8064)M >> > >>finish: 10:44:09 speed: 208 (204) K/sec (stalled) >> > >> >> > >> Is this a known bug and fixed in the further version? >> > >> ___ >> > >> drbd-user mailing list >> > >> drbd-user@lists.linbit.com >> > >> http://lists.linbit.com/mailman/listinfo/drbd-user >> > > >> > > >> > > Maybe provide more details about the term "cluster" you are using. Do >>
Re: [DRBD-user] oracle stop timeout while drbd resync
On Thu, Sep 1, 2016 at 9:02 AM, Igor Cicimov wrote: > On 1 Sep 2016 1:16 am, "Mia Lueng" wrote: > > > > Yes, Oracle & drbd is running under pacemaker just in > > primary/secondary mode. I stopped the oracle resource during DRBD is > > resyncing and the oracle hangup > > > > 2016-08-31 14:38 GMT+08:00 Igor Cicimov >: > > > > > > > > > On Wed, Aug 31, 2016 at 3:49 PM, Mia Lueng > wrote: > > >> > > >> Hi: > > >> I have a cluster with four drbd devices. I found oracle stopped > > >> timeout while drbd is in resync state. > > >> oracle is blocked like following: > > >> > > >> oracle6869 6844 0.0 0.0 71424 12616 ?S16:28 > > >> 00:00:00 pipe_wait > > >> /oracle/app/oracle/dbhome_1/bin/sqlplus > > >> @/tmp/ora_ommbb_shutdown.sql > > >> oracle6870 6869 0.0 0.1 4431856 26096 ? Ds 16:28 > > >> 00:00:00 get_write_access oracleommbb > > >> (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) > > >> > > >> > > >> drbd state > > >> > > >> 2016-08-30 16:33:32 Dump [/proc/drbd] ... > > >> = > > >> version: 8.3.16 (api:88/proto:86-97) > > >> GIT-hash: bbf851ee755a878a495cfd93e1a76bf90dc79442 Makefile.in build > > >> by drbd@build 2012-06-07 16:03:04 > > >> 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B > r- > > >> ns:2777568 nr:0 dw:492604 dr:3305833 al:4761 bm:439 lo:31 pe:613 > > >> ua:0 ap:31 ep:1 wo:d oos:4144796 > > >>[==>.] sync'ed: 35.7% (4044/6280)M > > >>finish: 0:10:19 speed: 6,680 (3,664) K/sec > > >> 1: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent B > r- > > >> ns:3709600 nr:0 dw:854764 dr:7632085 al:7689 bm:3401 lo:38 pe:3299 > > >> ua:38 ap:0 ep:1 wo:d oos:6204676 > > >>[===>] sync'ed: 41.5% (6056/10340)M > > >>finish: 0:22:14 speed: 4,640 (10,016) K/sec > > >> 2: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B > r- > > >> ns:3968883 nr:0 dw:127937 dr:5179641 al:190 bm:304 lo:1 pe:139 ua:0 > > >> ap:7 ep:1 wo:d oos:2124792 > > >>[>...] sync'ed: 66.3% (2072/6144)M > > >>finish: 0:06:12 speed: 5,692 (6,668) K/sec > > >> 3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B > r- > > >> ns:89737 nr:0 dw:439073 dr:2235186 al:724 bm:35 lo:0 pe:45 ua:0 ap:7 > > >> ep:1 wo:d oos:8131104 > > >>[>] sync'ed: 1.6% (7940/8064)M > > >>finish: 10:44:09 speed: 208 (204) K/sec (stalled) > > >> > > >> Is this a known bug and fixed in the further version? > > >> ___ > > >> drbd-user mailing list > > >> drbd-user@lists.linbit.com > > >> http://lists.linbit.com/mailman/listinfo/drbd-user > > > > > > > > > Maybe provide more details about the term "cluster" you are using. Do > you > > > have DRBD under control of crm like Pacemaker? If so are you running > DRBD in > > > dual primary mode maybe? And when does this state happen and under what > > > conditions i.e restarted a node etc. > > What os is this on? Can you please paste the output of "crm status" (or > pcs if you are on rhel7) and "crm_mon -Qrf1" > > ___ > drbd-user mailing list > drbd-user@lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user > > Another thing I forgot I find it odd that the sync for only one of the devices is stalled. Are they all using the same replication link? Any networking issues or network card errors you can see? ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] oracle stop timeout while drbd resync
On 1 Sep 2016 9:02 am, "Igor Cicimov" wrote: > > On 1 Sep 2016 1:16 am, "Mia Lueng" wrote: > > > > Yes, Oracle & drbd is running under pacemaker just in > > primary/secondary mode. I stopped the oracle resource during DRBD is > > resyncing and the oracle hangup > > > > 2016-08-31 14:38 GMT+08:00 Igor Cicimov : > > > > > > > > > On Wed, Aug 31, 2016 at 3:49 PM, Mia Lueng wrote: > > >> > > >> Hi: > > >> I have a cluster with four drbd devices. I found oracle stopped > > >> timeout while drbd is in resync state. > > >> oracle is blocked like following: > > >> > > >> oracle6869 6844 0.0 0.0 71424 12616 ?S16:28 > > >> 00:00:00 pipe_wait > > >> /oracle/app/oracle/dbhome_1/bin/sqlplus > > >> @/tmp/ora_ommbb_shutdown.sql > > >> oracle6870 6869 0.0 0.1 4431856 26096 ? Ds 16:28 > > >> 00:00:00 get_write_access oracleommbb > > >> (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) > > >> > > >> > > >> drbd state > > >> > > >> 2016-08-30 16:33:32 Dump [/proc/drbd] ... > > >> = > > >> version: 8.3.16 (api:88/proto:86-97) > > >> GIT-hash: bbf851ee755a878a495cfd93e1a76bf90dc79442 Makefile.in build > > >> by drbd@build 2012-06-07 16:03:04 > > >> 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > > >> ns:2777568 nr:0 dw:492604 dr:3305833 al:4761 bm:439 lo:31 pe:613 > > >> ua:0 ap:31 ep:1 wo:d oos:4144796 > > >>[==>.] sync'ed: 35.7% (4044/6280)M > > >>finish: 0:10:19 speed: 6,680 (3,664) K/sec > > >> 1: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent B r- > > >> ns:3709600 nr:0 dw:854764 dr:7632085 al:7689 bm:3401 lo:38 pe:3299 > > >> ua:38 ap:0 ep:1 wo:d oos:6204676 > > >>[===>] sync'ed: 41.5% (6056/10340)M > > >>finish: 0:22:14 speed: 4,640 (10,016) K/sec > > >> 2: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > > >> ns:3968883 nr:0 dw:127937 dr:5179641 al:190 bm:304 lo:1 pe:139 ua:0 > > >> ap:7 ep:1 wo:d oos:2124792 > > >>[>...] sync'ed: 66.3% (2072/6144)M > > >>finish: 0:06:12 speed: 5,692 (6,668) K/sec > > >> 3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > > >> ns:89737 nr:0 dw:439073 dr:2235186 al:724 bm:35 lo:0 pe:45 ua:0 ap:7 > > >> ep:1 wo:d oos:8131104 > > >>[>] sync'ed: 1.6% (7940/8064)M > > >>finish: 10:44:09 speed: 208 (204) K/sec (stalled) > > >> > > >> Is this a known bug and fixed in the further version? > > >> ___ > > >> drbd-user mailing list > > >> drbd-user@lists.linbit.com > > >> http://lists.linbit.com/mailman/listinfo/drbd-user > > > > > > > > > Maybe provide more details about the term "cluster" you are using. Do you > > > have DRBD under control of crm like Pacemaker? If so are you running DRBD in > > > dual primary mode maybe? And when does this state happen and under what > > > conditions i.e restarted a node etc. > > What os is this on? Can you please paste the output of "crm status" (or pcs if you are on rhel7) and "crm_mon -Qrf1" Also look for errors from crm in syslog and check oracle log too for errors. ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] oracle stop timeout while drbd resync
On 1 Sep 2016 1:16 am, "Mia Lueng" wrote: > > Yes, Oracle & drbd is running under pacemaker just in > primary/secondary mode. I stopped the oracle resource during DRBD is > resyncing and the oracle hangup > > 2016-08-31 14:38 GMT+08:00 Igor Cicimov : > > > > > > On Wed, Aug 31, 2016 at 3:49 PM, Mia Lueng wrote: > >> > >> Hi: > >> I have a cluster with four drbd devices. I found oracle stopped > >> timeout while drbd is in resync state. > >> oracle is blocked like following: > >> > >> oracle6869 6844 0.0 0.0 71424 12616 ?S16:28 > >> 00:00:00 pipe_wait > >> /oracle/app/oracle/dbhome_1/bin/sqlplus > >> @/tmp/ora_ommbb_shutdown.sql > >> oracle6870 6869 0.0 0.1 4431856 26096 ? Ds 16:28 > >> 00:00:00 get_write_access oracleommbb > >> (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) > >> > >> > >> drbd state > >> > >> 2016-08-30 16:33:32 Dump [/proc/drbd] ... > >> = > >> version: 8.3.16 (api:88/proto:86-97) > >> GIT-hash: bbf851ee755a878a495cfd93e1a76bf90dc79442 Makefile.in build > >> by drbd@build 2012-06-07 16:03:04 > >> 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > >> ns:2777568 nr:0 dw:492604 dr:3305833 al:4761 bm:439 lo:31 pe:613 > >> ua:0 ap:31 ep:1 wo:d oos:4144796 > >>[==>.] sync'ed: 35.7% (4044/6280)M > >>finish: 0:10:19 speed: 6,680 (3,664) K/sec > >> 1: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent B r- > >> ns:3709600 nr:0 dw:854764 dr:7632085 al:7689 bm:3401 lo:38 pe:3299 > >> ua:38 ap:0 ep:1 wo:d oos:6204676 > >>[===>] sync'ed: 41.5% (6056/10340)M > >>finish: 0:22:14 speed: 4,640 (10,016) K/sec > >> 2: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > >> ns:3968883 nr:0 dw:127937 dr:5179641 al:190 bm:304 lo:1 pe:139 ua:0 > >> ap:7 ep:1 wo:d oos:2124792 > >>[>...] sync'ed: 66.3% (2072/6144)M > >>finish: 0:06:12 speed: 5,692 (6,668) K/sec > >> 3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > >> ns:89737 nr:0 dw:439073 dr:2235186 al:724 bm:35 lo:0 pe:45 ua:0 ap:7 > >> ep:1 wo:d oos:8131104 > >>[>] sync'ed: 1.6% (7940/8064)M > >>finish: 10:44:09 speed: 208 (204) K/sec (stalled) > >> > >> Is this a known bug and fixed in the further version? > >> ___ > >> drbd-user mailing list > >> drbd-user@lists.linbit.com > >> http://lists.linbit.com/mailman/listinfo/drbd-user > > > > > > Maybe provide more details about the term "cluster" you are using. Do you > > have DRBD under control of crm like Pacemaker? If so are you running DRBD in > > dual primary mode maybe? And when does this state happen and under what > > conditions i.e restarted a node etc. What os is this on? Can you please paste the output of "crm status" (or pcs if you are on rhel7) and "crm_mon -Qrf1" ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] oracle stop timeout while drbd resync
Yes, Oracle & drbd is running under pacemaker just in primary/secondary mode. I stopped the oracle resource during DRBD is resyncing and the oracle hangup 2016-08-31 14:38 GMT+08:00 Igor Cicimov : > > > On Wed, Aug 31, 2016 at 3:49 PM, Mia Lueng wrote: >> >> Hi: >> I have a cluster with four drbd devices. I found oracle stopped >> timeout while drbd is in resync state. >> oracle is blocked like following: >> >> oracle6869 6844 0.0 0.0 71424 12616 ?S16:28 >> 00:00:00 pipe_wait >> /oracle/app/oracle/dbhome_1/bin/sqlplus >> @/tmp/ora_ommbb_shutdown.sql >> oracle6870 6869 0.0 0.1 4431856 26096 ? Ds 16:28 >> 00:00:00 get_write_access oracleommbb >> (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) >> >> >> drbd state >> >> 2016-08-30 16:33:32 Dump [/proc/drbd] ... >> = >> version: 8.3.16 (api:88/proto:86-97) >> GIT-hash: bbf851ee755a878a495cfd93e1a76bf90dc79442 Makefile.in build >> by drbd@build 2012-06-07 16:03:04 >> 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- >> ns:2777568 nr:0 dw:492604 dr:3305833 al:4761 bm:439 lo:31 pe:613 >> ua:0 ap:31 ep:1 wo:d oos:4144796 >>[==>.] sync'ed: 35.7% (4044/6280)M >>finish: 0:10:19 speed: 6,680 (3,664) K/sec >> 1: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent B r- >> ns:3709600 nr:0 dw:854764 dr:7632085 al:7689 bm:3401 lo:38 pe:3299 >> ua:38 ap:0 ep:1 wo:d oos:6204676 >>[===>] sync'ed: 41.5% (6056/10340)M >>finish: 0:22:14 speed: 4,640 (10,016) K/sec >> 2: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- >> ns:3968883 nr:0 dw:127937 dr:5179641 al:190 bm:304 lo:1 pe:139 ua:0 >> ap:7 ep:1 wo:d oos:2124792 >>[>...] sync'ed: 66.3% (2072/6144)M >>finish: 0:06:12 speed: 5,692 (6,668) K/sec >> 3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- >> ns:89737 nr:0 dw:439073 dr:2235186 al:724 bm:35 lo:0 pe:45 ua:0 ap:7 >> ep:1 wo:d oos:8131104 >>[>] sync'ed: 1.6% (7940/8064)M >>finish: 10:44:09 speed: 208 (204) K/sec (stalled) >> >> Is this a known bug and fixed in the further version? >> ___ >> drbd-user mailing list >> drbd-user@lists.linbit.com >> http://lists.linbit.com/mailman/listinfo/drbd-user > > > Maybe provide more details about the term "cluster" you are using. Do you > have DRBD under control of crm like Pacemaker? If so are you running DRBD in > dual primary mode maybe? And when does this state happen and under what > conditions i.e restarted a node etc. ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] DRBD9: full-mesh and managed resources
Il 31/08/2016 15:06, Lars Ellenberg ha scritto: > Instead of bridging, > explicit routes could be an other option. > ip route add .../.. dev ... > > Lars Already tried, and didn't work for me, I guess that if there are two interfaces with same IP drbd processes will listen on one only. Maybe I'm wrong. rob ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] DRBD9: full-mesh and managed resources
On Mon, Aug 22, 2016 at 10:43:18AM +0200, Roberto Resoli wrote: > Il 18/08/2016 14:03, Veit Wahlich ha scritto: > > Am Donnerstag, den 18.08.2016, 12:33 +0200 schrieb Roberto Resoli: > >> Il 18/08/2016 10:09, Adam Goryachev ha scritto: > >>> I can't comment on the DRBD related portions, but can't you add both > >>> interfaces on each machine to a single bridge, and then configure the IP > >>> address on the bridge. Hence each machine will only have one IP address, > >>> and the other machines will use their dedicated network to connect to > >>> it. I would assume the overhead of the bridge inside the kernel would be > >>> minimal, but possibly not, so it might be a good idea to test it out. > >> > >> Very clever suggestion! > >> > >> Many thanks, will try and report. > > > > If you try this, take care to enable STP on the bridges, or this will > > create loops. > > Yes, this worked immediately as aspected. > > > Also STP will give you redundancy in case a link breaks and will try to > > determine the shortest path between nodes. > > I confirm. With three nodes and three links of course stp blocks one of > the three links, with root bridge forwording traffing between the other two. > > It is possible to control which bridge becomes root using the parameter > "bridgeprio" of brctl. > > > But the shortest link is not guaranteed. Especially after recovery from > > a network link failure. > > You might want to monitor each node for the shortest path. > > Using stp of course has the side effect of not using one of the three > links (it is the price to pay for failover). > > I tried to disable stp, blocking at the same time (with a simple ebtable > rule) the forwardings through the bridge in order to avoid > loops/broadcast storms. In the resulting topology every link carries > only the traffic of the two nodes it connects (at the expense of having > no failover). > > it is very handy to monitor that all is working correctly using: > > watch brctl showstp > > and > > watch brctl showmacs > > I post here the configuration I ended up to use, for reference: > (I put it in a "drbd-interfaces" file, referenced in > "/etc/network/interfaces" using the "source" directive) > > === > auto drbdbr > iface drbdbr inet static > address > netmask 255.255.255.0 > bridge_ports eth2 eth3 > bridge_stp off > bridge_ageing 30 > bridge_fd 5 > # Only with stp on > # node1 and node2 are preferred > #bridge_bridgeprio 1000 > # Only with stp off > pre-up ifconfig eth2 mtu 9000 && ifconfig eth3 mtu 9000 > up ebtables -I FORWARD --logical-in drbdbr -j DROP > down ebtables -D FORWARD --logical-in drbdbr -j DROP > == Instead of bridging, explicit routes could be an other option. ip route add .../.. dev ... Lars ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user
Re: [DRBD-user] oracle stop timeout while drbd resync
On Wed, Aug 31, 2016 at 3:49 PM, Mia Lueng wrote: > Hi: > I have a cluster with four drbd devices. I found oracle stopped > timeout while drbd is in resync state. > oracle is blocked like following: > > oracle6869 6844 0.0 0.0 71424 12616 ?S16:28 > 00:00:00 pipe_wait > /oracle/app/oracle/dbhome_1/bin/sqlplus > @/tmp/ora_ommbb_shutdown.sql > oracle6870 6869 0.0 0.1 4431856 26096 ? Ds 16:28 > 00:00:00 get_write_access oracleommbb > (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) > > > drbd state > > 2016-08-30 16:33:32 Dump [/proc/drbd] ... > = > version: 8.3.16 (api:88/proto:86-97) > GIT-hash: bbf851ee755a878a495cfd93e1a76bf90dc79442 Makefile.in build > by drbd@build 2012-06-07 16:03:04 > 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > ns:2777568 nr:0 dw:492604 dr:3305833 al:4761 bm:439 lo:31 pe:613 > ua:0 ap:31 ep:1 wo:d oos:4144796 >[==>.] sync'ed: 35.7% (4044/6280)M >finish: 0:10:19 speed: 6,680 (3,664) K/sec > 1: cs:SyncSource ro:Secondary/Secondary ds:UpToDate/Inconsistent B r- > ns:3709600 nr:0 dw:854764 dr:7632085 al:7689 bm:3401 lo:38 pe:3299 > ua:38 ap:0 ep:1 wo:d oos:6204676 >[===>] sync'ed: 41.5% (6056/10340)M >finish: 0:22:14 speed: 4,640 (10,016) K/sec > 2: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > ns:3968883 nr:0 dw:127937 dr:5179641 al:190 bm:304 lo:1 pe:139 ua:0 > ap:7 ep:1 wo:d oos:2124792 >[>...] sync'ed: 66.3% (2072/6144)M >finish: 0:06:12 speed: 5,692 (6,668) K/sec > 3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent B r- > ns:89737 nr:0 dw:439073 dr:2235186 al:724 bm:35 lo:0 pe:45 ua:0 ap:7 > ep:1 wo:d oos:8131104 >[>] sync'ed: 1.6% (7940/8064)M >finish: 10:44:09 speed: 208 (204) K/sec (stalled) > > Is this a known bug and fixed in the further version? > ___ > drbd-user mailing list > drbd-user@lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user > Maybe provide more details about the term "cluster" you are using. Do you have DRBD under control of crm like Pacemaker? If so are you running DRBD in dual primary mode maybe? And when does this state happen and under what conditions i.e restarted a node etc. ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user