Re: [Veritas-ha] LLT heartbeat redundancy
LLT does support Layer 2 link aggregation and can work over trunks/bonds/aggregated links as long as a single device is presented to it: Linux - Bonding Solaris - Sun Trunking (now link aggregation/dladm in Solaris 10) HPUX - Auto Port Aggregation AIX - Etherchannel IPMP is at Layer 3 and Bonding and the others are at Layer 2 - hence we can't compare the two. It would be better to compare Sun Trunking and Linux Bonding. -Original Message- From: veritas-ha-boun...@mailman.eng.auburn.edu [mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Imri Zvik Sent: Wednesday, May 13, 2009 2:38 AM To: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy On Wednesday 13 May 2009 06:04:38 John Cronin wrote: Getting slightly off topic, but still somewhat relevant. Linux has many flavors of Ethernet bonding. To be sure, link aggregation resulting in increased bandwidth is generally supported on a single switch. However, Linux does have an active-passive bonding that is specifically intended for HA solutions. AIX has a similar configuration with the unfortunate name of EtherChannel Network Backup Interface - it does NOT rely on Cisco EtherChannel to work. Both of these create a virtual NIC that hides the complexity, making the interface group appear to be a single NIC. You don't need a bunch of switch link aggregation magic (802.11ad or EtherChannel) to implement active-passive NIC failover in this manner. In my experience, both Linux and AIX Ethernet bonding are easier to use than Sun IPMP, and they also are far more reliable. I have a lot of experience with all three of these, and in my opinion IPMP is the worst - I have experienced many false failures with IPMP, and I have had to do a bunch of silliness with static routes to make it work in certain environments (prior to the new link based IPMP - but it has issues of its own too). I wish Sun would add an active-passive capability to their new link aggregation capability (dladm) that works across switches. If they did that, they would have the same capabilities as Linux and AIX network bonding, with similar ease of use. It should be fairly trivial to implement. The one advantage that IPMP has in active-active mode (e.g. NOT link based) is that it can detect IP connectivity issues (via ping - not just Ethernet link detection) on all NICs in an interface group. However, it is usually issues with the IP connectivity checking that cause all my problems with IPMP, and I would gladly trade it for a simple link based virtual solution that looks like a single link to me. The linux bonding driver can do ARP test, in order to detect uplink failures. This is not a layer 3 check, but in most solutions and configurations, it is a suitbale replacment. I have never used Linux Ethernet bonding or AIX Etherchannel Network Backup Interface for VCS heartbeats, but I am pretty certain they would both work fine. That said, I am not sure they would provide any significant benefit over a traditional VCS heartbeat network configuration, using the same number of real NICs. The benefit is that with such bonding method, you can survive the failure scenario I've described in my first email :) It is a fact Symantec understands that, as they are trying to solve it internally in LLT :) -- John Cronin ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
Re: [Veritas-ha] LLT heartbeat redundancy
Yes, the jeopardy detection etc has not changed. We've just added support for this new type of topology in which the various LLT links are interconnected (crosslinks). -Original Message- From: John Cronin [mailto:jsc3...@gmail.com] Sent: Friday, May 08, 2009 6:58 AM To: Sandeep Agarwal (MTV) Cc: Jim Senicka; Imri Zvik; veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy This was certainly news to me. In this full-mesh heartbeat network, do we still go into jeopardy if the network links are lost relatively slowly (e.g. if one link on a node is down for more than 16 seconds by default)? On Sun, May 3, 2009 at 1:07 PM, Sandeep Agarwal (MTV) sandeep_agarw...@symantec.com wrote: From 5.0MP3 onwards we do support cross-links. In your example if you had a cable connecting sw1 and sw2 then the failure that you described would be handled and LLT would still have 1 valid link between node 1 and node 4. -Original Message- From: veritas-ha-boun...@mailman.eng.auburn.edu [mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Jim Senicka Sent: Sunday, May 03, 2009 9:23 AM To: Imri Zvik Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy LLT is designed to use jeopardy to detect the difference between single link fail and dual link fail in most situations. Having a single mesh may remove this capability. Let me check on this with engineering and see if we have any more up to date recommendations -Original Message- From: veritas-ha-boun...@mailman.eng.auburn.edu [mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Imri Zvik Sent: Sunday, May 03, 2009 12:18 PM To: Jim Senicka Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy On Sunday 03 May 2009 19:03:16 Jim Senicka wrote: This is not a limitation, as you had two independent failures. Bonding would remove the ability to discriminate between a link and a node failure. I didn't understand this one - With bonding I can maintain full mesh topology - No matter which one of the links fails, if a node still has at least one active link, LLT will still be able to see all the other nodes. This achieves greater HA than without the bonding. My feeling is in the scenario you describe, VCS is operating properly, and it is not a limitation. Of course it is operating properly - that's how it was designed to work :) I'm just saying that the cluster could be more redundant if it wasn't designed that way :) If you have issues with port or cable failures, add a low pri connection on a third network. ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
Re: [Veritas-ha] LLT heartbeat redundancy
The lost hb messages are due to problems in Layer 2 connectivity. In the cross-linked case as long as 1 link is up on each node LLT should work fine (after a 2s glitch). The llttab file is fine. -Original Message- From: Imri Zvik [mailto:im...@inter.net.il] Sent: Wednesday, May 06, 2009 1:44 AM To: Sandeep Agarwal (MTV) Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy On Wednesday 06 May 2009 00:55:23 you wrote: Nothing to add to /etc/llttab If you have 50MP3 then it should work. So considering the following llttab file: set-node rac-node1 set-cluster 0 link eth2 eth-00:21:5e:1f:0b:b0 - ether - - link eth3 eth-00:21:5e:1f:0b:b1 - ether - - and that eth2 and eth3 are on the same layer2 subnet (i.e. sees each other), I shouldn't get the lost hb messages? ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
Re: [Veritas-ha] LLT heartbeat redundancy
Unfortunately we don't have any more documentation for this feature. Basically, LLT works in the same way except now it can handle the failure that you described and now we can connect the two switches that the individual LLT links are configured on. -Original Message- From: Imri Zvik [mailto:im...@inter.net.il] Sent: Sunday, May 03, 2009 11:26 PM To: Sandeep Agarwal (MTV) Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy On Sunday 03 May 2009 20:07:29 Sandeep Agarwal (MTV) wrote: From 5.0MP3 onwards we do support cross-links. In your example if you had a cable connecting sw1 and sw2 then the failure that you described would be handled and LLT would still have 1 valid link between node 1 and node 4. Could you please point me to some more documention regarding this feature? Thanks! ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
Re: [Veritas-ha] LLT heartbeat redundancy
From 5.0MP3 onwards we do support cross-links. In your example if you had a cable connecting sw1 and sw2 then the failure that you described would be handled and LLT would still have 1 valid link between node 1 and node 4. -Original Message- From: veritas-ha-boun...@mailman.eng.auburn.edu [mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Jim Senicka Sent: Sunday, May 03, 2009 9:23 AM To: Imri Zvik Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy LLT is designed to use jeopardy to detect the difference between single link fail and dual link fail in most situations. Having a single mesh may remove this capability. Let me check on this with engineering and see if we have any more up to date recommendations -Original Message- From: veritas-ha-boun...@mailman.eng.auburn.edu [mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Imri Zvik Sent: Sunday, May 03, 2009 12:18 PM To: Jim Senicka Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy On Sunday 03 May 2009 19:03:16 Jim Senicka wrote: This is not a limitation, as you had two independent failures. Bonding would remove the ability to discriminate between a link and a node failure. I didn't understand this one - With bonding I can maintain full mesh topology - No matter which one of the links fails, if a node still has at least one active link, LLT will still be able to see all the other nodes. This achieves greater HA than without the bonding. My feeling is in the scenario you describe, VCS is operating properly, and it is not a limitation. Of course it is operating properly - that's how it was designed to work :) I'm just saying that the cluster could be more redundant if it wasn't designed that way :) If you have issues with port or cable failures, add a low pri connection on a third network. ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
Re: [Veritas-ha] LLT heartbeat redundancy
Sun Cluster does not support this feature. -Original Message- From: veritas-ha-boun...@mailman.eng.auburn.edu [mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Sandeep Agarwal (MTV) Sent: Sunday, May 03, 2009 10:07 AM To: Jim Senicka; Imri Zvik Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy From 5.0MP3 onwards we do support cross-links. In your example if you had a cable connecting sw1 and sw2 then the failure that you described would be handled and LLT would still have 1 valid link between node 1 and node 4. -Original Message- From: veritas-ha-boun...@mailman.eng.auburn.edu [mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Jim Senicka Sent: Sunday, May 03, 2009 9:23 AM To: Imri Zvik Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy LLT is designed to use jeopardy to detect the difference between single link fail and dual link fail in most situations. Having a single mesh may remove this capability. Let me check on this with engineering and see if we have any more up to date recommendations -Original Message- From: veritas-ha-boun...@mailman.eng.auburn.edu [mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Imri Zvik Sent: Sunday, May 03, 2009 12:18 PM To: Jim Senicka Cc: veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] LLT heartbeat redundancy On Sunday 03 May 2009 19:03:16 Jim Senicka wrote: This is not a limitation, as you had two independent failures. Bonding would remove the ability to discriminate between a link and a node failure. I didn't understand this one - With bonding I can maintain full mesh topology - No matter which one of the links fails, if a node still has at least one active link, LLT will still be able to see all the other nodes. This achieves greater HA than without the bonding. My feeling is in the scenario you describe, VCS is operating properly, and it is not a limitation. Of course it is operating properly - that's how it was designed to work :) I'm just saying that the cluster could be more redundant if it wasn't designed that way :) If you have issues with port or cable failures, add a low pri connection on a third network. ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
Re: [Veritas-ha] IPMultiNICB, mpathd and network outages
If you only have 1 router (which is very common) in your subnet then the router can possibly be a SPOF for IPMP. Here's how you can setup up more targets for the IPMP to probe so that the router is not a SPOF. How to Manually Specify Target Systems for Probe-Based Failure Detection 1. Log in with your user account to the system where you are configuring probe-based failure detection. 2. Add a route to a particular host to be used as a target in probe-based failure detection. $ route add -host destination-IP gateway-IP -static Replace the values of destination-IP and gateway-IP with the IPv4 address of the host to be used as a target. For example, you would type the following to specify the target system 192.168.85.137, which is on the same subnet as the interfaces in IPMP group testgroup1. $ route add -host 192.168.85.137 192.168.85.137 -static 3. Add routes to additional hosts on the network to be used as target systems. Taken from: http://docs.sun.com/app/docs/doc/816-4554/etmkd?l=ena=view This should probably be a best practise. However, one can argue that if the router is down then the cluster is useless anyways. Sandeep -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jim Senicka Sent: Monday, October 20, 2008 9:53 AM To: DeMontier, Frank; Paul Robertson; veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] IPMultiNICB, mpathd and network outages I would be more concerned about future failures being handled properly. If you were able to take out all networks from all nodes at same time, you have a SPOF. If this was a one time maintenance upgrade to your network gear and not a normal event, setting VCS to not respond to network events means that future cable or port issues will not be handled. If it is a common occurrence for all networks to be lost, perhaps you need to address the network issues :-) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of DeMontier, Frank Sent: Monday, October 20, 2008 11:10 AM To: Paul Robertson; veritas-ha@mailman.eng.auburn.edu Subject: Re: [Veritas-ha] IPMultiNICB, mpathd and network outages FaultPropagation=0 should do it. Buddy DeMontier State Street Global Advisors Infrastructure Technical Services Boston Ma 02111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Robertson Sent: Monday, October 20, 2008 10:37 AM To: veritas-ha@mailman.eng.auburn.edu Subject: [Veritas-ha] IPMultiNICB, mpathd and network outages We recently experienced a Cisco network issue which prevented all nodes in that subnet from accessing the default gateway for about a minute. The Solaris nodes which run probe-based IPMP reported that all interfaces had failed because they were unable to ping the default gateway; however, they came back within seconds once the network issue was resolved. Fine. Unfortunately, our VCS nodes initiated an offline of the service group after the IPMultiNICB resources detected the IPMP fault. Since the service group offline/online takes several minutes, the outage on these nodes was more painful. Furthermore, since the peer cluster nodes in the same subnet were also experiencing the same mpathd fault, there would have been little advantage to failing over the service group to another node. We would like to find a way to configure VCS so that the service group does not offline (and any dependent resources within the service group are not offlined) in the event of an mpathd (i.e. IPMultiNICB) failure. In looking through the documentation, it seems that the closest we can come is to increase the IPMultiNICB ToleranceLimit from 1 to a huge value: # hatype -modify IPMultiNICB ToleranceLimit This should achieve our desired goal, but I can't help thinking that it's an ugly hack, and that there must be a better way. Any suggestions are appreciated. Cheers, Paul P.S. A snippet of the main.cf file is listed below: group multinicbsg ( SystemList = { app04 = 1, app05 = 2 } Parallel = 1 ) MultiNICB multinicb ( UseMpathd = 1 MpathdCommand = /usr/lib/inet/in.mpathd -a Device = { ce0 = 0, ce4 = 2 } DefaultRouter = 192.168.9.1 ) Phantom phantomb ( ) phantomb requires multinicb group app_grp ( SystemList = { app04 = 0, app05 = 0 } ) IPMultiNICB app_ip ( BaseResName = multinicb Address = 192.168.9.34 NetMask = 255.255.255.0 Proxy appmnic_proxy ( TargetResName = multinicb ) (various other resources, including some that depend on app_ip excluded for brevity) app_ip requires appmnic_proxy ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha ___ Veritas-ha
Re: [Veritas-ha] Sfrac libskgxp library
Only the filenamed libskgxp10.so From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Monday, September 15, 2008 8:14 AM To: veritas-ha@mailman.eng.auburn.edu Subject: [Veritas-ha] Sfrac libskgxp library Hi, Can anyone answer this. If I copy the /opt/VRTSvcs/rac/lib/libskgxp102_64.so library into /opt/oracle/product/10.2.0.3/lib directory as libskgxp102_64.so instead on libskgxp10.so does that library get used as it's the most recent and matches the version or does it only use the file named libskgxp10.so? Kgh ** The information contained in this message, including attachments, may contain privileged or confidential information that is intended to be delivered only to the person identified above. If you are not the intended recipient, or the person responsible for delivering this message to the intended recipient, Alltel requests that you immediately notify the sender and asks that you do not read the message or its attachments, and that you delete them without copying or sending them to anyone else. ___ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha