Re: [Veritas-ha] LLT heartbeat redundancy

2009-05-13 Thread Sandeep Agarwal (MTV)
LLT does support Layer 2 link aggregation and can work over 
trunks/bonds/aggregated links  as long as a single device is presented to it:
Linux - Bonding
Solaris - Sun Trunking (now link aggregation/dladm in Solaris 10)
HPUX - Auto Port Aggregation
AIX - Etherchannel

IPMP is at Layer 3 and Bonding and the others are at Layer 2 - hence we can't 
compare the two. It would be better to compare Sun Trunking and Linux Bonding.  

-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu 
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Imri Zvik
Sent: Wednesday, May 13, 2009 2:38 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

On Wednesday 13 May 2009 06:04:38 John Cronin wrote:
 Getting slightly off topic, but still somewhat relevant.

 Linux has many flavors of Ethernet bonding.  To be sure, link 
 aggregation resulting in increased bandwidth is generally supported on a 
 single switch.
 However, Linux does have an active-passive bonding that is 
 specifically intended for HA solutions.  AIX has a similar 
 configuration with the unfortunate name of EtherChannel Network Backup 
 Interface - it does NOT rely on Cisco EtherChannel to work.  Both of these 
 create a virtual NIC
 that hides the complexity, making the interface group appear to be a 
 single NIC. You don't need a bunch of switch link aggregation magic 
 (802.11ad or
 EtherChannel) to implement active-passive NIC failover in this manner.

 In my experience, both Linux and AIX Ethernet bonding are easier to 
 use than Sun IPMP, and they also are far more reliable.  I have a lot 
 of experience with all three of these, and in my opinion IPMP is the 
 worst - I have experienced many false failures with IPMP, and I have 
 had to do a bunch of silliness with static routes to make it work in 
 certain environments (prior to the new link based IPMP - but it has 
 issues of its own too).  I wish Sun would add an active-passive 
 capability to their new link aggregation capability (dladm) that works 
 across switches.  If they did that, they would have the same 
 capabilities as Linux and AIX network bonding, with similar ease of 
 use.  It should be fairly trivial to implement.

 The one advantage that IPMP has in active-active mode (e.g. NOT link 
 based) is that it can detect IP connectivity issues (via ping - not 
 just Ethernet link detection) on all NICs in an interface group.  
 However, it is usually issues with the IP connectivity checking that 
 cause all my problems with IPMP, and I would gladly trade it for a 
 simple link based virtual solution that looks like a single link to me.

The linux bonding driver can do ARP test, in order to detect uplink failures.
This is not a layer 3 check, but in most solutions and configurations, it is a 
suitbale replacment.


 I have never used Linux Ethernet bonding or AIX Etherchannel Network 
 Backup Interface for VCS heartbeats, but I am pretty certain they 
 would both work fine.  That said, I am not sure they would provide any 
 significant benefit over a traditional VCS heartbeat network 
 configuration, using the same number of real NICs.

The benefit is that with such bonding method, you can survive the failure 
scenario I've described in my first email :)

It is a fact Symantec understands that, as they are trying to solve it 
internally in LLT :)


 --
 John Cronin
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu 
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha


Re: [Veritas-ha] LLT heartbeat redundancy

2009-05-08 Thread Sandeep Agarwal (MTV)
Yes, the jeopardy detection etc has not changed.
 
We've just added support for this new type of topology in which the
various LLT links are interconnected (crosslinks).

-Original Message-
From: John Cronin [mailto:jsc3...@gmail.com] 
Sent: Friday, May 08, 2009 6:58 AM
To: Sandeep Agarwal (MTV)
Cc: Jim Senicka; Imri Zvik; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy


This was certainly news to me.

In this full-mesh heartbeat network, do we still go into
jeopardy if the network links are lost relatively slowly (e.g. if one
link on a node is down for more than 16 seconds by default)?


On Sun, May 3, 2009 at 1:07 PM, Sandeep Agarwal (MTV)
sandeep_agarw...@symantec.com wrote:


From 5.0MP3 onwards we do support cross-links. In your
example if you
had a cable connecting sw1 and sw2 then the failure that
you described
would be handled and LLT would still have 1 valid link
between node 1
and node 4.


-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu

[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On
Behalf Of Jim
Senicka
Sent: Sunday, May 03, 2009 9:23 AM
To: Imri Zvik
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

LLT is designed to use jeopardy to detect the
difference between
single link fail and dual link fail in most situations.
Having a single
mesh may remove this capability.

Let me check on this with engineering and see if we have
any more up to
date recommendations


-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On
Behalf Of Imri
Zvik
Sent: Sunday, May 03, 2009 12:18 PM
To: Jim Senicka
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

On Sunday 03 May 2009 19:03:16 Jim Senicka wrote:
 This is not a limitation, as you had two independent
failures. Bonding

 would remove the ability to discriminate between a
link and a node
 failure.

I didn't understand this one - With bonding I can
maintain full mesh
topology - No matter which one of the links fails, if a
node still has
at least one active link, LLT will still be able to see
all the other
nodes.
This achieves greater HA than without the bonding.


 My feeling is in the scenario you describe, VCS is
operating properly,

 and it is not a limitation.

Of course it is operating properly - that's how it was
designed to work
:)
I'm just saying that the cluster could be more redundant
if it wasn't
designed that way :)

 If you have issues with port or cable failures, add a
low pri
connection
 on a third network.



___
Veritas-ha maillist  -
Veritas-ha@mailman.eng.auburn.edu

http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -
Veritas-ha@mailman.eng.auburn.edu

http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -
Veritas-ha@mailman.eng.auburn.edu

http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha


Re: [Veritas-ha] LLT heartbeat redundancy

2009-05-06 Thread Sandeep Agarwal (MTV)
The lost hb messages are due to problems in Layer 2 connectivity.

In the cross-linked case as long as 1 link is up on each node LLT should
work fine (after a 2s glitch).

The llttab file is fine.

-Original Message-
From: Imri Zvik [mailto:im...@inter.net.il] 
Sent: Wednesday, May 06, 2009 1:44 AM
To: Sandeep Agarwal (MTV)
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy


On Wednesday 06 May 2009 00:55:23 you wrote:
 Nothing to add to /etc/llttab

 If you have 50MP3 then it should work.

So considering the following llttab file:

set-node rac-node1
set-cluster 0
link eth2 eth-00:21:5e:1f:0b:b0 - ether - -
link eth3 eth-00:21:5e:1f:0b:b1 - ether - -

and that eth2 and eth3 are on the same layer2 subnet (i.e. sees each
other), 
I shouldn't get the lost hb messages?

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha


Re: [Veritas-ha] LLT heartbeat redundancy

2009-05-04 Thread Sandeep Agarwal (MTV)
Unfortunately we don't have any more documentation for this feature.

Basically, LLT works in the same way except now it can handle the
failure that you described and now we can connect the two switches that
the individual LLT links are configured on.

-Original Message-
From: Imri Zvik [mailto:im...@inter.net.il] 
Sent: Sunday, May 03, 2009 11:26 PM
To: Sandeep Agarwal (MTV)
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy


On Sunday 03 May 2009 20:07:29 Sandeep Agarwal (MTV) wrote:
 From 5.0MP3 onwards we do support cross-links. In your example if you 
 had a cable connecting sw1 and sw2 then the failure that you described

 would be handled and LLT would still have 1 valid link between node 1 
 and node 4.

Could you please point me to some more documention regarding this
feature?

Thanks!
 
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha


Re: [Veritas-ha] LLT heartbeat redundancy

2009-05-03 Thread Sandeep Agarwal (MTV)
From 5.0MP3 onwards we do support cross-links. In your example if you
had a cable connecting sw1 and sw2 then the failure that you described
would be handled and LLT would still have 1 valid link between node 1
and node 4. 

-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Jim
Senicka
Sent: Sunday, May 03, 2009 9:23 AM
To: Imri Zvik
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

LLT is designed to use jeopardy to detect the difference between
single link fail and dual link fail in most situations. Having a single
mesh may remove this capability.

Let me check on this with engineering and see if we have any more up to
date recommendations


-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Imri
Zvik
Sent: Sunday, May 03, 2009 12:18 PM
To: Jim Senicka
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

On Sunday 03 May 2009 19:03:16 Jim Senicka wrote:
 This is not a limitation, as you had two independent failures. Bonding

 would remove the ability to discriminate between a link and a node 
 failure.

I didn't understand this one - With bonding I can maintain full mesh
topology - No matter which one of the links fails, if a node still has
at least one active link, LLT will still be able to see all the other
nodes. 
This achieves greater HA than without the bonding.


 My feeling is in the scenario you describe, VCS is operating properly,

 and it is not a limitation.

Of course it is operating properly - that's how it was designed to work
:)
I'm just saying that the cluster could be more redundant if it wasn't
designed that way :)

 If you have issues with port or cable failures, add a low pri
connection
 on a third network.



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha


Re: [Veritas-ha] LLT heartbeat redundancy

2009-05-03 Thread Sandeep Agarwal (MTV)
Sun Cluster does not support this feature. 

-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Sandeep
Agarwal (MTV)
Sent: Sunday, May 03, 2009 10:07 AM
To: Jim Senicka; Imri Zvik
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

From 5.0MP3 onwards we do support cross-links. In your example if you
had a cable connecting sw1 and sw2 then the failure that you described
would be handled and LLT would still have 1 valid link between node 1
and node 4. 

-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Jim
Senicka
Sent: Sunday, May 03, 2009 9:23 AM
To: Imri Zvik
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

LLT is designed to use jeopardy to detect the difference between
single link fail and dual link fail in most situations. Having a single
mesh may remove this capability.

Let me check on this with engineering and see if we have any more up to
date recommendations


-Original Message-
From: veritas-ha-boun...@mailman.eng.auburn.edu
[mailto:veritas-ha-boun...@mailman.eng.auburn.edu] On Behalf Of Imri
Zvik
Sent: Sunday, May 03, 2009 12:18 PM
To: Jim Senicka
Cc: veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] LLT heartbeat redundancy

On Sunday 03 May 2009 19:03:16 Jim Senicka wrote:
 This is not a limitation, as you had two independent failures. Bonding

 would remove the ability to discriminate between a link and a node 
 failure.

I didn't understand this one - With bonding I can maintain full mesh
topology - No matter which one of the links fails, if a node still has
at least one active link, LLT will still be able to see all the other
nodes. 
This achieves greater HA than without the bonding.


 My feeling is in the scenario you describe, VCS is operating properly,

 and it is not a limitation.

Of course it is operating properly - that's how it was designed to work
:)
I'm just saying that the cluster could be more redundant if it wasn't
designed that way :)

 If you have issues with port or cable failures, add a low pri
connection
 on a third network.



___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha


Re: [Veritas-ha] IPMultiNICB, mpathd and network outages

2008-10-20 Thread Sandeep Agarwal (MTV)
If you only have 1 router (which is very common) in your subnet then the
router can possibly be a SPOF for IPMP. Here's how you can setup up more
targets for the IPMP to probe so that the router is not a SPOF.

How to Manually Specify Target Systems for Probe-Based Failure Detection

   1.

  Log in with your user account to the system where you are
configuring probe-based failure detection.
   2.

  Add a route to a particular host to be used as a target in
probe-based failure detection.

  $ route add -host destination-IP gateway-IP -static

  Replace the values of destination-IP and gateway-IP with the IPv4
address of the host to be used as a target. For example, you would type
the following to specify the target system 192.168.85.137, which is on
the same subnet as the interfaces in IPMP group testgroup1.

  $ route add -host 192.168.85.137 192.168.85.137 -static 

   3.

  Add routes to additional hosts on the network to be used as target
systems.

Taken from:
http://docs.sun.com/app/docs/doc/816-4554/etmkd?l=ena=view

This should probably be a best practise. However, one can argue that if
the router is down then the cluster is useless anyways.

Sandeep

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jim
Senicka
Sent: Monday, October 20, 2008 9:53 AM
To: DeMontier, Frank; Paul Robertson; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] IPMultiNICB, mpathd and network outages


I would be more concerned about future failures being handled properly.
If you were able to take out all networks from all nodes at same time,
you have a SPOF. If this was a one time maintenance upgrade to your
network gear and not a normal event, setting VCS to not respond to
network events means that future cable or port issues will not be
handled. If it is a common occurrence for all networks to be lost,
perhaps you need to address the network issues :-)



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
DeMontier, Frank
Sent: Monday, October 20, 2008 11:10 AM
To: Paul Robertson; veritas-ha@mailman.eng.auburn.edu
Subject: Re: [Veritas-ha] IPMultiNICB, mpathd and network outages

FaultPropagation=0 should do it.

Buddy DeMontier
State Street Global Advisors
Infrastructure Technical Services
Boston Ma 02111

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul
Robertson
Sent: Monday, October 20, 2008 10:37 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] IPMultiNICB, mpathd and network outages

We recently experienced a Cisco network issue which prevented all nodes
in that subnet from accessing the default gateway for about a minute.

The Solaris nodes which run probe-based IPMP reported that all
interfaces had failed because they were unable to ping the default
gateway; however, they came back within seconds once the network issue
was resolved. Fine.

Unfortunately, our VCS nodes initiated an offline of the service group
after the IPMultiNICB resources detected the IPMP fault. Since the
service group offline/online takes several minutes, the outage on these
nodes was more painful. Furthermore, since the peer cluster nodes in the
same subnet were also experiencing the same mpathd fault, there would
have been little advantage to failing over the service group to another
node.

We would like to find a way to configure VCS so that the service group
does not offline (and any dependent resources within the service group
are not offlined) in the event of an mpathd (i.e. IPMultiNICB) failure.
In looking through the documentation, it seems that the closest we can
come is to increase the IPMultiNICB ToleranceLimit from 1 to a huge
value:

 # hatype -modify IPMultiNICB ToleranceLimit 

This should achieve our desired goal, but I can't help thinking that
it's an ugly hack, and that there must be a better way. Any suggestions
are appreciated.

Cheers,

Paul

P.S. A snippet of the main.cf file is listed below:


 group multinicbsg (
   SystemList = { app04 = 1, app05 = 2 }
   Parallel = 1
   )

   MultiNICB multinicb (
   UseMpathd = 1
   MpathdCommand = /usr/lib/inet/in.mpathd -a
   Device = { ce0 = 0, ce4 = 2 }
   DefaultRouter = 192.168.9.1
   )

   Phantom phantomb (
   )

   phantomb requires multinicb

 group app_grp (
   SystemList = { app04 = 0, app05 = 0 }
   )

   IPMultiNICB app_ip (
   BaseResName = multinicb
   Address = 192.168.9.34
   NetMask = 255.255.255.0

   Proxy appmnic_proxy (
   TargetResName = multinicb
   )

   (various other resources, including some that depend on app_ip
   excluded for brevity)

   app_ip requires appmnic_proxy
___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha

___
Veritas-ha 

Re: [Veritas-ha] Sfrac libskgxp library

2008-09-15 Thread Sandeep Agarwal (MTV)
Only the filenamed libskgxp10.so



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Monday, September 15, 2008 8:14 AM
To: veritas-ha@mailman.eng.auburn.edu
Subject: [Veritas-ha] Sfrac libskgxp library


Hi,
 
Can anyone answer this.  If I copy the
/opt/VRTSvcs/rac/lib/libskgxp102_64.so library into
/opt/oracle/product/10.2.0.3/lib directory as libskgxp102_64.so instead
on libskgxp10.so  does that library get used as it's the most recent and
matches the version or does it only use the file named libskgxp10.so?
 
Kgh
 
 

**
The information contained in this message, including attachments, may
contain 
privileged or confidential information that is intended to be delivered
only to the 
person identified above. If you are not the intended recipient, or the
person 
responsible for delivering this message to the intended recipient,
Alltel requests 
that you immediately notify the sender and asks that you do not read the
message or its 
attachments, and that you delete them without copying or sending them to
anyone else. 

___
Veritas-ha maillist  -  Veritas-ha@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha