On 4/21/06, Peter Memishian <[EMAIL PROTECTED]> wrote: > > So, what happens now if probe-based detection is used with VLANs? For > > example: > > > > # native vlan - these interfaces need to work with dhcp for > > # network based installation/recovery > > bge0: 0.0.0.0 deprecated,nofailover,(!up) group 10.0.0.0 > > bge1: 0.0.0.0 deprecated,nofailover,(!up) group 10.0.0.0 > > bge0:1: 10.0.0.1 up group 10.0.0.0 > > # vlan 1 > > bge1000: 0.0.0.0 deprecated,nofailover,(!up) group 10.0.1.0 > > bge1001: 0.0.0.0 deprecated,nofailover,(!up) group 10.0.1.0 > > bge1000:1 10.0.1.1 up group 10.0.1.0 > > # vlan 2 > > bge2000: 0.0.0.0 deprecated,nofailover,(!up) group 10.0.2.0 > > bge2001: 0.0.0.0 deprecated,nofailover,(!up) group 10.0.2.0 > > bge2001:1 10.0.2.1 group 10.0.2.0 > > > > I this case, if a link failure is found on bge0, I fully expect that > > 10.0.0.1 will fail over to bge1. Will 10.0.1.1 recognize the link > > failure as well and fail over to bge1001? This is what makes sense to > > me but have not yet had a chance to test. FWIW, my initial focus on > > this would be with the current architecture(s) found in S10 1/06 and > > possibly S9 9/05. > > First, I'm a bit confused about your use of `!up' above -- all test > addresses should be IFF_UP. Second, you asking about probe-based failure > detection, but then propose that a link failure is detected. Those are > two different cases:
Sorry about probe-based vs. link based. I really meant link-based. The '!up' was used to imply that the interface should not be up. This is based upon the observation that if it is up that in.mpathd complains about the test addresses (0.0.0.0) not being unique when I try to configure for probe-based and I allow the interfaces to be up. > * In the case where the link remains up but probes are no longer > answered, in.mpathd will detect this failure independently on all > three IP interfaces using the link (bge0, bge1000, and bge2000), > and trigger failovers to bge1, bge1001, and bge2001 (though > nothing will happen when bge2000 fails since it doesn't host any > data addresses in your example above). Again, talking link-based, not probe-based.... sorry! > * In the case where the link goes down, the bge driver will call > mac_link_update() on the bge0 mac, which will cause GLDv3's > str_notify() function to be called on each stream that has > attached to a PPA that uses the bge0 mac. Since bge0, bge1000, > and bge2000 all use the bge0 mac (see dld_str_attach()), all > three IP interfaces will receive a DL_NOTE_LINK_DOWN DLPI > notification. I have confirmed that this seems to work just as you say. One thing of interest is that if_mpadm(1M) can be used to disable individual VLANs. That is, I can simulate a failure of bge0 independent of bge1000 and bge2000. However, now that I have been testing a bit more, I have found some failover/failback problems. My setup looks like: bge0 0.0.0.0 RUNNING,NOFAILOVER group native bge0:1 10.0.0.1 UP,RUNNING bge1 0.0.0.0 RUNNING,NOFAILOVER group native bge1000 0.0.0.0 RUNNING,NOFAILOVER group vlan1 bge1000:1 10.0.1.1 UP,RUNNING zone dev1 bge1001 0.0.0.0 RUNNING,NOFAILOVER group vlan1 bge1001:1 10.0.1.2 UP,RUNNING zone dev2 bge2000 0.0.0.0 RUNNING,NOFAILOVER group vlan2 bge2000:1 10.0.2.1 UP,RUNNING zone prod1 bge2001 0.0.0.0 RUNNING,NOFAILOVER group vlan2 bge2001:1 10.0.2.2 UP,RUNNING zone prod2 This is accomplished with /etc/hostname.* entries that say: /etc/hostname.bge0 group native -failover addif 10.0.0.1 netmask + broadcast + up /etc/hostname.bge1 group native -failover /etc/hostname.bge1000 group vlan1 -failover ... The addresses on the vlan interfaces all belong to zones and are plumbed via "zoneadm boot". Now, if I pull bge0 everything fails over to bge1 as expected. However, when I reconnect the cable to bge0, 10.0.0.1 is on bge0 (not bge0:1), 10.0.1.1 is on bge1000 (not bge1000:1), and 10.0.2.1 is on bge2000 (not bge2000:1). Then, if I pull bge1 I notice that the addresses on bge1:1, bge1001:1, and bge2001:1 do not fail over properly. The interfaces show that they are FAILED and the RUNNING flag is cleared, but the addresses stay where they started. If I plug the cable for bge1 in, the FAILED flag goes away and RUNNING is set. If I then use "if_mpadm -d bge2001" it says that it cannot fail the address over because there are no more interfaces in the group. In earlier tests with just a single "floater" IP in each group failover and failback worked as expected. Does this sound like a configuration error or bug? Mike -- Mike Gerdts http://mgerdts.blogspot.com/ _______________________________________________ networking-discuss mailing list [email protected]
