[ldoms-discuss] Dropping Network Connection

Matt Walburn Wed, 25 Jun 2008 10:47:19 -0500

I am having the same issue on a T5240 with S10U5 + LDOMs 1.0.3 + EIS
Baseline 05/08:


My configuration is this:

I have two nxge devices configured in a probe-based IPMP failover group.
These same two devices are then configured into two virtual switches so that
the LDOMs can do IPMP as well.

bash-3.00# ldm list-bindings
NAME             STATE    FLAGS   CONS    VCPU  MEMORY   UTIL  UPTIME
primary          active   -n-cv   SP      8     4G       5.6%  56m

MAC
    00:14:4f:46:a8:8e

VCPU
    VID    PID    UTIL STRAND
    0      0       12%   100%
    1      1      5.2%   100%
    2      2      1.1%   100%
    3      3      1.0%   100%
    4      4       15%   100%
    5      5      9.4%   100%
    6      6      0.5%   100%
    7      7      0.2%   100%

MAU
    ID     CPUSET
    0      (0, 1, 2, 3, 4, 5, 6, 7)

MEMORY
    RA               PA               SIZE
    0xe000000        0xe000000        4G

VARIABLES
    auto-boot?=true
    boot-device=rootdisk rootmirror disk net
    diag-switch?=false
    keyboard-layout=US-English
    local-mac-address?=true
    nvramrc=devalias rootdisk /pci at 400/pci at 0/pci at 8/scsi at 0/disk at 
0,0:a
devalias rootmirror /pci at 400/pci at 0/pci at 8/scsi at 0/disk at 1,0:a
." ChassisSerialNumber BEL0819IID " cr
    security-mode=none
    security-password=
    use-nvramrc?=true

IO
    DEVICE           PSEUDONYM        OPTIONS
    pci at 400          pci_0
    pci at 500          pci_1

VCC
    NAME             PORT-RANGE
    primary-vcc0     5000-5100
        CLIENT                      PORT
        soeivsx001c at primary-vcc0    5000
        xvmavsx001c at primary-vcc0    5004
        xvmpvsx001c at primary-vcc0    5005
        ni2avsx001c at primary-vcc0    5001
        spsavsx001c at primary-vcc0    5003
        smcavsx001c at primary-vcc0    5002

VSW
    NAME             MAC               NET-DEV   DEVICE     MODE
    primary-vsw0     00:14:4f:f9:d1:57 nxge4     switch at 0
        PEER                        MAC
        vnet1 at soeivsx001c           00:14:4f:fa:67:42
        vnet2 at xvmavsx001c           00:14:4f:fb:c7:cc
        vnet3 at xvmpvsx001c           00:14:4f:fa:c3:39
        vnet4 at ni2avsx001c           00:14:4f:fa:fe:5f
        vnet5 at spsavsx001c           00:14:4f:fb:53:c3
        vnet6 at smcavsx001c           00:14:4f:fb:0f:88
    NAME             MAC               NET-DEV   DEVICE     MODE
    primary-vsw1     00:14:4f:f8:9f:e2 nxge6     switch at 1
        PEER                        MAC
        vnet7 at soeivsx001c           00:14:4f:f9:b9:b5
        vnet8 at xvmavsx001c           00:14:4f:fb:a4:c9
        vnet9 at xvmpvsx001c           00:14:4f:fa:b2:94
        vnet10 at ni2avsx001c          00:14:4f:f8:eb:ce
        vnet11 at spsavsx001c          00:14:4f:fb:b1:07
        vnet12 at smcavsx001c          00:14:4f:f8:01:e2

VDS
    NAME             VOLUME         OPTIONS          DEVICE
    primary-vds0     vol1                            /dev/dsk/c1t4d0s2
                     vol2                            /dev/dsk/c1t5d0s2
                     svol1
/dev/dsk/c4t60060480000190100471533033433531d0s2
                     vol7                            /dev/dsk/c1t10d0s2
                     vol8                            /dev/dsk/c1t11d0s2
                     vol11                           /dev/dsk/c1t14d0s2
                     vol12                           /dev/dsk/c1t15d0s2
                     vol9                            /dev/dsk/c1t12d0s2
                     vol10                           /dev/dsk/c1t13d0s2
                     vol3                            /dev/dsk/c1t6d0s2
                     vol4                            /dev/dsk/c1t7d0s2
                     svol2
/dev/dsk/c4t60060480000190100471533033433731d0s2
                     vol5                            /dev/dsk/c1t8d0s2
                     vol6                            /dev/dsk/c1t9d0s2
        CLIENT                      VOLUME
        vdisk1 at soeivsx001c          vol1
        vdisk2 at soeivsx001c          vol2
        vdisk3 at soeivsx001c          svol1
        vdisk1 at xvmavsx001c          vol3
        vdisk2 at xvmavsx001c          vol4
        vdisk3 at xvmavsx001c          svol2
        vdisk1 at xvmpvsx001c          vol5
        vdisk2 at xvmpvsx001c          vol6
        vdisk1 at ni2avsx001c          vol7
        vdisk2 at ni2avsx001c          vol8
        vdisk1 at spsavsx001c          vol9
        vdisk2 at spsavsx001c          vol10
        vdisk1 at smcavsx001c          vol11
        vdisk2 at smcavsx001c          vol12

VCONS
    NAME             SERVICE                     PORT
    SP


On Thu, Jun 19, 2008 at 3:43 PM, <Sriharsha.Basavapatna at sun.com> wrote:

>  Scott Adair wrote:
>
> Yes, all of these vnets are on the same subnet.
>
>  within a guest are all 3 vnets in the same subnet? from your previous
> emails I thought all 3 vnets are plumbed and in use, in each guest; but
> looks like you are saying only 1 is plumbed?
>
> The issue occurs from just general use, but seems to be related to
> high network utilization (we use NFS for all the users home
> directories).
>
> 6603974 - Interesting, although we only have 3 vnet (only 1 plumbed)
> on each domain, and only two domains per vsw. Also we are not using
> DHCP. We loose connectivity to all external systems, both on the same
> subnet and across our router.'
>
>  if you are losing connection to hosts on the same subnet, then it may not
> be that problem.
>
> Would there be any harm is setting ip_ire_min_bucket_cnt and
> ip_ire_max_bucket_cnt? Would that need to be set in each Domain or
> just the Primary?
>
>  there shouldn't be any harm; set it in /etc/system on the domain(not
> primary) and remove after verifying.
> -Harsha
>
> Scott
>
> On 19-Jun-08, at 3:53 PM, Sriharsha.Basavapatna at Sun.COM wrote:
>
>    Scott Adair wrote:
>
>  So we have some mixed results. This seems to have reduced the
> issue, but it has not solved it. Actually, I think it has masked it
> a bit since it seems to have just increased a timeout in the vsw
> code (although I'm not a programmer, so I don't know for sure).
>
>        That work around(6675887) is needed only if you are using an aggr
> device for vsw; I'm not sure how that can change the behavior you
> are seeing.
>
>  Something else that I've noticed. Let's say that we have LD1 and
> LD2 on VSW0. If LD1 has the network problem I can still ping LD2.
> So I'm starting to think that the problem is not related to the vsw
> but maybe is something in the vnet driver inside the domain? This
> would explain why I loose all network connectivity, across all the
> vsw's the ldom is connected to, at the same time.
>
>        are these vnets in the same subnet on each guest? if yes and if the
> problem shows only when you try to ping off-link destinations (going
> thru default router), then you may be running into 6603974.
>
> -Harsha
>
>  --
> This message was posted from opensolaris.org
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>        _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>
>
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>


-- 
--
Matt Walburn
http://mattwalburn.com

On Thu, Jun 19, 2008 at 3:43 PM, <Sriharsha.Basavapatna at sun.com> wrote:

>  Scott Adair wrote:
>
> Yes, all of these vnets are on the same subnet.
>
>  within a guest are all 3 vnets in the same subnet? from your previous
> emails I thought all 3 vnets are plumbed and in use, in each guest; but
> looks like you are saying only 1 is plumbed?
>
> The issue occurs from just general use, but seems to be related to
> high network utilization (we use NFS for all the users home
> directories).
>
> 6603974 - Interesting, although we only have 3 vnet (only 1 plumbed)
> on each domain, and only two domains per vsw. Also we are not using
> DHCP. We loose connectivity to all external systems, both on the same
> subnet and across our router.'
>
>  if you are losing connection to hosts on the same subnet, then it may not
> be that problem.
>
> Would there be any harm is setting ip_ire_min_bucket_cnt and
> ip_ire_max_bucket_cnt? Would that need to be set in each Domain or
> just the Primary?
>
>  there shouldn't be any harm; set it in /etc/system on the domain(not
> primary) and remove after verifying.
> -Harsha
>
>  Scott
>
> On 19-Jun-08, at 3:53 PM, Sriharsha.Basavapatna at Sun.COM wrote:
>
>    Scott Adair wrote:
>
>  So we have some mixed results. This seems to have reduced the
> issue, but it has not solved it. Actually, I think it has masked it
> a bit since it seems to have just increased a timeout in the vsw
> code (although I'm not a programmer, so I don't know for sure).
>
>        That work around(6675887) is needed only if you are using an aggr
> device for vsw; I'm not sure how that can change the behavior you
> are seeing.
>
>  Something else that I've noticed. Let's say that we have LD1 and
> LD2 on VSW0. If LD1 has the network problem I can still ping LD2.
> So I'm starting to think that the problem is not related to the vsw
> but maybe is something in the vnet driver inside the domain? This
> would explain why I loose all network connectivity, across all the
> vsw's the ldom is connected to, at the same time.
>
>        are these vnets in the same subnet on each guest? if yes and if the
> problem shows only when you try to ping off-link destinations (going
> thru default router), then you may be running into 6603974.
>
> -Harsha
>
>  --
> This message was posted from opensolaris.org
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>        _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>
>
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>


-- 
--
Matt Walburn
http://mattwalburn.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/ldoms-discuss/attachments/20080625/bc7e0fb5/attachment.html>

[ldoms-discuss] Dropping Network Connection

Reply via email to