I am having the same issue on a T5240 with S10U5 + LDOMs 1.0.3 + EIS
Baseline 05/08:
My configuration is this:
I have two nxge devices configured in a probe-based IPMP failover group.
These same two devices are then configured into two virtual switches so that
the LDOMs can do IPMP as well.
bash-3.00# ldm list-bindings
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 8 4G 5.6% 56m
MAC
00:14:4f:46:a8:8e
VCPU
VID PID UTIL STRAND
0 0 12% 100%
1 1 5.2% 100%
2 2 1.1% 100%
3 3 1.0% 100%
4 4 15% 100%
5 5 9.4% 100%
6 6 0.5% 100%
7 7 0.2% 100%
MAU
ID CPUSET
0 (0, 1, 2, 3, 4, 5, 6, 7)
MEMORY
RA PA SIZE
0xe000000 0xe000000 4G
VARIABLES
auto-boot?=true
boot-device=rootdisk rootmirror disk net
diag-switch?=false
keyboard-layout=US-English
local-mac-address?=true
nvramrc=devalias rootdisk /pci at 400/pci at 0/pci at 8/scsi at 0/disk at
0,0:a
devalias rootmirror /pci at 400/pci at 0/pci at 8/scsi at 0/disk at 1,0:a
." ChassisSerialNumber BEL0819IID " cr
security-mode=none
security-password=
use-nvramrc?=true
IO
DEVICE PSEUDONYM OPTIONS
pci at 400 pci_0
pci at 500 pci_1
VCC
NAME PORT-RANGE
primary-vcc0 5000-5100
CLIENT PORT
soeivsx001c at primary-vcc0 5000
xvmavsx001c at primary-vcc0 5004
xvmpvsx001c at primary-vcc0 5005
ni2avsx001c at primary-vcc0 5001
spsavsx001c at primary-vcc0 5003
smcavsx001c at primary-vcc0 5002
VSW
NAME MAC NET-DEV DEVICE MODE
primary-vsw0 00:14:4f:f9:d1:57 nxge4 switch at 0
PEER MAC
vnet1 at soeivsx001c 00:14:4f:fa:67:42
vnet2 at xvmavsx001c 00:14:4f:fb:c7:cc
vnet3 at xvmpvsx001c 00:14:4f:fa:c3:39
vnet4 at ni2avsx001c 00:14:4f:fa:fe:5f
vnet5 at spsavsx001c 00:14:4f:fb:53:c3
vnet6 at smcavsx001c 00:14:4f:fb:0f:88
NAME MAC NET-DEV DEVICE MODE
primary-vsw1 00:14:4f:f8:9f:e2 nxge6 switch at 1
PEER MAC
vnet7 at soeivsx001c 00:14:4f:f9:b9:b5
vnet8 at xvmavsx001c 00:14:4f:fb:a4:c9
vnet9 at xvmpvsx001c 00:14:4f:fa:b2:94
vnet10 at ni2avsx001c 00:14:4f:f8:eb:ce
vnet11 at spsavsx001c 00:14:4f:fb:b1:07
vnet12 at smcavsx001c 00:14:4f:f8:01:e2
VDS
NAME VOLUME OPTIONS DEVICE
primary-vds0 vol1 /dev/dsk/c1t4d0s2
vol2 /dev/dsk/c1t5d0s2
svol1
/dev/dsk/c4t60060480000190100471533033433531d0s2
vol7 /dev/dsk/c1t10d0s2
vol8 /dev/dsk/c1t11d0s2
vol11 /dev/dsk/c1t14d0s2
vol12 /dev/dsk/c1t15d0s2
vol9 /dev/dsk/c1t12d0s2
vol10 /dev/dsk/c1t13d0s2
vol3 /dev/dsk/c1t6d0s2
vol4 /dev/dsk/c1t7d0s2
svol2
/dev/dsk/c4t60060480000190100471533033433731d0s2
vol5 /dev/dsk/c1t8d0s2
vol6 /dev/dsk/c1t9d0s2
CLIENT VOLUME
vdisk1 at soeivsx001c vol1
vdisk2 at soeivsx001c vol2
vdisk3 at soeivsx001c svol1
vdisk1 at xvmavsx001c vol3
vdisk2 at xvmavsx001c vol4
vdisk3 at xvmavsx001c svol2
vdisk1 at xvmpvsx001c vol5
vdisk2 at xvmpvsx001c vol6
vdisk1 at ni2avsx001c vol7
vdisk2 at ni2avsx001c vol8
vdisk1 at spsavsx001c vol9
vdisk2 at spsavsx001c vol10
vdisk1 at smcavsx001c vol11
vdisk2 at smcavsx001c vol12
VCONS
NAME SERVICE PORT
SP
On Thu, Jun 19, 2008 at 3:43 PM, <Sriharsha.Basavapatna at sun.com> wrote:
> Scott Adair wrote:
>
> Yes, all of these vnets are on the same subnet.
>
> within a guest are all 3 vnets in the same subnet? from your previous
> emails I thought all 3 vnets are plumbed and in use, in each guest; but
> looks like you are saying only 1 is plumbed?
>
> The issue occurs from just general use, but seems to be related to
> high network utilization (we use NFS for all the users home
> directories).
>
> 6603974 - Interesting, although we only have 3 vnet (only 1 plumbed)
> on each domain, and only two domains per vsw. Also we are not using
> DHCP. We loose connectivity to all external systems, both on the same
> subnet and across our router.'
>
> if you are losing connection to hosts on the same subnet, then it may not
> be that problem.
>
> Would there be any harm is setting ip_ire_min_bucket_cnt and
> ip_ire_max_bucket_cnt? Would that need to be set in each Domain or
> just the Primary?
>
> there shouldn't be any harm; set it in /etc/system on the domain(not
> primary) and remove after verifying.
> -Harsha
>
> Scott
>
> On 19-Jun-08, at 3:53 PM, Sriharsha.Basavapatna at Sun.COM wrote:
>
> Scott Adair wrote:
>
> So we have some mixed results. This seems to have reduced the
> issue, but it has not solved it. Actually, I think it has masked it
> a bit since it seems to have just increased a timeout in the vsw
> code (although I'm not a programmer, so I don't know for sure).
>
> That work around(6675887) is needed only if you are using an aggr
> device for vsw; I'm not sure how that can change the behavior you
> are seeing.
>
> Something else that I've noticed. Let's say that we have LD1 and
> LD2 on VSW0. If LD1 has the network problem I can still ping LD2.
> So I'm starting to think that the problem is not related to the vsw
> but maybe is something in the vnet driver inside the domain? This
> would explain why I loose all network connectivity, across all the
> vsw's the ldom is connected to, at the same time.
>
> are these vnets in the same subnet on each guest? if yes and if the
> problem shows only when you try to ping off-link destinations (going
> thru default router), then you may be running into 6603974.
>
> -Harsha
>
> --
> This message was posted from opensolaris.org
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>
>
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>
--
--
Matt Walburn
http://mattwalburn.com
On Thu, Jun 19, 2008 at 3:43 PM, <Sriharsha.Basavapatna at sun.com> wrote:
> Scott Adair wrote:
>
> Yes, all of these vnets are on the same subnet.
>
> within a guest are all 3 vnets in the same subnet? from your previous
> emails I thought all 3 vnets are plumbed and in use, in each guest; but
> looks like you are saying only 1 is plumbed?
>
> The issue occurs from just general use, but seems to be related to
> high network utilization (we use NFS for all the users home
> directories).
>
> 6603974 - Interesting, although we only have 3 vnet (only 1 plumbed)
> on each domain, and only two domains per vsw. Also we are not using
> DHCP. We loose connectivity to all external systems, both on the same
> subnet and across our router.'
>
> if you are losing connection to hosts on the same subnet, then it may not
> be that problem.
>
> Would there be any harm is setting ip_ire_min_bucket_cnt and
> ip_ire_max_bucket_cnt? Would that need to be set in each Domain or
> just the Primary?
>
> there shouldn't be any harm; set it in /etc/system on the domain(not
> primary) and remove after verifying.
> -Harsha
>
> Scott
>
> On 19-Jun-08, at 3:53 PM, Sriharsha.Basavapatna at Sun.COM wrote:
>
> Scott Adair wrote:
>
> So we have some mixed results. This seems to have reduced the
> issue, but it has not solved it. Actually, I think it has masked it
> a bit since it seems to have just increased a timeout in the vsw
> code (although I'm not a programmer, so I don't know for sure).
>
> That work around(6675887) is needed only if you are using an aggr
> device for vsw; I'm not sure how that can change the behavior you
> are seeing.
>
> Something else that I've noticed. Let's say that we have LD1 and
> LD2 on VSW0. If LD1 has the network problem I can still ping LD2.
> So I'm starting to think that the problem is not related to the vsw
> but maybe is something in the vnet driver inside the domain? This
> would explain why I loose all network connectivity, across all the
> vsw's the ldom is connected to, at the same time.
>
> are these vnets in the same subnet on each guest? if yes and if the
> problem shows only when you try to ping off-link destinations (going
> thru default router), then you may be running into 6603974.
>
> -Harsha
>
> --
> This message was posted from opensolaris.org
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>
>
> _______________________________________________
> ldoms-discuss mailing list
> ldoms-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/ldoms-discuss
>
>
--
--
Matt Walburn
http://mattwalburn.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/ldoms-discuss/attachments/20080625/bc7e0fb5/attachment.html>