Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15) (OpenIndiana-discuss Digest, Vol 37, Issue 20)

2013-08-15 Thread Ong Yu-Phing

Hmm, sounds like some issue with the iscsi adapter on vmware?

This article isn't directly applicable, but the configuration discussion 
might give some pointers on what you can check?


http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007829

On 15/08/2013 03:42, James Relph wrote:

the same servers as iSCSI targets has no iSCSI errors at the same time as 
VMware is freaking out

Is VMware using iSCSI as well or NFS?


Tried it with both (iSCSI originally), and oddly it's basically the exact same 
issue (frequent disconnects) between NFS and iSCSI.  You would be convinced 
it's network related, but nothing shows up obviously wrong in the switch logs 
and obviously the OpenIndiana iSCSI initiators (two of which are guest OSs on 
the VMware cluster!) aren't affected at all.  You get a bizarre situation where 
VMware is complaining about iSCSI going up and down, yet the VMs themselves 
don't register any problems whatsoever.

Thanks,

James.






___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15)

2013-08-14 Thread James Relph

>> the same servers as iSCSI targets has no iSCSI errors at the same time as 
>> VMware is freaking out
> 
> Is VMware using iSCSI as well or NFS?


Tried it with both (iSCSI originally), and oddly it's basically the exact same 
issue (frequent disconnects) between NFS and iSCSI.  You would be convinced 
it's network related, but nothing shows up obviously wrong in the switch logs 
and obviously the OpenIndiana iSCSI initiators (two of which are guest OSs on 
the VMware cluster!) aren't affected at all.  You get a bizarre situation where 
VMware is complaining about iSCSI going up and down, yet the VMs themselves 
don't register any problems whatsoever.

Thanks,

James.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15)

2013-08-14 Thread Gary Driggs
On Aug 14, 2013, at 11:38 AM, James Relph wrote:

> the same servers as iSCSI targets has no iSCSI errors at the same time as 
> VMware is freaking out

Is VMware using iSCSI as well or NFS?

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15)

2013-08-14 Thread James Relph
I've looked at subsystem performance and had things like zpool iostat running 
when the issue was occurring, and there's just nothing stressing the systems 
enough.  Plus the OpenIndiana servers using the same servers as iSCSI targets 
has no iSCSI errors at the same time as VMware is freaking out.  I would have 
expected the Oi initiators to at least log a few re-writes or iSCSI errors if 
it was a general "the iSCSI target is misbehaving" problem.

Thanks,

James

Principal Consultant

Website:www.themacplace.co.uk

On 14 Aug 2013, at 08:33, Ong Yu-Phing  wrote:

> so far we've been discussing network.  How about the disk subsystem side?  
> I've had a situation where a rebuild (RAID10 equivalent with 3x RAID1 vdevs, 
> had to replace a faulty disk), together with an overnight snapshot and 
> replication to another server, was "enough" to cause iscsi timeouts.
> 
> On 13/08/2013 21:18, Doug Hughes wrote:
>> We have lacp working between force10, hp, and cisco switches in all possible 
>> combinations with no difficulties. We do monitor and alert on excessive 
>> errors and drops for interfaces, but lacp isnt a culprit. If anything, it's 
>> an underlying interface when we find them. Also, it beats the heck out of 
>> spanning tree and is 2 orders of magnitude simpler than ospf, and 1 order 
>> simpler and more portable than ecmp. I am quite surprised by your 
>> observations.
>> 
>> Sent from my android device.
>> 
>> -Original Message-
>> From: "Edward Ned Harvey (openindiana)" 
>> To: Discussion list for OpenIndiana 
>> Sent: Tue, 13 Aug 2013 7:22 AM
>> Subject: Re: [OpenIndiana-discuss] VMware
>> 
>>> From: James Relph [mailto:ja...@themacplace.co.uk]
>>> Sent: Monday, August 12, 2013 4:47 PM
>>> 
>>> No, we're not getting any ping loss, that's the thing.  The network looks
>>> entirely faultless.  We've run pings for 24 hours with no ping loss.
>> Yeah, I swore you said you had ping loss before - but if not - I don't think 
>> ping alone is sufficient.  You have to find the error counters on the LACP 
>> interfaces.  Everybody everywhere seems to blindly assume LACP works 
>> reliably, but to me, simply saying the term "LACP" is a red flag.  It's 
>> extremely temperamental, and the resultant behavior is exactly as you've 
>> described.
>> 
>> ___
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss@openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>> 
>> 
> 
> Disclaimer: use of our emails are governed by terms at 
> http://360-jambo.com/emd
> 
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15)

2013-08-14 Thread Ong Yu-Phing
so far we've been discussing network.  How about the disk subsystem 
side?  I've had a situation where a rebuild (RAID10 equivalent with 3x 
RAID1 vdevs, had to replace a faulty disk), together with an overnight 
snapshot and replication to another server, was "enough" to cause iscsi 
timeouts.


On 13/08/2013 21:18, Doug Hughes wrote:

We have lacp working between force10, hp, and cisco switches in all possible 
combinations with no difficulties. We do monitor and alert on excessive errors 
and drops for interfaces, but lacp isnt a culprit. If anything, it's an 
underlying interface when we find them. Also, it beats the heck out of spanning 
tree and is 2 orders of magnitude simpler than ospf, and 1 order simpler and 
more portable than ecmp. I am quite surprised by your observations.

Sent from my android device.

-Original Message-
From: "Edward Ned Harvey (openindiana)" 
To: Discussion list for OpenIndiana 
Sent: Tue, 13 Aug 2013 7:22 AM
Subject: Re: [OpenIndiana-discuss] VMware


From: James Relph [mailto:ja...@themacplace.co.uk]
Sent: Monday, August 12, 2013 4:47 PM

No, we're not getting any ping loss, that's the thing.  The network looks
entirely faultless.  We've run pings for 24 hours with no ping loss.

Yeah, I swore you said you had ping loss before - but if not - I don't think ping alone 
is sufficient.  You have to find the error counters on the LACP interfaces.  Everybody 
everywhere seems to blindly assume LACP works reliably, but to me, simply saying the term 
"LACP" is a red flag.  It's extremely temperamental, and the resultant behavior 
is exactly as you've described.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss




Disclaimer: use of our emails are governed by terms at http://360-jambo.com/emd

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss