Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15) (OpenIndiana-discuss Digest, Vol 37, Issue 20)

2013-08-15 Thread Ong Yu-Phing

Hmm, sounds like some issue with the iscsi adapter on vmware?

This article isn't directly applicable, but the configuration discussion 
might give some pointers on what you can check?


http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007829

On 15/08/2013 03:42, James Relph wrote:

the same servers as iSCSI targets has no iSCSI errors at the same time as 
VMware is freaking out

Is VMware using iSCSI as well or NFS?


Tried it with both (iSCSI originally), and oddly it's basically the exact same 
issue (frequent disconnects) between NFS and iSCSI.  You would be convinced 
it's network related, but nothing shows up obviously wrong in the switch logs 
and obviously the OpenIndiana iSCSI initiators (two of which are guest OSs on 
the VMware cluster!) aren't affected at all.  You get a bizarre situation where 
VMware is complaining about iSCSI going up and down, yet the VMs themselves 
don't register any problems whatsoever.

Thanks,

James.






___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15)

2013-08-14 Thread James Relph

>> the same servers as iSCSI targets has no iSCSI errors at the same time as 
>> VMware is freaking out
> 
> Is VMware using iSCSI as well or NFS?


Tried it with both (iSCSI originally), and oddly it's basically the exact same 
issue (frequent disconnects) between NFS and iSCSI.  You would be convinced 
it's network related, but nothing shows up obviously wrong in the switch logs 
and obviously the OpenIndiana iSCSI initiators (two of which are guest OSs on 
the VMware cluster!) aren't affected at all.  You get a bizarre situation where 
VMware is complaining about iSCSI going up and down, yet the VMs themselves 
don't register any problems whatsoever.

Thanks,

James.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15)

2013-08-14 Thread Gary Driggs
On Aug 14, 2013, at 11:38 AM, James Relph wrote:

> the same servers as iSCSI targets has no iSCSI errors at the same time as 
> VMware is freaking out

Is VMware using iSCSI as well or NFS?

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15)

2013-08-14 Thread James Relph
I've looked at subsystem performance and had things like zpool iostat running 
when the issue was occurring, and there's just nothing stressing the systems 
enough.  Plus the OpenIndiana servers using the same servers as iSCSI targets 
has no iSCSI errors at the same time as VMware is freaking out.  I would have 
expected the Oi initiators to at least log a few re-writes or iSCSI errors if 
it was a general "the iSCSI target is misbehaving" problem.

Thanks,

James

Principal Consultant

Website:www.themacplace.co.uk

On 14 Aug 2013, at 08:33, Ong Yu-Phing  wrote:

> so far we've been discussing network.  How about the disk subsystem side?  
> I've had a situation where a rebuild (RAID10 equivalent with 3x RAID1 vdevs, 
> had to replace a faulty disk), together with an overnight snapshot and 
> replication to another server, was "enough" to cause iscsi timeouts.
> 
> On 13/08/2013 21:18, Doug Hughes wrote:
>> We have lacp working between force10, hp, and cisco switches in all possible 
>> combinations with no difficulties. We do monitor and alert on excessive 
>> errors and drops for interfaces, but lacp isnt a culprit. If anything, it's 
>> an underlying interface when we find them. Also, it beats the heck out of 
>> spanning tree and is 2 orders of magnitude simpler than ospf, and 1 order 
>> simpler and more portable than ecmp. I am quite surprised by your 
>> observations.
>> 
>> Sent from my android device.
>> 
>> -Original Message-
>> From: "Edward Ned Harvey (openindiana)" 
>> To: Discussion list for OpenIndiana 
>> Sent: Tue, 13 Aug 2013 7:22 AM
>> Subject: Re: [OpenIndiana-discuss] VMware
>> 
>>> From: James Relph [mailto:ja...@themacplace.co.uk]
>>> Sent: Monday, August 12, 2013 4:47 PM
>>> 
>>> No, we're not getting any ping loss, that's the thing.  The network looks
>>> entirely faultless.  We've run pings for 24 hours with no ping loss.
>> Yeah, I swore you said you had ping loss before - but if not - I don't think 
>> ping alone is sufficient.  You have to find the error counters on the LACP 
>> interfaces.  Everybody everywhere seems to blindly assume LACP works 
>> reliably, but to me, simply saying the term "LACP" is a red flag.  It's 
>> extremely temperamental, and the resultant behavior is exactly as you've 
>> described.
>> 
>> ___
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss@openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>> 
>> 
> 
> Disclaimer: use of our emails are governed by terms at 
> http://360-jambo.com/emd
> 
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware (OpenIndiana-discuss Digest, Vol 37, Issue 15)

2013-08-14 Thread Ong Yu-Phing
so far we've been discussing network.  How about the disk subsystem 
side?  I've had a situation where a rebuild (RAID10 equivalent with 3x 
RAID1 vdevs, had to replace a faulty disk), together with an overnight 
snapshot and replication to another server, was "enough" to cause iscsi 
timeouts.


On 13/08/2013 21:18, Doug Hughes wrote:

We have lacp working between force10, hp, and cisco switches in all possible 
combinations with no difficulties. We do monitor and alert on excessive errors 
and drops for interfaces, but lacp isnt a culprit. If anything, it's an 
underlying interface when we find them. Also, it beats the heck out of spanning 
tree and is 2 orders of magnitude simpler than ospf, and 1 order simpler and 
more portable than ecmp. I am quite surprised by your observations.

Sent from my android device.

-Original Message-
From: "Edward Ned Harvey (openindiana)" 
To: Discussion list for OpenIndiana 
Sent: Tue, 13 Aug 2013 7:22 AM
Subject: Re: [OpenIndiana-discuss] VMware


From: James Relph [mailto:ja...@themacplace.co.uk]
Sent: Monday, August 12, 2013 4:47 PM

No, we're not getting any ping loss, that's the thing.  The network looks
entirely faultless.  We've run pings for 24 hours with no ping loss.

Yeah, I swore you said you had ping loss before - but if not - I don't think ping alone 
is sufficient.  You have to find the error counters on the LACP interfaces.  Everybody 
everywhere seems to blindly assume LACP works reliably, but to me, simply saying the term 
"LACP" is a red flag.  It's extremely temperamental, and the resultant behavior 
is exactly as you've described.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss




Disclaimer: use of our emails are governed by terms at http://360-jambo.com/emd

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-13 Thread Doug Hughes
We have lacp working between force10, hp, and cisco switches in all possible 
combinations with no difficulties. We do monitor and alert on excessive errors 
and drops for interfaces, but lacp isnt a culprit. If anything, it's an 
underlying interface when we find them. Also, it beats the heck out of spanning 
tree and is 2 orders of magnitude simpler than ospf, and 1 order simpler and 
more portable than ecmp. I am quite surprised by your observations.

Sent from my android device.

-Original Message-
From: "Edward Ned Harvey (openindiana)" 
To: Discussion list for OpenIndiana 
Sent: Tue, 13 Aug 2013 7:22 AM
Subject: Re: [OpenIndiana-discuss] VMware

> From: James Relph [mailto:ja...@themacplace.co.uk]
> Sent: Monday, August 12, 2013 4:47 PM
> 
> No, we're not getting any ping loss, that's the thing.  The network looks
> entirely faultless.  We've run pings for 24 hours with no ping loss.

Yeah, I swore you said you had ping loss before - but if not - I don't think 
ping alone is sufficient.  You have to find the error counters on the LACP 
interfaces.  Everybody everywhere seems to blindly assume LACP works reliably, 
but to me, simply saying the term "LACP" is a red flag.  It's extremely 
temperamental, and the resultant behavior is exactly as you've described.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-13 Thread Edward Ned Harvey (openindiana)
> From: James Relph [mailto:ja...@themacplace.co.uk]
> Sent: Monday, August 12, 2013 4:47 PM
> 
> No, we're not getting any ping loss, that's the thing.  The network looks
> entirely faultless.  We've run pings for 24 hours with no ping loss.

Yeah, I swore you said you had ping loss before - but if not - I don't think 
ping alone is sufficient.  You have to find the error counters on the LACP 
interfaces.  Everybody everywhere seems to blindly assume LACP works reliably, 
but to me, simply saying the term "LACP" is a red flag.  It's extremely 
temperamental, and the resultant behavior is exactly as you've described.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-12 Thread James Relph

> I think we found your smoking gun.  You're getting ping loss on a local 
> network, and you're using 4x 10Gb LACP bonded network.  And for some reason 
> you say "should be pretty solid."  What you've described is basically the 
> definition of unstable, if you ask me.

No, we're not getting any ping loss, that's the thing.  The network looks 
entirely faultless.  We've run pings for 24 hours with no ping loss.


> Before anything else, know this:  In LACP, only one network interface can be 
> used per data stream.  So if you have a server with LACP, then each client 
> can go up to 10Gb, but if you have 4 clients simultaneously, they can each go 
> up to 10Gb.  You cannot push 40Gb to a single client.

Each storage server has 5 clients.

> Also, your hard disks are all 1Gbit.  So every 10 disks you have in the 
> server add up to a single 10Gb network interface.  It is absolutely pointless 
> to use LACP in this situation unless you have a huge honking server.  
> (Meaning >40 disks).

They've got 38 disks.

> In my experience, LACP is usually unstable, unless you buy a really expensive 
> switch

The switches are pretty expensive, we've got Arista switches and SolarFlare 
NICs in the servers (well, the bond is across a SolarFlare NIC and an Intel 
NIC).

> and QA test the hell out of your configuration before using it.  I hear lots 
> of people say their LACP is stable and reliable where they are - but it's 
> only because they have never tested it and haven't noticed the problems.  The 
> problems are specifically as you've described.  Occasional packet loss, which 
> people tend to think is ok, but in reality, the only acceptable level of 
> packet loss is 0%.

Yep, 0% packet loss, sorry if I've mis-worded something somewhere, but 
definitely no dropped packets.

> 
> Figure out how to observe & clear the error counters on all the network 
> interfaces.  Login to the switch to measure them there ...  Login to the 
> server to measure them there ...  Login to each client to measure them there. 
>  Reset them all to 0.  And then start hammering the shit out of the whole 
> system.  Get all the clients to drive the network hard, both transmit and 
> receive.  If you see error counters increasing, you have a problem.


I'll double check but pretty sure that we've reset witnessed no CRC errors over 
test periods, even when hammering the system.

James.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-12 Thread Edward Ned Harvey (openindiana)
> -Original Message-
> From: James Relph [mailto:ja...@themacplace.co.uk]
> Sent: Sunday, August 11, 2013 10:59 AM
> 
> although would we lose pings
> with that (had pings running to test for a network issue and never had packet
> loss)?  It's a bit of a puzzler!

Hold on now ...


> From: James Relph [mailto:ja...@themacplace.co.uk]
> Sent: Sunday, August 11, 2013 12:59 PM
> 
>dedicated physical 10Gb network for iSCSI/NFS traffic, with 4x 10Gb
> links (in an LACP bond) per device.  Should be pretty solid really.

I think we found your smoking gun.  You're getting ping loss on a local 
network, and you're using 4x 10Gb LACP bonded network.  And for some reason you 
say "should be pretty solid."  What you've described is basically the 
definition of unstable, if you ask me.

Before anything else, know this:  In LACP, only one network interface can be 
used per data stream.  So if you have a server with LACP, then each client can 
go up to 10Gb, but if you have 4 clients simultaneously, they can each go up to 
10Gb.  You cannot push 40Gb to a single client.

Also, your hard disks are all 1Gbit.  So every 10 disks you have in the server 
add up to a single 10Gb network interface.  It is absolutely pointless to use 
LACP in this situation unless you have a huge honking server.  (Meaning >40 
disks).

In my experience, LACP is usually unstable, unless you buy a really expensive 
switch and QA test the hell out of your configuration before using it.  I hear 
lots of people say their LACP is stable and reliable where they are - but it's 
only because they have never tested it and haven't noticed the problems.  The 
problems are specifically as you've described.  Occasional packet loss, which 
people tend to think is ok, but in reality, the only acceptable level of packet 
loss is 0%.

Here's what you need to do:

Figure out how to observe & clear the error counters on all the network 
interfaces.  Login to the switch to measure them there ...  Login to the server 
to measure them there ...  Login to each client to measure them there.  Reset 
them all to 0.  And then start hammering the shit out of the whole system.  Get 
all the clients to drive the network hard, both transmit and receive.  If you 
see error counters increasing, you have a problem.

Based on what you've said so far, I guarantee you're going to see error 
counters increasing.  Unless you ignore my advice and don't do these tests ... 
because these tests are difficult to do, and like I said, several times I've 
seen sysadmins swear their system was reliable, only to be proven wrong when 
*actually* put to the test.

I also encounter a lot:  Mailing lists exactly like this one, I say something 
just like above, and other people come back and argue about it, *insisting* 
that it's ok to have occasional packet loss, on a LAN or WAN.  I swear to you, 
as an IT consultant, this provides a lot of my sustenance - I get called into 
places with either storage problems or internet problems, and if there is 
packet loss >0% that is ultimately the root cause of their problem.  Never seen 
an exception.

Because this argument invariably leads to argument, I won't respond to any of 
it.  I've simply grown tired of arguing about it with other people elsewhere.  
It's definitely a trending pattern.  The way I see it, I provide you free 
advice on a mailing list, if you don't take it, so be it.  I continue to get 
paid.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread Bentley, Dain
Been using it on as a datastore for both my vmware and hyper-v clusters.  No 
complaints, great performance and reliability.  I'm using iSCSI. 

Sent from my iPhone

On Aug 10, 2013, at 6:12 AM, "James Relph"  wrote:

> 
> Hi all,
> 
> Is anybody using Oi as a data store for VMware using NFS or iSCSI?
> 
> Thanks,
> 
> James. 
> 
> Sent from my iPhone
> 
> 
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread Jim Klimov

On 2013-08-11 19:57, James Relph wrote:

If I recall correctly, you can set LACP parameters that determine how
fast the switch-over occurs between ports, the interval at which the



I'll have to have a look, but the thing is that we were seeing these datastore 
drops while at the same time we were running pings showing no dropped packets 
and no significant network latency.  If it was an LACP issue (ports dropping 
etc.) causing iSCSI issues, wouldn't we see dropped packets at the same time?


I think it may depend on the hashing type you use in the LACP trunk.
Basically, in LACP, every logical connection uses one of the offered
links and maxes out at its link speed. When you have many connections
they can, on average, cover all links more or less equally, and sum
up to a larger bandwidth than one link. Selection of a link for a
particular connection can depend on several factors - for example,
L2-hashing like "sum up MAC addresses of DST and SRC, divide by the
number of links, use the remainder as the link number to use". Other
algorithms may bring IP addresses and port numbers into the mix, so
that, in particular, if there are only two hosts communicating over
a direct link, they still have chances to utilize all links.

So it is possible that in your case ICMP went over a working link
and data failed over a flaky link, for example. My gut-feeling in
this area would be that one of the links does not perform well, but
is not kicked out of the team (at least for a while), so connections
scheduled onto it are lost or at least lag. One idea is to verify
that LACP does not cause more trouble than benefit (perhaps by
leaving only one physical link active and trying to reproduce the
problem); a recent discussion on this subject suggested that maybe
independent (non-trunked) links and application-layer MPxIO to the
targets might be better than generic network-level aggregation.

HTH,
//Jim

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread Gary
On Sun, Aug 11, 2013 at 10:57 AM, James Relph  wrote:

> If it was an LACP issue (ports dropping etc.) causing iSCSI issues, wouldn't 
> we see dropped packets at the same time?

The protocol calls for strict ordering of packets so one would think
ICMP would be useful in troubleshooting this. I remember using ping to
test switch-over and didn't put anything live on it until I could see
no noticeable outage after tuning my parameters. So if you're certain
layer two isn't at fault, have you covered the other four? For
example, is your hypervisor fully patched?

-Gary

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread James Relph
> If I recall correctly, you can set LACP parameters that determine how
> fast the switch-over occurs between ports, the interval at which the
> interfaces send LACP packets, and more. These can be set on either the
> OS or switch side depending on the vendor. So if you've determined
> that there is nothing wrong at either the physical layer or network
> and above, then the link layer is your most likely culprit. Applying
> the process of elimination or some other methodology is most advisable
> for these types of troubleshooting situations.

I'll have to have a look, but the thing is that we were seeing these datastore 
drops while at the same time we were running pings showing no dropped packets 
and no significant network latency.  If it was an LACP issue (ports dropping 
etc.) causing iSCSI issues, wouldn't we see dropped packets at the same time?

Thanks,

James.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread Gary Driggs
On Aug 11, 2013, at 9:59 AM, James Relph  wrote:

> Nope, dedicated physical 10Gb network for iSCSI/NFS traffic, with 4x 10Gb 
> links (in an LACP bond) per device.  Should be pretty solid really.

If I recall correctly, you can set LACP parameters that determine how
fast the switch-over occurs between ports, the interval at which the
interfaces send LACP packets, and more. These can be set on either the
OS or switch side depending on the vendor. So if you've determined
that there is nothing wrong at either the physical layer or network
and above, then the link layer is your most likely culprit. Applying
the process of elimination or some other methodology is most advisable
for these types of troubleshooting situations.

-Gary

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread James Relph
> Also, does your host use ipfilter to filter and/or NAT access to the
> iSCSI and NFS services? 

Nope, dedicated physical 10Gb network for iSCSI/NFS traffic, with 4x 10Gb links 
(in an LACP bond) per device.  Should be pretty solid really.

Thanks,

James.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread Jim Klimov

On 2013-08-11 16:59, James Relph wrote:


I'll pass that on to someone actually, thanks, although would we lose pings 
with that (had pings running to test for a network issue and never had packet 
loss)?  It's a bit of a puzzler!


Also, does your host use ipfilter to filter and/or NAT access to the
iSCSI and NFS services? It might be that you run out of "buckets"
needed to track sessions. I am not sure what the defaults are now,
but remember needing to bump them a lot on an OpenSolaris SXCE 129
firewall.

There was this patch to /lib/svc/method/ipfilter :

configure_firewall()
{
create_global_rules || exit $SMF_EXIT_ERR_CONFIG
create_global_ovr_rules || exit $SMF_EXIT_ERR_CONFIG
create_services_rules || exit $SMF_EXIT_ERR_CONFIG

[ ! -f ${IPFILCONF} -a ! -f ${IPNATCONF} ] && exit 0

### Enforce and display state-table sizing
### Jim Klimov, 2009-2010
ipf -D -T 
fr_statemax=72901,fr_statesize=104147,fr_statemax,fr_statesize -E -T 
fr_statemax,fr_statesize

# ipf -E

load_ippool || exit $SMF_EXIT_ERR_CONFIG
load_ipf || exit $SMF_EXIT_ERR_CONFIG
load_ipnat || exit $SMF_EXIT_ERR_CONFIG
}


Again, I have no idea if any of this (the fr_* line) is needed on todays
systems; the defaults in SXCE were pretty much too low, as contemporary
blogs and forums helpfully pointed out...

HTH,
//Jim Klimov


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread James Relph

I'll pass that on to someone actually, thanks, although would we lose pings 
with that (had pings running to test for a network issue and never had packet 
loss)?  It's a bit of a puzzler!

James. 

Sent from my iPhone

> On 11 Aug 2013, at 10:43, "Jim Klimov"  wrote:
> 
>> On 2013-08-11 11:13, James Relph wrote:
>> Hi Ed, Chip,
>> 
>> Thanks for the responses, it was basically to see whether people had been 
>> having any compatibility issues with Oi as backend storage.  We've seen 
>> datastore disconnects in the ESXi hosts over both iSCSI and NFS, and it 
>> seemed odd that there'd be the same problems across both protocols.  Didn't 
>> really show up in testing and I've seen other people  running this kind of 
>> setup without issue, so it was really a question to see if there were any 
>> other people seeing the same thing.  At the same time as the hosts were 
>> seeing disconnects we had other machines using the same iSCSI targets 
>> without any errors at all, so it is all a bit odd.
> 
> Maybe something with networking? Like trunked connections and some
> links going down (temporarily) and hash-routed packets to them are
> not delivered properly (until the failure is detected or clink comes
> back up)? Possibly, if a master (first) interface on an aggregation
> becomes lost, there may also be fun with MAC address changes...
> 
> Wild shots in the dark, though not completely without practical basis ;)
> 
> HTH,
> //Jim
> 
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss


___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread Jim Klimov

On 2013-08-11 11:13, James Relph wrote:

Hi Ed, Chip,

Thanks for the responses, it was basically to see whether people had been 
having any compatibility issues with Oi as backend storage.  We've seen 
datastore disconnects in the ESXi hosts over both iSCSI and NFS, and it seemed 
odd that there'd be the same problems across both protocols.  Didn't really 
show up in testing and I've seen other people  running this kind of setup 
without issue, so it was really a question to see if there were any other 
people seeing the same thing.  At the same time as the hosts were seeing 
disconnects we had other machines using the same iSCSI targets without any 
errors at all, so it is all a bit odd.


Maybe something with networking? Like trunked connections and some
links going down (temporarily) and hash-routed packets to them are
not delivered properly (until the failure is detected or clink comes
back up)? Possibly, if a master (first) interface on an aggregation
becomes lost, there may also be fun with MAC address changes...

Wild shots in the dark, though not completely without practical basis ;)

HTH,
//Jim

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-11 Thread James Relph
Hi Ed, Chip,

Thanks for the responses, it was basically to see whether people had been 
having any compatibility issues with Oi as backend storage.  We've seen 
datastore disconnects in the ESXi hosts over both iSCSI and NFS, and it seemed 
odd that there'd be the same problems across both protocols.  Didn't really 
show up in testing and I've seen other people  running this kind of setup 
without issue, so it was really a question to see if there were any other 
people seeing the same thing.  At the same time as the hosts were seeing 
disconnects we had other machines using the same iSCSI targets without any 
errors at all, so it is all a bit odd.

Thanks,

James


On 10 Aug 2013, at 14:32, Edward Ned Harvey (openindiana) 
 wrote:

>> From: James Relph [mailto:ja...@themacplace.co.uk]
>> Sent: Saturday, August 10, 2013 6:12 AM
>> 
>> Is anybody using Oi as a data store for VMware using NFS or iSCSI?
> 
> I have done both.  What do you want to know?
> 
> I couldn't measure any performance difference nfs vs iscsi.  Theoretically, 
> iscsi should be more reliable, by default setting the refreservation and 
> supposedly guaranteeing there will always be disk space available for writes, 
> but I haven't found that to be reality.  I have bumped into full disk 
> problems with iscsi just as much as nfs, so it's important to simply monitor 
> and manage intelligently.  And the comstar stuff seems to be kind of 
> unreliable, not to mention confusing.  NFS seems to be considerably easier to 
> manage.  So I would recommend NFS rather than iscsi.
> 
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-10 Thread Edward Ned Harvey (openindiana)
> From: James Relph [mailto:ja...@themacplace.co.uk]
> Sent: Saturday, August 10, 2013 6:12 AM
> 
> Is anybody using Oi as a data store for VMware using NFS or iSCSI?

I have done both.  What do you want to know?

I couldn't measure any performance difference nfs vs iscsi.  Theoretically, 
iscsi should be more reliable, by default setting the refreservation and 
supposedly guaranteeing there will always be disk space available for writes, 
but I haven't found that to be reality.  I have bumped into full disk problems 
with iscsi just as much as nfs, so it's important to simply monitor and manage 
intelligently.  And the comstar stuff seems to be kind of unreliable, not to 
mention confusing.  NFS seems to be considerably easier to manage.  So I would 
recommend NFS rather than iscsi.

___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMware

2013-08-10 Thread Schweiss, Chip
I use it to back 250 VMs on 10 ESXi hosts via NFS.  40 NL-SAS spindles,
1.4TB L2ARC, 2 ZiL SSD, 72GB ram.   Couldn't ask for a better platform for
VM storage.

-Chip

On Sat, Aug 10, 2013 at 5:11 AM, James Relph wrote:

>
> Hi all,
>
> Is anybody using Oi as a data store for VMware using NFS or iSCSI?
>
> Thanks,
>
> James.
>
> Sent from my iPhone
>
>
> ___
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss@openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


Re: [OpenIndiana-discuss] VMWare ESX experiences

2010-10-08 Thread russell

 Hi Markus,

I have using VMware v4.0.0 at work and had stability issues with some 
virtual appliances until I applied all the latest patches. It might be 
worth applying them if you have not already.




___
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss