Hi Michael,
Am 18.02.16 um 08:17 schrieb Michael Talbott:
While I don't have a setup like you've described, I'm going to take a wild
guess and say check your switches (and servers) ARP tables. Perhaps the switch
isn't updating your VIP address with the other servers MAC address fast enough.
Maybe as part of the failover script, throw a command to your switch to update
the ARP entry or clear its ARP table. Another perhaps simpler solution /
diagnostic you could do is record a ping output of the server to your router
via the vip interface and address right after the failover process to try and
tickle the switch to update its mac table. Also it's possible the clients might
need an ARP flush too.
If this is the case, another possibility is you could have both servers spoof
the same MAC address and only ever have one up at a time and have them
controlled by the failover script (or bad things will happen).
Just a thought.
Michael
Sent from my iPhone
On Feb 17, 2016, at 10:13 PM, Stephan Budach <stephan.bud...@jvm.de> wrote:
Hi,
I have been test driving RSF-1 for the last week to accomplish the following:
- cluster a zpool, that is made up from 8 mirrored vdevs, which are based on 8
x 2 SSD mirrors via iSCSI from another OmniOS box
- export a nfs share from above zpool via a vip
- have RSF-1 provide the fail-over and vip-moving
- use the nfs share as a repository for my Oracle VM guests and vdisks
The setup seems to work fine, but I do have one issue, I can't seem to get solved.
Whenever I failover the zpool, any inflight nfs data, will be stalled for some
unpredictable time. Sometimes it takes not much longer than the "move" time of
the resources but sometimes it takes up to 5 mins. until the nfs client on my VM server
becomes alive again.
So, when I issue a simple ls -l on the folder of the vdisks, while the
switchover is happening, the command somtimes comcludes in 18 to 20 seconds,
but sometime ls will just sit there for minutes.
I wonder, if there's anything, I could do about that. I have already played
with several timeouts, nfs wise and tcp wise, but nothing seem to yield any
effect on this issue. Anyone, who knows some tricks to speed up the inflight
data?
Thanks,
Stephan
I don't think that the switches are the problem, since when I ping the
vip from the VM host (OL6 based), then the ping only ceases for the time
it takes RSF-1 to move the services and afterwards the pings continue
just normally. The only thing I wonder is, if it's more of a NFS or a
tcp-in-general thing. Maybe I should also test some other IP protocol to
see, if that one stalls as well for that long of a time.
Cheers,
Stephan
_______________________________________________
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss