On Jul 21, 2014, at 8:57 AM, Ulrich Windl wrote: >>>> Charles Taylor <chas...@ufl.edu> schrieb am 17.07.2014 um 17:24 in >>>> Nachricht > <761ce39a-57d8-47d2-860d-2af1936cc...@ufl.edu>: >> I feel like this is something that must have been covered extensively >> already >> but I've done a lot of googling, looked at a lot of cluster configs, but >> have >> not found the solution. >> >> I have an HA NFS cluster (corosync+pacemaker). The relevant rpms are listed >> below but I'm not sure they are that important to the question which is >> this... >> >> When performing managed failovers of the NFS-exported file system resource >> from one node to the other (crm resource move), any active NFS clients >> experience an I/O error when the file system is unexported. In other words, >> you must unexport it to unmount it. As soon as it is unexported, clients >> are >> no longer able to write to it and experience an I/O error (rather than just >> blocking). > > Do you hard-mount or soft-mount NFS? Do you use NFSv3 or NFSv4?
Hard mounts. We are supporting both NFSv3 and NFSv4 mounts. I tested both and the behavior was the same. There seemed to be no way to avoid and I/O error on the clients when umounting the file system as part of a managed (crm resource move) failover. I'm wondering if this is expected or if there is some way around it that I'm simply missing. We'd like to be able to "move" resources back and forth among the servers for maintenance without disrupting client I/O. Just to summarize, the Filesystem agent must umount the volume to migrate it. To successfully umount it, umount requires the volume to be unexported. As soon as the "stop" operation is run by the exportfs agent, any clients actively doing I/O are interrupted and error out rather than blocking as they would if the server went down. So far, I've been unable to find a way around this. As I write this, I'm thinking that perhaps the way to achieve this is to change the order of the services so that the VIP is started last and stopped first when stopping/starting the resource group. That should make it appear to the client that the server just "went away" as would happen in a failure scenario. Then the client should not know that the file system has been unexported since it can't talk to the server. Perhaps, I just made a rookie mistake in the ordering of the services within the resource group. I'll try that and report back. Regards, Charlie _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems