Thanks for the fix. We got it via IDR30 for opensolaris SU7. Unfortunately, the fix does not help. As it just retries a single time to get the mount, there's a high potential to fail again, which is not covered by the fix. I programmed a workaround myself via an SGE prolog script, and there I observed that we have to retry up to five times before we get it on a moderately busy server. The statistics now shows about 5% failures when running SGE jobs on 15 clients running ~120 processes in total (with the NFS parameters described above). Since we are ready to activate another 48 clients, this will get worse soon.
It seems that there should be a better fix to hide the EBUSY condition from userland, maybe a 'while' -loop with a short sleep ? Or a fix for the problem that triggers the EBUSY condition ? -- This message posted from opensolaris.org