-- Warning - threadjack in progress. But I think it might be related. --

Interesting. Could this be similar to the case that I'm seeing with "no space left on device"? Here's my uneducated assumption: - at this point I believe some form of I/O error or interrupt causes ocfs2 to error out
- the "remount read only" effect silently kicks in (no log message though)
- now file operations return "no space left on device", but my device is showing 2% use

The reasons I imagine a correlation between these two are:
Tao says:
> The ERESTARTSYS may happen when we get interrupted from ocfs2_cluster_lock.
> I met with it when I rm -rf a very large dir and use "ctrl+c" to stop it
> when I tested bug 1162.

There are also a fair number of posts over in the kernel lists talking about qlogic driver issues (qla2xxx) relating to PCI MSI's causing hangs under moderate I/O load.
eg: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/268242

The rm -rf on a very large dir would do presumably do this. I also hit this when I do a local rsync or untar of very large directory trees. EMC also recommends the qlogic HBA be set to Interrupt after every I/O completion. Could this cause a race condition? All of these have interrupts in common. Think setting nointr as a mount option would help here?

Best,
James

Joel Becker wrote:
On Mon, Aug 31, 2009 at 12:39:02PM -0700, Joel Becker wrote:
On Mon, Aug 31, 2009 at 12:16:36PM -0700, Joel Becker wrote:
5441  open("t/t6015-rev-list-show-all-parents.sh", 
O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0777) = ? ERESTARTSYS (To be restarted)

        The workaround is to mount with the 'nointr' option.

Joel

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to