I have a number of systems with an iscsi root filesystem. These systems connect to an redundant pair of iscsi servers, using tgtd. I use heartbeat to fail over the iscsi target. I'm using open-iscsi 869. It tried both the iscsi transport 869, and the default centos 724. The iscsid used was always 869.
I've set the replacement timeout high, so the iscsi root system should be able to recover from the short outage if the iscsi target fails over to another server: node.session.timeo.replacement_timeout = 86400 Unfortunately, this doesn't always work. Sometimes the OS will report filesystem errors and mount the fs read-only. A short time later the iscsi targets will be reconnected, but the filesystem is already read-only by then. The logs show (default iscsi transport 724 was used for this test): Apr 21 11:35:25 front003 kernel: end_request: I/O error, dev sda, sector 1336006 Apr 21 11:35:25 front003 kernel: end_request: I/O error, dev sda, sector 1336006 Apr 21 11:35:25 front003 kernel: Buffer I/O error on device sda1, logical block 166993 Apr 21 11:35:25 front003 kernel: Buffer I/O error on device sda1, logical block 166993 <more disk errors> Apr 21 11:35:26 front003 kernel: ext3_abort called. Apr 21 11:35:26 front003 kernel: ext3_abort called. Apr 21 11:35:26 front003 kernel: EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal Apr 21 11:35:26 front003 kernel: EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal Apr 21 11:35:26 front003 kernel: Remounting filesystem read-only Apr 21 11:35:26 front003 kernel: Remounting filesystem read-only Apr 21 11:35:36 front003 kernel: connection1:0: iscsi: detected conn error (1011) Apr 21 11:35:36 front003 kernel: connection1:0: iscsi: detected conn error (1011) Apr 21 11:35:36 front003 iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3) Apr 21 11:35:40 front003 kernel: connection5:0: iscsi: detected conn error (1011) Apr 21 11:35:40 front003 kernel: connection5:0: iscsi: detected conn error (1011) Apr 21 11:35:40 front003 kernel: connection8:0: iscsi: detected conn error (1011) Apr 21 11:35:40 front003 kernel: connection8:0: iscsi: detected conn error (1011) Apr 21 11:35:40 front003 iscsid: received iferror -38 Apr 21 11:35:40 front003 iscsid: received iferror -38 Apr 21 11:35:40 front003 iscsid: received iferror -38 Apr 21 11:35:40 front003 iscsid: received iferror -38 Apr 21 11:35:40 front003 iscsid: received iferror -38 Apr 21 11:35:40 front003 iscsid: connection1:0 is operational after recovery (1 attempts) Is there any way to prevent this, so a iscsi root system can recover gracefully from a short outage? Niels --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---
