Hi!
Got a problem here a couple of days ago when I ran a snapshot stream over fibre channel on my home/business/devel server to the clone backup server. Systems: OmniOS 5.11 omnios-r151018-95eaa7e on both systems, initiator on one, and and target on the other. Also same hardware: Dell precision workstation with dual xeon 6-cores and 96 GB registred ram, intel quad port gb nic, and qlogic QLE2462 HBA's. Configured with one lun provisioned from the target/backup system to the initiator system as a backup lun, and that backup lun configured as a zpool in the initiator system. I should also say, that I run this Fc connection point-to-point, no switch is involved, and it's a single fibre pair, 10 m. I did a zfs send/recv of a snapshot, and I thought it took a long time. It was around 67 GB. Then the initiator system crashed and dumped. It rebooted, and I got into it again without any trouble. What I immediately saw was that the zpool "backpool" that was backed by the Fc lun was not present any longer. I made a zpool import, and it was back there again. I did another test, sent a much smaller snap, this was around 450 MB, and that worked fine, although I thought it took a lot of time. I once again tried with the bigger snap, and same thing happened, system crashed and dumped. I got those two dump files, but I don't know wether this might be a problem on the target system or the initiator side. I can provide access to dump files. Here is some information from the two systems that I find interesting: The initiator system, omni: omni: root@omni:/var/log# dmesg | grep qlc Oct 2 18:33:08 omni qlc: [ID 439991 kern.info] NOTICE: Qlogic qlc(0,0): Loop OFFLINE root@omni:/# dmesg | grep scsi Oct 2 18:34:58 omni scsi: [ID 243001 kern.info] /pci@19,0/pci8086,3410@9/pci1077,138@0/fp@0,0 (fcp0): Oct 2 18:34:58 omni genunix: [ID 408114 kern.info] /scsi_vhci/disk@g600144f0c648ae73000057ef6d370001 (sd5) offline Oct 2 18:34:58 omni genunix: [ID 483743 kern.info] /scsi_vhci/disk@g600144f0c648ae73000057ef6d370001 (sd5) multipath status: failed: path 4 fp0/disk@w2101001b32a19a92,0 is offline root@omni:/# dmesg | grep multipath Oct 2 18:34:58 omni genunix: [ID 483743 kern.info] /scsi_vhci/disk@g600144f0c648ae73000057ef6d370001 (sd5) multipath status: failed: path 4 fp0/disk@w2101001b32a19a92,0 is offline As you can see, the loop is marked offline at the occasion for the crash. But notably strange, there is also an info of a failed multipath...? Why this? There is no multipath here... The target system, omni2: root@omni2:/root# grep stmf /var/adm/messages Oct 2 09:56:37 omni2 pseudo: [ID 129642 kern.info] pseudo-device: stmf_sbd0 Oct 2 09:56:37 omni2 genunix: [ID 936769 kern.info] stmf_sbd0 is /pseudo/stmf_sbd@0 Oct 2 09:56:46 omni2 pseudo: [ID 129642 kern.info] pseudo-device: stmf0 Oct 2 09:56:46 omni2 genunix: [ID 936769 kern.info] stmf0 is /pseudo/stmf@0 Oct 2 09:57:31 omni2 pseudo: [ID 129642 kern.info] pseudo-device: stmf0 Oct 2 09:57:31 omni2 genunix: [ID 936769 kern.info] stmf0 is /pseudo/stmf@0 Oct 2 09:57:31 omni2 stmf_sbd: [ID 690249 kern.warning] WARNING: ioctl(DKIOCINFO) failed 25 There is this warning, ioctl(DKCINFO) failed 25, that I tried to find out what it is about, but not succeeded. Perhaps it is just so simple that the Fc connection isn't good enough. The cable shouldn't be a problem, since it is brand new, but it could of coarse be something with the HBA's. I could get another cable for doing multipath, and see how that would work, but let's start with this first. Best regards from/Med vänliga hälsningar från Johan Kragsterman Capvert _______________________________________________ OmniOS-discuss mailing list [email protected] http://lists.omniti.com/mailman/listinfo/omnios-discuss
