After a while I was able to track down the problem.
During boot process service filesystem/local gets enabled long before 
iscsi/initiator. Start method of filesystem/local will mount ufs, swap and all 
stuff from /etc/vftab.
Some recent patch added a "zfs mount -a" to the filesystem/local start method.
Of couse "zfs mount -a" will not find our iSCSI zpools and should no do nothing 
at all. But in our case it finds and imports a zpool "data" located on a local 
attached disk.

I suspect that the zpool management information stored within zpool "data" 
contains a pointer to a (at that time) inaccessible zpool "iscsi1", which gets 
marked as "FAULTED"
Therefore "zpool list" shows that missing zpool as "FAULTED" since all devices 
of iscsi1 are inaccessible.
Some times later in boot process iscsi/initiator gets enabled. After getting 
all iscsi targets online, one would expect the zpool to change state, because 
of all devices are online now. But that will not happen.

As a workaround I've written a service manifest and start method. The service 
will be fired up after iscsi/initiator. The start method scans output of "zpool 
list " for faulted pools with size of "-". For each faulted zpool it will do a 
"zpool export pool-name" and after that a re-import with "zpool import 
pool-name".
After that the previously faulted iscsi zpool will be online again (unless you 
have other problems).

But be aware, there is a race condition: you have to wait some time (at least 
10 secs) between export and re-import of a zpool. Without wait inbetween the 
fault condition will not get cleared.

Again: you may encounter this case only if...
... you're running some recent kernel patch level (in our case 142909-17)
... *and* you have placed zpools on both iscsi and non-iscsi devices.

- Andreas
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to