Ok, after doing a lot more testing of this I've found it's not the Supermicro 
controller causing problems.  It's purely ZFS, and it causes some major 
problems!  I've even found one scenario that appears to cause huge data loss 
without any warning from ZFS - up to 30,000 files and 100MB of data missing 
after a reboot, with zfs reporting that the pool is OK.

***********************************************************************
1. Solaris handles USB and SATA hot plug fine

If disks are not in use by ZFS, you can unplug USB or SATA devices, cfgadm will 
recognise the disconnection.  USB devices are recognised automatically as you 
reconnect them, SATA devices need reconfiguring.  Cfgadm even recognises the 
SATA device as an empty bay:

# cfgadm
Ap_Id                          Type         Receptacle   Occupant     Condition
sata1/7                        sata-port    empty        unconfigured ok
usb1/3                         unknown      empty        unconfigured ok

-- insert devices --

# cfgadm
Ap_Id                          Type         Receptacle   Occupant     Condition
sata1/7                        disk         connected    unconfigured unknown
usb1/3                         usb-storage  connected    configured   ok

To bring the sata drive online it's just a case of running
# cfgadm -c configure sata1/7 

***********************************************************************
2. If ZFS is using a hot plug device, disconnecting it will hang all ZFS status 
tools.

While pools remain accessible, any attempt to run "zpool status" will hang.  I 
don't know if there is any way to recover these tools once this happens.  While 
this is a pretty big problem in itself, it also makes me worry if other types 
of error could have the same effect.  I see potential for this leaving a server 
in a state whereby you know there are errors in a pool, but have no way of 
finding out what those errors might be without rebooting the server.

***********************************************************************
3. Once ZFS status tools are hung the computer will not shut down.

The only way I've found to recover from this is to physically power down the 
server.  The solaris shutdown process simply hangs.

***********************************************************************
4. While reading an offline disk causes errors, writing does not!  
    *** CAUSES DATA LOSS ***

This is a big one:  ZFS can continue writing to an unavailable pool.  It 
doesn't always generate errors (I've seen it copy over 100MB before erroring), 
and if not spotted, this *will* cause data loss after you reboot.

I discovered this while testing how ZFS coped with the removal of a hot plug 
SATA drive.  I knew that the ZFS admin tools were hanging, but that redundant 
pools remained available.  I wanted to see whether it was just the ZFS admin 
tools that were failing, or whether ZFS was also failing to send appropriate 
error messages back to the OS.

These are the tests I carried out:

Zpool:  Single drive zpool, consisting of one 250GB SATA drive in a hot plug 
bay.
Test data:  A folder tree containing 19,160 items.  71.1MB in total.

TEST1:  Opened File Browser, copied the test data to the pool.  Half way 
through the copy I pulled the drive.  THE COPY COMPLETED WITHOUT ERROR.  Zpool 
list reports the pool as online, however zpool status hung as expected.

Not quite believing the results, I rebooted and tried again.

TEST2:  Opened File Browser, copied the data to the pool.  Pulled the drive 
half way through.  The copy again finished without error.  Checking the 
properties shows 19,160 files in the copy.  ZFS list again shows the filesystem 
as ONLINE.

Now I decided to see how many files I could copy before it errored.  I started 
the copy again.  File Browser managed a further 9,171 files before it stopped.  
That's nearly 30,000 files before any error was detected.  Again, despite the 
copy having finally errored, zpool list shows the pool as online, even though 
zpool status hangs.

I rebooted the server, and found that after the reboot my first copy contains 
just 10,952 items, and my second copy is completely missing.  That's a loss of 
almost 20,000 files.  Zpool status however reports NO ERRORS.

For the third test I decided to see if these files are actually accessible 
before the reboot:

TEST3:  This time I pulled the drive *before* starting the copy.  The copy 
started much slower this time and only got to 2,939 files before reporting an 
error.  At this point I copied all the files that had been copied to another 
pool, and then rebooted.

After the reboot, the folder in the test pool had disappeared completely, but 
the copy I took before rebooting was fine and contains 2,938 items, 
approximately 12MB of data.  Again, zpool status reports no errors.

Further tests revealed that reading the pool results in an error almost 
immediately.  Writing to the pool appears very inconsistent.

This is a huge problem.  Data can be written without error, and is still served 
to users.  It is only later on that the server will begin to issue errors, but 
at that point zfs admin tools are useless.  The only possible recovery is a 
server reboot, but that will loose recent data written to the pool, but will do 
so without any warnings at all from ZFS.  

Needless to say I have a lot less faith in ZFS' error checking after having 
seen it loose 30,000 files without error.

***********************************************************************
5. If you are using CIFS and pull a drive from the volume, the whole server 
hangs!

This appears to be the original problem I found.  While ZFS doesn't handle 
drive removal well, the combination of ZFS and CIFS is worse.  If you pull a 
drive from a ZFS pool (redundant or not), which is serving CIFS data, the 
entire server freezes until you re-insert the drive.

Note that ZFS itself does not recover after the drive is inserted;  admin tools 
will still hang.  However the re-insertion of the drive is enough to unfreeze 
the server.

Of course, you still need a physical reboot to get your ZFS admin tools back, 
but in the meantime data is accessible again.
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to