Posted in greater detail at Server Fault - http://serverfault.com/q/277966/13325
I have an HP ProLiant DL380 G7 system running NexentaStor. The server has 36GB
RAM, 2 LSI 9211-8i SAS controllers (no SAS expanders), 2 SAS system drives, 12
SAS data drives, a hot-spare disk, an Intel X25-M L2ARC cache and a DDRdrive
PCI ZIL accelerator. This system serves NFS to multiple VMWare hosts. I also
have about 90-100GB of deduplicated data on the array.
I've had two incidents where performance tanked suddenly, leaving the VM guests
and Nexenta SSH/Web consoles inaccessible and requiring a full reboot of the
array to restore functionality. In both cases, it was the Intel X-25M L2ARC SSD
that failed or was "offlined". NexentaStor failed to alert me on the cache
failure, however the general ZFS FMA alert was visible on the (unresponsive)
console screen.
The "zpool status" output showed:
cache
c6t5001517959467B45d0 FAULTED 2 542 0 too many errors
This did not trigger any alerts from within Nexenta.
I was under the impression that an L2ARC failure would not impact the system.
But in this case, it was the culprit. I've never seen any recommendations to
RAID L2ARC for resiliency. Removing the bad SSD entirely from the server got me
back running, but I'm concerned about the impact of the device failure and the
lack of notification from NexentaStor.
What's the current best-choice SSD for L2ARC cache applications these days? It
seems as though the Intel units are no longer well-regarded.
--
Edmund White
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss