Thanx so much for the info :) Actually I got the same problem again this morning, and I'm starting to think that this is not an aac driver problem. I noticed that those fmd errors are not logged during the freeze of the zfs file system, but right after reboot. Probably this message is logged on every boot, some kind of "hey, your controller does not support this feature". At the end I'll paste the three logs from arcconf, anyway. I choosed not to use WTB caching, because I have no battery on this controller, and configured zfs to have zfs:zfs_nocacheflush=1. This proved to be the faster combination without cache. Thinking again on the problem, I also noticed that the zpool looks safe during freeze. A zpool status does not show any problem. Only the zfs file system inside the pool get locked on any command, once in that state. Also savecore never returns after working up to 100%, so I have no /var/crash state. Another thing I noticed, it's not good to have VMware do the iscsi stuff and give the disk to the PDC. This way, the vmdk resides in a VMware formatted partition, but the PDC has no knoweledge that this is an iscsi resource, and dosn't consider it as a "volatile" disk. So it starts not responding when the iscsi is not responding. What I will do, probably this evening, is create a new iscsi partition on the other controller (areca), mount it as a new iscsi directly from the PDC, move the content of the VMware controlled iscsi to the new disk, switch letters and remove the old disk from the PDC. I will keep the CIFS repository on the adaptec, for now. This way, I can verify: - if iscsi never dies but CIFS dies, probably we have a problem on the adaptec pool - if iscsi dies but CIFS does not, probably we have a problem with iscsi zfs freezing - if iscsi dies, verify what is the reaction of the NFS sharings on the areca controller - if everything goes without problems, maybe the choice of using iscsi from VMware was the problem A last note: I noticed that the kernel update I did in august (to solve a CIFS problem), leaved my zpools needing an update. I never did this zpool update. Do you think this may cause problems like this? I never did the update, beacuse I don't know: - how long it will take - if the pool remains operative during upgrade - what are the risks - how much it will affect storage responsiveness during upgrade Thanks so much for all the suggestions. Gabriele. sonicle@xstorage:~# arcconf getlogs 1 DEVICE Controllers found: 1 Command completed successfully. sonicle@xstorage:~# arcconf getlogs 1 EVENT Controllers found: 1 Command completed successfully. sonicle@xstorage:~# arcconf getlogs 1 DEAD Controllers found: 1 ---------------------------------------------------------------------------------- Da: Joerg Goltermann Cc: Gabriele Bulfon Udo Grabowski (IMK) Data: 27 novembre 2012 12.21.45 CET Oggetto: Re: [discuss] Again: illumos based ZFS storage failure Hi, we have/had some systems with aac (Adaptec 5405/5805) running illumos for a long time without any problems. Maybe your assumption on the cache is not correct or this depends on the controller settings. I've never seen such a message, but we have WTB caching enabled. Did you have checked the controller logs for errors/warnings? $ arcconf getlogs 1 [device|dead|event] You can talk to Dan, he is working on an update. I don't know if it's ready for use, but you can contact him or try the patch: http://kebe.com/~danmcd/webrevs/aac/ - Joerg -- OSN Online Service Nuernberg GmbH, Bucher Str. 78, 90408 Nuernberg Tel: +49 911 39905-0 - Fax: +49 911 39905-55 - http://www.osn.de HRB 15022 Nuernberg, USt-Id: DE189301263, GF: Joerg Goltermann
------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com