Hello Armin,

Thursday, October 23, 2008, 10:13:23 AM, you wrote:

AO> Good morning,

AO>  i experience file corruption on a zfs in a two node Cluster. The
AO> Filesystem holds the datafile of a VirtualBox windows-guest
AO> instance. It is placed in one resourcegroup together with the
AO> gds-scripts which manage the virtual-machine startup and probe:

AO> clresourcegroup create vb1 

AO> clresource create -t SUNW.HAStoragePlus \
AO> -g vb1 \
AO> -p Zpools=vb1 \
AO> -p AffinityOn=True vb1-storage 

AO> clresource create -g vb1 -t SUNW.gds \
AO>  [..]
AO> -p stop_signal=9 -p Failover_enabled=true \
AO> -p Resource_dependencies=vb1-storage vb1-vms

AO> After some days of operations (and many failovers) the
AO> virtual-disk-datafile is corrupted and the zfs does not mount any more:

AO> Oct 23 09:56:08 siegfried EVENT-TIME: Thu Oct 23 09:56:08 CEST 2008
AO> Oct 23 09:56:08 siegfried PLATFORM: PowerEdge 1850, CSN: 9Z7MV1J, HOSTNAME: 
siegfried
AO> Oct 23 09:56:08 siegfried SOURCE: zfs-diagnosis, REV: 1.0
AO> Oct 23 09:56:08 siegfried EVENT-ID:
AO> 3e0a4051-cd05-cce8-b0bb-c4c165cc4fcc
AO> Oct 23 09:56:08 siegfried DESC: The number of checksum errors associated 
with a ZFS device
AO> Oct 23 09:56:08 siegfried exceeded acceptable levels.  Refer to
AO> http://sun.com/msg/ZFS-8000-GH for more information.
AO> Oct 23 09:56:08 siegfried AUTO-RESPONSE: The device has been marked as 
degraded.  An attempt
AO> Oct 23 09:56:08 siegfried will be made to activate a hot spare if available.
AO> Oct 23 09:56:08 siegfried IMPACT: Fault tolerance of the pool may be 
compromised.
AO> Oct 23 09:56:08 siegfried REC-ACTION: Run 'zpool status -x' and replace the 
bad device.

AO> # zpool status -xv
AO>   pool: vb1
AO>  state: ONLINE
AO> status: One or more devices has experienced an error resulting in data
AO>         corruption.  Applications may be affected.
AO> action: Restore the file in question if possible.  Otherwise restore the
AO>         entire pool from backup.
AO>    see: http://www.sun.com/msg/ZFS-8000-8A
AO>  scrub: none requested
AO> config:
AO>         NAME                                     STATE     READ WRITE CKSUM
AO>         vb1                                      ONLINE       0   0     0
AO>           c4t600D0230000000000088824BC4228807d0  ONLINE       0   0     0
AO> errors: Permanent errors have been detected in the following files:
AO>         /vb1/vb1/vhd/vb1_vhd1.vdi


AO> SunOS Version: 5.11 snv_97 i86pc i386 i86pc
AO> ClusterExpress Version: 08/20/2008 (build from source)
AO> Storage: SAN Luns via scsi_vhci

AO> Any suggestions?

If you can then try to get some kind of redundancy provided by ZFS
(mirror?). Looks like your controller/array/whatever corrupted some
data.

-- 
Best regards,
 Robert                            mailto:[EMAIL PROTECTED]
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to