Re: [zfs-discuss] ZFS : unable to mount a pool

2009-09-30 Thread Nicolas Szalay
Le mercredi 30 septembre 2009 à 11:43 +0200, Nicolas Szalay a écrit :
> Hello all,
> 
> I have a critical ZFS problem, quick history
[snip]

little addition : zdb -l /dev/rdsk/c7t0d0 sees the metadatas
Isn't it "just" the phys_path that is wrong ?


LABEL 0

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=34
ashift=9
asize=2000377872384
is_log=0
children[0]
type='disk'
id=0
guid=6486634062425618987
path='/dev/dsk/c4t0d0s0'
devid='id1,s...@x0004d927f800/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@0,0:a'
whole_disk=1
DTL=80
children[1]
type='disk'
id=1
guid=5339011664685738178
path='/dev/dsk/c4t1d0s0'
devid='id1,s...@x0004d927f810/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@1,0:a'
whole_disk=1
DTL=79
children[2]
type='disk'
id=2
guid=2839810383588280229
path='/dev/dsk/c5t2d0s0'
devid='id1,s...@x0004d927f820/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@2,0:a'
whole_disk=1
DTL=78
children[3]
type='disk'
id=3
guid=2925754536128244731
path='/dev/dsk/c5t3d0s0'
devid='id1,s...@x0004d927f830/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@3,0:a'
whole_disk=1
DTL=77

LABEL 1

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=34
ashift=9
asize=2000377872384
is_log=0
children[0]
type='disk'
id=0
guid=6486634062425618987
path='/dev/dsk/c4t0d0s0'
devid='id1,s...@x0004d927f800/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@0,0:a'
whole_disk=1
DTL=80
children[1]
type='disk'
id=1
guid=5339011664685738178
path='/dev/dsk/c4t1d0s0'
devid='id1,s...@x0004d927f810/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@1/pci17d3,1...@e/s...@1,0:a'
whole_disk=1
DTL=79
children[2]
type='disk'
id=2
guid=2839810383588280229
path='/dev/dsk/c5t2d0s0'
devid='id1,s...@x0004d927f820/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@2,0:a'
whole_disk=1
DTL=78
children[3]
type='disk'
id=3
guid=2925754536128244731
path='/dev/dsk/c5t3d0s0'
devid='id1,s...@x0004d927f830/a'

phys_path='/p...@0,0/pci8086,2...@2/pci8086,3...@0,3/pci8086,3...@2/pci17d3,1...@e/s...@3,0:a'
whole_disk=1
DTL=77

LABEL 2

version=10
name='applis'
state=0
txg=3183748
pool_guid=5524311410139446438
hostid=566707831
hostname='solarisfiler2'
top_guid=254793396820920770
guid=5339011664685738178
vdev_tree
type='raidz'
id=0
guid=254793396820920770
nparity=1
metaslab_array=15
metaslab_shift=3

[zfs-discuss] ZFS : unable to mount a pool

2009-09-30 Thread Nicolas Szalay
Hello all,

I have a critical ZFS problem, quick history

I have a production machine which backplane has burnt (litteraly) that
had 2 pools : "applis" & "storage". Those pools are RAIDz1 + 1 spare.
Then we switched to the backup one, all right.

Backup machine is the exact replica of production one, files are rsynced
every night. I needed to reboot it to rename it, so I did. But reboot
failed : zfs panic at boot. following sun's doc, I
moved /etc/zfs/zpool.cache away, and reboot. All fine. Mounted pool
"storage" without problem, but unable to mount pool "applis".

Someone suggested disks were re-labelled, I'm open to all advices, If I
can't get those datas back, all we be lost...

Thanks for reading, and maybe helping.

diagnostic outputs : (made from opensolaris liveCD, to have a recent
ZFS)

--
# zpool import
  pool: applis
id: 5524311410139446438
 state: UNAVAIL
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

applis  UNAVAIL  insufficient replicas
  raidz1FAULTED  corrupted data
c7t0d0  FAULTED  corrupted data
c7t1d0  FAULTED  corrupted data
c8t2d0  FAULTED  corrupted data
c8t3d0  FAULTED  corrupted data
spares
  c5t4d0

--
# prtvtoc /dev/rdsk/c7t0d0
* /dev/rdsk/c7t0d0 partition map
*
* Dimensions:
* 512 bytes/sector
* 976773168 sectors
* 976773101 accessible sectors
*
* Flags:
*   1: unmountable
*  10: read-only
*
* Unallocated space:
*   First SectorLast
*   Sector CountSector 
*  34   222   255
*
*  First SectorLast
* Partition  Tag  FlagsSector CountSector  Mount Directory
   0  400256 976756495 976756750
   8 1100  976756751 16384 976773134

--
# fstyp /dev/rdsk/c7t0d0
unknown_fstyp (no match)

--
# zpool import -f applis
cannot import 'applis': one or more device is currently unavailable


signature.asc
Description: Ceci est une partie de message	numériquement signée
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Simple monitoring of ZFS pools, email alerts?

2008-04-02 Thread Nicolas Szalay

Le mercredi 02 avril 2008 à 16:23 -0500, [EMAIL PROTECTED] a
écrit :
> Been goggling around on this to no avail...
>  
> We're hoping to soon put into production an x4500 with a big ZFS pool,
> replacing a (piece of junk) NAS head which replaced our old trusty
> NetApp.
>  
> In each of those older boxes, we configured them to send out an email
> when there was a component failure.
>  
> I'm trying to find the simplest way to do this with our ZFS box. We've
> got a rudimentary log parser, but I don't want to rely on it to pick
> up items in the messages file. Surely there is some way to get an
> email alert when a disk pukes? I don't want to re-invent the wheel
> (but am so far pretty surprised I've not turned up any such so far).

I have been in the same situation, so I wrote a nagios plugin collecting
event from FMd via SNMP. The purpose is a little wider than what you
need (as it collects all FMd events) but if it can help see
http://solaris-fr.org/home/docs/serveur/smpfmd (in french, sorry)

If you need explanations due to language post here :)

Hope it helps,

-- 
Nicolas Szalay

Administrateur systèmes & réseaux

-- _
ASCII ribbon campaign ( )
 - against HTML email  X
 & vCards / \


signature.asc
Description: Ceci est une partie de message	numériquement signée
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cause for data corruption?

2008-02-27 Thread Nicolas Szalay

Le mardi 26 février 2008 à 05:59 -0800, Sandro a écrit :
> Hey
> 
> Thanks for your answers guys.
> 
> I'll run VTS to stresstest cpu and memory.
> 
> And I just checked the block diagram of my motherboard (Gigabyte M61P-S3).
> It doesn't even have 64bit pci slots.. just standard old 33mhz 32bit pci .. 
> and a couple of newer pci-e.
> But my two controllers are both the same vendor / version and are both 
> connected to the same pci bus.
 
looks like 32 bits & ZFS definitively hurts :D

-- 
Nicolas Szalay

Administrateur systèmes & réseaux

-- _
ASCII ribbon campaign ( )
 - against HTML email  X
 & vCards / \


signature.asc
Description: Ceci est une partie de message	numériquement signée
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cause for data corruption?

2008-02-26 Thread Nicolas Szalay
Le lundi 25 février 2008 à 11:05 -0800, Sandro a écrit :
> hi folks

Hi,

> I've been running my fileserver at home with linux for a couple of years and 
> last week I finally reinstalled it with solaris 10 u4.
> 
> I borrowed a bunch of disks from a friend, copied over all the files, 
> reinstalled my fileserver and copied the data back.
> 
> Everything went fine, but after a few days now, quite a lot of files got 
> corrupted.
> here's the output:
> 
>  # zpool status data
>   pool: data
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
> entire pool from backup.
>see: http://www.sun.com/msg/ZFS-8000-8A
>  scrub: scrub completed with 422 errors on Mon Feb 25 00:32:18 2008
> config:
> 
> NAMESTATE READ WRITE CKSUM
> dataONLINE   0 0 5.52K
>   raidz1ONLINE   0 0 5.52K
> c0t0d0  ONLINE   0 0 10.72
> c0t1d0  ONLINE   0 0 4.59K
> c0t2d0  ONLINE   0 0 5.18K
> c0t3d0  ONLINE   0 0 9.10K
> c1t0d0  ONLINE   0 0 7.64K
> c1t1d0  ONLINE   0 0 3.75K
> c1t2d0  ONLINE   0 0 4.39K
> c1t3d0  ONLINE   0 0 6.04K
> 
> errors: 388 data errors, use '-v' for a list
> 
> Last night I found out about this, it told me there were errors in like 50 
> files.
> So I scrubbed the whole pool and it found a lot more corrupted files.
> 
> The temporary system which I used to hold the data while I'm installing 
> solaris on my fileserver is running nv build 80 and no errors on there.
> 
> What could be the cause of these errors??
> I don't see any hw errors on my disks..
> 
>  # iostat -En | grep -i error
> c3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c4d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c0t0d0   Soft Errors: 574 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c1t0d0   Soft Errors: 549 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c0t1d0   Soft Errors: 14 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c0t2d0   Soft Errors: 549 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c0t3d0   Soft Errors: 549 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c1t1d0   Soft Errors: 548 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c1t2d0   Soft Errors: 14 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> c1t3d0   Soft Errors: 548 Hard Errors: 0 Transport Errors: 0
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> 
> although a lot of soft errors.
> Linux said that one disk had gone bad, but I figured the sata cable was 
> somehow broken, so I replaced that before installing solaris. And solaris 
> didn't and doesn't see any actual hw errors on the disks, does it?

I had the same symptoms recently. I also thought the disk were dying but
I was wrong. Suspected the RAM, no. Finally it was because I mixed raid
cards on different PCI buses : 2 64bits buses (no problem with these
ones) and 1 32 Bits PCI bus which caused *all* the checksum errors.

Kicked ou the card on the 32 bit PCI bus and all worked fine.

Hope it helps,

-- 
Nicolas Szalay

Administrateur systèmes & réseaux

-- _
ASCII ribbon campaign ( )
 - against HTML email  X
 & vCards / \


signature.asc
Description: Ceci est une partie de message	numériquement signée
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] add/replace : strange zfs pool behaviour

2008-02-13 Thread Nicolas Szalay
Hi everybody;

I'm experimenting something weird on one of my zpool. One of my hard
drive failed (c3t3d0). The hot spare (c4t3d0) did its job, I
(physically) replaced it, and rebooted.

I have acknowledged the failure with fmadm too.

I now have this zpool config :

$ zpool status storage
  pool: storage
 state: ONLINE
 scrub: resilver completed with 0 errors on Wed Feb 13 16:45:30 2008
config:

NAMESTATE READ WRITE CKSUM
storage ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c3t2d0  ONLINE   0 0 0
c4t3d0  ONLINE   0 0 0
c7t2d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0


I seethe disk with the "format" command :

$ sudo format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
   0. c1d0 
  /[EMAIL PROTECTED],0/[EMAIL PROTECTED],2/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
   1. c2d0 
  /[EMAIL PROTECTED],0/[EMAIL PROTECTED],2/[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
   2. c3t0d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED],3/pci8086,[EMAIL PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
   3. c3t1d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED],3/pci8086,[EMAIL PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
   4. c3t2d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED],3/pci8086,[EMAIL PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
   5. c3t3d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED],3/pci8086,[EMAIL PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
   6. c4t2d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED],3/pci8086,[EMAIL PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
   7. c4t3d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED],3/pci8086,[EMAIL PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0
   8. c7t0d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
   9. c7t1d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
  10. c7t2d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
  11. c7t3d0 
  /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci8086,[EMAIL 
PROTECTED]/pci17d3,[EMAIL PROTECTED]/[EMAIL PROTECTED],0


But I can't insert it into the pool

$ sudo zpool replace storage c4t3d0 c3t3d0
invalid vdev specification
use '-f' to override the following errors:
/dev/dsk/c7t3d0s0 is part of active ZFS pool storage. Please see
zpool(1M).

$ sudo zpool add storage c4t3d0 c3t3d0
invalid vdev specification
use '-f' to override the following errors:
/dev/dsk/c7t3d0s0 is part of active ZFS pool storage. Please see
zpool(1M).

Does anyone have a clue ?

thanks.

-- 
Nicolas Szalay

Administrateur systèmes & réseaux

-- _
ASCII ribbon campaign ( )
 - against HTML email  X
 & vCards / \


signature.asc
Description: Ceci est une partie de message	numériquement signée
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 3ware support

2008-02-12 Thread Nicolas Szalay
Le mardi 12 février 2008 à 07:22 +0100, Johan Kooijman a écrit :
> Goodmorning all,

Hi,

> can anyone confirm that 3ware raid controllers are indeed not working
> under Solaris/OpenSolaris? I can't seem to find it in the HCL.

I do confirm they don't work

> We're now using a 3Ware 9550SX as a S-ATA RAID controller. The
> original plan was to disable all it's RAID functions and use justs the
> S-ATA controller functionality for ZFS deployment.
> 
> If indeed 3Ware isn't support, I have to buy a new controller. Any
> specific controller/brand you can recommend for Solaris?

I use Areca cards, with the driver supplied by Areca (certified in the
HCL)

Have a nice day,

-- 
Nicolas Szalay

Administrateur systèmes & réseaux

-- _
ASCII ribbon campaign ( )
 - against HTML email  X
 & vCards / \


signature.asc
Description: Ceci est une partie de message	numériquement signée
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss