Re: [zfs-discuss] new labelfix needed

2010-09-01 Thread Benjamin Brumaire
your point have only a rethoric meaning. System breaks regardless the ressource 
you put to build it. Bad hardware, typo, human mistakes, bugs, This 
mailing-list is full of examples. Having some tools like zdb, mdb, zfs import 
-fFX and labelfix for analyzis and repair is always a good thing. BTW zfsck 
would be a great improvement to ZFS.

bbr
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] stmf corruption and dealing with dynamic lun mapping

2010-09-01 Thread Geoff Nordli
I am running Nexenta NCP 3.0 (134f).

My stmf configuration was corrupted.  I was getting errors like in
/var/adm/messages:

Sep  1 10:32:04 llift-zfs1 svc-stmf[378]: [ID 130283 user.error] get
property view_entry-0/all_hosts failed - entity not found
Sep  1 10:32:04 llift-zfs1 svc.startd[9]: [ID 652011 daemon.warning]
svc:/system/stmf:default: Method "/lib/svc/method/svc-stmf start" failed
with exit status 1

In the /var/adm/system-stmf\:default.log

[ Sep  1 10:32:05 Executing start method ("/lib/svc/method/svc-stmf start").
]
svc-stmf: Unable to load the configuration. See /var/adm/messages for
details
svc-stmf: For information on reverting the stmf:default instance to a
previously running configuration see the man page for svccfg(1M)
svc-stmf: After reverting the instance you must clear the service
maintenance state. See the man page for svcadm(1M)


I fixed it by going into the svccfg and reverted to the previous "running"
snap. 

We have a lab management system which continuously creates and deletes LUNs
as virtual machines are built and destroyed.  When I recovered to the
previous running state we had a mismatch between what the LUNs should be and
what they were.  

Is there a backup "configuration" somewhere, or a way to "re-read" the LUN
configuration? 

If not, I set the LUN for each volume in the custom zfs properties.  I may
just need to build a "sanitizer" script to rebuild the LUN mappings in the
event of catastrophic failure.

BTW, I am running this system inside a VMWare Server vm, which has caused
some instability, but I guess it is good to be prepared. 

Thanks,

Geoff  







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] System hangs during zfs send

2010-09-01 Thread Bryan Leaman
Just to update everyone...this turned out to be OpenSolaris bug 6884007 
"zfs_send() can leave temporary holds around".  It was fixed in b142 and I was 
able to apply the patch to NCP3 and the issue is resolved on my system.

Apparently the system was stuck in dnode_special_close() due to some snapshot 
holds that were not cleaned up properly by zfs send.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] pool died during scrub

2010-09-01 Thread Carsten John
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeff Bacon wrote:
> I have a bunch of sol10U8 boxes with ZFS pools, most all raidz2 8-disk
> stripe. They're all supermicro-based with retail LSI cards.
> 
> I've noticed a tendency for things to go a little bonkers during the
> weekly scrub (they all scrub over the weekend), and that's when I'll
> lose a disk here and there. OK, fine, that's sort of the point, and
> they're SATA drives so things happen. 
> 
> I've never lost a pool though, until now. This is Not Fun. 
> 
>> ::status
> debugging crash dump vmcore.0 (64-bit) from ny-fs4
> operating system: 5.10 Generic_142901-10 (i86pc)
> panic message:
> BAD TRAP: type=e (#pf Page fault) rp=fe80007cb850 addr=28 occurred
> in module "zfs" due to a NULL pointer dereference
> dump content: kernel pages only
>> $C
> fe80007cb960 vdev_is_dead+2()
> fe80007cb9a0 vdev_mirror_child_select+0x65()
> fe80007cba00 vdev_mirror_io_start+0x44()
> fe80007cba30 zio_vdev_io_start+0x159()
> fe80007cba60 zio_execute+0x6f()
> fe80007cba90 zio_wait+0x2d()
> fe80007cbb40 arc_read_nolock+0x668()
> fe80007cbbd0 dmu_objset_open_impl+0xcf()
> fe80007cbc20 dsl_pool_open+0x4e()
> fe80007cbcc0 spa_load+0x307()
> fe80007cbd00 spa_open_common+0xf7()
> fe80007cbd10 spa_open+0xb()
> fe80007cbd30 pool_status_check+0x19()
> fe80007cbd80 zfsdev_ioctl+0x1b1()
> fe80007cbd90 cdev_ioctl+0x1d()
> fe80007cbdb0 spec_ioctl+0x50()
> fe80007cbde0 fop_ioctl+0x25()
> fe80007cbec0 ioctl+0xac()
> fe80007cbf10 _sys_sysenter_post_swapgs+0x14b()
> 
>   pool: srv
> id: 9515618289022845993
>  state: UNAVAIL
> status: One or more devices are missing from the system.
> action: The pool cannot be imported. Attach the missing
> devices and try again.
>see: http://www.sun.com/msg/ZFS-8000-6X
> config:
> 
> srvUNAVAIL  missing device
>   raidz2   ONLINE
> c2t5000C5001F2CCE1Fd0  ONLINE
> c2t5000C5001F34F5FAd0  ONLINE
> c2t5000C5001F48D399d0  ONLINE
> c2t5000C5001F485EC3d0  ONLINE
> c2t5000C5001F492E42d0  ONLINE
> c2t5000C5001F48549Bd0  ONLINE
> c2t5000C5001F370919d0  ONLINE
> c2t5000C5001F484245d0  ONLINE
>   raidz2   ONLINE
> c2t5F000B5C8187d0  ONLINE
> c2t5F000B5C8157d0  ONLINE
> c2t5F000B5C9101d0  ONLINE
> c2t5F000B5C8167d0  ONLINE
> c2t5F000B5C9120d0  ONLINE
> c2t5F000B5C9151d0  ONLINE
> c2t5F000B5C9170d0  ONLINE
> c2t5F000B5C9180d0  ONLINE
>   raidz2   ONLINE
> c2t5000C50010A88E76d0  ONLINE
> c2t5000C5000DCD308Cd0  ONLINE
> c2t5000C5001F1F456Dd0  ONLINE
> c2t5000C50010920E06d0  ONLINE
> c2t5000C5001F20C81Fd0  ONLINE
> c2t5000C5001F3C7735d0  ONLINE
> c2t5000C500113BC008d0  ONLINE
> c2t5000C50014CD416Ad0  ONLINE
> 
> Additional devices are known to be part of this pool, though
> their
> exact configuration cannot be determined.
> 
> 
> All of this would be ok... except THOSE ARE THE ONLY DEVICES THAT WERE
> PART OF THE POOL. How can it be missing a device that didn't exist? 
> 
> A "zpool import -fF" results in the above kernel panic. This also
> creates /etc/zfs/zpool.cache.tmp, which then results in the pool being
> imported, which leads to a continuous reboot/panic cycle. 
> 
> I can't obviously use b134 to import the pool without logs, since that
> would imply upgrading the pool first, which is hard to do if it's not
> imported. 
> 
> My zdb skills are lacking - zdb -l gets you about so far and that's it.
> (where the heck are the other options to zdb even written down, besides
> in the code?)
> 
> OK, so this isn't the end of the world, but it's 15TB of data I'd really
> rather not have to re-copy across a 100Mbit line. It really more
> concerns me that ZFS would do this in the first place - it's not
> supposed to corrupt itself!!
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Hi Jeff,



looks similar to a crash I had here at our site a few month ago. Same
symptoms, no actual solution. We had to recover from a rsync backup server.


We had the logs on an mirrored SSD and an additional SSD as cache.

The machine (SUN 4270 with SUN J4400 JBODS and SUN SAS disks) crashed in
the same manner (core dumping while trying to import the pool). After
booting into single user mode we found the log pool mirror corrupted
(one disk unavailbale). Even after replacing the disk and resilvering
the log mirror we were not able to import the pool.

I suggest that it may has been related to memory (perhaps a lack of memory).


all the best

[zfs-discuss] Disk on Module (DOM) for NAS boot drive?

2010-09-01 Thread valrh...@gmail.com
I have a file server that I've basically maxed out the drive bays for. At the 
moment, I'm running Nexenta on an SSD that is sort of resting on something else 
in the case. I was wondering if, instead, I could install Nexenta on a SATA 
Disk on Module (DOM), say something like 4 GB, dual channel, SLC:

http://www.kingspec.com/solid-state-disk-products/series-domsata.htm

I did try with a USB memory stick, but it was slow. And my previous 
installation of EON on a memory stick got corrupted and I lost everything (not 
the data, but the configuration). Has anyone gotten this to work before (for 
Nexenta, EON, etc.)? Any suggestions or advice? And how much space does a 
plain-vanilla installation of Nexenta actually require?

Thanks!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 4k block alignment question (X-25E)

2010-09-01 Thread Yuri Vorobyev

31.08.2010 21:23, Ray Van Dolson пишет:


Here's an article with some benchmarks:

   http://wikis.sun.com/pages/viewpage.action?pageId=186241353

Seems to really impact IOPS.


This is really interesting reading. Can someone do same tests with Intel 
X25-E?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss