Re: [zfs-discuss] SXCE build 90 vs S10U6?

2008-06-12 Thread Mike Gerdts
On Thu, Jun 12, 2008 at 10:12 PM, Tim [EMAIL PROTECTED] wrote:
 I guess I find the difference between b90 and opensolaris trivial
 given we're supposed to be getting constant updates following the sxce
 builds.

But the supported version of OpenSolaris will not be on the same
schedule as sxce.  Opensolaris 2008.05 is based on snv_86.  The
supported version will only have bug fixes until 2008.11.  That is, it
follows much more of fthe same type of schedule that sxde did.

Additionally, OpenSolaris has completely redone the installation and
packaging bits.  When you are running a bunch of servers with
aggregate storage capacity of over 100 TB you are probably doing
something that is rather important to the company that shelled out
well over $100,000 for the hardware.  In most (not all) environments
that I have worked in this says that you don't want to be relying too
heavily on 1.0 software[1] or external web services[2] that the
maintainer has not shown a track record[3] of maintaining in a way
that meets typical enterprise-level requirements.


1. The non-live CD installer has not even made it into the unstable
Mercurial repository.  The pkg and beadm commands and associated
libraries have less than a month of existence in anything that any
vendor is claiming to support.
2. AFAIK, pkg.sun.com does not serve packages yet.
pkg.opensolaris.org serves up packages from snv_90 by default even
though snv_86 is the variant that is supposedly supported.
3. There were numerous complaints of repeated timeouts when the snv_90
packages were released resulting in having to restart the upgrade from
the start.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs promote and ENOSPC (+panic with dtrace)

2008-06-11 Thread Mike Gerdts
On Wed, Jun 11, 2008 at 12:58 AM, Robin Guo [EMAIL PROTECTED] wrote:
 Hi, Mike,

  It's like 6452872, it need enough space for 'zfs promote'

Not really -  in 6452872 a file system is at its quota before the
promote is issued. I expect that a promote may cause several KB of
metadata changes that require some space and as such would require
more space than the quota.

In my case, quotas are not in used.  I had over 1.8 GB free before I
issued the zfs promote and fully expected to have roughly the same
amount of space free after the promote.  It seems as though a wrong
comparison about the amount of required free space is being made.

I have been able to reproduce - but then when I started poking at it
with dtrace (no destructive actions) I got a panic.

# mdb *.0
Loading modules: [ unix genunix specfs dtrace cpu.generic uppc
scsi_vhci zfs random ip hook neti sctp arp usba fctl md lofs sppp
crypto ptm ipc fcp fcip cpc logindmux sv nsctl sdbc ufs rdc ii nsmb ]
 ::status
debugging crash dump vmcore.0 (32-bit) from indy2
operating system: 5.11 snv_86 (i86pc)
panic message:
BAD TRAP: type=e (#pf Page fault) rp=e0620d38 addr=200 occurred in module unkn
own due to a NULL pointer dereference
dump content: kernel pages only
 ::stack
0x200(eb1ea000)
zfs_ioc_promote+0x3b()
zfsdev_ioctl+0xd8(2d8, 5a23, 8045e40, 13, e8b3a020, e0620f78)
cdev_ioctl+0x2e(2d8, 5a23, 8045e40, 13, e8b3a020, e0620f78)
spec_ioctl+0x65(ddfb6c00, 5a23, 8045e40, 13, e8b3a020, e0620f78)
fop_ioctl+0x49(ddfb6c00, 5a23, 8045e40, 13, e8b3a020, e0620f78)
ioctl+0x155()
sys_call+0x10c()


The dtrace command that I was running was:

dtrace -n 'fbt:zfs:dsl_dataset_promote:return { trace(arg0); stack() }'

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs promote and ENOSPC

2008-06-09 Thread Mike Gerdts
I needed to free up some space to be able to create and populate a new
upgrade.  I was caught off guard by the amount of free space required
by zfs promote.

bash-3.2# uname -a
SunOS indy2 5.11 snv_86 i86pc i386 i86pc

bash-3.2# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
rpool 5.49G  1.83G55K  /rpool
[EMAIL PROTECTED] 46.5K  -  49.5K  -
rpool/ROOT5.39G  1.83G18K  none
rpool/ROOT/2008.052.68G  1.83G  3.38G  legacy
rpool/ROOT/2008.05/opt 814M  1.83G  22.3M  legacy
rpool/ROOT/2008.05/[EMAIL PROTECTED]43K  -  22.3M  -
rpool/ROOT/2008.05/opt/SUNWspro739M  1.83G   739M  legacy
rpool/ROOT/2008.05/opt/netbeans   52.9M  1.83G  52.9M  legacy
rpool/ROOT/preview2   2.71G  1.83G  2.71G  /mnt
rpool/ROOT/[EMAIL PROTECTED] 6.13M  -  2.71G  -
rpool/ROOT/preview2/opt 27K  1.83G  22.3M  legacy
rpool/export  89.8M  1.83G19K  /export
rpool/export/home 89.8M  1.83G  89.8M  /export/home

bash-3.2# zfs promote rpool/ROOT/2008.05
cannot promote 'rpool/ROOT/2008.05': out of space

Notice that I have 1.83 GB of free space and the snapshot from which
the clone was created (rpool/ROOT/[EMAIL PROTECTED]) is 2.71 GB.  It
was not until I had more than 2.71 GB of free space that I could
promote rpool/ROOT/2008.05.

This behavior does not seem to be documented.  Is it a bug in the
documentation or zfs?

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Filesystem for each home dir - 10,000 users?

2008-06-06 Thread Mike Mackovitch
On Fri, Jun 06, 2008 at 06:27:01PM -0400, Brian Hechinger wrote:
 On Fri, Jun 06, 2008 at 02:58:09PM -0700, eric kustarz wrote:
  
   clients do not.  Without per-filesystem mounts, 'df' on the client
   will not report correct data though.
  
   I expect that mirror mounts will be coming Linux's way too.
  
  The should already have them:
  http://blogs.sun.com/erickustarz/en_US/entry/linux_support_for_mirror_mounts
 
 Where does that leave those of us who need to deal with OSX clients?  Does 
 apple
 have any plans to get in on this?

Apple plans on supporting NFSv4... including mirror mounts (barring any
unseen, insurmountable hurdles).

HTH
--macko
Not speaking officially for Apple, but just as an engineer who works
on this stuff.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Filesystem for each home dir - 10,000 users?

2008-06-06 Thread Mike Mackovitch
On Fri, Jun 06, 2008 at 03:43:29PM -0700, eric kustarz wrote:
 
 On Jun 6, 2008, at 3:27 PM, Brian Hechinger wrote:
 
  On Fri, Jun 06, 2008 at 02:58:09PM -0700, eric kustarz wrote:
 
  clients do not.  Without per-filesystem mounts, 'df' on the client
  will not report correct data though.
 
  I expect that mirror mounts will be coming Linux's way too.
 
  The should already have them:
  http://blogs.sun.com/erickustarz/en_US/entry/linux_support_for_mirror_mounts
 
  Where does that leave those of us who need to deal with OSX  
  clients?  Does apple
  have any plans to get in on this?
 
 They need to implement NFSv4 in general first :)

Technically, Mac OS X 10.5 Leopard has some basic NFSv4.0 support in it.
But just enough to make it look like it passes all the Connectathon tests.
Not enough to warrant use by anyone but the terminally curious (or masochistic).
This is mentioned briefly in the mount_nfs(8) man page.

It would be reasonable to expect that future MacOSX releases will include
increasing levels of functionality and that NFSv4 will eventually be made
the default NFS version.

 But you'd have to  
 ask them on their lists what the status of that is... i know i would  
 like it...

Or get lucky and happen to have one of their engineers catch the question
on this list and reply...  ;-)

--macko
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] LiveUpgrade Bug? -- ZFS root finally here in SNV90

2008-06-05 Thread Ellis, Mike
If you jumpstart a system, and have it default to a shared / and /var,
you can do the following:

 lucreate -n lu1
 lustatus

Due to a bug you now have to edit: /bootpool/boot/menu.lst

Once that's done, you're in good shape and both environments are
bootable (review boot -L)

--

If we do the same thing with a separate / and /var
using jumpstart profile entry:  
bootenv installbe bename zfsboot dataset /var  

Things appear to be working well
Lucreate appears to do all the snapshots/clones, and sets some special
parameters for /var

But when you try to boot from that BE, things die pretty early on in the
boot process, likely related to the fact that it didn't actually mount
/var

Is this a known bug? (is this the proper way to validate, and
potentially file bugs for ZFS boot?)

Thanks,

 -- MikeE


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Thursday, June 05, 2008 1:56 PM
To: Ellis, Mike
Cc: ZFS discuss
Subject: Re: [zfs-discuss] ZFS root finally here in SNV90

Mike,

As we discussed, you can't currently break out other datasets besides
/var. I'll add this issue to the FAQ.

Thanks,

Cindy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS root finally here in SNV90

2008-06-05 Thread Mike Gerdts
On Wed, Jun 4, 2008 at 11:18 PM, Rich Teer [EMAIL PROTECTED] wrote:
 Why would one do that?  Just keep an eye on the root pool and all is good.

The only good argument I have for separating out some of /var is for
boot environment management.  I grew tired of repeating my arguments
and suggestions and wrote a blog entry.

http://mgerdts.blogspot.com/2008/03/future-of-opensolaris-boot-environment.html

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Get your SXCE on ZFS here!

2008-06-04 Thread Ellis, Mike
The FAQ document (
http://opensolaris.org/os/community/zfs/boot/zfsbootFAQ/ ) has a
jumpstart profile example:

install_type initial_install
pool newpool auto auto auto mirror c0t0d0 c0t1d0
bootenv installbe bename sxce_xx 

The B90 jumpstart check program (SPARC) flags that the disks should 
be specified as: c0t0d0s0 c0t1d0s0 (slices)

Can someone confirm the FAQ is indeed incorrect  perhaps make the
adjustment to the FAQ if so warranted?

Thanks,

 -- MikeE



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Cindy
Swearingen
Sent: Wednesday, June 04, 2008 6:50 PM
To: Tim
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Get your SXCE on ZFS here!

Tim,

Start at the zfs boot page, here:

http://www.opensolaris.org/os/community/zfs/boot/

Review the information and follow the links to the docs.

Cindy

- Original Message -
From: Tim [EMAIL PROTECTED]
Date: Wednesday, June 4, 2008 4:29 pm
Subject: Re: [zfs-discuss] Get your SXCE on ZFS here!
To: Kyle McDonald [EMAIL PROTECTED]
Cc: zfs-discuss@opensolaris.org, andrew [EMAIL PROTECTED]

 On Wed, Jun 4, 2008 at 5:01 PM, Kyle McDonald [EMAIL PROTECTED] 
 wrote:
 
  andrew wrote:
   With the release of the Nevada build 90 binaries, it is now 
 possible to
  install SXCE directly onto a ZFS root filesystem, and also put ZFS 
 swap onto
  a ZFS filesystem without worrying about having it deadlock. ZFS now 
 also
  supports crash dumps!
  
   To install SXCE to a ZFS root, simply use the text-based 
 installer, after
  choosing Solaris Express from the boot menu on the DVD.
  
   DVD download link:
  
   http://www.opensolaris.org/os/downloads/sol_ex_dvd_1/
  
  
  This release also (I beleive) supports installing on ZFS through
JumpStart.
 
  Does anyone have a pointer for Docs on what the syntax is for a
  JumpStart profile to configure ZFS root?
 
   -Kyle
 
  
   This message posted from opensolaris.org
   ___
   zfs-discuss mailing list
   zfs-discuss@opensolaris.org
   http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  
 
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 
 
 Does this mean zfs boot/root on sparc is working as well?  If so...
FINALLY
 :)
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS root finally here in SNV90

2008-06-04 Thread Ellis, Mike
In addition to the standard containing the carnage arguments used to
justify splitting /var/tmp, /var/mail, /var/adm (process accounting
etc), is there an interesting use-case where would one split out /var
for compression reasons (as in, turn on compression for /var so that
process accounting, network flow, and other such fun logs can be kept on
a compressed filesystem (while keeping / (and thereby /usr etc.)
uncompressed?) 

The ZFSBOOT-FAQ document doesn't really show how to break out multiple
filesystems with jumpstart profiles... An example there might be
helpful...  (as its clear this is a frequently asked question :-)

Also a compression on, ditto-data-bits on, (or perhaps a generic
place to insert zpool/zfs parameters) as part of the jumpstart profiles
could also be useful...

If SSD is coming fast and furious, being able to use compression, shared
free-space (quotas etc) to keep the boot-images small enough so they'll
fit and accommodate live-upgrade patching, will become increasingly
important.

http://www.networkworld.com/news/2008/060308-sun-flash-storage.html?page
=1

Rock on guys,

 -- MikeE




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Rich Teer
Sent: Thursday, June 05, 2008 12:19 AM
To: Bob Friesenhahn
Cc: ZFS discuss
Subject: Re: [zfs-discuss] ZFS root finally here in SNV90

On Wed, 4 Jun 2008, Bob Friesenhahn wrote:

 Did you actually choose to keep / and /var combined?  Is there any 

THat's what I'd do...

 reason to do that with a ZFS root since both are sharing the same pool

 and so there is no longer any disk space advantage?  If / and /var are

 not combined can they have different assigned quotas without one 
 inheriting limits from the other?

Why would one do that?  Just keep an eye on the root pool and all is
good.

-- 
Rich Teer, SCSA, SCNA, SCSECA

CEO,
My Online Home Inventory

URLs: http://www.rite-group.com/rich
  http://www.linkedin.com/in/richteer
  http://www.myonlinehomeinventory.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] /var/sadm on zfs?

2008-06-01 Thread Mike Gerdts
On Sun, Jun 1, 2008 at 3:53 AM, Enda O'Connor [EMAIL PROTECTED] wrote:
 Jim Litchfield at Sun wrote:

 I think you'll find that any attempt to make zones (certainly whole root
 ones) will fail after this.


 right, zoneadm install actually copies in the global zones undo.z into the
 local zone, so that patchrm of an existing patch will work.

 haven't tried out what happens when the undo is missing,

My guess it works just fine - based upon the fact that patchadd -d
does not create the undo.z file.  Admittedly, it is sloppy to just get
rid of the undo.z file - the existence of the other related
directories is (save/patchid) may trip something up.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] panic: avl_find() succeeded inside avl_add()

2008-06-01 Thread Mike Gerdts
On Sat, May 31, 2008 at 9:38 PM, Mike Gerdts [EMAIL PROTECTED] wrote:
 $ find /ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix
 /ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix
 /ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix/.make.state.lock
 /ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix/debug64
 panic

The stack from this one is...

 ::stack
vpanic(128d918, 300093c3778, 2a1010c7418, 0, 300093c39a8, 1229000)
avl_add+0x38(300091da548, 300093c3778, 649e740, 30005f1a180,
800271d6, 128d800)
mzap_open+0x18c(cf, 300091da538, 300091df998, 30005f1a180, 300091da520,
300091da508)
zap_lockdir+0x54(30003ac6b88, 26b32, 0, 0, 1, 2a1010c78f8)
zap_cursor_retrieve+0x40(2a1010c78f0, 2a1010c77d8, 0, 1, 2a1010c78f0, 2)
zfs_readdir+0x224(3, 2a1010c7aa0, 30009173308, 2, 2000, 2a1010c77f0)
fop_readdir+0x44(300091fe940, 2a1010c7aa0, 30005f403b0, 2a1010c7a9c, 2000,
111dd48)
getdents64+0x90(4, 2a1010c7ad0, 2000, 0, 30008245dd0, 0)
syscall_trap32+0xcc(4, ff1a, 2000, 0, 0, 0)

It tripped up on:

 300091fe940::print vnode_t v_path
v_path = 0x300082608c0 
/ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix/debug64

Which is a subdirectory of where it tripped up before.

I am able to do find /ws/mount -name serengeti -prune without
problems.  To make it so that I can hopefully proceed with the build I
have moved the directory out of the way, then did an hg update so
that I can hopefully get the build I was trying to do to complete.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs equivalent of ufsdump and ufsrestore

2008-05-31 Thread Mike Gerdts
On Sat, May 31, 2008 at 9:18 AM, David Magda [EMAIL PROTECTED] wrote:

 On May 31, 2008, at 06:03, Joerg Schilling wrote:

 The other method works as root if you use -atime (see man page) and is
 available since 13 years.

 Would it be possible to assign an RBAC role to a regular user to
 accomplish this? If so, would you know which one?

You can use ppriv -D -e star ... to figure out which privileges you
lack to be able to reset the atime. I suspect that in order to perform
backups (and reset atime), you would need to have file_dac_read and
file_dac_write.  A backup program that has those privileges has
everything they need to gain full root access.

I wish that there was a flag to open(2) to say not to update the atime
and that there was a privilege that could be granted to allow this
flag without granting file_dac_write.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] /var/sadm on zfs?

2008-05-31 Thread Mike Gerdts
On Sat, May 31, 2008 at 5:16 PM, Bob Friesenhahn
[EMAIL PROTECTED] wrote:
 On my heavily-patched Solaris 10U4 system, the size of /var (on UFS)
 has gotten way out of hand due to the remarkably large growth of
 /var/sadm.  Can this directory tree be safely moved to a zfs
 filesystem?  How much of /var can be moved to a zfs filesystem without
 causing boot or runtime issues?

/var/sadm is not used during boot.

If you have been patching regularly, you probably have a bunch of
undo.Z files that are used only in the event that you want to back
out.  If you don't think you will be backing out any patches that were
installed 90 or more days ago the following commands may be helpful:

To understand how much space would be freed up by whacking the old undo files:

# find /var/sadm/pkg -mtime +90 -name undo.Z | xargs du -k \
| nawk '{t+= $1; print $0} END {printf(Total: %d MB\n, t / 1024)}'

Copy the old backout files somewhere else:

# cd /var/sadm
# find pkg -mtime +90 -name undo.Z \
| cpio -pdv /somewhere/else

Remove the old (90+ days) undo files

# find /var/sadm/pkg -mtime +90 -name undo.Z | xargs rm -f

Oops, I needed those files to back out 123456-01

# cd /somewhere/else
# find pkg -name undo.Z | grep 123456-01 \
| cpio -pdv /var/sadm
# patchrm 123456-01

Before you do this, test it and convince yourself that it works.  I
have not seen Sun documentation (either docs.sun.com or
sunsolve.sun.com) that says that this is a good idea - but I haven't
seen any better method for getting rid of the cruft that builds up in
/var/sadm either.

I suspect that further discussion on this topic would be best directed
to [EMAIL PROTECTED] or sun-managers mailing list (see
http://www.sunmanagers.org/).

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] /var/sadm on zfs?

2008-05-31 Thread Mike Gerdts
On Sat, May 31, 2008 at 7:37 PM, Jim Litchfield at Sun
[EMAIL PROTECTED] wrote:
 I think you'll find that any attempt to make zones (certainly whole root
 ones) will fail after this.

In no way am I speaking authoritatively on this (I really wish all the
install, lu, patch code was open...), but are you perhaps confusing
the patch backout data with the CAS and and modifiable (e, v) files
that have pristine values stashed under
/var/sadm/pkg/pkgname/save/pspool/pkgname?

My understanding was that files of type f are installed into
non-global zones by copying the file from the installed location in
the global zone.  So as to not pick up changes that occurred to
editable (e - e.g. /etc/passwd) and volatile (v - e.g.
/var/adm/messages) in the global zone, the pristine copy is picked up
from pspool directories I mention above.

To try things out, let's play with alternate root installation of
studio 11 at /opt/studio11 - on ZFS.  Playing is best done on
clones...

zfs snapshot pool0/opt/[EMAIL PROTECTED]
zfs clone pool0/opt/[EMAIL PROTECTED] pool0/junk

I have patch 121015-06 installed - which patches SPROcc.  It was
installed with patchadd -R /opt/studio11 -M `pwd` * with several
patches in the cwd while running the command.

 The contents of /pool0/junk/var/sadm/pkg/SPROcc look like:

var/sadm/pkg/SPROcc
var/sadm/pkg/SPROcc/install
var/sadm/pkg/SPROcc/install/copyright
var/sadm/pkg/SPROcc/install/depend
var/sadm/pkg/SPROcc/pkginfo
var/sadm/pkg/SPROcc/save
var/sadm/pkg/SPROcc/save/121015-06
var/sadm/pkg/SPROcc/save/121015-06/undo.Z
var/sadm/pkg/SPROcc/save/pspool
var/sadm/pkg/SPROcc/save/pspool/SPROcc
var/sadm/pkg/SPROcc/save/pspool/SPROcc/install
var/sadm/pkg/SPROcc/save/pspool/SPROcc/install/copyright
var/sadm/pkg/SPROcc/save/pspool/SPROcc/install/depend
var/sadm/pkg/SPROcc/save/pspool/SPROcc/pkginfo
var/sadm/pkg/SPROcc/save/pspool/SPROcc/pkgmap
var/sadm/pkg/SPROcc/save/pspool/SPROcc/save
var/sadm/pkg/SPROcc/save/pspool/SPROcc/save/121015-06
var/sadm/pkg/SPROcc/save/pspool/SPROcc/save/121015-06/undo.Z

I back out the patch (patchrm -R /pool0/junk 121015-06), then
reinstall it without backout info (patchadd -R /pool0/junk -d
121015-06).  Things have changed to...

var/sadm/pkg/SPROcc
var/sadm/pkg/SPROcc/install
var/sadm/pkg/SPROcc/install/checkinstall
var/sadm/pkg/SPROcc/install/copyright
var/sadm/pkg/SPROcc/install/depend
var/sadm/pkg/SPROcc/install/patch_checkinstall
var/sadm/pkg/SPROcc/install/patch_postinstall
var/sadm/pkg/SPROcc/pkginfo
var/sadm/pkg/SPROcc/save
var/sadm/pkg/SPROcc/save/pspool
var/sadm/pkg/SPROcc/save/pspool/SPROcc
var/sadm/pkg/SPROcc/save/pspool/SPROcc/install
var/sadm/pkg/SPROcc/save/pspool/SPROcc/install/copyright
var/sadm/pkg/SPROcc/save/pspool/SPROcc/install/depend
var/sadm/pkg/SPROcc/save/pspool/SPROcc/pkginfo
var/sadm/pkg/SPROcc/save/pspool/SPROcc/pkgmap

Notice the lack of undo.Z files (and associated patch directories),
but the rest looks the same.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] panic: avl_find() succeeded inside avl_add()

2008-05-31 Thread Mike Gerdts
I just experienced a zfs-related crash.  I have filed a bug (don't
know number - grumble). I have a crash dump but little free space.  If
someone would like some more info from the core, please let me know in
the next few days.

 ::status
debugging crash dump /pool0/vmcore.0 (64-bit) from sun
operating system: 5.11 snv_76 (sun4u)
panic message: avl_find() succeeded inside avl_add()
dump content: kernel pages only
 ::stack
vpanic(128d918, 3000c1daab0, 2a101673418, 0, 3000b6a3770, 1229000)
avl_add+0x38(300106ee398, 3000c1daab0, 649e740, 3001b377980,
800271d6, 128d800)
mzap_open+0x18c(cf, 300106ee388, 3000c94caa0, 3001b377980, 300106ee370,
300106ee358)
zap_lockdir+0x54(300039bce68, 26b32, 0, 0, 1, 2a1016738f8)
zap_cursor_retrieve+0x40(2a1016738f0, 2a1016737d8, 0, 1, 2a1016738f0, 2)
zfs_readdir+0x224(3, 2a101673aa0, 3000dfc7980, 2, 2000, 2a1016737f0)
fop_readdir+0x44(3000df541c0, 2a101673aa0, 3000cb58dc8, 2a101673a9c, 2000,
111dd48)
getdents64+0x90(8, 2a101673ad0, 2000, 2004, 3001e54cac8, ff0b)
syscall_trap32+0xcc(8, ff0f4000, 2000, 2004, 0, ff0b)

# zpool status pool0
  pool: pool0
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
pool0 ONLINE   0 0 0
  mirror  ONLINE   0 0 0
c0t1d0s7  ONLINE   0 0 0
c0t0d0s7  ONLINE   0 0 0

errors: No known data errors

# zpool get all pool0
NAME   PROPERTY VALUE   SOURCE
pool0  size 27.2G   -
pool0  used 24.9G   -
pool0  available2.38G   -
pool0  capacity 91% -
pool0  altroot  -   default
pool0  health   ONLINE  -
pool0  guid 8395455814253440113  -
pool0  version  8   default
pool0  bootfs   -   default
pool0  delegation   on  default
pool0  autoreplace  off default
pool0  temporaryoff default

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] panic: avl_find() succeeded inside avl_add()

2008-05-31 Thread Mike Gerdts
On Sat, May 31, 2008 at 8:48 PM, Mike Gerdts [EMAIL PROTECTED] wrote:
 I just experienced a zfs-related crash.  I have filed a bug (don't
 know number - grumble). I have a crash dump but little free space.  If
 someone would like some more info from the core, please let me know in
 the next few days.

And I am able to reproduce...

From a fresh crash:

 ::status
debugging crash dump vmcore.6 (64-bit) from sun
operating system: 5.11 snv_76 (sun4u)
panic message: avl_find() succeeded inside avl_add()
dump content: kernel pages only
 ::stack
vpanic(128d918, 30011ba6638, 2a101594fb8, 0, 30011ba6868, 1229000)
avl_add+0x38(30011bad320, 30011ba6638, 649e740, 3000d2d8180,
800271d6, 128d800)
mzap_open+0x18c(cf, 30011bad310, 30011b2b480, 3000d2d8180, 30011bad2f8,
30011bad2e0)
zap_lockdir+0x54(30004910c08, 26b32, 0, 0, 1, 2a1015951e8)
zap_lookup+0x18(30004910c08, 26b32, 2a101595680, 8, 1, 2a1015952a8)
zfs_dirent_lock+0x2f8(2a101595370, 3000b859518, 2a101595680, 2a101595378, 6, 4)
zfs_dirlook+0x19c(3000b859518, 2a101595680, 2a101595678, 2a101595680, 0, 0)
zfs_lookup+0x188(3000b855d00, 2a101595680, 2a101595678, 2a101595940, 0,
30004c32440)
fop_lookup+0x4c(3000b855d00, 2a101595680, 2a101595678, 2a101595940, 0,
3000101fa40)
lookuppnvp+0x324(2a101595940, 0, 0, 3000b855d00, 30008c61b70, 3000101fa40)
lookuppnat+0x10c(3000c864600, 0, 0, 0, 2a101595ad8, 0)
lookupnameat+0x5c(c461c, 0, 0, 0, 2a101595ad8, 0)
cstatat_getvp+0x16c(18bd000, c461c, 1, 0, 2a101595ad8, 0)
cstatat64_32+0x58(ffd19553, c461c, 1, ffbfbcc0, 1000, 0)
syscall_trap32+0xcc(c461c, ffbfbcc0, c462c, 0, ff00, 80808080)

 3000c864600::print vnode_t
{
...
v_path = 0x3000c837458
/ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix

$ ls /ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix
Makefile   debug64/

$ find /ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix
/ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix
/ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix/.make.state.lock
/ws/mount/onnv-gate/usr/src/uts/sun4u/serengeti/unix/debug64
panic

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Create ZFS now, add mirror later

2008-05-28 Thread E. Mike Durbin
Is there a way to to a create a zfs file system
(e.g. zpool create boot /dev/dsk/c0t0d0s1)

Then, (after vacating the old boot disk) add another
device and make the zpool a mirror?

(as in: zpool create boot mirror /dev/dsk/c0t0d0s1 /dev/dsk/c1t0d0s1)

Thanks!

emike
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS in S10U6 vs openSolaris 05/08

2008-05-27 Thread Mike Gerdts
On Tue, May 27, 2008 at 12:44 PM, Rob Logan [EMAIL PROTECTED] wrote:
   There is something more to consider with SSDs uses as a cache device.
 why use SATA as the interface? perhaps
 http://www.tgdaily.com/content/view/34065/135/
 would be better? (no experience)

 cards will start at 80 GB and will scale to 320 and 640 GB next year.
 By the end of 2008, Fusion io also hopes to roll out a 1.2 TB card.
 160 parallel pipelines that can read data at 800 megabytes per second
 and write at 600 MB/sec 4K blocks and then streaming eight
 simultaneous 1 GB reads and writes.  In that test, the ioDrive
 clocked in at 100,000 operations per second...  beat $30 dollars a GB,

These could be rather interesting as swap devices.  On the face of it,
$30/GB is pretty close to the list price of taking a T5240 from 32 GB
to 64 GB.  However, it is *a lot* less than feeding system-board DIMM
slots to workloads that use a lot of RAM but are fairly inactive.  As
such, a $10k PCIe card may be able to allow a $42k 64 GB T5240 handle
5+ times the number of not-too-busy J2EE instances.

If anyone's done any modelling or testing of such an idea, I'd love to
hear about it.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS: A general question

2008-05-24 Thread Ellis, Mike
I like the link you sent along... They did a nice job with that. 
(but it does show that mixing and matching vastly different drive-sizes
is not exactly optimal...)

http://www.drobo.com/drobolator/index.html

Doing something like this for ZFS allowing people to create pools by
mixing/matching drives, raid1, and raidz/z2 drives in a zpool makes for
a pretty cool page. If one of the statistical gurus can add MTBF
MTTdataLoss etc. to that as a calculator at the bottom that would be
even better. (someone did some static graphs for different thumper
configurations for this in the past... This would just make that more
general purpose/GUI driven... Sounds like a cool project)

--

No mention anywhere of removing drives thereby reducing capacity
though... Raid-re-striping isn't all that much fun, especially with
larger drives... (and even ZFS lacks some features in this area for now)


See the answer to you other question below. (from their FAQ)

-- MikeE



What file systems does drobo support?
 
RESOLUTION:

Drobo is a usb external disk array that is formatted by the host
operating system (Windows or OS X). We currently support NTFS, HFS+, and
FAT32 file systems with firmware revision 1.0.2.

Drobo is not a ZFS file system.

STATUS:

Current specification 1.0.2

Applies to:
Drobo DRO4D-U  




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Steve Hull
Sent: Saturday, May 24, 2008 7:00 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS: A general question

OK so in my (admittedly basic) understanding of raidz and raidz2, these
technologies are very similar to raid5 and raid6.  BUT if you set up one
disk as a raidz vdev, you (obviously) can't maintain data after a disk
failure, but you are protected against data corruption that is NOT a
result of disk failure.  Right?

So is there a resource somewhere that I could look at that clearly
spells out how many disks I could have vs. how much resulting space I
would have that would still protect me against disk failure (a la the
Drobolator http://www.drobo.com/drobolator/index.html)?  I mean, if I
have a raidz vdev with one disk, then I add a disk, am I protected from
disk failure?  Is it the case that I need to have disks in groups of 4
to maintain protection against single disk failure with raidz and in
groups of 5 for raidz2?  It gets even more confusing if I wanted to add
disks of varying sizes...  

And you said I could add a disk (or disks) to a mirror -- can I force
add a disk (or disks) to a raidz or raidz2?  Without destroying and
rebuilding as I read would be required somewhere else?

And if I create a zpool and add various single disks to it (without
creating raidz/mirror/etc), is it the case that the zpool is essentially
functioning like spanning raid?  Ie, no protection at all??

Please either point me to an existing resource that spells this out a
little clearer or give me a little more explanation around it.

And...  do you think that the Drobo (www.drobo.com) product is
essentially just a box with OpenSolaris and ZFS on it?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs iostat

2008-05-18 Thread Mike Gerdts
On Sun, May 18, 2008 at 7:34 AM, Karsten L. [EMAIL PROTECTED] wrote:
 Hi guys,

 is there a way to find out the current i/o-stats for a zfs-partition? I know 
 of zpool iostat, but it only lists the i/o-stats of the whole pool. I need 
 something like zfs iostat, or how can I get the stats with general 
 systemtools of a particular directory?

 any idea would be appreciated
 karsten

Have you tried fsstat?  I think it will do what you are looking for
whether it is zfs, ufs, tmpfs, etc.


-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs! mirror and break

2008-05-08 Thread Mike DeMarco
I currently have a zpool with two 8Gbyte disks in it. I need to replace them 
with a single 56Gbyte disk.

with veritas I would just add the disk in as a mirror and break off the other 
plex then destroy it.

I see no way of being able to do this with zfs.

Being able to migrate data without having to unmount and remount filesystems is 
very 
important to me.

Can anyone say when such functionality will be implemented?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How many ZFS pools is it sensible to use on a single server?

2008-04-15 Thread Mike Gerdts
On Tue, Apr 15, 2008 at 9:22 AM, David Collier-Brown [EMAIL PROTECTED] wrote:
   We've discussed this in considerable detail, but the original
  question remains unanswered:  if an organization *must* use
  multiple pools, is there an upper bound to avoid or a rate
  of degradation to be considered?

I have a keen interest in this as well.  I would really like zones to
be able to independently fail over between hosts in a zone farm.  The
work coming out of the Indiana, IPS, Caiman, etc. projects imply that
zones will have to be on zfs.  In order to fail zones over between
systems independently either I need to have a zpool per zone or I need
to have per-dataset replication.  Considering that with some workloads
20+ zones on a T2000 is quite feasible, a T5240 could be pushing 80+
zones and as such a relatively large number of zpools.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZVOL access permissions?

2008-04-12 Thread Ellis, Mike
Could someone kindly provide some details on using a zvol in sparse-mode?

Wouldn't the COW nature of zfs (assuming COW still applies on ZVOLS) quickly 
erode the sparse nature of the zvol?

Would sparse data-presentation only work by delegating a part of a zpool to a 
zone, but that's at the file-level, not raw?

Thanks 

  -- mikee


- Original Message -
From: [EMAIL PROTECTED] [EMAIL PROTECTED]
To: zfs-discuss@opensolaris.org zfs-discuss@opensolaris.org
Sent: Sat Apr 12 10:02:18 2008
Subject: [zfs-discuss] ZVOL access permissions?

How can I set up a ZVOL that's accessible by non-root users, too? The intent is 
to use sparse ZVOLs as raw disks in virtualization (reducing overhead compared 
to file-based virtual volumes).

Thanks,
-mg
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [storage-discuss] Preventing zpool imports on boot

2008-02-15 Thread Mike Gerdts
On Thu, Feb 14, 2008 at 11:17 PM, Dave [EMAIL PROTECTED] wrote:
  I don't want Solaris to import any pools at bootup, even when there were
  pools imported at shutdown/at crash time. The process to prevent
  importing pools should be automatic and not require any human
  intervention. I want to *always* import the pools manually.

  Hrm... what if I deleted zpool.cache after importing/exporting any pool?
  Are these the only times zpool.cache is created?

  I wish zpools had a property of 'atboot' or similar, so that you could
  mark a zpool to be imported at boot or not.


Like this?

 temporary

 By default, all pools are persistent and  are  automati-
 cally  opened  when the system is rebooted. Setting this
 boolean property to on causes the pool to  exist  only
 while  the  system is up. If the system is rebooted, the
 pool has to be manually imported  by  using  the  zpool
 import  command.  Setting this property is often useful
 when using pools on removable media, where  the  devices
 may  not  be  present when the system reboots. This pro-
 perty can also be referred to by  its  shortened  column
 name,  temp.


  (I am trying to move this thread over to zfs-discuss, since I originally
  posted to the wrong alias)

storage-discuss trimmed in my reply.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL controls in Solaris 10 U4?

2008-02-02 Thread Mike Gerdts
On Jan 30, 2008 2:27 PM, Jonathan Loran [EMAIL PROTECTED] wrote:
 Before ranting any more, I'll do the test of disabling the ZIL.  We may
 have to build out these systems with Open Solaris, but that will be hard
 as they are in production.  I would have to install the new OS on test
 systems and swap out the drives during scheduled down time.  Ouch.

Live upgrade can be very helpful here, either for upgrading or
applying a flash archive.  Once you are comfortable that Nevada
performs like you want, you could prep the new OS on alternate slices
or broken mirrors.  Activating the updated OS should take only a few
seconds longer than a standard init 6.  Failback is similarly easy.

I can't remember the last time I swapped physical drives to minimize
the outage during an upgrade.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Resizing a mirror

2008-01-29 Thread Mike Gerdts
On Jan 29, 2008 5:55 PM, Andrew Gabriel [EMAIL PROTECTED] wrote:
 Having attached new bigger disks to a mirror, and detached all the older
 smaller disks, how to I tell ZFS to expand the size of the mirror to
 match that of the bigger disks? I had a look through the system admin
 guide, but couldn't find this anywhere.

 In SVM, you just say metattach mirror with no devices listed to
 achieve this, but the equivalent in zpool gives a syntax error.

I thought I saw something on the list lately saying that there is a
bug that requires you to export the zpool and then import it to get
the additional space to be seen.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Moving zfs to an iscsci equallogic LUN

2008-01-15 Thread Ellis, Mike
Use zpool replace to swap one side of the mirror with the iscsi lun.

-- mikee


- Original Message -
From: [EMAIL PROTECTED] [EMAIL PROTECTED]
To: zfs-discuss@opensolaris.org zfs-discuss@opensolaris.org
Sent: Tue Jan 15 08:46:40 2008
Subject: Re: [zfs-discuss] Moving zfs to an iscsci equallogic LUN

What would be the commands for the three way mirror or an example of what your 
describing. I thought the 200gb would have to be the same size to attach to the 
existing mirror and you would have to attach two LUN disks vs one LUN.  Once it 
attaches it automatically reslivers or syncs the disk then if I wanted to I 
could remove the two 73 GB disks or still keep them in the pool and expand the 
pool later if I want?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware for zfs home storage

2008-01-14 Thread mike
On 1/14/08, eric kustarz [EMAIL PROTECTED] wrote:

 On Jan 14, 2008, at 11:08 AM, Tim Cook wrote:

  www.mozy.com appears to have unlimited backups for 4.95 a month.
  Hard to beat that.  And they're owned by EMC now so you know they
  aren't going anywhere anytime soon.

mozy's been okay, but only for windows/OS X.

uploading can be slow sometimes...

i do like rsync.net since it is a totally standards based solution,
not proprietary.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hardware for zfs home storage

2008-01-14 Thread mike
except in my experience it is piss poor slow... but yes it is another
option that is -basically- built on standards (i say that only because
it's not really a traditional filesystem concept)

On 1/14/08, David Magda [EMAIL PROTECTED] wrote:

 On Jan 14, 2008, at 17:15, mike wrote:

  On 1/14/08, eric kustarz [EMAIL PROTECTED] wrote:
 
  On Jan 14, 2008, at 11:08 AM, Tim Cook wrote:
 
  www.mozy.com appears to have unlimited backups for 4.95 a month.
  Hard to beat that.  And they're owned by EMC now so you know they
  aren't going anywhere anytime soon.
 
  mozy's been okay, but only for windows/OS X.
 
  uploading can be slow sometimes...
 
  i do like rsync.net since it is a totally standards based solution,
  not proprietary.

 There's also Amazon's S3. Published APIs so you can use already
 available utilities / libraries into whatever scripted solution you
 can think of.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can't access my data

2008-01-04 Thread Mike Gerdts
On Jan 4, 2008 2:42 PM, George Shepherd - Sun Microsystems Home system
[EMAIL PROTECTED] wrote:
 Hi Folks..

 I have/had a zpool containing one filesystem.

 I had to change my hostid and needed to import my pool, (I've done his
 OK in the past).
 After the import the mount of my filesystem failed.

 # zpool import homespool
 cannot mount 'homespool/homes': mountpoint or dataset is busy
  ^^

 # zfs list
 NAME  USED  AVAIL  REFER  MOUNTPOINT
 homespool9.91G   124G18K  /homespool
 homespool/homes  9.91G   124G  9.91G  /homes
^^
Is something else already mounted at /homes?

# df -k /homes

Did you or someone else cd /homes before trying this, thus
causing the mount point to be busy?

# fuser /homes

If you still can't resolve it

# zfs set mountpoint=/somewhere_else homespool/homes
# zfs mount -a (not sure this needed)
# cd /somewhere_else

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help needed ZFS vs Veritas Comparison

2007-12-28 Thread Mike Gerdts
On Dec 28, 2007 8:40 AM, Sengor [EMAIL PROTECTED] wrote:
 Real comparison of features should include scenarios such as:

 - how ZFS/VxVM compare in BCV like environments (eg. when volumes are
 presented back to the same host)
 - how they all cope with various multipathing solutions out there
 - Filesystem vs Volume snapshots
 - Portability within cluster like environments (SCSI reserves  LUN
 visibility to multiple synchronous hosts)
 - Disaster recovery scenarios
 - Ease/Difficulty with data migrations across physical arrays
 - Boot volumes
 - Online vs Offline attribute/parameter changes

Very good list!

 I can't think of more right now, it's way past midnight here ;)

How about these?

- Integration with backup system
- Active-active cluster (parallel file system) capabilities
- Integration with OS maintenance activities (install, upgrade, patching, etc.)
- Relative performance on anticipated workload
- Staffing issues (what do people know, how many hours to train, how
long before proficiency)
- Supportability on multiple platforms at the site (e.g. Solaris,
Linux, HP-UX, AIX, ...)
- Impact of failure modes (missing license key especially major system
changes, on-disk corruption)
- Opportunities to do things previously not possible

ZFS doesn't win on many of those, but with the improvements that I
have seen throughout the storage stack it is somewhat likely that the
required improvements are already on the roadmap.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] fclose failing at 2G on a ZFS filesystem

2007-12-25 Thread Mike Gerdts
On Dec 25, 2007 1:33 PM, K [EMAIL PROTECTED] wrote:

   if (fclose (file)) {
   fprintf (stderr, fatal: unable to close temp file: %s\n,
strerror (errno));
   exit (1);

 I don't understand why the above piece of code is failing...

What command line is used to compile the code?  I would guess that you
don't have large file support.  A variant of the following would
probably be good:

cc -c $CFLAGS `getconf LFS_CFLAGS` myprog.c
cc -o myprog $LDFLAGS `getconf LFS_LDFLAGS`

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does Oracle support ZFS as a file system with Oracle RAC?

2007-12-18 Thread Mike Gerdts
On Dec 18, 2007 11:01 AM, David Runyon [EMAIL PROTECTED] wrote:
 Does anyone know this?

There are multiple file system usages involved in Oracle RAC:

1) Oracle Home - This is where the oracle software lives.  This can be
   on a file system shared among all nodes or a per-host file system.
   ZFS should work fine in the per-host configuration, but I don't
   know about an official support statement.  This is likely not very
   important because of...
2) Database files - I'll lump redo logs, etc. in with this.  In Oracle
   RAC these must live on a shared-rw (e.g. clustered VxFS, NFS) file
   system.  ZFS does not do this.

If you drink the Oracle kool-aid and are using 10g or later the
database files will go into ASM, which seems to share a number of
characteristics with (but is largely complementary to) ZFS.  That is,
it spreads writes among all allocated disks, provides redundancy
without an underlying volume manager or hardware RAID, is transaction
safe, etc.  I am pretty sure that ASM also supports per-block
checksums, space efficient snapshots, block level incremental backups,
etc.  Although ASM is a relatively new technology, I think it has many
more hours of runtime and likely more space in production use than
ZFS.

I think that ZFS holds a lot of promise for shared-nothing database
clusters, such as is being done by Greenplumb with their extended
variant of Postgres.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] /usr/bin and /usr/xpg4/bin differences

2007-12-16 Thread Mike Gerdts
On Dec 16, 2007 1:16 AM, Sasidhar Kasturi [EMAIL PROTECTED] wrote:
 Yes .. but have a look at the bug i am working on ..
 Bug id:6493125

 http://bugs.opensolaris.org/view_bug.do?bug_id=6493125

 Thank you,
  Sasidhar.

I'm not sure what question you are asking...

1) Why are there two variants of df?

If this is the question, it is likely because Solaris already had df
that had some behavior that people had grown to depend on.  A
standards committee came along and said We need the -P and -v option
and the formatting of these headers needs to be ... and these columns
needs to be like   Sun needed to not disrupt existing customers
and needed XPG4 compliance to satisfy a potentially different set of
customers.  Thus, the XPG4 variant got some small changes that (at the
time?) weren't appropriate for the traditional version.

There is some likelihood that the bug you are working on is not the
first time that the differences between the XPG4 variant and the
/usr/bin variant have decreased.

2) How could the same source code produce different output?

It looks as though enabling the -P option is pretty straightforward -
modification of the getopts() string at removal of the #ifdef at line
593 and getting rid of the #ifdef and corresponding #endif at lines
605 and 607 should be sufficient.

590 #ifdef XPG4
591 while ((arg = getopt(argc, argv, F:o:abehkVtgnlPZ)) != EOF) {
592 #else
593 while ((arg = getopt(argc, argv, F:o:abehkVtgnlvZ)) != EOF) {
594 #endif
595 if (arg == 'F') {
596 if (F_option)
597 errmsg(ERR_FATAL + ERR_USAGE,
598 more than one FSType specified);
599 F_option = 1;
600 FSType = optarg;
601 } else if (arg == 'V'  ! V_option) {
602 V_option = TRUE;
603 } else if (arg == 'v'  ! v_option) {
604 v_option = TRUE;
605 #ifdef XPG4
606 } else if (arg == 'P'  ! P_option) {
607 SET_OPTION(P);
608 #endif

Of course, updating the usage error, man page, etc. would be
appropriate too.  You can see a few other #ifdef XPG4 blocks that
show the quite small differences between the two variants.

Also... since there is nothing zfs-specific here, opensolaris-code may
be a more appropriate forum.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] /usr/bin and /usr/xpg4/bin differences

2007-12-15 Thread Mike Gerdts
On Dec 15, 2007 11:31 PM, KASTURI VENKATA SESHA SASIDHAR
[EMAIL PROTECTED] wrote:
 Hello,
 I am working on open solaris bugs .. and need to change the code of 
 df in the above two folders..

 I would like to know why there are two df's with diff options in the 
 respective folders..
 /usr/bin/df is different is from /usr/xpg4/bin/df!!

The code for both variants of df come from the same source
(usr/src/cmd/fs.d/df.c).  The xpg4 variant is compiled with -DXPG4.
After a build in usr/src/cmd/fs.d is complete you will see the
following:

$ ls df*
df  df.odf.po.xpg4  df.xpg4
df.cdf.po   df.xcl  df.xpg4.o

It looks to me as though df becomes /usr/bin/df and df.xpg4 becomes
/usr/xpg4/bin/df.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error in zpool man page?

2007-12-07 Thread Mike Dotson

On Fri, 2007-12-07 at 08:02 -0800, jonathan soons wrote:
 The man page gives this form:
  zpool create [-fn] [-R root] [-m mountpoint] pool vdev ...
 however, lower down, there is this command:
 # zpool create mirror c0t0d0 c0t1d0 mirror c1t0d0 c1t1d0
 Isn't the pool element missing in the command?

In the command you pasted above yes, however, looking at the man pages I
have, I see the correct command line.  What OS and rev was this from?

  
 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Error in zpool man page?

2007-12-07 Thread Mike Dotson

On Fri, 2007-12-07 at 08:24 -0800, jonathan soons wrote:
 SunOS 5.10  Last change: 25 Apr 2006 
 
 Yes, I see that my other server is more up to date.
 
 SunOS 5.10  Last change: 13 Feb 2007   
 This one was recently installed.

What OS rev?  (more /etc/release)  

I don't have any systems later than update 3 patched to January 2007 and
have the correct man page.

Looks like perhaps bug 6419899 which was fixed in patch 119246-16 and
119246-21 was released on 11-DEC-2006 and included in Solaris 10 11/06
(update 3).  Latest is rev 27 of patch 119246.

 
 Is there a patch that was not included with 10_Recommended?
  
 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Thanks...


Mike Dotson
Area System Support Engineer - ACS West
Phone: (503) 343-5157
[EMAIL PROTECTED]


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] x4500 w/ small random encrypted text files

2007-11-29 Thread Mike Gerdts
On Nov 29, 2007 11:41 AM, Richard Elling [EMAIL PROTECTED] wrote:
 It depends on the read pattern.  If you will be reading these small
 files randomly, then there may be a justification to tune recordsize.
 In general, backup/restore workloads are not random reads, so you
 may be ok with the defaults.  Try it and see if it meets your
 performance requirements.
  -- richard

It seems as though backup/restore of small files would be a random
pattern, unless you are using zfs send/receive.  Since no enterprise
backup solution that I am aware of uses zfs send/receive, most people
doing backups of zfs are using something that does something along the
lines of

while readdir ; do
open file
read from file
write to backup stream
close file
done

Since files are unlikely to be on disk in a contiguous manner, this
looks like a random read operation to me.

Am I wrong?

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Kernel panic receiving incremental snapshots

2007-11-23 Thread Mike Gerdts
On Aug 25, 2007 8:36 PM, Stuart Anderson [EMAIL PROTECTED] wrote:
 Before I open a new case with Sun, I am wondering if anyone has seen this
 kernel panic before? It happened on an X4500 running Sol10U3 while it was
 receiving incremental snapshot updates.

 Thanks.


 Aug 25 17:01:50 ldasdata6 ^Mpanic[cpu0]/thread=fe857d53f7a0:
 Aug 25 17:01:50 ldasdata6 genunix: [ID 895785 kern.notice] dangling dbufs 
 (dn=fe82a3532d10, dbuf=fe8b4e338b90)

I saw dangling dbufs panics beginning with S10U4 beta and the then
current (May '07) nevada builds.  If you are running a kernel newer
than the x86 equivalent of 125100-10, you may be seeing the same
thing.  The panics I saw were not triggered by zfs receive, so you may
be seeing something different.  An IDR was produced for me.  If you
have Sun support search for my name, you can likely get the same IDR
(errr, an IDR with the same fix - mine was SPARC) to see if it
addresses your problem.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Home Motherboard

2007-11-22 Thread mike
I actually have a related motherboard, chassis, dual power-supplies
and 12x400 gig drives already up on ebay too. If I recall Areca cards
are supported in OpenSolaris...

http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItemitem=300172982498


On 11/22/07, Jason P. Warr [EMAIL PROTECTED] wrote:
 If you want a board that is a steal look at this one:

 http://www.ascendtech.us/itemdesc.asp?ic=MBTAS2882G3NR

 Tyan S2882, Dual Socket 940 Opteron, 8 DDR slots, 2  PCI-X 133 busses with 2 
 slots each, Dual Core support.

 $80.

 Pair is with a couple of Opteron 270's from ebay for $195:

 http://cgi.ebay.com/MATCH-PAIR-AMD-Opteron-270-64Bit-DualCore-940pin-2Ghz_W0QQitemZ290182379420QQihZ019QQcategoryZ80142QQssPageNameZWDVWQQrdZ1QQcmdZViewItem

 Granted, you need an E-ATX case but those are not that expensive and an 
 EPS12V power supply.  For less than $600 you can have a hell of a base server 
 grade system with 4 cores and 2-4G of ram.


 - Original Message -
 From: Rob Logan [EMAIL PROTECTED]
 To: zfs-discuss@opensolaris.org
 Sent: Wednesday, November 21, 2007 11:17:19 PM (GMT-0600) America/Chicago
 Subject: [zfs-discuss] Home Motherboard

 grew tired of the recycled 32bit cpus in
 http://www.opensolaris.org/jive/thread.jspa?messageID=127555

 and bought this to put the two marvell88sx cards in:
 $255 http://www.supermicro.com/products/motherboard/Xeon3000/3210/X7SBE.cfm
  http://www.supermicro.com/manuals/motherboard/3210/MNL-0970.pdf
 $195 1333FSB 2.6GHz Xeon 3075 (basicly a E6750)
  Any Core 2 Quad/Duo in LGA775 will work, including 45nm dies:
  http://rob.com/sun/x7sbe/45nm-pricing.jpg
 $270 Four 1G PC2-6400 DDRII 800MHz 240-pin ECC Unbuffered SDRAM
 $ 55 LOM (IPMI and Serial over LAN)
  http://www.supermicro.com/manuals/other/AOC-SIMSOLC-HTC.pdf

 # /usr/X11/bin/scanpci
 pci bus 0x cardnum 0x00 function 0x00: vendor 0x8086 device 0x29f0
  Intel Corporation Server DRAM Controller

 pci bus 0x cardnum 0x01 function 0x00: vendor 0x8086 device 0x29f1
  Intel Corporation Server Host-Primary PCI Express Bridge

 pci bus 0x cardnum 0x1a function 0x00: vendor 0x8086 device 0x2937
  Intel Corporation USB UHCI Controller #4

 pci bus 0x cardnum 0x1a function 0x01: vendor 0x8086 device 0x2938
  Intel Corporation USB UHCI Controller #5

 pci bus 0x cardnum 0x1a function 0x02: vendor 0x8086 device 0x2939
  Intel Corporation USB UHCI Controller #6

 pci bus 0x cardnum 0x1a function 0x07: vendor 0x8086 device 0x293c
  Intel Corporation USB2 EHCI Controller #2

 pci bus 0x cardnum 0x1c function 0x00: vendor 0x8086 device 0x2940
  Intel Corporation PCI Express Port 1

 pci bus 0x cardnum 0x1c function 0x04: vendor 0x8086 device 0x2948
  Intel Corporation PCI Express Port 5

 pci bus 0x cardnum 0x1c function 0x05: vendor 0x8086 device 0x294a
  Intel Corporation PCI Express Port 6

 pci bus 0x cardnum 0x1d function 0x00: vendor 0x8086 device 0x2934
  Intel Corporation USB UHCI Controller #1

 pci bus 0x cardnum 0x1d function 0x01: vendor 0x8086 device 0x2935
  Intel Corporation USB UHCI Controller #2

 pci bus 0x cardnum 0x1d function 0x02: vendor 0x8086 device 0x2936
  Intel Corporation USB UHCI Controller #3

 pci bus 0x cardnum 0x1d function 0x07: vendor 0x8086 device 0x293a
  Intel Corporation USB2 EHCI Controller #1

 pci bus 0x cardnum 0x1e function 0x00: vendor 0x8086 device 0x244e
  Intel Corporation 82801 PCI Bridge

 pci bus 0x cardnum 0x1f function 0x00: vendor 0x8086 device 0x2916
  Intel Corporation  Device unknown

 pci bus 0x cardnum 0x1f function 0x02: vendor 0x8086 device 0x2922
  Intel Corporation 6 port SATA AHCI Controller

 pci bus 0x cardnum 0x1f function 0x03: vendor 0x8086 device 0x2930
  Intel Corporation SMBus Controller

 pci bus 0x cardnum 0x1f function 0x06: vendor 0x8086 device 0x2932
  Intel Corporation Thermal Subsystem

 pci bus 0x0001 cardnum 0x00 function 0x00: vendor 0x8086 device 0x0329
  Intel Corporation 6700PXH PCI Express-to-PCI Bridge A

 pci bus 0x0001 cardnum 0x00 function 0x01: vendor 0x8086 device 0x0326
  Intel Corporation 6700/6702PXH I/OxAPIC Interrupt Controller A

 pci bus 0x0001 cardnum 0x00 function 0x02: vendor 0x8086 device 0x032a
  Intel Corporation 6700PXH PCI Express-to-PCI Bridge B

 pci bus 0x0001 cardnum 0x00 function 0x03: vendor 0x8086 device 0x0327
  Intel Corporation 6700PXH I/OxAPIC Interrupt Controller B

 pci bus 0x0003 cardnum 0x02 function 0x00: vendor 0x11ab device 0x6081
  Marvell Technology Group Ltd. MV88SX6081 8-port SATA II PCI-X
 Controller

 pci bus 0x000d cardnum 0x00 function 0x00: vendor 0x8086 device 0x108c
  Intel Corporation 82573E Gigabit Ethernet Controller (Copper)

 pci bus 0x000f cardnum 0x00 function 0x00: vendor 0x8086 device 0x109a
  Intel Corporation 82573L Gigabit Ethernet Controller

 pci bus 0x0011 cardnum 0x04 function 0x00: vendor 0x1002 device 0x515e
  ATI Technologies Inc ES1000

 # cfgadm -a
 Ap_Id  Type 

Re: [zfs-discuss] How to create ZFS pool ?

2007-11-15 Thread Mike Dotson
On Thu, 2007-11-15 at 05:25 -0800, Boris Derzhavets wrote:
 Thank you very much Mike for your feedback.
 Just one more question.
 I noticed five device under /dev/rdsk:-
 c1t0d0p0
 c1t0d0p1
 c1t0d0p2
 c1t0d0p3
 c1t0d0p4
 been created by system immediately after installation completed.
 I believe it's x86 limitation (no more then 4 primary partitions)
 If I've got your point right, in case when Other OS partition gets number 3.
 I am supposed to run:-
 # zpool create pool  c1t0d0p3

Yes.  Just make sure it's the correct partition, ie. partition 3 is
actually where you want the zpool otherwise you'll corrupt/loose what
ever data is on that partition.  You also need to make sure that
partition 3 is defined and you can see it in fdisk as Solaris creates
these p? devices whether they exist or not.

So if I read your previous email correctly, you'll need to run format,
select your first disk then run fdisk again.  Empty/unused space doesn't
mean a partition has been created.

From there, you'll want to create a new partition and if you're not
familiar with Solaris fdisk, it's a PITA until you get really used to
it.  You'll want to start one (1) cylinder past the end of your last
partition so there's no overlap, then calculate the size of the
partition.  I usually use cylinders for this.

So on one of my systems:

 Total disk size is 17849 cylinders
 Cylinder size is 16065 (512 byte) blocks

   Cylinders
  Partition   StatusType  Start   End   Length%
  =   ==  =   ===   ==   ===
  1   ActiveSolaris2  1  52245224 29



SELECT ONE OF THE FOLLOWING:
   1. Create a partition
   2. Specify the active partition
   3. Delete a partition
   4. Change between Solaris and Solaris2 Partition IDs
   5. Exit (update disk configuration and exit)
   6. Cancel (exit without updating disk configuration)
Enter Selection: 

So the last cylinder is 5224 so we'll start on 5225 and to use the rest
of the disk, you'll want to take the max cylinders (17849 from top line)
and subtract 5225 which gives you 12624.  

Select 1 to create a new partition:
Select the partition type to create:
   1=SOLARIS2  2=UNIX3=PCIXOS 4=Other
   5=DOS12 6=DOS16   7=DOSEXT 8=DOSBIG
   9=DOS16LBA  A=x86 BootB=Diagnostic C=FAT32
   D=FAT32LBA  E=DOSEXTLBA   F=EFI0=Exit? 

Select 4 for Other OS
Specify the percentage of disk to use for this partition
(or type c to specify the size in cylinders). 

Now select c for cylinders (I've never been much one for trusting
percentages;)

Enter starting cylinder number:  5225
Enter partition size in cylinders: 12624
(It'll ask you about making it the active partition - say no here)


 Total disk size is 17849 cylinders
 Cylinder size is 16065 (512 byte) blocks

   Cylinders
  Partition   StatusType  Start   End   Length%
  =   ==  =   ===   ==   ===
  1   ActiveSolaris2  1  52245224 29
  2 Other OS   5225  1784812624 71




SELECT ONE OF THE FOLLOWING:
   1. Create a partition
   2. Specify the active partition
   3. Delete a partition
   4. Change between Solaris and Solaris2 Partition IDs
   5. Exit (update disk configuration and exit)
   6. Cancel (exit without updating disk configuration)

Double check you're not overlapping any of the partitions and select 5
to save the partition.

In this case, the pool would be c1t0d0p2.  Not the most technically
accurate but think of p0 as the entire disk and your first partition
starts with p1 and so forth.

Hope that helps.  If you want, post your fdisk partition table if you
want a second set of eyes.  

 Boris.
  
 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to create ZFS pool ?

2007-11-14 Thread Mike Dotson
On Wed, 2007-11-14 at 21:23 +, A Darren Dunham wrote:
 On Wed, Nov 14, 2007 at 09:40:59AM -0800, Boris Derzhavets wrote:
  I was able to create second Solaris partition by running 
  
  #fdisk /dev/rdsk/c1t0d0p0
 
 I'm afraid that won't do you much good.
 
 Solaris only works with one Solaris partition at a time (on any one
 disk).  If you have free space that you want to play with, it should be
 within the existing partition (or be on another disk).
 
  Is it posible to create zfs pool with third partition ?
 
 I doubt it, but I think it more of a general Solaris limitation than
 anything to do with ZFS specifically.

You can't use another Solaris partition but you could use a different
partition ID:

 Total disk size is 9729 cylinders
 Cylinder size is 16065 (512 byte) blocks

   Cylinders
  Partition   StatusType  Start   End   Length%
  =   ==  =   ===   ==   ===
  1 IFS: NTFS 0  10431044 11
  2 Linux native   1044  23481305 13
  3   ActiveSolaris2   2349  49592611 27
  4 Other OS   4960  97284769 49


SELECT ONE OF THE FOLLOWING:
   1. Create a partition
   2. Specify the active partition
   3. Delete a partition
   4. Change between Solaris and Solaris2 Partition IDs
   5. Exit (update disk configuration and exit)
   6. Cancel (exit without updating disk configuration)

Notice partition 4 is Other OS which is where I have my zfs pool:

helios(2): zpool status
  pool: lpool
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool
can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
pool will no longer be accessible on older software versions.
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
lpool   ONLINE   0 0 0
  c0d0p4ONLINE   0 0 0

errors: No known data errors


So to create the pool in my case would be: zpool create lpool c0d0p4



-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Hierarchal zfs mounts

2007-10-22 Thread Mike DeMarco
Looking for a way to mount a zfs filesystem ontop of another zfs filesystem 
without resorting to legacy mode.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hierarchal zfs mounts

2007-10-22 Thread Mike DeMarco
 Mike DeMarco wrote:
  Looking for a way to mount a zfs filesystem ontop
 of another zfs
  filesystem without resorting to legacy mode.
 
 doesn't simply 'zfs set mountpoint=...' work for you?
 
 -- 
 Michael Schuster
 Recursion, n.: see 'Recursion'
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss

well if you create let say a local/apps and a local/apps-bin then 
zfs set mountpoint=/apps local/apps
zfs set mountpoint=/apps/bin  local/apps-bin

now if you reboot the system there is no mechanism to tell zfs to mount /apps 
first and /apps/bin second so you could get /apps/bin mounted first and then 
/apps either will mount overtop or wont mount.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mounting ZFS Pool to a different server

2007-10-19 Thread Mike Gerdts
The ideal situation it would go like:

host1# zpool export pool
host2# zpool import pool

If you know (really know) that it is offline on the other server (e.g. you
can verify the host is dead), you can use:

# zpool import -f pool

Mike


On 10/19/07, Mertol Ozyoney [EMAIL PROTECTED] wrote:

  Hi;



 One of my customers is using ZFs on IBM DS4800 Lun's. They use one lun for
 each ZFS pool if it matters.

 They want to take the pool offline from one server and take it nline from
 an other server.



 In summary they want to take the control of a ZFS pool if the primary
 server fails for some reason. I know we can do it with Sun Cluster how ever
 this is pretty complex and expensive .



 How can this be achieved?



 Regards

 Mertol





 [image: http://www.sun.com/emrkt/sigs/6g_top.gif] http://www.sun.com/

 *Mertol Ozyoney *
 Storage Practice - Sales Manager

 *Sun Microsystems, TR*
 Istanbul TR
 Phone +902123352200
 Mobile +905339310752
 Fax +90212335
 Email [EMAIL PROTECTED] [EMAIL PROTECTED]





 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




-- 
Mike Gerdts
http://mgerdts.blogspot.com/
image001.gif___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] meet import error after reinstall the OS

2007-10-18 Thread Shuai Mike Cheng
Hi,
I just reinstall my machine with onnv build75. When I try to import the zfs, I 
meet an error. The pool is created with a slice c1t1d0s7 and a whole disk 
c1t2d0s0. How do I fix the error? Below is the output from zpool and zdb.

# zpool import
  pool: tank
id: 8219303556773256880
 state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-6X
config:

tankUNAVAIL  missing device
  c1t2d0ONLINE

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
volams:/ 112 # zdb -l /dev/dsk/c1t1d0s7

LABEL 0

failed to unpack label 0

LABEL 1


LABEL 2

version=3
name='tank'
state=0
txg=44
pool_guid=8219303556773256880
top_guid=1191595199136351517
guid=1191595199136351517
vdev_tree
type='disk'
id=1
guid=1191595199136351517
path='/dev/dsk/c1t1d0s7'
devid='id1,[EMAIL PROTECTED]/h'
whole_disk=0
metaslab_array=18
metaslab_shift=29
ashift=9
asize=58707935232

LABEL 3

version=3
name='tank'
state=0
txg=44
pool_guid=8219303556773256880
top_guid=1191595199136351517
guid=1191595199136351517
vdev_tree
type='disk'
id=1
guid=1191595199136351517
path='/dev/dsk/c1t1d0s7'
devid='id1,[EMAIL PROTECTED]/h'
whole_disk=0
metaslab_array=18
metaslab_shift=29
ashift=9
asize=58707935232
volams:/ 113 # zdb -l /dev/dsk/c1t2d0s2
cannot open '/dev/dsk/c1t2d0s2': I/O error
volams:/ 114 # zdb -l /dev/dsk/c1t2d0s0

LABEL 0

version=3
name='tank'
state=0
txg=43
pool_guid=8219303556773256880
top_guid=4844356610838567439
guid=4844356610838567439
vdev_tree
type='disk'
id=0
guid=4844356610838567439
path='/dev/dsk/c1t2d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=1
metaslab_array=14
metaslab_shift=29
ashift=9
asize=73394552832

LABEL 1

version=3
name='tank'
state=0
txg=43
pool_guid=8219303556773256880
top_guid=4844356610838567439
guid=4844356610838567439
vdev_tree
type='disk'
id=0
guid=4844356610838567439
path='/dev/dsk/c1t2d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=1
metaslab_array=14
metaslab_shift=29
ashift=9
asize=73394552832

LABEL 2

version=3
name='tank'
state=0
txg=43
pool_guid=8219303556773256880
top_guid=4844356610838567439
guid=4844356610838567439
vdev_tree
type='disk'
id=0
guid=4844356610838567439
path='/dev/dsk/c1t2d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=1
metaslab_array=14
metaslab_shift=29
ashift=9
asize=73394552832

LABEL 3

version=3
name='tank'
state=0
txg=43
pool_guid=8219303556773256880
top_guid=4844356610838567439
guid=4844356610838567439
vdev_tree
type='disk'
id=0
guid=4844356610838567439
path='/dev/dsk/c1t2d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=1
metaslab_array=14
metaslab_shift=29
ashift=9
asize=73394552832


thanks
Mike
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HAMMER

2007-10-18 Thread Mike Gerdts
On 10/18/07, Darren J Moffat [EMAIL PROTECTED] wrote:
 zfs send | ssh -C | zfs recv

I was going to suggest this, but I think (I could be wrong...) that
ssh would then use zlib for compression and that ssh is still a
single-threaded process.  This has two effects:

1) gzip compression instead of compress - may or may not be right for
the application
2) encryption + compression happens in same thread.  While this may be
fine for systems that can do both at wire or file system speed, it is
not ideal if transfer rates are already constrained by CPU speed.

The Niagara 2 CPU likely changes the importance of 2 a bit.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HAMMER

2007-10-18 Thread Mike Gerdts
On 10/18/07, Darren J Moffat [EMAIL PROTECTED] wrote:
 Unfortunately it doesn't yet because ssh can't yet use the N2 crypto -
 because it uses OpenSSL's libcrypto without using the ENGINE API.

Marketing needs to get in line with the technology.  The word I
received was that any application that linked against the included
version of OpenSSL automatically gets to take advantage of the N2
crypto engine, so long as it is using one of the algorithms supported
by N2 engine.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HAMMER

2007-10-18 Thread Mike Gerdts
On 10/18/07, Darren J Moffat [EMAIL PROTECTED] wrote:
 Which marketing documentation (not person) says that ?

It was a person giving a technology brief in the past 6 weeks or so.
It kinda went like so long as they link against the bundled openssl
and not a private copy of openssl they will automatically take
advantage of the offload engine.

 It isn't actually false but it has a caveat that the application must be
 using the OpenSSL ENGINE API, which Apache mod_ssl does and it must use
 the EVP_ interfaces in OpenSSL's libcrypto (not the lower level direct
 software algorithm ones).

 Remember marketing info his very high level, the devil as aways is in
 the code.

Yeah, I know.  It's often times difficult to find the right code when
you know what you are looking for.  When you don't know that you
should be fact-checking, the code rarely finds its way in front of
you.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] UC Davis Cyrus Incident September 2007

2007-10-18 Thread Mike Gerdts
On 10/18/07, Bill Sommerfeld [EMAIL PROTECTED] wrote:
 that sounds like a somewhat mangled description of the cross-calls done
 to invalidate the TLB on other processors when a page is unmapped.
 (it certainly doesn't happen on *every* update to a mapped file).

I've seen systems running Veritas Cluster  Oracle Cluster Ready
Services idle at about 10% sys due to the huge number of monitoring
scripts that kept firing.  This was on a 12 - 16 CPU 25k domain.  A
quite similar configuration on T2000's had negligible overhead.
Lesson learned: cross-calls (and thread migrations, and ...) are much
cheaper on systems with lower latency between CPUs.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] UC Davis Cyrus Incident September 2007

2007-10-18 Thread Mike Gerdts
On 10/18/07, Gary Mills [EMAIL PROTECTED] wrote:
 What's the command to show cross calls?

mpstat will show it on a system basis.

xcallsbypid.d from the DTraceToolkit (ask google) will tell you which
PID is responsible.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] df command in ZFS?

2007-10-17 Thread Mike Gerdts
On 10/17/07, David Runyon [EMAIL PROTECTED] wrote:
 I was presenting to a customer at the EBC yesterday, and one of the
 people at the meeting said using df in ZFS really drives him crazy (no,
 that's all the detail I have).  Any ideas/suggestions?

I suspect that this is related to the notion that file systems are
cheap and the traditional notion of quotas is replaced by cheap file
systems.  This makes it so that a system with 1000 users that
previously had a small number of file systems now has over 1000 file
systems.  What used to be relatively simple output from df now turns
into 40+ screens[1] on the default sized terminal window.

1.  If you are in this situation, there is a good chance that the
formatting of df cause line folding or wrapping that doubles the
number of lines to 80+ screens of df output.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-24 Thread Mike Gerdts
On 9/24/07, Paul B. Henson [EMAIL PROTECTED] wrote:
 but checking the actual release notes shows no ZFS mention. 3.0.26 to
 3.2.0? That seems an odd version bump...

3.0.x and before are GPLv2.  3.2.0 and later are GPLv3.

http://news.samba.org/announcements/samba_gplv3/

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Mike Gerdts
On 9/20/07, Paul B. Henson [EMAIL PROTECTED] wrote:
 Again though, that would imply two different storage locations visible to
 the clients? I'd really rather avoid that. For example, with our current
 Samba implementation, a user can just connect to
 '\\files.csupomona.edu\username' to access their home directory or
 '\\files.csupomona.edu\groupname' to access a shared group directory.
 They don't need to worry on which physical server it resides or determine
 what server name to connect to.

MS-DFS could be helpful here.  You could have a virtual samba instance
that generates MS-DFS redirects to the appropriate spot.  At one point
in the past I wrote a script (long since lost - at a different job)
that would automatically convert automounter maps into the
appropriately formatted symbolic links used by the Samba MS-DFS
implementation.  It worked quite well for giving one place to
administer the location mapping while providing transparency to the
end-users.

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zoneadm clone doesn't support ZFS snapshots in

2007-09-21 Thread Mike Gerdts
On 9/20/07, Matthew Flanagan [EMAIL PROTECTED] wrote:
 Mike,

 I followed your procedure for cloning zones and it worked
 well up until yesterday when I tried applying the S10U4
 kernel patch 12001-14 and it wouldn't apply because I had
 my zones on zfs :(

Thanks for sharing.  That sucks.

 I'm still figuring out how to fix this other than moving all of my zones onto 
 UFS.

How about a dtrace script that changes the fstyp in statvfs() returns
to say that it is ufs?  :)

I bet someone comes along and says that isn't supported either...

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zoneadm clone doesn't support ZFS snapshots in

2007-09-21 Thread Mike Gerdts
On 9/21/07, Christine Tran [EMAIL PROTECTED] wrote:
 patch and install tools can't figure out pools yet.  If you have a 1GB
 pool and 10 filesystems on it, du reports each having 1GB, do you have
 10GB capacity?  The tools can't tell.  Please check the archives, this
 subject has been extensively discussed.

Two responses come immediately to mind...

1) Thanks for protecting stupid/careless people from doing bad things.
2) UNIX has a longstanding tradition of adding a -f flag for cases
when the sysadmin realizes there is additional risk but feels that
appropriate precautions have been taken.

I would really like to ask Sun for a roadmap as to when this is going
to be supported.  Since this is the zfs list (not zones or install
list) and it is OpenSolaris (not Solaris) I guess I should probably
find a more appropriate forum.

So, for now I will use OpenSolaris where I can and wait patiently for
the new installer + snap upgrade basket and wait for it to find its
way into Solaris in about a year or two.  In the meantime, I'll
probably end up putting most zones on a particular competitor's NAS
devices and looking into how well their file system cloning
capabilities play in coordination with iSCSI.

irony
Oh, wait!  What if the NAS device runs out of space while I'm
patching?  Better rule out the thin provisioning capabilities of the
HDS storage that Sun sells as well.
/irony

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Would a device list output be a reasonable feature for zpool(1)?

2007-09-17 Thread Ellis, Mike
Yup...

With Leadville/MPXIO targets in the 32-digit range, identifying the new
storage/LUNs is not a trivial operatrion.

 -- MikeE 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Russ
Petruzzelli
Sent: Monday, September 17, 2007 1:51 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Would a device list output be a reasonable
feature for zpool(1)?

Seconded!

MC wrote:
 With the arrival of ZFS, the format command is well on its way to
deprecation station.  But how else do you list the devices that zpool
can create pools out of?

 Would it be reasonable to enhance zpool to list the vdevs that are
available to it?  Perhaps as part of the help output to zpool create?
  
   
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and Live Upgrade

2007-09-15 Thread Mike Gerdts
On 9/15/07, Coy Hile [EMAIL PROTECTED] wrote:
 Is  there any update/work-around/patch/etc as of the S10u4 WOS for the bugs 
 that existed with respect to LU, Zones, and ZFS?  More specifically, the 
 following:

 6359924 live upgrade needs to include support for zfs

I bet that Live Upgrade never does, but Snap Upgrade does.

http://opensolaris.org/os/project/caiman/Snap_Upgrade/

It is likely worth considering more of the roadmap when reading that page.

http://opensolaris.org/os/project/caiman/Roadmap/

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] space allocation vs. thin provisioning

2007-09-14 Thread Mike Gerdts
Short question:

I'm curious as to how ZFS manages space (free and used) and how
its usage interacts with thin provisioning provided by HDS
arrays.  Is there any effort to minimize the number of provisioned
disk blocks that get writes so as to not negate any space
benefits that thin provisioning may give?


Background  more detailed questions:

In Jeff Bonwick's blog[1], he talks about free space management
and metaslabs.  Of particular interest is the statement: ZFS
divides the space on each virtual device into a few hundred
regions called metaslabs.

1. http://blogs.sun.com/bonwick/entry/space_maps

In Hu Yoshida's (CTO, Hitachi Data Systems) blog[2] there is a
discussion of thin provisioning at the enterprise array level.
Of particular interest is the statement: Dynamic Provisioning is
not a panacea for all our storage woes. There are applications
that do a hard format or write across the volume when they do an
allocation and that would negate the value of thin provisioning.
In another entry[3] he goes on to say: Capacity is allocated to
'thin' volumes from this pool in units of 42 MB pages

2. http://blogs.hds.com/hu/2007/05/dynamic_or_thin_provisioning.html
3. http://blogs.hds.com/hu/2007/05/thin_and_wide_.html

This says that any time that a 42 MB region gets one sector
written, 42 MB of storage is permanently[4] allocated to the
virtual LUN.

4. Until the LUN is destroyed, that is.

I know that ZFS does not do a write across all of the disk as
part of formatting.  Does it, however, drop some sort of metaslab
data structures on each of those few hundred regions?

When space is allocated, does it make an attempt to spread the
allocations across all of the metaslabs, or does it more or less
fill up one metaslab before moving to the next?

As data is deleted, do the freed blocks get reused before never
used blocks?

Is there any collaboration between the storage vendors and ZFS
developers to allow the file system to tell the storage array
this range of blocks is unused so that the array can reclaim
the space?  I could see this as useful when doing re-writes of
data (e.g. crypto rekey) to concentrate data that had become
scattered into contiguous space.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] space allocation vs. thin provisioning

2007-09-14 Thread Mike Gerdts
On 9/14/07, Moore, Joe [EMAIL PROTECTED] wrote:
 I was trying to compose an email asking almost the exact same question,
 but in the context of array-based replication.  They're similar in the
 sense that you're asking about using already-written space, rather than
 to go off into virgin sectors of the disks (in my case, in the hope that
 the previous write is still waiting to be replicated and thus can be
 replaced by the current data)

At one point, I thought this was how data replication should happen
too.  However, unless you have two consecutive writes to the same
space, coalescing the writes could make it so that the data
(generically, including fs metadata) on the replication target may be
corrupt.  Generally speaking, you need to have in-order writes to
ensure that you maintain crash consistent data integrity in the
event of a various failure modes.

Of course, I can see how writes could be batched coalesced and applied
in a journaled manner such that each batch fully applies or is rolled
back on the target.  I haven't heard of this being done.

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do I get my pool back?

2007-09-13 Thread Mike Lee

have you tried zpool clear?

Peter Tribble wrote:

On 9/13/07, Solaris [EMAIL PROTECTED] wrote:
  

Try exporting the pool then import it.  I have seen this after moving disks
between systems, and on a couple of occasions just rebooting.



Doesn't work. (How can you export something that isn't imported
anyway?)

  


--
http://www.sun.com/solaris  * Michael Lee *
Area System Support Engineer

*Sun Microsystems, Inc.*
Phone x40782 / 866 877 8350
Email [EMAIL PROTECTED]
http://www.sun.com/solaris

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression=on and zpool attach

2007-09-12 Thread Mike DeMarco
 On 11/09/2007, Mike DeMarco [EMAIL PROTECTED]
 wrote:
   I've got 12Gb or so of db+web in a zone on a ZFS
   filesystem on a mirrored zpool.
   Noticed during some performance testing today
 that
   its i/o bound but
   using hardly
   any CPU, so I thought turning on compression
 would be
   a quick win.
 
  If it is io bound won't compression make it worse?
 
 Well, the CPUs are sat twiddling their thumbs.
 I thought reducing the amount of data going to disk
 might help I/O -
 is that unlikely?

IO bottle necks are usually caused by a slow disk or one that has heavy 
workloads reading many small files. Two factors that need to be considered are 
Head seek latency and spin latency. Head seek latency is the amount of time it 
takes for the head to move to the track that is to be written, this is a 
eternity for the system (usually around 4 or 5 milliseconds). Spin latency is 
the amount of time it takes for the spindle to spin the track to be read or 
written over the head. Ideally you only want to pay the latency penalty once. 
If you have large reads and writes going to the disk then compression may help 
a little but if you have many small reads or writes it will do nothing more 
than to burden your CPU with a no gain amount of work to do since your are 
going to be paying Mr latency for each read or write.

Striping several disks together with a stripe width that is tuned for your data 
model is how you could get your performance up. Stripping has been left out of 
the ZFS model for some reason. Where it is true that RAIDZ will stripe the data 
across a given drive set it does not give you the option to tune the stripe 
width. Do to the write performance problems of RAIDZ you may not get a 
performance boost from it stripping if your write to read ratio is too high 
since the driver has to calculate parity for each write.

 
   benefit of compression
   on the blocks
   that are copied by the mirror being resilvered?
 
  No! Since you are doing a block for block mirror of
 the data, this would not could not compress the data.
 
 No problem, another job for rsync then :)
 
 
 -- 
 Rasputin :: Jack of All Trades - Master of Nuns
 http://number9.hellooperator.net/
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression=on and zpool attach

2007-09-12 Thread Mike DeMarco
 On 9/12/07, Mike DeMarco [EMAIL PROTECTED] wrote:
 
  Striping several disks together with a stripe width
 that is tuned for your data
  model is how you could get your performance up.
 Stripping has been left out
  of the ZFS model for some reason. Where it is true
 that RAIDZ will stripe
  the data across a given drive set it does not give
 you the option to tune the
  stripe width. Do to the write performance problems
 of RAIDZ you may not
  get a performance boost from it stripping if your
 write to read ratio is too
  high since the driver has to calculate parity for
 each write.
 
 I am not sure why you think striping has been left
 out of the ZFS
 model. If you create a ZFS pool without the raidz
 or mirror
 keywords, the pool will be striped. Also, the
 recordsize tunable can
 be useful for matching up application I/O to physical
 I/O.
 
 Thanks,
 - Ryan
 -- 
 UNIX Administrator
 http://prefetch.net
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss

Oh... How right you are. I dug into the PDFs and read up on Dynamic striping. 
My bad.
ZFS rocks.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression=on and zpool attach

2007-09-11 Thread Mike DeMarco
 I've got 12Gb or so of db+web in a zone on a ZFS
 filesystem on a mirrored zpool.
 Noticed during some performance testing today that
 its i/o bound but
 using hardly
 any CPU, so I thought turning on compression would be
 a quick win.

If it is io bound won't compression make it worse? 

 
 I know I'll have to copy files for existing data to
 be compressed, so
 I was going to
 make a new filesystem, enable compression and rysnc
 everything in, then drop the
 old filesystem and mount the new one (with compressed
 blocks) in its place.
 
 But I'm going to be hooking in faster LUNs later this
 week. The plan
 was to remove
 half of the mirror, attach a new disk, remove the
 last old disk and
 attach the second
 half of the mirror (again on a faster disk).
 
 Will this do the same job? i.e. will I see the
 benefit of compression
 on the blocks
 that are copied by the mirror being resilvered?

No! Since you are doing a block for block mirror of the data, this would not 
could not compress the data.

 
 
 -- 
 Rasputin :: Jack of All Trades - Master of Nuns
 http://number9.hellooperator.net/
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-08 Thread Mike Gerdts
On 9/8/07, Richard Elling [EMAIL PROTECTED] wrote:
 Changing the topic slightly, the strategic question is:
 why are you providing disk space to students?

For most programming and productivity (e.g. word processing, etc.)
people will likely be better suited by having network access for their
personal equipment with local storage.

For cases when specialized expensive tools ($10k + per seat) are used,
it is not practical to install them on hundreds or thousands of
personal devices for a semester or two of work.  The typical computing
lab that provides such tools is not well equipped to deal with
removable media such as flash drives.  Further, such tools will often
times be used to do designs that require simulations to run as batch
jobs that run under grid computing tools such as Grid Engine, Condor,
LSF, etc.

Then, of course, there are files that need to be shared, have reliable
backups, etc.  Pushing that out to desktop or laptop machines is not
really a good idea.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Mike Gerdts
On 9/7/07, Alec Muffett [EMAIL PROTECTED] wrote:
  The main bugbear is what the ZFS development team laughably call
  quotas. They aren't quotas, they are merely filesystem size
  restraints. To get around this the developers use the let them eat
  cake mantra, creating filesystems is easy so create a new
  filesystem for each user, with a quota on it. This is the ZFS way.

Having worked in academia and multiple Fortune 100's, the problem
seems to be most prevalent in academia, although possibly a minor
inconvenience in some engineering departments in industry.  In the
.edu where I used to manage the UNIX environment, I would have a tough
time weighing the complexities of quotas he mentions vs. the other
niceties.  My guess is that unless I had something that was really
broken, I would stay with UFS or VxFS waiting for a fix.

It appears as though the author has not yet tried out snapshots.  The
fact that space used by a snapshot for the sysadmin's convenience
counts against the user's quota is the real killer.  This would force
me into a disk to disk (rsync, because zfs send | zfs recv would
require snapshots to stay around for incrementals) backup + snapshot
scenario to be able to keep snapshots while minimizing their impact on
users.  That means double the disk space.  Doubling the quota is not
an option because without soft quotas there is no way to keep people
from using all of their space.  Frankly, that would be so much trouble
I would be better off using tape for restores, just like with UFS or
VxFS.

  Now, with each user having a separate filesystem this breaks. The
  automounter will mount the parent filesystem as before but all you
  will see are the stub directories ready for the ZFS daughter
  filesystems to mount onto and there's no way of consolidating the
  ZFS filesystem tree into one NFS share or rules in automount map
  files to be able to do sub-directory mounting.

While NFS4 holds some promise here, it is not a solution today.  It
won't be until all OS's that came out before 2008 are gone.  That will
be a while.

Use of macros (e.g. * server:/home/) can go a long ways.  If that
doesn't do it, an executable map that does the appropriate munging may
be in order.

  The problem here is one of legacy code, which you'll find
  throughout the academic, and probably commercial world. Basically,
  there's a lot of user generated code which has hard coded paths so
  any new system has to replicate what has gone before. (The current
  system here has automount map entries which map new disks to the
  names of old disks on machines long gone, e.g. /home/eeyore_data/ )

Put such entries before the *  entry and things should be OK.

For me, quotas are likely to be a pain point that prevents me from
making good use of snapshots.  Getting changes in application teams'
understanding and behavior is just too much trouble.  Others are:

1. There seems to be no integration with backup tools that are
time+space+I/O efficient.  If my storage is on Netapp, I can use NDMP
to do incrementals between snapshots.  No such thing exists with ZFS.

2. Use of clones is out because I can't do a space-efficient restore.

3. ARC messes up my knowledge of how much RAM my machine is making
good use of.  After the first backup, vmstat says that I am just at
the brink of not having enough RAM that paging (file system and pager)
will begin soon.  This may be fine on a file server, but it really
messes with me if it is a J2EE server and I'm trying to figure out how
many more app servers I can add.

I have a lot of hopes for ZFS and have used it with success (and
failures) in limited scope.  I'm sure that with time the improvements
will come that make that scope increase dramatically, but for now it
is confined to the lab.  :(

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread mike
On 9/7/07, Mike Gerdts [EMAIL PROTECTED] wrote:
 For me, quotas are likely to be a pain point that prevents me from
 making good use of snapshots.  Getting changes in application teams'
 understanding and behavior is just too much trouble.  Others are:

not to mention there are smaller-scale users that want the data
protection, checksumming and scalability that ZFS offers (although the
whole zdev/zpool/etc. thing might wind up causing me to have to buy
more disks to add more space, if i were to use it)

it would be nice to have a ZFS lite(tm) for those of us that just want
easily expandable filesystems (as in, add a new disk/device and not
have to think of some larger geometry) with inline
checksumming/COW/metadata/ditto blocks/etc/etc goodness. basically
like a home edition. i don't care about LUNs, send/receive, quotas,
snapshots (for the most part), setting up different zpools to gain
specific performance benefits, etc. i just want raid-z/raid-z2 with a
easy way to add disks.

i have not actually used ZFS yet because i've been waiting for
opensolaris/solaris (or even freebsd possibly) to support eSATA
hardware or something related. the hardware support front for SOHO
users has also been slow. that's not a shortcoming of ZFS though...
but does make me wish i had the basic protection features of ZFS with
hardware support like linux.

- my two cents
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] An Academic Sysadmin's Lament for ZFS ?

2007-09-07 Thread Mike Gerdts
On 9/7/07, Stephen Usher [EMAIL PROTECTED] wrote:

 Brian H. Nelson:

 I'm sure it would be interesting for those on the list if you could
 outline the gotchas so that the rest of us don't have to re-invent the
 wheel... or at least not fall down the pitfalls.

The UFS on zvols option sounds intriguing to me, but I would guess
that the following could be problems:

1) Double buffering:  Will ZFS store data in the ARC while UFS uses
traditional file system buffers?

2) Boot order dependencies.  How does the startup of zfs compare to
processing of /etc/vfstab?  I would guess that this is OK due to
legacy mount type supported by zfs.  If this is OK, then dfstab
processing is probably OK.

I say intriguing because it could give you a the improved data
integrity checks and bit more flexibility in how you do things like
backups and restores.  Snapshots of the zvols could be mounted as
other UFS file systems that could allow for self-service restores.
Perhaps this would make it so that you can write data to tape a bit
less frequently.

If deduplication comes into zfs, you may be able to get to a point
where course project instructions that say cp ~course/hugefile ~
become not so expensive - you would be charging quota to each user but
only storing one copy.  Depending on the balance of CPU power vs. I/O
bandwidth, compressed zvols could be a real win, more than paying back
the space required to have a few snapshots around.

Mike
-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/WAFL lawsuit

2007-09-06 Thread mike
On 9/6/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
  This is my personal opinion and all,  but even knowing that Sun
 encourages open conversations on these mailing lists and blogs it seems to
 falter common sense for people from @sun.com to be commenting on this
 topic. It seems like something users should be aware of,  but if I were
 working at Sun I would feel a very strong urge to clear any public
 conversation about the topic with management.  As always, I do appreciate
 the frank insight given from the sun folks -- I am just worried that you
 may be doing yourself a disservice talking about it.

i completely disagree. i work for a fortune 50 company and we have a
hell of a time with the legal department or other people who refuse to
think it's okay to speak frankly about things in their company.

obviously trade secrets and other things aside, i think it is
ultimately beneficial and helps a company feel more accountable when
it allows direct public exchange with employees and not through
spin-educated marketeers or public relation folk.

i don't expect anyone from sun on the zfs list would tell us anything
other than their personal opinion. i appreciate it too. from reading
forums and mailing lists, to having sun volunteer 6? people to help
memcached continue to flourish, i think sun is a role model for a
company who continues to profit but has figured out that certain
things can be free and ultimately they are helping make more mature
products and encourage innovation. they also would get the bonus of
having things like memcached run better on their platforms then too :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (politics) Sharks in the waters

2007-09-05 Thread mike
On 9/5/07, Joerg Schilling [EMAIL PROTECTED] wrote:
 As I wrote before, my wofs (designed and implemented 1989-1990 for SunOS 4.0,
 published May 23th 1991) is copy on write based, does not need fsck and always
 offers a stable view on the media because it is COW.

Side question:

If COW is such an old concept, why haven't there been many filesystems
that have become popular that use it? ZFS, BTRFS (I think) and maybe
WAFL? At least that I know of. It seems like an excellent guarantee of
disk commitment, yet we're all still fussing with journalled
filesystems, filesystems that fragment, buffer lags (or whatever you
might call it) etc.

Just stirring the pot, seems like a reasonable question (perhaps one
to take somewhere else or start a new thread...)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The Dangling DBuf Strikes Back

2007-09-03 Thread Mike Gerdts
On 9/3/07, Dale Ghent [EMAIL PROTECTED] wrote:

 I saw a putback this past week from M. Maybee regarding this, but I
 thought I'd post here that I saw what is apparently an incarnation of
 6569719 on a production box running  s10u3 x86 w/ latest (on
 sunsolve) patches. I have 3 other servers configured the same way WRT
 work load, zfs pools and hardware resources, so if this occurs again
 I'll see about logging a case and getting a relief patch. Anyhow,
 perhaps a backport to s10 may be in order

[note: the patches I mention are s10 sparc specific.  Translation to
x86 required.]

As of a few weeks ago s10u3 with latest patches did not have this
problem for me, but s10u4 beta and snv69 did.  My situation was on
sun4v, not i386.  More specifically:

S10 118833-36, 118833-07, 118833-10:

# zpool import
  pool: zfs
id: 679728171331086542
 state: FAULTED
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
   see: http://www.sun.com/msg/ZFS-8000-5E
config:

zfs FAULTED   corrupted data
  c0d1s3FAULTED   corrupted data

snv_69, s10u4beta:

Boot device: /[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]:dhcp
File and args: -s
SunOS Release 5.11 Version snv_69 64-bit
Copyright 1983-2007 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Booting to milestone milestone/single-user:default.
Configuring /dev
Using DHCP for network configuration information.
Requesting System Maintenance Mode
SINGLE USER MODE
# zpool import

panic[cpu0]/thread=300028943a0: dangling dbufs (dn=3000392dbe0,
dbuf=3000392be08)

02a10076f270 zfs:dnode_evict_dbufs+188 (3000392dbe0, 0, 1, 1,
2a10076f320, 7b729000)
  %l0-3: 03000392ddf0   03000392ddf8
  %l4-7: 02a10076f320 0001 03000392bf20 0003
02a10076f3e0 zfs:dmu_objset_evict_dbufs+100 (2, 0, 0, 7b722800, 0,
3516900)
  %l0-3: 7b72ac00 7b724510 7b724400 03516a70
  %l4-7: 03000392dbe0 03516968 7b7228c1 0001
...


Sun offered me an IDR against 125100-07, but since I could not
reproduce the problem on that kernel, I never tested it.  This does
imply that they think there is a dangling dbufs problem in 125100-07
that they think they have a fixed for support-paying customers.
Perhaps this is the problem and related solution that you would be
interested in.

The interesting thing with my case is that the backing store for this
device is a file on a ZFS file system, served up has a virtual disk in
an LDOM.  From the primary LDOM, there is no corruption.  An
unexpected reset (panic, I believe) of the primary LDOM seems to have
caused the corruption in the guest LDOM.  What was that about having
the redundancy as close to the consumer as possible?  :)

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] do zfs filesystems isolate corruption?

2007-08-11 Thread Mike Gerdts
On 8/11/07, Stan Seibert [EMAIL PROTECTED] wrote:
 I'm not sure if that answers the question you were asking, but generally I 
 found that damage to a zpool was very well confined.

But you can't count on it.  I currently have an open case where a
zpool became corrupt and put the system into a panic loop.  As this
case has progressed, I found that the panic loop part of it is not
present in any released version of S10 tested (S10U3 + 118833-36,
125100-07, 125100-10) but does exist in snv69.

The test mechanism is whether zpool import (no pool name) causes the
system to panic or not.  If that happens, I'm going on the assumption
that if this causes  panic, having the appropriate zpool.cache in
place will cause it to panic during every boot.

Oddly enough, I know I can't blame the storage subsystem on this - it
is ZFS as well.  :)

It goes like this:

HDS 99xx
T2000 primary ldom
S10u3 with a file on zfs presented as a block device for an ldom
T2000 guest ldom
zpool on slice 3 of block device mentioned above

Depending on the OS running on the guest LDOM zpool import gives
different results:

S10U3 118833-36 - 125100-10:
  zpool is corrupt restore from backups
S10u4 Beta, snv69 and I think snv59:
   panic - S10u4 backtrace is very different from snv*

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] It is that time again... ZFS + Firewire/USB - and a specific enclosure

2007-08-06 Thread mike
I am looking into getting something like this:
http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItemrd=1item=120069356632ssPageName=STRK:MEWA:ITih=002

For a home storage server. I would like to run ZFS. Preferrably
FreeBSD (if basic functionality is completely bug-free and I won't
lose any data :)) as I am more comfortable with it.

However, I have been subscribed to the ZFS lists for some time now and
done a Google check up for topics relating to this for some time, and
the jury still seems to be out on whether or not this would be a good
idea.

I need it mainly for DVD storage (for easy playback) and personal
archiving. So maybe 3-4 read streams and 1-2 write streams - probably
at peak time.

Is this a viable option? I want capacity more than speed; but I want
at least single parity redundancy with RAID-Z. Would it be an option
to add 8 drive units as needed, making them each a single RAID-Z
device (so 7 drives usable?) and add them into the same zpool to
increase available storage when I would get low?

I might be missing some concepts here... but I keep querying this list
hoping someone has done some of the footwork already (I don't have as
many funds available to me as in the past to mess around with
experimental/untested ideas) not to mention I don't want to be
experimenting with my data... or maybe someone at least has some
advice. I have seen a couple comments about Firewire, not many about
USB. I don't care about the hotplugging, I can power down to replace
any drives or do maintenance. It's mainly for cheap, quiet enclosures
that can export JBOD...

Thanks,
mike
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Will de-duplication be added?

2007-07-29 Thread Mike Gerdts
On 7/29/07, Lance Brown [EMAIL PROTECTED] wrote:
 Does anybody know if native block level replication or block level 
 de-duplication as NetApp calls it will be added?

This has been discussed a bit in recent threads.

http://www.google.com/search?hl=enq=%22zfs-discuss%22+site%3Amail.opensolaris.org+%28dedup+OR+%22de-duplication%22+OR+deduplication%29btnG=Google+Search

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Any fix for zpool import kernel panic (reboot loop)?

2007-07-25 Thread Mike Gerdts
On 7/25/07, Andre Wenas [EMAIL PROTECTED] wrote:
 Hi Rodney,

 I have been using zfs root/boot for few months without any problem. I
 can also import the pool from other environment.

 Do you have problem importing the zfs boot pool only ? or you can's use
 the zfs boot at all ?

I have a zfs pool on (on a T2000) that zpool import (with or without
a pool name) will cause a dangling dbufs panic on S10u3, S10u4beta,
and Nevada b61ish.  Prior to booting from my jumpstart server and
disabling zpool.cache, the machine was stuck in a panic loop.  A
support call is open and it is a known problem that (I'm told) is
being worked on.

I only mention this to say that this type of problem is not restricted
to zfs boot.

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] import a group

2007-07-17 Thread Mike Salehi
One last question, when it comes to patching these zones, is it better to patch 
it normally or destroy all the local zones and patch only the global zone and 
use sh file to recreate all the zones.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] import a group

2007-07-16 Thread Mike Salehi
Greetings,

  Given zfs pools, how does one import these pools to another node in 
the cluster.
Mike
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] import a group

2007-07-16 Thread Mike Salehi
Sorry, my question is not clear enough. These pools contain a zone each.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pseudo file system access to snapshots?

2007-07-13 Thread Mike Gerdts
On 7/11/07, Matthew Ahrens [EMAIL PROTECTED] wrote:
  This restore problem is my key worry in deploying ZFS in the area
  where I see it as most beneficial.  Another solution that would deal
  with the same problem is block-level deduplication.  So far my queries
  in this area have been met with silence.

 I must have missed your messages on deduplication.

That's OK...  I think that I probably spill my guts adequately on the topic at:

http://mail.opensolaris.org/pipermail/zfs-discuss/2006-August/034065.html
http://mail.opensolaris.org/pipermail/zfs-discuss/2007-April/040034.html
http://mail.opensolaris.org/pipermail/storage-discuss/2007-May/002711.html

The only previous thread that seemed to have much discussion was

http://mail.opensolaris.org/pipermail/zfs-discuss/2007-January/

 But did you see this
 thread on it?  zfs space efficiency, 6/24 - 7/7?

I stopped reading the zfs space efficiency thread just before it got
interesting.  :)

The dedup methods suggested there are quite consistent with those that
I had previously suggested.


 We've been thinking about ZFS dedup for some time, and want to do it but have
 other priorities at the moment.

This is very good to hear.

  Hmmm... I just ran into another snag with this.  I had been assuming
  that clones and snapshots were more closely related.  But when I tried
  to send the differences between the source of a clone and a snapshot
  within that clone I got this message:

 I'm not sure what you mean by more closely related.  The only reason we
 don't support that is because we haven't gotten around to adding the special
 cases and error checking for it (and I think you're the first person to notice
 its omission).  But it's actually in the works now so stay tuned for an update
 in a few months.

I was trying to say (but didn't) that I thought that there should be
nearly (very important word) no differences between finding the
differences from a clone's origin to a snapshot of the clone and
finding the differences between snapshots between snapshots of the
same file system.

I'm also glad to see this is in the works.  Most of my use cases for
ZFS involve use of clones.  Lack of space-efficient backups and
especially restores makes me wait to use ZFS outside of the lab.

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pseudo file system access to snapshots?

2007-07-11 Thread Mike Gerdts
On 7/11/07, Darren J Moffat [EMAIL PROTECTED] wrote:
 Mike Gerdts wrote:
  Perhaps a better approach is to create a pseudo file system that looks like:
 
  mntpt/pool
 /@@
 /@today
 /@yesterday
 /fs
/@@
/@2007-06-01
 /otherfs
 /@@

 How is this different from cd mntpt/.zfs/snapshot/   ?


mntpt/.zfs/snapshot provides file-level access to the contents of
the snapshot.  If you back those up, then restore every snapshot, you
will potentially be using way more disk space.

What I am proposing is that cat mntpt/pool/@snap1 delivers a data
stream corresponding to the output of zfs send and that cat
mntpnt/pool/@[EMAIL PROTECTED] delivers a data stream corresponding to zfs
send -i snap1 snap2.

This would allow existing backup tools to perform block level
incremental backups.  Assuming that writing to the various files is
the equivalent of the corresponding zfs receive commands, it
provides for block level restores that preserve space efficiency as
well.

Why?

Suppose I have a server with 50 full root zones on it.  Each zone has
a zonepath at /zones/zonename that is about 8 GB.  This implies that
I need 400 GB just for zone paths.  Using ZFS clones, I can likely
trim that down to far less than 100 and probably less than 20.  I
can't trim it down that far if I don't have a way to restore the
system.

This restore problem is my key worry in deploying ZFS in the area
where I see it as most beneficial.  Another solution that would deal
with the same problem is block-level deduplication.  So far my queries
in this area have been met with silence.

Hmmm... I just ran into another snag with this.  I had been assuming
that clones and snapshots were more closely related.  But when I tried
to send the differences between the source of a clone and a snapshot
within that clone I got this message:

incremental source must be in same filesystem
usage:
send [-i snapshot] snapshot

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Pseudo file system access to snapshots?

2007-07-10 Thread Mike Gerdts
As I have been grappling with how I will manage ZFS backups using
existing enterprise backup tools (e.g. Netbackup), it seems as though
two approaches continue to dominate:

1) Just use the POSIX interface like is used for UFS,
   VxFS, etc.  This has key the disadvantage that it
   is not efficient (space, time, performance, etc.)
   during backup and restore could be impossible
   due to the space inefficiency.  Other disadvantages
   exist as well.
2) Use zfs send (but not receive) to do disk-to-disk
   backups, then back up the zfs send images to tape.
   This is also inefficient due to extra space, time,
   etc. but you ACL's, snapshots, clones, etc. seem
   as though they will be preserved on restore.  The
   interface to the backup software will require some
   scripting, much like anything else that requires
   a quiesce before backup.

For a while I was thinking that zfs send data streams would be nice
to work with NDMP.  However, this solution will only play well with
the commercial products that have been going after the storage
appliance market for quite some time.  I'm not aware of free tools
that speak NDMP.

Perhaps a better approach is to create a pseudo file system that looks like:

mntpt/pool
   /@@
   /@today
   /@yesterday
   /fs
  /@@
  /@2007-06-01
   /otherfs
   /@@

As you might imagine, reading from pool/@today would be equivalent to
zfs send [EMAIL PROTECTED].  Some sort of notation (pool/@[EMAIL PROTECTED]
?) would be needed to represent zfs send -i [EMAIL PROTECTED]
[EMAIL PROTECTED].  Reading from pool/fs/@@ would be equivalent to zfs
snapshot pool/fs@timestamp; zfs send pool/fs@timestamp.

Writing to a particular path would have the same effect as zfs receive.

Is this something that is maybe worth spending a few more cycles on,
or is it likely broken from the beginning?

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Take Three: PSARC 2007/171 ZFS Separate Intent Log

2007-07-07 Thread Mike Gerdts
On 7/7/07, Cyril Plisko [EMAIL PROTECTED] wrote:
 Hello,

 This is a third request to open the materials of the PSARC case
 2007/171 ZFS Separate Intent Log
 I am not sure why two previous requests were completely ignored
 (even when seconded by another community member).
 In any case that is absolutely unaccepted practice.

The past week of inactivity is likely related to most of Sun in the US
being on mandatory vacation.  Sun typically shuts down for the week
that contains July 4 and (I think) the week between Christmas and Jan
1.

Mike
-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS usb keys

2007-06-27 Thread Mike Lee
I had a similar situation between x86 and SPARC, version number. When I 
created the pool on the LOWER rev machine, it was seen by the HIGHER rev 
machine. This was a USB HDD, not a stick. I can now move the drive 
between boxes.


HTH,
Mike

Dick Davies wrote:

Thanks to everyone for the sanity check - I think
it's a platform issue, but not an endian one.

The stick was originally DOS-formatted, and the zpool was built on the 
first

fdisk partition. So Sparcs aren't seeing it, but the x86/x64 boxes are.




--
http://www.sun.com/solaris  * Michael Lee *
Area System Support Engineer

*Sun Microsystems, Inc.*
Phone x40782 / 866 877 8350
Email [EMAIL PROTECTED]
http://www.sun.com/solaris

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] ZFS - DB2 Performance

2007-06-26 Thread Ellis, Mike
At what Solaris10 level (patch/update) was the single-threaded
compression situation resolved? 
Could you be hitting that one?

 -- MikeE 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Roch - PAE
Sent: Tuesday, June 26, 2007 12:26 PM
To: Roshan Perera
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS - DB2 Performance


Possibly the  storage is flushing  the write caches  when it
should not.  Until we  get  a fix,  cache flushing  could be
disabled  in  the storage  (ask   the vendor for   the magic
incantation). If that's not forthcoming and if all pools are 
attached to NVRAM protected devices; then these /etc/system
evil tunable might help :

In older solaris releases we have

set zfs:zil_noflush = 1

On newer releases

set zfs:zfs_nocacheflush = 1


If  you implement this,  Do place a   comment that this is a
temporary workaround waiting for bug 6462690 to be fixed.

About Compression, I don't have the numbers but a reasonable
guess would be that it can consumes  roughly 1-Ghz of CPU to
compress 100MB/sec. This will of course depend on the type
of data being compressed.

-r

Roshan Perera writes:
  Hi all,
  
  I am after some help/feedback to the subject issue explained below.
  
  We are in the process of migrating a big DB2 database from a 
  
  6900 24 x 200MHz CPU's with Veritas FS 8TB of storage Solaris 8 to 
  25K   12 CPU dual core x 1800Mhz with ZFS 8TB storage SAN storage
(compressed  RaidZ) Solaris 10.
  
  Unfortunately, we are having massive perfomance problems with the new
solution. It all points towards IO and ZFS. 
  
  Couple of questions relating to ZFS.
  1. What is the impace on using ZFS compression ? Percentage of system
  resources required, how much of a overhead is this as suppose to
  non-compression. In our case DB2 do similar amount of read's and
  writes. 
  2. Unfortunately we are using twice RAID (San level Raid and RaidZ)
to
  overcome the panic problem my previous blog (for which I had good
  response). 
  3. Any way of monitoring ZFS performance other than iostat ?
  4. Any help on ZFS tuning in this kind of environment like caching
etc ?
  
  Would appreciate for any feedback/help wher to go next. 
  If this cannot be resolved we may have to go back to VXFS which would
be a shame.
  
  
  Thanks in advance.
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread mike

On 6/20/07, Constantin Gonzalez [EMAIL PROTECTED] wrote:


 One disk can be one vdev.
 A 1+1 mirror can be a vdev, too.
 A n+1 or n+2 RAID-Z (RAID-Z2) set can be a vdev too.

- Then you concatenate vdevs to create a pool. Pools can be extended by
 adding more vdevs.

- Then you create ZFS file systems that draw their block usage from the
 resources supplied by the pool. Very flexible.


This actually brings up something I was wondering about last night:

If I was to plan for a 16 disk ZFS-based system, you would probably
suggest me to configure it as something like 5+1, 4+1, 4+1 all raid-z
(I don't need the double parity concept)

I would prefer something like 15+1 :) I want ZFS to be able to detect
and correct errors, but I do not need to squeeze all the performance
out of it (I'll be using it as a home storage server for my DVDs and
other audio/video stuff. So only a few clients at the most streaming
off of it)

I would be interested in hearing if there are any other configuration
options to squeeze the most space out of the drives. I have no issue
with powering down to replace a bad drive, and I expect that I'll only
have one at the most fail at a time. If I really do need room for two
to fail then I suppose I can look for a 14 drive space usable setup
and use raidz-2.

Thanks,
mike
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Scalability/performance

2007-06-20 Thread mike

On 6/20/07, Paul Fisher [EMAIL PROTECTED] wrote:

I would not risk raidz on that many disks.  A nice compromise may be 14+2 
raidz2, which should perform nicely for your workload and be pretty reliable 
when the disks start to fail.


Would anyone on the list not recommend this setup? I could live with 2
drives being used for parity (or the parity concept)

I would be able to reap the benefits of ZFS - self-healing, corrupted
file reconstruction (since it has some parity to read from) and should
have decent performance (obviously not smokin' since I am not
configuring this to try for the fastest possible)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Karma Re: Re: Best use of 4 drives?

2007-06-15 Thread mike

On 6/15/07, Brian Hechinger [EMAIL PROTECTED] wrote:


Hmmm, that's an interesting point.  I remember the old days of having to
stagger startup for large drives (physically large, not capacity large).

Can that be done with SATA?


I had to link 2 600w power supplies together to be able to power on 12 drives...

I believe it is up to the controller (and possibly the drives) to
support staggering. But it is allowed in SATA if the controller/drives
support it.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Btrfs, COW for Linux [somewhat OT]

2007-06-14 Thread mike

it's about time. this hopefully won't spark another license debate,
etc... ZFS may never get into linux officially, but there's no reason
a lot of the same features and ideologies can't make it into a
linux-approved-with-no-arguments filesystem...

as a more SOHO user i like ZFS mainly for it's COW and integrity, and
being able to add more storage later on. the latter is nothing new
though. but telling the world who needs hardware raid? software can
do it much better is a concept that excites me; it would be great for
linux to have something like that as well that could be merged into
the kernel without any debate.

On 6/14/07, David Magda [EMAIL PROTECTED] wrote:

Hello,

Somewhat off topic, but it seems that someone released a COW file
system for Linux (currently in 'alpha'):

   * Extent based file storage (2^64 max file size)
   * Space efficient packing of small files
   * Space efficient indexed directories
   * Dynamic inode allocation
   * Writable snapshots
   * Subvolumes (separate internal filesystem roots)
   - Object level mirroring and striping
   * Checksums on data and metadata (multiple algorithms available)
   - Strong integration with device mapper for multiple device support
   - Online filesystem check
   * Very fast offline filesystem check
   - Efficient incremental backup and FS mirroring

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Btrfs, COW for Linux [somewhat OT]

2007-06-14 Thread mike

because i don't want bitrot to destroy the thousands of pictures and
memories i keep? i keep important personal documents, etc. filesystem
corruption is not a feature to me. perhaps i spoke incorrectly but i
consider COW to be one of the reasons a filesystem can keep itself in
check, the disk write will be transactional and guaranteed if it says
successful, correct?

i still plan on using offsite storage services to maintain the
physical level of redundancy (house burns down, equipment is stolen,
HDs will always die at some point, etc.) but as a user who has had
corruption happen many times (FAT32, NTFS, XFS, JFS) it is encouraging
to see more options that put emphasis on integrity...


On 6/14/07, Frank Cusack [EMAIL PROTECTED] wrote:

On June 14, 2007 3:57:55 PM -0700 mike [EMAIL PROTECTED] wrote:
 as a more SOHO user i like ZFS mainly for it's COW and integrity, and

huh.  As a SOHO user, why do you care about COW?

-frank


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Btrfs, COW for Linux [somewhat OT]

2007-06-14 Thread mike

On 6/14/07, Frank Cusack [EMAIL PROTECTED] wrote:

Yes, but there are many ways to get transactions, e.g. journalling.


ext3 is journaled. it doesn't seem to always be able to recover data.
it also takes forever to fsck. i thought COW might alleviate some of
the fsck needs... it just seems like a more efficient (or guaranteed?)
method of disk commitment. but i am speaking purely from the
sidelines. i don't know all the internals of filesystems, just the
ones that have bitten me in the past.


ps. top posting especially sucks when you ask multiple questions.


yes, sir!
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Holding disks for home servers

2007-06-07 Thread mike

looks like you used 3 for a total of 15 disks, right?

I have a CM stacker too - I used the CM 4-disks-in-3-5.25-slots
though. I am currently trying to sell it too, as it is bulky and I
would prefer using eSATA/maybe Firewire/USB enclosures and a small
controller machine (like a Shuttle) so it is much easier to move
around, and much easier to expand. You'll hit a ceiling real quick
with those big chassis (I already did, it only holds 12 disks in
current fashion and I have a 16 port Areca card) and I don't want to
get stuck once again running out of space, ZFS or not. Not to mention
I had to custom bind two 600w power supplies together to give it
enough juice to run... I want something not as insane. I just want
storage :)

On 6/7/07, Rob Logan [EMAIL PROTECTED] wrote:


On the third upgrade of the home nas, I chose
http://www.addonics.com/products/raid_system/ae4rcs35nsa.asp to hold the
disks. each hold 5 disks, in the space of three slots and 4 fit into a
http://www.google.com/search?q=stacker+810 case for a total of 20
disks.

But if given a chance to go back in time, the
http://www.supermicro.com/products/accessories/mobilerack/CSE-M35TQ.cfm
has LEDs next to the drive, and doesn't vibrate as much.

photos in http://rob.com/sun/zfs/

Rob
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] current state of play with ZFS boot and install?

2007-05-31 Thread Mike Dotson
I've been using the zfsbootkit to modify my jumpstart images.  As far as
I know, the kit is the current process for zfs boot until further
notice.

http://www.opensolaris.org/os/community/install/files/zfsboot-kit-20060418.i386.tar.bz2

See readme in the package.

On Thu, 2007-05-31 at 02:06 -0700, Marko Milisavljevic wrote:
 I second that... I am trying to figure out what is missing so that I
 can use ZFS exclusively... right now as far as I know two major
 obstacles are no support from installer and issues with live update.
 Are both of those expected to be resolved this year?
 
 On 5/30/07, Carl Brewer [EMAIL PROTECTED] wrote:
  Out of curiosity, I'm wondering if Lori, or anyone else who actually writes 
  the stuff, has any sort of a 'current state of play' page that describes 
  the latest OS ON release and how it does ZFS boot and installs? There's 
  blogs all over the place, of course, which have a lot of stale information, 
  but is there a 'the current release supports this, and this is how you 
  install it' page anywhere, or somewhere in particular to watch?
 
  I've been playing with ZFS boot since around b34 or whenever it was that it 
  first started to be able to be used as a boot partition with the temporary 
  ufs partition hack, but I understand it's moved beyond that.
 
  I've been downloading and playing with the ON builds every now and then, 
  but haven't found (haven't looked in the right places?) anywhere where each 
  build has this is what this build does differently, this is what works and 
  how documented.
 
  can someone belt me with a cluestick please?
 
 
  This message posted from opensolaris.org
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: ZFS - Use h/w raid or not?Thoughts. Considerations.

2007-05-29 Thread Ellis, Mike
Also the unmirrored memory for the rest of the system has ECC and
ChipKill, which provides at least SOME protection against random
bit-flips.

--

Question: It appears that CF and friends would make a descent live-boot
(but don't run on me like I'm a disk) type of boot-media due to the
limited write/re-write limitations of flash-media. (at least the
non-exotic type of flash-media)

Would something like future zfs-booting on a pair of CF-devices
reduce/lift that limitation? (does the COW nature of ZFS automatically
spread WRITES across the entire CF device?) [[ is tmp-fs/swap going to
remain a problem till zfs-swap adds some COW leveling to the swap-area?
]]

Thanks,

 -- MikeE

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Carson Gaspar
Sent: Tuesday, May 29, 2007 8:05 PM
To: Richard Elling
Cc: zfs-discuss@opensolaris.org; Anton B. Rang
Subject: Re: [zfs-discuss] Re: ZFS - Use h/w raid or not?Thoughts.
Considerations.


Richard Elling wrote:

 But I am curious as to why you believe 2x CF are necessary?
 I presume this is so that you can mirror.  But the remaining memory
 in such systems is not mirrored.  Comments and experiences are
welcome.

CF == bit-rot-prone disk, not RAM. You need to mirror it for all the 
same reasons you need to mirror hard disks, and then some.

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: ZFS - Use h/w raid or not?Thoughts.Considerations.

2007-05-29 Thread Ellis, Mike
Hey Richard, thanks for sparking the conversation... This is a very
interesting topic (especially if you take it out of the HPC we need
1000 servers to have this minimal boot image space into general
purpose/enterprise computing)

--

Based on your earlier note, it appears you're not planning to use cheapo
free after rebate CF cards :-) (The cheap-ones would probably be
perfect for ZFS a-la cheap-o-JBOD).

Having boot disks mirrored across controllers has had sys-admins sleep
better over the years (especially in FC-loop-cases with both drives on
the same loop... Sigh). If the USB-bus one might hang these fancy
FC-cards on is robust enough then perhaps a single battle hardened
CF-card will suffice... (although zfs ditto-blocks or some form of
protection might still be considered a good thing?)

Having 2 cards would certainly make the unlikely replacement of a card
a LOT more straight-forward than a single-card failure... Much of this
would depend on the quality of these CF-cards and how they put up under
load/stress/time 

--

If we're going down this CF-boot path, many of us are going to have to
re-think our boot-environment quite a bit. We've been spoiled with 36+
GB mirrored-boot drives for some time now  (if you do a lot of
PATCHING, you'll find that even those can get tight But that's a
discussion for a different day)

I don't think most enterprise boot disk layouts are going to fit (even
unmirrored) onto a single 4GB CF-card. So we'll have to play some games
where we start splitting off /opt, /var, (which is fairly read-write
intensive when you have process-accounting etc. running) onto some
other non-CF filesystem (likely a SAN of some variety). At some
point the hackery a 4GB CF-card is going to force us to do, is going to
become more complex than just biting the bullet and doing a full
multipath-ed SAN-boot  calling it a day. (or perhaps some future
iSCSI/NFS boot for the SAN-averse)

Seriously though... If (say in some HPC/grid space?) you can stick your
ENTIRE boot environment onto a 4GB CF-card, why not just do the SAN,
NFS/iSCSI boot thing instead? (what ever happened to:
http://blogs.sun.com/dweibel/entry/sprint_snw_2006#comments  )

--

But lets explore the CF thing some more... There is something there,
although I think Sun might have to provide some
best-practices/suggestions as to how customers that don't run a
minimum-config-no-local-apps, pacct, monitoring, etc. solaris
environment are best to use something like this. Use it as a pivot boot
onto the real root-image? That would delegate the CF-card to little more
than a rescue/utility image Kinda cool, but not earth-shattering I
would think (especially for those already utilizing wanboot for such
purposes)

--

Splitting off /var and friends from the boot environment (and still
packing the boot env say on a ditto-block 4GB FC card) is still going to
leave a pretty tight boot env. Obviously you want to be able to do some
fancy live-upgrade stuff in this space too, and all of a sudden a single
4GB flash-card don't look so big anymore 

2 of them, with some ZFS (and compression?) or even SDS mirroring
between them would possibly go a long way to make replacement easier,
give you redundancy (zfs/sds mirrors), some wiggle-room for live-upgrade
scenarios, and who knows what else. Still tight though

--

If it's a choice between 1-CF or NONE, we'll take 1-CF I guess Fear
of the unknown (and field data showing how these guys hold up over time)
would really determine uptake I guess. (( as you said, real data
regarding these specialized CF-cards will be required... Is it going to
vary greatly from vendor to vendor? Usecase to usecase? I'm not looking
forward to blazing the trail here Something doesn't seem right,
especially without the safety net of a mirrored environment... But maybe
that's just old-school sys-admin superstition Lets get some data,
set me straight...))

--

Right now we can stick 4x 4GB memory sticks into a x4200 (creating a
cactus looking device :-) A single built-in CF is obviously
cleaner/safer, but also somewhat limiting in terms of redundancy or even
just capacity. 

Has anyone considered taking say 2x 4G CF cards, and sticking them
inside one of the little sas-drive-enclosures? Customers could purchase
upto 4 of those for certain servers, (t2000/x4200 etc.) and treat these
as if they were really fast, lower-power/heat, (never fails no need to
mirror?) ~9GB drives. In the long-run, is that easier and more
flexible?

--

It would be really interesting to hear how others out there might try to
use a CF-boot-option in their environment.

Good thread, lets bat this around some more.

 -- MikeE





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, May 29, 2007 9:48 PM
To: Ellis, Mike
Cc: Carson Gaspar; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Re: ZFS - Use h/w raid or
not?Thoughts.Considerations.


Ellis, Mike wrote

RE: [zfs-discuss] Re: ZFS - Use h/w raid or not?Thoughts.Considerations.

2007-05-29 Thread Ellis, Mike
Hey Richard, thanks for sparking the conversation... This is a very
interesting topic (especially if you take it out of the HPC we need
1000 servers to have this minimal boot image space into general
purpose/enterprise computing)

--

Based on your earlier note, it appears you're not planning to use cheapo
free after rebate CF cards :-) (The cheap-ones would probably be
perfect for ZFS a-la cheap-o-JBOD).

Having boot disks mirrored across controllers has had sys-admins sleep
better over the years (especially in FC-loop-cases with both drives on
the same loop... Sigh). If the USB-bus one might hang these fancy
FC-cards on is robust enough then perhaps a single battle hardened
CF-card will suffice... (although zfs ditto-blocks or some form of
protection might still be considered a good thing?)

Having 2 cards would certainly make the unlikely replacement of a card
a LOT more straight-forward than a single-card failure... Much of this
would depend on the quality of these CF-cards and how they put up under
load/stress/time 

--

If we're going down this CF-boot path, many of us are going to have to
re-think our boot-environment quite a bit. We've been spoiled with 36+
GB mirrored-boot drives for some time now  (if you do a lot of
PATCHING, you'll find that even those can get tight But that's a
discussion for a different day)

I don't think most enterprise boot disk layouts are going to fit (even
unmirrored) onto a single 4GB CF-card. So we'll have to play some games
where we start splitting off /opt, /var, (which is fairly read-write
intensive when you have process-accounting etc. running) onto some
other non-CF filesystem (likely a SAN of some variety). At some
point the hackery a 4GB CF-card is going to force us to do, is going to
become more complex than just biting the bullet and doing a full
multipath-ed SAN-boot  calling it a day. (or perhaps some future
iSCSI/NFS boot for the SAN-averse)

Seriously though... If (say in some HPC/grid space?) you can stick your
ENTIRE boot environment onto a 4GB CF-card, why not just do the SAN,
NFS/iSCSI boot thing instead? (what ever happened to:
http://blogs.sun.com/dweibel/entry/sprint_snw_2006#comments  )

--

But lets explore the CF thing some more... There is something there,
although I think Sun might have to provide some
best-practices/suggestions as to how customers that don't run a
minimum-config-no-local-apps, pacct, monitoring, etc. solaris
environment are best to use something like this. Use it as a pivot boot
onto the real root-image? That would delegate the CF-card to little more
than a rescue/utility image Kinda cool, but not earth-shattering I
would think (especially for those already utilizing wanboot for such
purposes)

--

Splitting off /var and friends from the boot environment (and still
packing the boot env say on a ditto-block 4GB FC card) is still going to
leave a pretty tight boot env. Obviously you want to be able to do some
fancy live-upgrade stuff in this space too, and all of a sudden a single
4GB flash-card don't look so big anymore 

2 of them, with some ZFS (and compression?) or even SDS mirroring
between them would possibly go a long way to make replacement easier,
give you redundancy (zfs/sds mirrors), some wiggle-room for live-upgrade
scenarios, and who knows what else. Still tight though

--

If it's a choice between 1-CF or NONE, we'll take 1-CF I guess Fear
of the unknown (and field data showing how these guys hold up over time)
would really determine uptake I guess. (( as you said, real data
regarding these specialized CF-cards will be required... Is it going to
vary greatly from vendor to vendor? Usecase to usecase? I'm not looking
forward to blazing the trail here Something doesn't seem right,
especially without the safety net of a mirrored environment... But maybe
that's just old-school sys-admin superstition Lets get some data,
set me straight...))

--

Right now we can stick 4x 4GB memory sticks into a x4200 (creating a
cactus looking device :-) A single built-in CF is obviously
cleaner/safer, but also somewhat limiting in terms of redundancy or even
just capacity. 

Has anyone considered taking say 2x 4G CF cards, and sticking them
inside one of the little sas-drive-enclosures? Customers could purchase
upto 4 of those for certain servers, (t2000/x4200 etc.) and treat these
as if they were really fast, lower-power/heat, (never fails no need to
mirror?) ~9GB drives. In the long-run, is that easier and more
flexible?

--

It would be really interesting to hear how others out there might try to
use a CF-boot-option in their environment.

Good thread, lets bat this around some more.

 -- MikeE





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, May 29, 2007 9:48 PM
To: Ellis, Mike
Cc: Carson Gaspar; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Re: ZFS - Use h/w raid or
not?Thoughts.Considerations.


Ellis, Mike wrote

Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Mike Dotson
On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote:
 Bill Sommerfeld wrote:
  IMHO, there should be no need to put any ZFS filesystems in /etc/vfstab,
  but (this is something of a digression based on discussion kicked up by
  PSARC 2007/297) it's become clear to me that ZFS filesystems *should* be
  mounted by mountall and mount -a rather than via a special-case
  invocation of zfs mount at the end of the fs-local method script.
 
  in other words: teach mount how to find the list of filesystems in
  attached pools and mix them in to the dependency graph it builds to
  mount filesystems in the right order, rather than mounting
  everything-but-zfs first and then zfs later.
 
 

 I agree with this.  This seems like a necessary response to
 both PSARC/2007/297 and also necessary for eliminating
 legacy mounts for zfs root file systems.  The problem of
 the interaction between legacy and non-legacy mounts will just
 get worse once we are using non-legacy mounts for the
 file systems in the BE.

Could we also look into why system-console insists on waiting for ALL
the zfs mounts to be available?  Shouldn't the main file system food
groups be mounted and then allow console-login (much like single user or
safe-mode)?

Would help in many cases where an admin needs to work on a system but
doesn't need, say 20k users home directories mounted, to do this work.


 
 Lori
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Mike Dotson
On Fri, 2007-05-25 at 15:50 -0600, Lori Alt wrote:
 Mike Dotson wrote:
  On Fri, 2007-05-25 at 14:29 -0600, Lori Alt wrote:
   
  Would help in many cases where an admin needs to work on a system but
  doesn't need, say 20k users home directories mounted, to do this work.

 So single-user mode is not sufficient for this?
 

Not all work needs to be done in single user:) And I wouldn't consider a
4+ hour boot time just for mounting file systems a good use of cpu time
when an admin could be doing other things - preparation for next
patching, configuring changes to webserver, etc.  Or just monitoring the
status of the file system mounts to give an update to management on how
many file systems are mounted and how many are left.

Point is, why is console-login dependent on *all* the file systems being
mounted in *multiboot*.  Does it really need to depend on *all* the file
systems being mounted?  

 
 Lori
 
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Thanks...


Mike Dotson
Area System Support Engineer - ACS West
Phone: (503) 343-5157
[EMAIL PROTECTED]


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Mike Dotson
On Fri, 2007-05-25 at 15:19 -0700, Eric Schrock wrote:
 This has been discussed many times in smf-discuss, for all types of
 login.  Basically, there is no way to say console login for root
 only.  As long as any user can log in, we need to have all the
 filesystems mounted because we don't know what dependencies there may
 be.  Simply changing the definition of console-login isn't a
 solution because it breaks existing assumptions and software.

devils_advocate
So how are you guaranteeing NFS server and automount with autofs are up,
running and working for the user for console-login.
/devils_advocate

I don't buy this argument and you don't have to say console-login for
root only you just have to have console-login and the services
available are minimal and may not include *all* services much like when
a nfs server is down, etc. 

If the software depends on a file system or all the file systems to be
mounted, it adds that as a dependency (filesystem/local).  console-login
does not require this - only non-root users.  (I remember a smf config
bug with apache not requiring filesystem/local and failing to start)

What software is dependent on console-login?
helios(3): svcs -D console-login
STATE  STIMEFMRI

In fact the console-login depends on filesystem/minimal which to me
means minimal file systems not all file systems and there is no software
dependent on console-login - where's the disconnect?

From what I see, problem is auditd is dependent on filesystem/local
which is where we possibly have the hangup.

 
 A much better option is the 'trigger mount' RFE that would allow ZFS to
 quickly 'mount' a filesystem but not pull all the necessary data off
 disk until it's first accessed.

Agreed but there's still the issue with console-login being dependent on
all file systems instead of minimal file systems.

 
 - Eric
 
 
 --
 Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
-- 
Mike Dotson

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs root: legacy mount or not?

2007-05-25 Thread Mike Dotson
On Fri, 2007-05-25 at 15:46 -0700, Eric Schrock wrote:
 On Fri, May 25, 2007 at 03:39:11PM -0700, Mike Dotson wrote:
  
  In fact the console-login depends on filesystem/minimal which to me
  means minimal file systems not all file systems and there is no software
  dependent on console-login - where's the disconnect?
  
 
 You're correct - I thought console-login depended in filesystem/local,
 not
 filesystem/minimal.  ZFS filesystems are not mounted as part of
 filesystem/minimal, so remind me what the promlem is?

Create 20k zfs file systems and reboot.  Console login waits for all the
zfs file systems to be mounted (fully loaded 880, you're looking at
about 4 hours so have some coffee ready).

The *only* place I can see the filesystem/local dependency is in
svc:/system/auditd:default, however, on my systems it's disabled.

Haven't had a chance to really prune out the dependency tree to really
find the disconnect but once /, /var, /tmp and /usr are mounted, the
conditions for console-login should be met.

As you mentioned, best solution for this number of filesystems in zfs
land is the *automount* fs option where it mounts the filesystems as
needed to reduce the *boot time*.



 
 - Eric
 
 --
 Eric Schrock, Solaris Kernel Development   http://blogs.sun.com/eschrock
-- 
Thanks...


Mike Dotson
Area System Support Engineer - ACS West
Phone: (503) 343-5157
[EMAIL PROTECTED]


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] DBMS on zpool

2007-05-18 Thread Ellis, Mike

This is probably a good place to start.

http://blogs.sun.com/realneel/entry/zfs_and_databases

Please post back to the group with your results, I'm sure many of us are
interested.

Thanks,

 -- MikeE

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of homerun
Sent: Friday, May 18, 2007 8:42 AM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] DBMS on zpool


Hi

Just playing around with zfs , trying to locate DBMS data files to
zpool.
DBMS i mean here are oracle and informix.
currently noticed that read operations perfomance is excelent but all
write operations are not and also write operations performance variates
a lot.
My quess for not so good write performance and write performance
variation is double buffering , DBMS buffers and zfs caching. together.
Have anyone seen or tested best practices how should DBMS setup be
implemented using zpool ; zfs or zvol.

Thanks
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


<    1   2   3   4   5   >