[zfs-discuss] zfs fragmentation

2009-08-07 Thread Hua
1. Due to the COW nature of zfs, files on zfs are more tender to be fragmented 
comparing to traditional file system. Is this statement correct?

2. If so, common understanding is that fragmentation cause perform degradation, 
will zfs or to what extend zfs performance is affected by the fragmentation?

3. Being a relative new file system, are there many adoption in large 
implementation?

4. Googing zfs fragmentation doesn't return a lot results. It can because 
either there isn't a lot major adoption of zfs or fragment isn't a really 
problem for zfs.

Any information is appreciated.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool iscsi /zfs performance in opensolaris 0906

2009-08-07 Thread erik.ableson

On 7 août 09, at 02:03, Stephen Green wrote:

I used a 2GB ram disk (the machine has 12GB of RAM) and this jumped  
the backup up to somewhere between 18-40MB/s, which means that I'm  
only a couple of hours away from finishing my backup.  This is, as  
far as I can tell, magic (since I started this message nearly 10GB  
of data have been transferred, when it took from 6am this morning to  
get to 20GB.)


It transfer speed drops like crazy when the write to disk happens,  
but it jumps right back up afterwards.


If you want to perhaps reuse the slog later (ram disks are not  
preserved over reboot) write the slog volume out to disk and dump  
it back in after restarting.

dd if=/dev/ramdisk/slog of=/root/slog.dd


Now my only question is:  what do I do when it's done?  If I reboot  
and the ram disk disappears, will my tank be dead? Or will it just  
continue without the slog?  I realize that I'm probably totally  
boned if the system crashes, so I'm copying off the stuff that I  
really care about to another pool (the Mac's already been backed up  
to a USB drive.)


Have I meddled in the affairs of wizards?  Is ZFS subtle and quick  
to anger?


You have a number of options to preserve the current state of affairs  
and be able to reboot the OpenSolaris server if required.


The absolute safest bet would be the following, but the resilvering  
will take a while before you'll be able to shutdown:


create a file of the same size of the ramdisk on the rpool volume
replace the ramdisk slog with the 2G file (zpool replace poolname / 
dev/ramdisk/slog /root/slogtemp)

wait for the resilver/replacement operation to run its course
reboot
create a new ramdisk (same size, as always)
replace the file slog with the newly created ramdisk

If your machine reboots unexpectedly things are a little dicier, but  
you should still be able to get things back online.  If you did a dump  
of the ramdisk via dd to a file it should contain the correct  
signature and be recognized by ZFS.  Now there will be no guarantees  
to the state of the data since if there was anything actively used on  
the ramdisk when it stopped you'll lose data and I'm not sure how the  
pool will deal with this.  But in a pinch, you should be able to  
either replace the missing ramdisk device with the dd file copy of the  
ramdisk (make a copy first, just in case) or mount a new ramdisk, and  
dd the contents of the file back to the device and then import the pool.


Cheers,

Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] limiting the ARC cache during early boot, without /etc/system

2009-08-07 Thread Matt Ingenthron
 Besides the /etc/system, you could also export all
 the pools, use mdb to 
 set the same variable that /etc/system sets, and then
 import the pools 
 again. Don't know of any other mechanism to limit
 ZFS's memory foot print.
 
 If you don't do ZFS boot, manually import the pools
 after the 
 application starts, so you get your pages first.

Sounds good... except this is OpenSolaris distro we're talking about so I have 
ZFS root with no other options.  It'll always have at least the rpool.

Good thought though!

- Matt
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can I setting 'zil_disable' to increase ZFS/iscsi performance ?

2009-08-07 Thread roland
Yes, but to see if a separate ZIL will make a difference the OP should 
try his iSCSI workload first with ZIL then temporarily disable ZIL and 
re-try his workload.

or you may use the zilstat utility
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] changing SATA ports

2009-08-07 Thread Dick Hoogendijk
I've a new MB (tyhe same as before butthis one works..) and I want to 
change the way my SATA drives were connected. I had a ZFS boot mirror 
conncted to SATA3 and 4 and I wat those drives to be on SATA1 and 2 now.


Question: will ZFS see this and boot the system OK or will I have to 
take some precautions beforehand?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool iscsi /zfs performance in opensolaris 0906

2009-08-07 Thread Stephen Green

erik.ableson wrote:

On 7 août 09, at 02:03, Stephen Green wrote:


Man, that looks so nice I think I'll change my mail client to do dates 
in French :-)


Now my only question is:  what do I do when it's done?  If I reboot 
and the ram disk disappears, will my tank be dead? Or will it just 
continue without the slog?  I realize that I'm probably totally boned 
if the system crashes, so I'm copying off the stuff that I really care 
about to another pool (the Mac's already been backed up to a USB drive.)


You have a number of options to preserve the current state of affairs 
and be able to reboot the OpenSolaris server if required.


The absolute safest bet would be the following, but the resilvering will 
take a while before you'll be able to shutdown:


create a file of the same size of the ramdisk on the rpool volume
replace the ramdisk slog with the 2G file (zpool replace poolname 
/dev/ramdisk/slog /root/slogtemp)

wait for the resilver/replacement operation to run its course
reboot
create a new ramdisk (same size, as always)
replace the file slog with the newly created ramdisk


Would having an slog as a file on a different pool provide anywhere near 
the same improvement that I saw by adding a ram disk? Would it affect 
the typical performance (i.e., reading and writing files in my editor) 
adversely?


That is, could I move the slog to a file and then just leave it there so 
that I don't have trouble across reboots?  I could then just use the 
ramdisk when big things happened on the MacBook.


If your machine reboots unexpectedly things are a little dicier, but you 
should still be able to get things back online.  If you did a dump of 
the ramdisk via dd to a file it should contain the correct signature and 
be recognized by ZFS.  Now there will be no guarantees to the state of 
the data since if there was anything actively used on the ramdisk when 
it stopped you'll lose data and I'm not sure how the pool will deal with 
this.  But in a pinch, you should be able to either replace the missing 
ramdisk device with the dd file copy of the ramdisk (make a copy first, 
just in case) or mount a new ramdisk, and dd the contents of the file 
back to the device and then import the pool.


So, I take it if I just do a shutdown, the slog will be emptied 
appropriately to the pool, but then at startup the slog device will be 
missing and the system won't be able to import that pool.


If I dd the ramdisk to a file, I suppose that I should use a file on my 
rpool, right?


Thanks for the advice, I think it might be time to convince the wife 
that I need to buy an SSD.  Anyone have recommendations for a reasonably 
priced SSD for a home box?


Steve



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] MISTAKE in Evil_Tuning_Guide - FLUSH

2009-08-07 Thread Michael Marburger
Who do we contact to fix mis-information in the evil tuning guide?

at: 
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#How_to_Tune_Cache_Sync_Handling_Per_Storage_Device

Item 2 indicates SPARC uses file name ssd.conf and X64 uses sd.conf to insert a 
line sd-config-list 

After doing this for Hitachi San luns, we still had peformance issues. Sun 
support Incident 71249590 indicated that for ssd.conf, the token must be 
ssd-config-list not sd-config-list (begin with ssd vice sd).  This solved 
our problem and we are not sending the cache flush to the Hitachi San anymore.

Anyone know who to contact to get the documentation fixed?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool iscsi /zfs performance in opensolaris 0906

2009-08-07 Thread Stephen Green

Stephen Green wrote:
Thanks for the advice, I think it might be time to convince the wife 
that I need to buy an SSD.  Anyone have recommendations for a reasonably 
priced SSD for a home box?


For example, does anyone know if something like:

http://www.newegg.com/Product/Product.aspx?Item=N82E16820227436

manufacturers homepage:

http://www.ocztechnology.com/products/solid_state_drives/ocz_minipci_express_ssd-sata_

would work in OpenSolaris?  It (apparently) just looks like a SATA disk 
on the PCIe bus, and the package that they ship it in doesn't look big 
enough to have a driver disk in it (and the manufacturer doesn't provide 
drivers on their Web site.)


Compatibility aside, would a 16GB SSD on a SATA port be a good solution 
to my problem? My box is a bit shy on SATA ports, but I've got lots of 
PCI ports.  Should I get two?


It's only $60, so not such a troublesome sell to my wife.

Steve


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] changing SATA ports

2009-08-07 Thread Tim Cook
On Fri, Aug 7, 2009 at 8:49 AM, Dick Hoogendijk d...@nagual.nl wrote:

 I've a new MB (tyhe same as before butthis one works..) and I want to
 change the way my SATA drives were connected. I had a ZFS boot mirror
 conncted to SATA3 and 4 and I wat those drives to be on SATA1 and 2 now.

 Question: will ZFS see this and boot the system OK or will I have to take
 some precautions beforehand?



You need to update grub if you're going to change the ports the boot drives
are plugged into.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs fragmentation

2009-08-07 Thread Bob Friesenhahn

On Thu, 6 Aug 2009, Hua wrote:

1. Due to the COW nature of zfs, files on zfs are more tender to be 
fragmented comparing to traditional file system. Is this statement 
correct?


Yes and no.  Fragmentation is a complex issue.

ZFS uses 128K data blocks by default whereas other filesystems 
typically use 4K or 8K blocks.  This naturally reduces the potential 
for fragmentation by 32X over 4k blocks.


ZFS storage pools are typically comprised of multiple vdevs and 
writes are distributed over these vdevs.  This means that the first 
128K of a file may go to the first vdev and the second 128K may go to 
the second vdev.  It could be argued that this is a type of 
fragmentation but since all of the vdevs can be read at once (if zfs 
prefetch chooses to do so) the seek time for single-user contiguous 
access is essentially zero since the seeks occur while the application 
is already busy processing other data.  When mirror vdevs are used, 
any device in the mirror may be used to read the data.


ZFS uses a slab allocator and allocates large contiguous chunks of 
from the vdev storage, and then carves the 128K blocks from those 
large chunks.  This dramatically increases the probability that 
related data will be very close on the same disk.


ZFS delays ordinary writes to the very last minute according to these 
rules (my understanding): 7/8th total memory consumed, 5 seconds of 
100% write I/O is collected, or 30 seconds has elapsed.  Since quite a 
lot of data is written at once, zfs is able to write that data in the 
best possible order.


ZFS uses a copy-on-write model.  Copy-on-write tends to cause 
fragmentation if portions of existing files are updated.  If a large 
portion of a file is overwritten in a short period of time, the result 
should be reasonably fragment-free but if parts of the file are 
updated over a long period of time (like a database) then the file is 
certain to be fragmented.  This is not such a big problem as it 
appears to be since such files were already typically accessed using 
random access.


ZFS absolutely observes synchronous write requests (e.g. by NFS or a 
database).  The synchronous write requests do not benefit from the 
long write aggregation delay so the result may not be written as 
ideally as ordinary write requests.  Recently zfs has added support 
for using a SSD as a synchronous write log, and this allows zfs to 
turn synchronous writes into more ordinary writes which can be written 
more intelligently while returning to the user with minimal latency.


Perhaps the most significant fragmentation concern for zfs is if the 
pool is allowed to become close to 100% full.  Similar to other 
filesystems, the quality of the storage allocations goes downhill fast 
when the pool is almost 100% full, so even files written contiguously 
may be written in fragments.


3. Being a relative new file system, are there many adoption in 
large implementation?


There are indeed some sites which heavily use zfs.  One very large 
site using zfs is archive.org.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool Layout Advice Needed

2009-08-07 Thread Adam Sherman

On 6-Aug-09, at 11:32 , Thomas Burgess wrote:
i've seen some people use usb sticks, and in practice it works on  
SOME machines.  The biggest difference is that the bios has to allow  
for usb booting.  Most of todays computers DO.  Personally i like  
compact flash because it is fairly easy to use as a cheap  
alternative to a hard drive.  I mirror the cf drives exactly like  
they are hard drives so if one fails i just replace it.  USB is a  
little harder to do that with because they are just not as  
consistent as compact flash.  But honestly it should work and many  
people do this.



I've ended up purchasing two 8GB CF cards and the required CF-SATA  
adapters.


How, once I install OpenSolaris on the system using the two CF cards  
as a mirrored ZFS root pool, can I leverage any of the free space for  
some kind of ZFS specific performance improvement? slog? etc?


Thanks for everyone's input!

A.

--
Adam Sherman
CTO, Versature Corp.
Tel: +1.877.498.3772 x113



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] MISTAKE in Evil_Tuning_Guide - FLUSH

2009-08-07 Thread Michael Marburger
Sweet! Thanks! You rock!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool iscsi /zfs performance in opensolaris 0906

2009-08-07 Thread Scott Meilicke
Note - this has a mini PCIe interface, not PCIe.

I had the 64GB version in a Dell Mini 9. While it was great for it's small 
size, low power and low heat characteristics (no fan on the Mini 9!), it was 
only faster than the striped sata drives in my mac pro when it came to random 
reads. Everything else was slower, sometimes by a lot, as measured by XBench. 
Unfortunately I no longer have the numbers to share. I see the sustained writes 
listed as up to 25 MB/s, and bursts up to 51 MB/s.

That said, I have read of people having good luck with fast CF cards (no ref, 
sorry). So maybe this will be just fine :) 

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs fragmentation

2009-08-07 Thread Scott Meilicke
 ZFS absolutely observes synchronous write requests (e.g. by NFS or a 
 database). The synchronous write requests do not benefit from the 
 long write aggregation delay so the result may not be written as 
 ideally as ordinary write requests. Recently zfs has added support 
 for using a SSD as a synchronous write log, and this allows zfs to 
 turn synchronous writes into more ordinary writes which can be written 
 more intelligently while returning to the user with minimal latency.

Bob, since the ZIL is used always, whether a separate device or not, won't 
writes to a system without a separate ZIL also be written as intelligently as 
with a separate ZIL?

Thanks,
Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS log root pool?

2009-08-07 Thread Gregg Ferguson [Systems Engineer, Sun West Coast Aerospace]

Hi,

Is the ability to add a log device to a root pool on the roadmap for ZFS?

Thanks,
  Gregg  gregg dot ferguson at sun dot com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs fragmentation

2009-08-07 Thread Neil Perrin



On 08/07/09 10:54, Scott Meilicke wrote:
ZFS absolutely observes synchronous write requests (e.g. by NFS or a 
database). The synchronous write requests do not benefit from the 
long write aggregation delay so the result may not be written as 
ideally as ordinary write requests. Recently zfs has added support 
for using a SSD as a synchronous write log, and this allows zfs to 
turn synchronous writes into more ordinary writes which can be written 
more intelligently while returning to the user with minimal latency.


Bob, since the ZIL is used always, whether a separate device or not,
won't writes to a system without a separate ZIL also be written as
intelligently as with a separate ZIL?


- Yes. ZFS uses the same code path (intelligence?) to write out the data
from NFS - regardless of whether there's a separate log (slog) or not.



Thanks,
Scott

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs fragmentation

2009-08-07 Thread Bob Friesenhahn

On Fri, 7 Aug 2009, Scott Meilicke wrote:


Bob, since the ZIL is used always, whether a separate device or not, 
won't writes to a system without a separate ZIL also be written as 
intelligently as with a separate ZIL?


I don't know the answer to that.  Perhaps there is no current 
advantage.  The longer the final writes can be deferred, the more 
opportunity there is to write the data with a better layout, or to 
avoid writing some data at all.


One thing I forgot to mention in my summary is that zfs is commonly 
used in multi-user environments where there may be many simultaneous 
writers.  Simultaneous writers tend to naturally fragment a filesystem 
unless the filesystem is willing to spread the data out in advance and 
take a seek hit (from one file to another) for each file write.  Zfs 
deferrment of the writes allows the data to be written more 
intelligently in these multi-user environments.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] add-view for the zfs snapshot

2009-08-07 Thread liujun
I frist create lun by stmfadm create-lu , and  add-view , so the initiator 
can see the created lun.

Now I use zfs snapshot  to create snapshot for the created lun.
 
What can I do to make the snapshot is accessed by the Initiator? Thanks.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool iscsi /zfs performance in opensolaris 0906

2009-08-07 Thread Stephen Green

Stephen Green wrote:
Oh, and for those following along at home, the re-silvering of the slog 
to a file is proceeding well.  72% done in 25 minutes.


And, for the purposes of the archives, the re-silver finished in 34 
minutes and I successfully removed the RAM disk.  Thanks, Erik for the 
eminently followable instructions.


Also, I got my wife to agree to a new SSD, so I presume that I can 
simply do the re-silver with the new drive when it arrives.


Can I replace a log with a larger one?  Can I partition the SSD (looks 
like I'll be getting a 32GB one) and use half for cache and half for 
log?  Even if I can, should I?


Steve

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] add-view for the zfs snapshot

2009-08-07 Thread Charles Baker
 I frist create lun by stmfadm create-lu , and
 add-view , so the initiator can see the created
  lun.
 
 Now I use zfs snapshot  to create snapshot for the
 created lun.
  
 hat can I do to make the snapshot is accessed by the
 Initiator? Thanks.

Hi,

This is a good question and something that I have not tried.
Please see Chapter 7 of the zfs manual linked below.

http://dlc.sun.com/pdf/819-5461/819-5461.pdf

Cross posting with the zfs discuss

regards
Chuck
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Supported Motherboard SATA controller chipsets?

2009-08-07 Thread Volker A. Brandt
Hello Kyle!


Sorry for the late answer.

  Be careful with nVidia if you want to use Samsung SATA disks.
  There is a problem with the disk freezing up.   This bit me with
  our X2100M2 and X2200M2 systems.

 I don't know if it's related to your issue, but I have also seen
 comments around about the nv-sata windows drivers hanging up when
 formatting drives  than 1024GB. But that's been fixed in the latest
 nvidia windows drivers.

 Does that sound related, or like something different?

Something different.  The problem with the X2100M2 and X2200M2 will
only occur with specific Samsung disk models, in my case the HD103UJ
1 TB disk.  The system will work fine, until suddenly the disk freezes
up.  The disk is then no longer recognized at all.  It will not respond
to any command whatsoever.

After a power cycle, the disks is fine -- until the next freeze.

I think the same happened to some people on the 'net with the 750GB
variant of the same disk, but I have only seen it with the 1 TB type.


Regards -- Volker
-- 

Volker A. Brandt  Consulting and Support for Sun Solaris
Brandt  Brandt Computer GmbH   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513  Schuhgröße: 45
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Shrinking a zpool?

2009-08-07 Thread Cindy . Swearingen

Hey Richard,

I believe 6844090 would be a candidate for an s10 backport.

The behavior of 6844090 worked nicely when I replaced a disk of the same
physical size even though the disks were not identical.

Another flexible storage feature is George's autoexpand property (Nevada
build 117), where you can attach or replace a disk in a pool with LUN
that is larger in size than the existing size of the pool, but you can
keep the LUN size constrained with autoexpand set to off.

Then, if you decide that you want to use the expanded LUN, you can set
autoexpand to on, or you can just detach it to use in another pool where 
you need the expanded size.


(The autoexpand feature description is in the ZFS Admin Guide on the
opensolaris/...zfs/docs site.)

Contrasting the autoexpand behavior to current Solaris 10 releases, I
noticed recently that you can use zpool attach/detach to attach a larger 
disk for eventual replacement purposes and the pool size is expanded

automatically, even on a live root pool, without the autoexpand feature
and no import/export/reboot is needed. (Well, I always reboot to see if
the new disk will boot before detaching the existing disk.)

I did this recently to expand a 16-GB root pool to 68-GB root pool.
See the example below.

Cindy

# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool  16.8G  5.61G  11.1G33%  ONLINE  -
# zpool status
   pool: rpool
  state: ONLINE
  scrub: none requested
config:

 NAME STATE READ WRITE CKSUM
 rpoolONLINE   0 0 0
   c1t18d0s0  ONLINE   0 0 0

errors: No known data errors
# zpool attach rpool c1t18d0s0 c1t1d0s0
# zpool status rpool
   pool: rpool
  state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
 continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scrub: resilver in progress for 0h3m, 51.35% done, 0h3m to go
config:

 NAME   STATE READ WRITE CKSUM
 rpool  ONLINE   0 0 0
   mirror   ONLINE   0 0 0
 c1t18d0s0  ONLINE   0 0 0
 c1t1d0s0   ONLINE   0 0 0
# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk
/dev/rdsk/c1t1d0s0
boot from new disk to make sure replacement disk boots
# init 0
# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool  16.8G  5.62G  11.1G33%  ONLINE  -
# zpool status
   pool: rpool
  state: ONLINE
  scrub: none requested
config:

 NAME   STATE READ WRITE CKSUM
 rpool  ONLINE   0 0 0
   mirror   ONLINE   0 0 0
 c1t18d0s0  ONLINE   0 0 0
 c1t1d0s0   ONLINE   0 0 0

errors: No known data errors
# zpool detach rpool c1t18d0s0
# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool  68.2G  5.62G  62.6G 8%  ONLINE  -
# cat /etc/release
Solaris 10 5/09 s10s_u7wos_08 SPARC
Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
 Use is subject to license terms.
  Assembled 30 March 2009



On 08/05/09 17:20, Richard Elling wrote:

On Aug 5, 2009, at 4:06 PM, cindy.swearin...@sun.com wrote:


Brian,

CR 4852783 was updated again this week so you might add yourself or
your customer to continue to be updated.

In the meantime, a reminder is that a mirrored ZFS configuration
is flexible in that devices can be detached (as long as the redundancy
is not compromised) or replaced as long as the replacement disk is  an 
equivalent size or larger. So, you can move storage around if you  
need to in a mirrored ZFS config and until 4852783 integrates.



Thanks Cindy,
This is another way to skin the cat. It works for simple volumes, too.
But there are some restrictions, which could impact the operation when a
large change in vdev size is needed. Is this planned to be backported
to Solaris 10?

CR 6844090 has more details.
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
  -- richard



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



Re: [zfs-discuss] zfs fragmentation

2009-08-07 Thread Ed Spencer
Let me give a real life example of what I believe is a fragmented zfs pool.

Currently the pool is 2 terabytes in size  (55% used) and is made of 4 san luns 
(512gb each).
The pool has never gotten close to being full. We increase the size of the pool 
by adding 2 512gb luns about once a year or so.

The pool has been divided into 7 filesystems.

The pool is used for imap email data. The email system (cyrus) has 
approximately 80,000 accounts all located within the pool, evenly distributed 
between the filesystems.

Each account has a directory associated with it. This directory is the users 
inbox. Additional mail folders are subdirectories. Mail is stored as individual 
files.

We receive mail at a rate of 0-20MB/Second, every minute of every  hour of 
every day of every week, etc etc.

Users recieve mail constantly over time. They read it and then either delete it 
or store it in a subdirectory/folder.

I imagine that my mail (located in a single subdirectory structure) is spread 
over the entire pool because it has been received over time. I believe the data 
is highly fragmented (from a file and directory perspective).

The result of this is that backup thoughput of a single filesystem in this pool 
is about 8GB/hour.
We use EMC networker for backups.
  
This is a problem. There are no utilities available to evaluate this type of 
fragmentation.  
There are no utilities to fix it.

ZFS, from the mail system perspective works great. 
Writes and random reads operate well.

Backup is a problem and not just because of small files, but small files 
scatterred over the entire pool. 

Adding another pool and copying all/some data over to it would only a short 
term solution.

I believe zfs needs a feature that operates in the background and defrags the 
pool to optimize sequential reads of the file and directory structure.

Ed
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool iscsi /zfs performance in opensolaris 0906

2009-08-07 Thread Stephen Green

Stephen Green wrote:
Also, I got my wife to agree to a new SSD, so I presume that I can 
simply do the re-silver with the new drive when it arrives.


And the last thing for today, I ended up getting:

http://www.newegg.com/Product/Product.aspx?Item=N82E16820609330

which is 16GB and should be sufficient to my needs.  I'll let you know 
how it works out.  Suggestions as to pre/post installation IO tests welcome.


Steve
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs fragmentation

2009-08-07 Thread Richard Elling

On Aug 7, 2009, at 2:29 PM, Ed Spencer wrote:

Let me give a real life example of what I believe is a fragmented  
zfs pool.


Currently the pool is 2 terabytes in size  (55% used) and is made of  
4 san luns (512gb each).
The pool has never gotten close to being full. We increase the size  
of the pool by adding 2 512gb luns about once a year or so.


The pool has been divided into 7 filesystems.

The pool is used for imap email data. The email system (cyrus) has  
approximately 80,000 accounts all located within the pool, evenly  
distributed between the filesystems.


Each account has a directory associated with it. This directory is  
the users inbox. Additional mail folders are subdirectories. Mail is  
stored as individual files.


We receive mail at a rate of 0-20MB/Second, every minute of every   
hour of every day of every week, etc etc.


Users recieve mail constantly over time. They read it and then  
either delete it or store it in a subdirectory/folder.


I imagine that my mail (located in a single subdirectory structure)  
is spread over the entire pool because it has been received over  
time. I believe the data is highly fragmented (from a file and  
directory perspective).


The result of this is that backup thoughput of a single filesystem  
in this pool is about 8GB/hour.

We use EMC networker for backups.


This is very unlikely to be a fragmentation problem. It is a  
scalability problem

and there may be something you can do about it in the short term.

However, though I usually like to tease, in this case I need to tease. I
recently completed a white paper on this exact workload and how we
designed it to scale. I hope to publish that paper RSN.  When the paper
hits the web, I'll restart a new thread on using ZFS for large-scale  
email

systems.



This is a problem. There are no utilities available to evaluate this  
type of fragmentation.

There are no utilities to fix it.

ZFS, from the mail system perspective works great.
Writes and random reads operate well.

Backup is a problem and not just because of small files, but small  
files scatterred over the entire pool.


Adding another pool and copying all/some data over to it would only  
a short term solution.


I'll have to disagree.

I believe zfs needs a feature that operates in the background and  
defrags the pool to optimize sequential reads of the file and  
directory structure.


This will not solve your problem, but there are other methods that can.
 -- richard



Ed
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool Layout Advice Needed

2009-08-07 Thread Ian Collins

Adam Sherman wrote:

On 6-Aug-09, at 15:16 , Ian Collins wrote:
This ended up being a costly mistake, the environment I ended up with 
didn't play well with Live Upgrade.  So I suggest what ever you do, 
make sure you can create a new BE and boot into it before committing.


I assume this was old-style LU and the new-style ZFS-based boot 
environments?


No, the original BE was build 101, ZFS boot.  An lucreate from that BE 
took a day (!) and the new BE wasn't bootable.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss