Re: [zfs-discuss] Server Cloning With ZFS?

2009-06-18 Thread Fajar A. Nugraha
On Thu, Jun 18, 2009 at 10:56 AM, Dave Ringkorno-re...@opensolaris.org wrote:
 But what if I used zfs send to save a recursive snapshot of my root pool on 
 the old server, booted my new server (with the same architecture) from the 
 DVD in single user mode and created a ZFS pool on its local disks, and did 
 zfs receive to install the boot environments there?  The filesystems don't 
 care about the underlying disks.  The pool hides the disk specifics.  There's 
 no vfstab to edit.

http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#ZFS_Root_Pool_Recovery

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression at zfs filesystem creation

2009-06-18 Thread Haudy Kazemi

Bob Friesenhahn wrote:

On Wed, 17 Jun 2009, Haudy Kazemi wrote:

usable with very little CPU consumed.
If the system is dedicated to serving files rather than also being 
used interactively, it should not matter much what the CPU usage is.  
CPU cycles can't be stored for later use.  Ultimately, it (mostly*) 
does not matter if


Clearly you have not heard of the software flywheel:

  http://www.simplesystems.org/users/bfriesen/software_flywheel.html
I had not heard of such a device, however from the description it 
appears to be made from virtual unobtanium :)


My line of reasoning is that unused CPU cycles are to some extent a 
wasted resource, paralleling the idea that having system RAM sitting 
empty/unused is also a waste and should be used for caching until the 
system needs that RAM for other purposes (how the ZFS cache is supposed 
to work).  This isn't a perfect parallel as CPU power consumption and 
heat outlet do vary by load much more than does RAM.  I'm sure someone 
could come up with a formula for the optimal CPU loading to maximize 
energy efficiency.  There has been work on this the paper 'Dynamic Data 
Compression in Multi-hop Wireless Networks' at 
http://enl.usc.edu/~abhishek/sigmpf03-sharma.pdf .


If I understand the blog entry correctly, for text data the task took 
up to 3.5X longer to complete, and for media data, the task took about 
2.2X longer to complete with a maximum storage compression ratio of 
2.52X.


For my backup drive using lzjb compression I see a compression ratio 
of only 1.53x.


I linked to several blog posts.  It sounds like you are referring to ' 
http://blogs.sun.com/dap/entry/zfs_compression#comments '?
This blog's test results show that on their quad core platform (Sun 7410 
have quad core 2.3 ghz AMD Opteron cpus*) :
* 
http://sunsolve.sun.com/handbook_pub/validateUser.do?target=Systems/7410/spec


for text data, LZJB compression had negligible performance benefits 
(task times were unchanged or marginally better) and less storage space 
was consumed (1.47:1).
for media data, LZJB compression had negligible performance benefits 
(task times were unchanged or marginally worse) and storage space 
consumed was unchanged (1:1).
Take away message: as currently configured, their system has nothing to 
lose from enabling LZJB.


for text data, GZIP compression at any setting, had a significant 
negative impact on write times (CPU bound), no performance impact on 
read times, and significant positive improvements in compression ratio.
for media data, GZIP compression at any setting, had a significant 
negative impact on write times (CPU bound), no performance impact on 
read times, and marginal improvements in compression ratio.
Take away message: With GZIP as their system is currently configured, 
write performance would suffer in exchange for a higher compression 
ratio.  This may be acceptable if the system fulfills a role that has a 
read heavy usage profile of compressible content.  (An archive.org 
backend would be such an example.)  This is similar to the tradeoff made 
when comparing RAID1 or RAID10 vs RAID5.


Automatic benchmarks could be used to detect and select the optimal 
compression settings for best performance, with the basic case assuming 
the system is a dedicated file server and more advanced cases accounting 
for the CPU needs of other processes run on the same platform.  Another 
way would be to ask the administrator what the usage profile for the 
machine will be and preconfigure compression settings suitable for that 
use case.


Single and dual core systems are more likely to become CPU bound from 
enabling compression than a quad core.


All systems have bottlenecks in them somewhere by virtue of design 
decisions.  One or more of these bottlenecks will be the rate limiting 
factor for any given workload, such that even if you speed up the rest 
of the system the process will still take the same amount of time to 
complete.  The LZJB compression benchmarks on the quad core above 
demonstrate that LZJB is not the rate limiter either in writes or 
reads.  The GZIP benchmarks show that it is a rate limiter, but only 
during writes.  On a more powerful platform (6x faster CPU), GZIP writes 
may no longer be the bottleneck (assuming that the network bandwidth and 
drive I/O bandwidth remain unchanged).


System component balancing also plays a role.  If the server is 
connected via a 100 Mbps CAT5e link, and all I/O activity is from client 
computers on that link, does it make any difference if the server is 
actually capable of GZIP writes at 200 Mbps, 500 Mbps, or 1500 Mbps?  If 
the network link is later upgraded to Gigabit ethernet, now only the 
system capable of GZIPing at 1500 Mbps can keep up.  The rate limiting 
factor changes as different components are upgraded.


In many systems for many workloads, hard drive I/O bandwidth is the rate 
limiting factor that has the most significant performance impact, such 
that a 20% boost 

Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?

2009-06-18 Thread Bogdan M. Maryniuk
2009/6/18 Timh Bergström timh.bergst...@diino.net:
 USB-sticks has proven a bad idea with zfs mirrors
I think, USB sticks is bad idea for mirrors in general... :-)

 ZFS on iSCSI *is* flaky
OK, so what is the status of your bugreport about this? Was ignored or
just rejected?..

 Flaming people on ./
Nobody flaming people nor in current directory (./) neither on /.
(slash-dot). All asked is a practical steps or bug reports.

P.S. Additionally, everyone can spend their true anger on an installed
Solaris somewhere on a spare hardware and kill that sucker with a
stress-tests. Effect: you're relaxed and Sun folks has a job. :-)

-- 
Kind regards, BM

Things, that are stupid at the beginning, rarely ends up wisely.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs on 32 bit?

2009-06-18 Thread Casper . Dik


yeah.  many of those ARM systems will be low-power
builtin-crypto-accel builtin-gigabit-MAC based on Orion and similar,
NAS (NSLU2-ish) things begging for ZFS.

So what's the boot environment they use?

cd It's true for most of the Intel Atom family (Zxxx and Nxxx but
cd not the 230 and 330 as those are 64 bit) Those are new
cd systems.

the 64-bit atom are desktop, and the 32-bit are laptop.  They are both
current chips right now---the 64-bit are not newer than 32-bit.


I know; I'm not sure about the recently Pineview boxes, though.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using single SSD for l2arc on multiple pools?

2009-06-18 Thread Mertol Ozyoney
Hi Joseph ;

You cant share SSDs between pools (at least for today) unless you slice.
Also it's better to use 2x SSD's for L2 ARC as depending on your system
there can be slight limitations of using one SSD. 

Best regards
Mertol 



Mertol Ozyoney 
Storage Practice - Sales Manager

Sun Microsystems, TR
Istanbul TR
Phone +902123352200
Mobile +905339310752
Fax +90212335
Email mertol.ozyo...@sun.com



-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Joseph Mocker
Sent: Tuesday, June 16, 2009 10:28 PM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Using single SSD for l2arc on multiple pools?

Hello,

I'm curious if it is possible to use a single SSD for the l2arc for 
multiple pools?

I'm guessing that I can break the SSD into multiple slices and assign a 
slice as a cache device in each pool. That doesn't seem very flexible 
though, so I was wondering if there is another way to do this?

Thanks...

  --joe
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?

2009-06-18 Thread Timh Bergström
Den 18 juni 2009 09.42 skrev Bogdan M. Maryniukbogdan.maryn...@gmail.com:
 ZFS on iSCSI *is* flaky
 OK, so what is the status of your bugreport about this? Was ignored or
 just rejected?..

No bug report because I don't think it's the file systems fault, and
why bother when disappearing vdevs (even though the pool is fully
redundant (raidz) and got enough vdevs to be theoretically working)
causes the machine to panic and crash when there is other
solutions/file systems that is more robust (for me) when using
iscsi/fc. If my data is gone (or inaccessible), I have other things to
worry about than filing bug reports and/or get on the list and get
flamed for not having proper backups. :-]

How to reproduce? Create a raidz2 pool (with Solaris 10u3) over two
iscsi-enclosures, shutdown one of the enclosures, observe the results.
It would probably work better if I upgraded solaris/zfs, but as I said
- at the time I had other things to worry about.

No flaming/blaming/hating, I simply don't use the combination
zfs+iscsi/fc for critical data anymore and thats OK with me.

--
Best Regards,
Timh
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS: Re-Propragate inheritable ACL permissions

2009-06-18 Thread Andreas Kuechler
Hi Cindy and Christo,

this is a good example of how useless ZFS ACLs are. Nobody understands how to 
use them!

Please note in Cindy's examples above:

You can not use file_inherit on files. Inheritance can only be set on 
directories. Depending on the zfs aclinherit mode, the result may not be what 
you want. When you have set an ACL inheritance on a directory and use chmod in 
the old way, e.g. chmod g-w dir1, the ACL inheritance of dir1 is modified!

Be extremely careful with chmod A=... since this replaces any ACL set on 
file/dir, including trivial ACLs for owner@, group@ and every...@.

My experience: Avoid ACLs wherever you can. They are simply not manageable.

Andreas
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] compression at zfs filesystem creation

2009-06-18 Thread Bob Friesenhahn

On Thu, 18 Jun 2009, Haudy Kazemi wrote:


for text data, LZJB compression had negligible performance benefits (task 
times were unchanged or marginally better) and less storage space was 
consumed (1.47:1).
for media data, LZJB compression had negligible performance benefits (task 
times were unchanged or marginally worse) and storage space consumed was 
unchanged (1:1).
Take away message: as currently configured, their system has nothing to lose 
from enabling LZJB.


My understanding is that these tests were done with NFS and one client 
over gigabit ethernet (a file server scenario).  So in this case, the 
system is able to keep up with NFS over gigabit ethernet when LZJB is 
used.


In a stand-alone power-user desktop scenario, the situtation may be 
quite different.  In this case application CPU usage may be competing 
with storage CPU usage.  Since ZFS often defers writes, it may be that 
the compression is performed at the same time as application compute 
cycles.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] 7110 questions

2009-06-18 Thread Dan Pritts
Hi all,

(down to the wire here on EDU grant pricing :)

i'm looking at buying a pair of 7110's in the EDU grant sale.
The price is sure right.  I'd use them in a mirrored, cold-failover
config.

I'd primarily be using them to serve a vmware cluster; the current config
is two standalone ESX servers with local storage, 450G of SAS RAID10 each.

the 7110 price point is great, and i think i have a reasonable
understanding of how this stuff ought to work.

I'm curious about a couple things that would be unsupported.

Specifically, whether they are not supported if they have specifically
been crippled in the software.

1) SSD's 

I can imagine buying an intel SSD, slotting it into the 7110, and using
it as a ZFS L2ARC (? i mean the equivalent of readzilla)

2) expandability

I can imagine buying a SAS card and a JBOD and hooking it up to
the 7110; it has plenty of PCI slots.

finally, one question - I presume that I need to devote a pair of disks
to the OS, so I really only get 14 disks for data.  Correct?

thanks!

danno
--
Dan Pritts, Sr. Systems Engineer
Internet2
office: +1-734-352-4953 | mobile: +1-734-834-7224

ESCC/Internet2 Joint Techs
July 19-23, 2009 - Indianapolis, Indiana
http://jointtechs.es.net/indiana2009/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] problems with l2arc in 2009.06

2009-06-18 Thread Rob Logan

 correct ratio of arc to l2arc?

from http://blogs.sun.com/brendan/entry/l2arc_screenshots

It costs some DRAM to reference the L2ARC, at a rate proportional to record 
size.
For example, it currently takes about 15 Gbytes of DRAM to reference 600 Gbytes 
of
L2ARC - at an 8 Kbyte ZFS record size. If you use a 16 Kbyte record size, that 
cost
would be halve - 7.5 Gbytes. This means you shouldn't, for example, configure a
system with only 8 Gbytes of DRAM, 600 Gbytes of L2ARC, and an 8 Kbyte record 
size -
if you did, the L2ARC would never fully populate.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files

2009-06-18 Thread Louis Romero

hi Dirk,

How might we explain running find on a linux client to an NFS mounted 
file system under the 7000 taking significantly longer (i.e. performance 
behaving as though the command was run from Solaris?)  Not sure if find 
would have the intelligence to differentiate between file system types 
and run different sections of code based upon what it finds?


louis

On 06/17/09 11:38, Dirk Nitschke wrote:

Hi Louis!

Solaris /usr/bin/find and Linux (GNU-) find work differently! I have 
experienced dramatic runtime differences some time ago. The reason is 
that Solaris find and GNU find use different algorithms.


GNU find uses the st_nlink (number of links) field of the stat 
structure to optimize it's work. Solaris find does not use this kind 
of optimization because the meaning of number of links is not well 
defined and file system dependent.


If you are interested, take a look at, say,

CR 4907267 link count problem is hsfs
CR 4462534 RFE: pcfs should emulate link counts for directories

Dirk

Am 17.06.2009 um 18:08 schrieb Louis Romero:


Jose,

I believe the problem is endemic to Solaris.  I have run into similar 
problems doing a simple find(1) in /etc.  On Linux, a find operation 
in /etc is almost instantaneous.  On solaris, it has a tendency to  
spin for a long time.  I don't know what their use of find might be 
but, running updatedb on the linux clients (with the NFS file system 
mounted of course) and using locate(1) will give you a work-around on 
the linux clients.
Caveat Empore: There is a staleness factor associated with this 
solution as any new files dropped in after updatedb runs will not 
show up until the next updatedb is run.


HTH

louis

On 06/16/09 11:55, Jose Martins wrote:


Hello experts,

IHAC that wants to put more than 250 Million files on a single
mountpoint (in a directory tree with no more than 100 files on each
directory).

He wants to share such filesystem by NFS and mount it through
many Linux Debian clients

We are proposing a 7410 Openstore appliance...

He is claiming that certain operations like find, even if taken from
the Linux clients on such NFS mountpoint take significant more
time than if such NFS share was provided by other NAS providers
like NetApp...

Can someone confirm if this is really a problem for ZFS filesystems?...

Is there any way to tune it?...

We thank any input

Best regards

Jose









___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files

2009-06-18 Thread Cor Beumer - Storage Solution Architect




Hi Jose,

Well it depends on the total size of your Zpool and how often these
files are changed.

I was at a customer an huge internet provider, who had 40x an X4500
with Standard solaris and using ZFS.
All the machines were equiped with 48x 1TB disks. The machines were
used to provide the email platform, so all
the user email accounts were on the system. This did mean also millions
of files in one ZPOOL.

What they noticed on the the X4500 systems, that when the zpool became
filled up for about 50-60% the performance of the system
did drop enormously. 
They do claim this has to do with the fragmentation of the ZFS
filesystem. So we did try over there putting an S7410 system in with
about the same config on disks, 44x 1TB SATA BUT 4x 18GB WriteZilla (in
a stripe) we were able to get much and much more i/o's from the system
the the comparable X4500, however they did put it in production for a
couple of weeks, and as soon as the ZFS filesystem did come in the
range of about 50-60% filling the did see the same problem. 
The performance did drop down enormously. Netapps has the same problem
with there Waffle filesystem, (they also tested this) however they do
provide an Defragmentation tool for this. This is also NOT a nice
solution, because you have to run this, manually or scheduled and it is
taking a lot of system resources but it helps. 

I did hear Sun is denying we do have this problem in ZFS, and therefore
we don't need a kind of defragmentation mechanism,
however our customer experiences are different

May be it is good for the ZFS group to look at this (potential) problem.

The customer i am talking about is willing to share there experiences
with Sun engineering.

greetings,

Cor Beumer


Jose Martins wrote:

Hello experts,
  
  
IHAC that wants to put more than 250 Million files on a single
  
mountpoint (in a directory tree with no more than 100 files on each
  
directory).
  
  
He wants to share such filesystem by NFS and mount it through
  
many Linux Debian clients
  
  
We are proposing a 7410 Openstore appliance...
  
  
He is claiming that certain operations like find, even if taken from
  
the Linux clients on such NFS mountpoint take significant more
  
time than if such NFS share was provided by other NAS providers
  
like NetApp...
  
  
Can someone confirm if this is really a problem for ZFS filesystems?...
  
  
Is there any way to tune it?...
  
  
We thank any input
  
  
Best regards
  
  
Jose
  
  
  
  


-- 

  

  
   Cor Beumer 
 Data Management  Storage
  
 Sun Microsystems Nederland BV
 Saturnus 1
 3824 ME Amersfoort The Netherlands
 Phone +31 33 451 5172
 Mobile +31 6 51 603 142
 Email cor.beu...@sun.com
  

  




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?

2009-06-18 Thread Miles Nordin
 bmm == Bogdan M Maryniuk bogdan.maryn...@gmail.com writes:
 tt == Toby Thain t...@telegraphics.com.au writes:

   bmm That's why I think that speaking My $foo crashes therefore it
   bmm is all crap is bad idea: either help to fix it or just don't
   bmm use it,

First, people are allowed to speak and share information, and yes,
even complain, without helping to fix things.  You do not get to
silence people who lack the talent, time, and interest to fix
problems.  Everyone's allowed to talk here.

Second, I do use ZFS.  But I keep a backup pool.  And although my
primary pool is iSCSI-based, the backup pools are direct-attached.

Thanks to the open discussion on the list, I know that using iSCSI
puts me at higher risk of pool loss.  I know I need to budget for the
backup pool equipment if I want to switch from $oldfilesystem to ZFS
and not take a step down in reliability.  I know that, while there is
no time-consuming fsck to draw out downtime, pretty much every
corruption event results in ``restore the pool from backup'' which
takes a while, so I need to expect that by, for example, being
prepared to run critical things directly off the backup pools.

Finally, I know that ZFS pool corruption almost always results in loss
of the whole pool, while other filesystem corruption tends to do
crazier things which cappen to be less catastrophic to my particular
dataset: some files but not all are lost after fsck, some files remain
but lose their names, or more usefully retain their names but lose the
name of one of their parent directories, the insides of some files are
silently corrupted.

There's actionable information in here.  Technical discussion is worth
more than sucks/rules armwrestling.

   bmm The same way, if you have a mirror of USB hard drives, then
   bmm swap cables and reboot — your mirror gone. But that's not
   bmm because of ZFS, if you will look more closely...

actually I think you are the one not looking closely enough.  You say
no one is losing pools, and then 10min later reply to a post about
running zdb on a lost pool.  You shouldn't need me to tell you
something's wrong.

When you limit your thesis to ``ZFS rules'' and then actively mislead
people, we all lose.

tt /. is no person...

right, so I use a word like ad hominem, and you stray from the main
point to say ``Erm ayctually your use of rhetorical terminology is
incorrect.''  maybe, maybe not, whatever, but 

  again [x2], the posts in the slashdot thread complaining about
  corruption were just pointers to original posts on this list, so
  attacking the forum where you saw the pointer instead of the content
  of its destination really is clearly _ad hominem_.

*brrk* *brr* ``no!  no it's not ad hominem!  it's a different word!
ah, ha ah thought' you'd slip one past me there eh?''  QUIT BEING SO
DAMNED ADD.  We can get nowhere.

As for the posts being rubbish, you and I both know it's plausible
speculation that Apple delayed unleashing ZFS on their consumers
because of the lost pool problems.  ZFS doesn't suck, I do use it, I
hope and predict it will get better---so just back off and calm down
with the rotten fruit.  But neither who's saying it nor your not
wanting to hear it makes it less plausible.


pgpqQ7LsTK2De.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] problems with l2arc in 2009.06

2009-06-18 Thread Ethan Erchinger
 
   correct ratio of arc to l2arc?
 
 from http://blogs.sun.com/brendan/entry/l2arc_screenshots
 
Thanks Rob. Hmm...that ratio isn't awesome.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 7110 questions

2009-06-18 Thread Adam Leventhal
On Thu, Jun 18, 2009 at 11:51:44AM -0400, Dan Pritts wrote:
 I'm curious about a couple things that would be unsupported.
 
 Specifically, whether they are not supported if they have specifically
 been crippled in the software.

We have not crippled the software in any way, but we have designed an
appliance with some specific uses. Doing things from the Solaris shell
by hand my damage your system and void your support contract.

 1) SSD's 
 
 I can imagine buying an intel SSD, slotting it into the 7110, and using
 it as a ZFS L2ARC (? i mean the equivalent of readzilla)

That's not supported, it won't work easily, and if you get it working you'll
be out of luck if you have a problem.

 2) expandability
 
 I can imagine buying a SAS card and a JBOD and hooking it up to
 the 7110; it has plenty of PCI slots.

Ditto.

 finally, one question - I presume that I need to devote a pair of disks
 to the OS, so I really only get 14 disks for data.  Correct?

That's right. We market the 7110 as either 2TB = 146GB x 14 or 4.2TB =
300GB x 14 raw capacity.

Adam

-- 
Adam Leventhal, Fishworks http://blogs.sun.com/ahl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files

2009-06-18 Thread Richard Elling

Cor Beumer - Storage Solution Architect wrote:

Hi Jose,

Well it depends on the total size of your Zpool and how often these 
files are changed.


...and the average size of the files.  For small files, it is likely 
that the default

recordsize will not be optimal, for several reasons.  Are these small files?
-- richard



I was at a customer an huge internet provider, who had 40x an X4500 
with Standard solaris and using ZFS.
All the machines were equiped with 48x 1TB disks. The machines were 
used to provide the email platform, so all
the user email accounts were on the system. This did mean also 
millions of files in one ZPOOL.


What they noticed on the the X4500 systems, that when the zpool became 
filled up for about 50-60% the performance of the system

did drop enormously.
They do claim this has to do with the fragmentation of the ZFS 
filesystem. So we did try over there putting an S7410 system in with 
about the same config on disks, 44x 1TB SATA BUT 4x 18GB WriteZilla 
(in a stripe) we were able to get much and much more i/o's from the 
system the the comparable X4500, however they did put it in production 
for a couple of weeks, and as soon as the ZFS filesystem did come in 
the range of about 50-60% filling the did see the same problem.
The performance did drop down enormously. Netapps has the same problem 
with there Waffle filesystem, (they also tested this) however they do 
provide an Defragmentation tool for this. This is also NOT a nice 
solution, because you have to run this, manually or scheduled and it 
is taking a lot of system resources but it helps.


I did hear Sun is denying we do have this problem in ZFS, and 
therefore we don't need a kind of defragmentation mechanism,

however our customer experiences are different

May be it is good for the ZFS group to look at this (potential) problem.

The customer i am talking about is willing to share there experiences 
with Sun engineering.


greetings,

Cor Beumer


Jose Martins wrote:


Hello experts,

IHAC that wants to put more than 250 Million files on a single
mountpoint (in a directory tree with no more than 100 files on each
directory).

He wants to share such filesystem by NFS and mount it through
many Linux Debian clients

We are proposing a 7410 Openstore appliance...

He is claiming that certain operations like find, even if taken from
the Linux clients on such NFS mountpoint take significant more
time than if such NFS share was provided by other NAS providers
like NetApp...

Can someone confirm if this is really a problem for ZFS filesystems?...

Is there any way to tune it?...

We thank any input

Best regards

Jose





--
http://www.sun.com*Cor Beumer *
  Data Management  Storage

  *Sun Microsystems Nederland BV*
  Saturnus 1
  3824 ME Amersfoort The Netherlands
  Phone +31 33 451 5172
  Mobile +31 6 51 603 142
  Email cor.beu...@sun.com



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS metadata and cloning filesystem layout across machines

2009-06-18 Thread Nikhil
Hey ZFS experts,

Where is the ZFS metadata stored under? Can it be viewed through some commands?

Here is my requirement: I have a machine with lots of ZFS filesystems on it 
under couple of zpools and there is this another new machine with empty disks, 
what I want now is the similar layout of the pools, filesystems with 
quota,reservation and other properties intact along with the naming conventions 
created on the new machine.

How do I create a schema or reverse engineer to run the zfs/zpool commands 
required to create the similay layout of the zfs filesystems on the new machine.
Is there a import stuff like that?

Thanks.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] problems with l2arc in 2009.06

2009-06-18 Thread Richard Elling

Ethan Erchinger wrote:

  correct ratio of arc to l2arc?

from http://blogs.sun.com/brendan/entry/l2arc_screenshots



Thanks Rob. Hmm...that ratio isn't awesome.
  


TANSTAAFL

A good SWAG is about 200 bytes for L2ARC directory in the ARC for
each record in the L2ARC.

So if your recordsize is 512 bytes (pathologically worst case), you'll need
200/512 * size of L2ARC for a minimum ARC size, so ARC needs to be
about 40% of the size of L2ARC.  For 8 kByte recordsize it will be about
200/8192 or 2.5%.  Neel liked using 16kByte recordsize for InnoDB, so
figure about about 1.2%.

In this case, if you have about 150 GBytes of L2ARC disk, and are using
8 kByte recordsize, you'll need at least 3.75 GBytes for the ARC, instead
of 2 GBytes.  Since this space competes with the regular ARC caches,
you'll want even more headroom, so maybe 5 GBytes would be a
reasonable minimum ARC cap?
-- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS metadata and cloning filesystem layout across machines

2009-06-18 Thread Henrik Hjort

Hi Nikhil,

take a look at the output from 'zpool history'. You should get/see all  
the

information you need to be able to recreate your configuration.

http://docs.sun.com/app/docs/doc/819-5461/gdswe?a=view

Cheers,
 Henrik

On Jun 18, 2009, at 8:47 PM, Nikhil wrote:


Hey ZFS experts,

Where is the ZFS metadata stored under? Can it be viewed through  
some commands?


Here is my requirement: I have a machine with lots of ZFS  
filesystems on it under couple of zpools and there is this another  
new machine with empty disks, what I want now is the similar layout  
of the pools, filesystems with quota,reservation and other  
properties intact along with the naming conventions created on the  
new machine.


How do I create a schema or reverse engineer to run the zfs/zpool  
commands required to create the similay layout of the zfs  
filesystems on the new machine.

Is there a import stuff like that?

Thanks.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 7110 questions

2009-06-18 Thread lawrence ho
We have a 7110 on try and buy program. 

We tried using the 7110 with XEN Server 5 over iSCSI and NFS. Nothing seems to 
solve the slow write problem. Within the VM, we observed around 8MB/s on 
writes. Read performance is fantastic. Some troubleshooting was done with local 
SUN rep. The conclusion is that 7110 does not have write cache in forms of SSD 
or controller DRAM write cache. The solution from SUN is to buy StorageTek or 
7000 series model with SSD write cache.

Adam, please advise if there any fixes for 7110. I am still shopping for SAN 
and would rather buy a 7100 than a StorageTek or something else.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files

2009-06-18 Thread Gary Mills
On Thu, Jun 18, 2009 at 12:12:16PM +0200, Cor Beumer - Storage Solution 
Architect wrote:
 
 What they noticed on the the X4500 systems, that when the zpool became 
 filled up for about 50-60% the performance of the system
 did drop enormously.
 They do claim this has to do with the fragmentation of the ZFS 
 filesystem. So we did try over there putting an S7410 system in with 
 about the same config on disks, 44x 1TB SATA BUT 4x 18GB WriteZilla (in 
 a stripe) we were able to get much and much more i/o's from the system 
 the the comparable X4500, however they did put it in production for a 
 couple of weeks, and as soon as the ZFS filesystem did come in the range 
 of about 50-60% filling the did see the same problem.

We had a similar problem with a T2000 and 2 TB of ZFS storage.  Once
the usage reached 1 TB, the write performance dropped considerably and
the CPU consumption increased.  Our problem was indirectly a result of
fragmentation, but it was solved by a ZFS patch.  I understand that
this patch, which fixes a whole bunch of ZFS bugs, should be released
soon.  I wonder if this was your problem.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Server Cloning With ZFS?

2009-06-18 Thread Cindy . Swearingen

Hi Dave,

Until the ZFS/flash support integrates into an upcoming Solaris 10
release, I don't think we have an easy way to clone a root pool/dataset
from one system to another system because system specific info is still
maintained.

Your manual solution sounds plausible but probably won't work because of
the system specific info.

Here are some options:

1. Wait for the ZFS/flash support in an upcoming Solaris 10 release.
You can track CR 6690473 for this support.

2. Review interim solutions that involves UFS to ZFS migration but might
give you some ideas:

http://blogs.sun.com/scottdickson/entry/flashless_system_cloning_with_zfs
http://blogs.sun.com/scottdickson/entry/a_much_better_way_to

3. Do an initial installation of your new server with a two-disk 
mirrored root pool. Set up a separate pool for data/applications. 
Snapshot data from the E450 and send/receive over to the data/app

pool on the new server.

Cindy

Dave Ringkor wrote:

So I had an E450 running Solaris 8 with VxVM encapsulated root disk.  I 
upgraded it to Solaris 10 ZFS root using this method:

- Unencapsulate the root disk
- Remove VxVM components from the second disk
- Live Upgrade from 8 to 10 on the now-unused second disk
- Boot to the new Solaris 10 install
- Create a ZFS pool on the now-unused first disk
- Use Live Upgrade to migrate root filesystems to the ZFS pool
- Add the now-unused second disk to the ZFS pool as a mirror

Now my E450 is running Solaris 10 5/09 with ZFS root, and all the same users, 
software, and configuration that it had previously.  That is pretty slick in 
itself.  But the server itself is dog slow and more than half the disks are 
failing, and maybe I want to clone the server on new(er) hardware.

With ZFS, this should be a lot simpler than it used to be, right?  A new server has new hardware, new disks with different names and different sizes.  But that doesn't matter anymore.  There's a procedure in the ZFS manual to recover a corrupted server by using zfs receive to reinstall a copy of the boot environment into a newly created pool on the same server.  But what if I used zfs send to save a recursive snapshot of my root pool on the old server, booted my new server (with the same architecture) from the DVD in single user mode and created a ZFS pool on its local disks, and did zfs receive to install the boot environments there?  The filesystems don't care about the underlying disks.  The pool hides the disk specifics.  There's no vfstab to edit.  


Off the top of my head, all I can think to have to change is the network interfaces.  And 
that change is as simple as cd /etc ; mv hostname.hme0 hostname.qfe0 or 
whatever.  Is there anything else I'm not thinking of?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?

2009-06-18 Thread Toby Thain


On 18-Jun-09, at 12:14 PM, Miles Nordin wrote:


bmm == Bogdan M Maryniuk bogdan.maryn...@gmail.com writes:
tt == Toby Thain t...@telegraphics.com.au writes:

...
tt /. is no person...



... you and I both know it's plausible
speculation that Apple delayed unleashing ZFS on their consumers
because of the lost pool problems.  ZFS doesn't suck, I do use it, I
hope and predict it will get better---so just back off and calm down
with the rotten fruit.  But neither who's saying it nor your not
wanting to hear it makes it less plausible.


In my opinion, a more plausible explanation is: Apple has not made  
ZFS integration a high priority [for 10.6].


There is no doubt Apple has the engineering resources to make it  
perfectly reliable as a component of Mac OS X, if that were a high  
priority goal.


I run OS X but I am not at all tempted to play with ZFS on it there;  
life is too short for betas. If I want ZFS I install Solaris 10.


--Toby



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 7110 questions

2009-06-18 Thread Scott Meilicke
Both iSCSI and NFS are slow? I would expect NFS to be slow, but in my iSCSI 
testing with OpenSolaris 2008.11, performance we reasonable, about 2x NFS. 

Setup: Dell 2950 with a SAS HBA and SATA 3x5 raidz (15 disks, no separate ZIL), 
iSCSI using vmware ESXi 3.5 software initiator.

Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 7110 questions

2009-06-18 Thread Erik Ableson
There's a configuration issue in there somewhere. I have a ZFS based  
system serving up to some ESX servers working great with a few  
exceptions.


First off perf was awful, but there was some confusion on how to  
optimize network traffic on ESX so I installed a fresh one using only  
the defaults, no jumbo frames, no etherchannel and I was able to push  
the ZFS server to wire speed read and write over iSCSI. I still have  
the write problem over NFS though. I should be back in the datacenter  
tomorrow to see if it's specific to the ESX NFS client.


So my advice is to start looking at all of the tweaks that have been  
applied to the networking setup on the Xen side first.


Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 18 juin 2009, at 21:06, lawrence ho no-re...@opensolaris.org wrote:


We have a 7110 on try and buy program.

We tried using the 7110 with XEN Server 5 over iSCSI and NFS.  
Nothing seems to solve the slow write problem. Within the VM, we  
observed around 8MB/s on writes. Read performance is fantastic. Some  
troubleshooting was done with local SUN rep. The conclusion is that  
7110 does not have write cache in forms of SSD or controller DRAM  
write cache. The solution from SUN is to buy StorageTek or 7000  
series model with SSD write cache.


Adam, please advise if there any fixes for 7110. I am still shopping  
for SAN and would rather buy a 7100 than a StorageTek or something  
else.

--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 7110 questions

2009-06-18 Thread Nicholas Lee
With XenServer 4 and NFS you had to grow the disks (modified manually from
thin to fat) in order to get decent performance.


On Fri, Jun 19, 2009 at 7:06 AM, lawrence ho no-re...@opensolaris.orgwrote:

 We have a 7110 on try and buy program.

 We tried using the 7110 with XEN Server 5 over iSCSI and NFS. Nothing seems
 to solve the slow write problem. Within the VM, we observed around 8MB/s on
 writes. Read performance is fantastic. Some troubleshooting was done with
 local SUN rep. The conclusion is that 7110 does not have write cache in
 forms of SSD or controller DRAM write cache. The solution from SUN is to buy
 StorageTek or 7000 series model with SSD write cache.

 Adam, please advise if there any fixes for 7110. I am still shopping for
 SAN and would rather buy a 7100 than a StorageTek or something else.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] 7110 questions

2009-06-18 Thread Adam Leventhal
Hey Lawrence,

Make sure you're running the latest software update. Note that this forumn
is not the appropriate place to discuss support issues. Please contact your
official Sun support channel.

Adam

On Thu, Jun 18, 2009 at 12:06:02PM -0700, lawrence ho wrote:
 We have a 7110 on try and buy program. 
 
 We tried using the 7110 with XEN Server 5 over iSCSI and NFS. Nothing seems 
 to solve the slow write problem. Within the VM, we observed around 8MB/s on 
 writes. Read performance is fantastic. Some troubleshooting was done with 
 local SUN rep. The conclusion is that 7110 does not have write cache in forms 
 of SSD or controller DRAM write cache. The solution from SUN is to buy 
 StorageTek or 7000 series model with SSD write cache.
 
 Adam, please advise if there any fixes for 7110. I am still shopping for SAN 
 and would rather buy a 7100 than a StorageTek or something else.
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 
Adam Leventhal, Fishworks http://blogs.sun.com/ahl
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files

2009-06-18 Thread Richard Elling

Gary Mills wrote:

On Thu, Jun 18, 2009 at 12:12:16PM +0200, Cor Beumer - Storage Solution 
Architect wrote:
  
What they noticed on the the X4500 systems, that when the zpool became 
filled up for about 50-60% the performance of the system

did drop enormously.
They do claim this has to do with the fragmentation of the ZFS 
filesystem. So we did try over there putting an S7410 system in with 
about the same config on disks, 44x 1TB SATA BUT 4x 18GB WriteZilla (in 
a stripe) we were able to get much and much more i/o's from the system 
the the comparable X4500, however they did put it in production for a 
couple of weeks, and as soon as the ZFS filesystem did come in the range 
of about 50-60% filling the did see the same problem.



We had a similar problem with a T2000 and 2 TB of ZFS storage.  Once
the usage reached 1 TB, the write performance dropped considerably and
the CPU consumption increased.  Our problem was indirectly a result of
fragmentation, but it was solved by a ZFS patch.  I understand that
this patch, which fixes a whole bunch of ZFS bugs, should be released
soon.  I wonder if this was your problem.
  


George would probably have the latest info, but there were a number of
things which circled around the notorious Stop looking and start ganging
bug report,
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6596237
-- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs on 32 bit?

2009-06-18 Thread Miles Nordin
 cd == Casper Dik casper@sun.com writes:

  yeah.  many of those ARM systems will be low-power
  builtin-crypto-accel builtin-gigabit-MAC based on Orion and
  similar, NAS (NSLU2-ish) things begging for ZFS.

cd So what's the boot environment they use?

i think it is called U-Boot:

 http://forum.openwrt.org/viewtopic.php?pid=60387


pgpy5uR63QExP.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?

2009-06-18 Thread Sean Sprague

Toby,



On 17-Jun-09, at 7:37 AM, Orvar Korvar wrote:

Ok, so you mean the comments are mostly FUD and bull shit? Because 
there are no bug reports from the whiners? Could this be the case? It 
is mostly FUD? Hmmm...?




Having read the thread, I would say without a doubt.

Slashdot was never the place to go for accurate information about ZFS. 


Many would even say:

Slashdot was never the place to go for accurate information.
Slashdot was never the place to go for information.
Slashdot was never the place to go.
Slashdot? Never.

Take your pick ;-)

Regards... Sean.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs on 32 bit?

2009-06-18 Thread Fajar A. Nugraha
On Thu, Jun 18, 2009 at 4:28 AM, Miles Nordincar...@ivy.net wrote:
   djm http://opensolaris.org/os/project/osarm/

 yeah.  many of those ARM systems will be low-power
 builtin-crypto-accel builtin-gigabit-MAC based on Orion and similar,
 NAS (NSLU2-ish) things begging for ZFS.

Are they feasible targets for zfs?

The N610N that I have (BCM3302, 300MHz, 64MB) isn't even powerful
enough to saturate either the gigabit wired or 802.11n wireless. It
only goes about 25Mbps.

Last time I test on EEPC 2G's Celeron, zfs is slow to the point of
unusable. Will it be usable enough on most ARMs?

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs on 32 bit?

2009-06-18 Thread Erik Trimble

Fajar A. Nugraha wrote:

On Thu, Jun 18, 2009 at 4:28 AM, Miles Nordincar...@ivy.net wrote:
  

  djm http://opensolaris.org/os/project/osarm/

yeah.  many of those ARM systems will be low-power
builtin-crypto-accel builtin-gigabit-MAC based on Orion and similar,
NAS (NSLU2-ish) things begging for ZFS.



Are they feasible targets for zfs?

The N610N that I have (BCM3302, 300MHz, 64MB) isn't even powerful
enough to saturate either the gigabit wired or 802.11n wireless. It
only goes about 25Mbps.

Last time I test on EEPC 2G's Celeron, zfs is slow to the point of
unusable. Will it be usable enough on most ARMs?

  
Well, given that ARM processors use a completely different ISA (ie. 
they're not x86-compatible), OpenSolaris won't run on them currently.


If you'd like to do the port

wink

I can't say as to the entire Atom line of stuff, but I've found the 
Atoms are OK for desktop use, and not anywhere powerful enough for even 
a basic NAS server.  The demands of wire-speed Gigabit, ZFS, and 
encryption/compression are hard on the little Atom guys. Plus, it seems 
to be hard to find an Atom motherboard which supports more than 2GB of 
RAM, which is a serious problem.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs on 32 bit?

2009-06-18 Thread Bogdan M. Maryniuk
On Fri, Jun 19, 2009 at 11:16 AM, Erik Trimbleerik.trim...@sun.com wrote:
 I can't say as to the entire Atom line of stuff, but I've found the Atoms
 are OK for desktop use, and not anywhere powerful enough for even a basic
 NAS server.  The demands of wire-speed Gigabit, ZFS, and
 encryption/compression are hard on the little Atom guys.

+1. I wanted to skip it, but will reply.

I have two Asus EeePC Box 202 / 2GB. These are running numerous zones
(snv_111b) for me with various services on them and still are very
usable and fast enough. Additionally, I overclocked each up to
1.75GHz, did some corrections to Solaris's TCP/IP stack, removed some
unnecessary services and they are just fine.

 Plus, it seems to be hard to find an Atom motherboard which supports
 more than 2GB of RAM, which is a serious problem.

Well, let's don't forget that Atom is also smallest low-power
processor and is designed for cheap and small nettops/netbooks that
are don't need 4GB RAM ever. Despite of that:
http://www.mini-itx.com/store/?c=53

-- 
Kind regards, BM

Things, that are stupid at the beginning, rarely ends up wisely.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss