Re: [zfs-discuss] SPARC SATA, please.

2009-06-26 Thread Volker A. Brandt
> The MCP55 is the chipset currently in use in the Sun X2200 M2 series of
> servers.

... which has big problems with certain Samsung SATA disks. :-(

So if you get such a board be sure to avoid Samsung 750GB and
1TB disks.  Samsung never aknowledged the bug, nor have they released
a firmware update.  And nVidia never said anything about it either.
Of course I only found out about it after buying lots of Samsung disks
for our X2200s.

Sigh...


Regards -- Volker
--

Volker A. Brandt  Consulting and Support for Sun Solaris
Brandt & Brandt Computer GmbH   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513  Schuhgröße: 45
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SPARC SATA, please.

2009-06-26 Thread Erik Trimble

Volker A. Brandt wrote:

The MCP55 is the chipset currently in use in the Sun X2200 M2 series of
servers.



... which has big problems with certain Samsung SATA disks. :-(

So if you get such a board be sure to avoid Samsung 750GB and
1TB disks.  Samsung never aknowledged the bug, nor have they released
a firmware update.  And nVidia never said anything about it either.
Of course I only found out about it after buying lots of Samsung disks
for our X2200s.

Sigh...


Regards -- Volker
--

Volker A. Brandt  Consulting and Support for Sun Solaris
Brandt & Brandt Computer GmbH   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513  Schuhgröße: 45
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
  



That is true, and it slipped my mind. Thanks for reminding me, Volker.

I'm a Hitachi disk user myself, and they work swell. The Seagates I have 
in my X2200 M2 seem to work fine, as well.



I've not tried any SSDs yet with the MCP55 - since they're heavily 
Samsung under the hood (regardless of whose name is on the outside), I 
_hope_ it was just a HD-specific firmware bug.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SPARC SATA, please.

2009-06-26 Thread Volker A. Brandt
> > So if you get such a board be sure to avoid Samsung 750GB and
> > 1TB disks.  Samsung never aknowledged the bug, nor have they released
> > a firmware update.  And nVidia never said anything about it either.

[...]

> I'm a Hitachi disk user myself, and they work swell. The Seagates I have
> in my X2200 M2 seem to work fine, as well.

Yes, all HGST disks I've tried so far work just fine.

> I've not tried any SSDs yet with the MCP55 - since they're heavily
> Samsung under the hood (regardless of whose name is on the outside), I
> _hope_ it was just a HD-specific firmware bug.

I think it is quite HD-specific.  I have another, slightly older,
160GB Samsung disk that worked fine as root disk in the X2200M2.
If you do try an SSD please let us know. :-)


Regards -- Volker
-- 

Volker A. Brandt  Consulting and Support for Sun Solaris
Brandt & Brandt Computer GmbH   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513  Schuhgröße: 45
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Backing up OS drive?

2009-06-26 Thread Tertius Lydgate
I have one drive that I'm running OpenSolaris on and a 6-drive RAIDZ. 
Unfortunately I don't have another drive to mirror the OS drive, so I was 
wondering what the best way to back up that drive is. Can I mirror it onto a 
file on the RAIDZ, or will this cause problems before the array is loaded when 
booting? What about zfs send and recv to the RAIDZ?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Backing up OS drive?

2009-06-26 Thread Cindy . Swearingen

Hi Tertius,

I think you are saying that you have an OpenSolaris system with a 
one-disk root pool and a 6-way RAIDZ non-root pool.


You could create root pool snapshots and send them over to the non-root
pool or to a pool on another system. Then, consider purchasing another 
disk for a mirrored root pool configuration.


The root pool snapshot recovery process is described here:

http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#ZFS_Root_Pool_Recovery

Cindy

Tertius Lydgate wrote:

I have one drive that I'm running OpenSolaris on and a 6-drive RAIDZ. 
Unfortunately I don't have another drive to mirror the OS drive, so I was 
wondering what the best way to back up that drive is. Can I mirror it onto a 
file on the RAIDZ, or will this cause problems before the array is loaded when 
booting? What about zfs send and recv to the RAIDZ?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS I/O throughput?

2009-06-26 Thread Chookiex
>Do you mean that it would be faster to read compressed data than uncompressed 
>data, or it would be faster to read compressed data than to write it?

yes, because read needs more less CPU time, and the I/O is the same with write.

Do you test it in other environment? likely, increase the server memory?  or 
increase the clients?




From: David Pacheco 
To: Chookiex 
Cc: zfs-discuss@opensolaris.org
Sent: Friday, June 26, 2009 1:00:36 AM
Subject: Re: [zfs-discuss] Is the PROPERTY compression will increase the ZFS 
I/O throughput?

Chookiex wrote:
> thank you ;)
> I mean that it would be faster in reading compressed data IF the write with 
> compression is faster than non-compressed? Just like lzjb.


Do you mean that it would be faster to read compressed data than uncompressed 
data, or it would be faster to read compressed data than to write it?


> But i can't understand why the read performance is generally unaffected by 
> compression? Because the uncompression (lzjb, gzip)  is faster than 
> compression in algorithm, so I think reading the compressing data would need 
> more less CPU time.
> 
> So the conclusion in the blog that "read performance is generally unaffected 
> by compression", I'm not agreed with it.
> Except the ARC cached the data in the read test and there are no random read 
> test?


My comment was just an empirical observation: in my experiments, read time was 
basically unaffected. I don't believe this was a result of ARC caching because 
I constructed the experiments to avoid that altogether by using working sets 
larger than the ARC and streaming through the data.

In my case the system's read bandwidth wasn't a performance limiter. We know 
this because the write bandwidth was much higher (see the graphs), and we were 
writing twice as much data as we were reading (because we were mirroring). So 
even if compression was decreasing the amount of I/O that was done on the read 
side, other factors (possibly the number of clients) limited the bandwidth we 
could achieve before we got to a point where compression would have made any 
difference.

-- Dave


> My data is text data set, about 320,000 text files or emails. The compression 
> ratio is:
> lzjb 1.55x
> gzip-1 2.54x
> gzip-2 2.58x
> gzip 2.72x
> gzip-9 2.73x
> 
> for your curiosity :)
> 
> 
> 
> *From:* David Pacheco 
> *To:* Chookiex 
> *Cc:* zfs-discuss@opensolaris.org
> *Sent:* Thursday, June 25, 2009 2:00:49 AM
> *Subject:* Re: [zfs-discuss] Is the PROPERTY compression will increase the 
> ZFS I/O throughput?
> 
> Chookiex wrote:
>  > Thank you for your reply.
>  > I had read the blog. The most interesting thing is WHY is there no 
>performance improve when it set any compression?
> 
> There are many potential reasons, so I'd first try to identify what your 
> current bandwidth limiter is. If you're running out of CPU on your current 
> workload, for example, adding compression is not going to help performance. 
> If this is over a network, you could be saturating the link. Or you might not 
> have enough threads to drive the system to bandwidth.
> 
> Compression will only help performance if you've got plenty of CPU and other 
> resources but you're out of disk bandwidth. But even if that's the case, it's 
> possible that compression doesn't save enough space that you actually 
> decrease the number of disk I/Os that need to be done.
> 
>  > The compressed read I/O is less than uncompressed data,  and decompress is 
>faster than compress.
> 
> Out of curiosity, what's the compression ratio?
> 
> -- Dave
> 
>  > so if lzjb write is better than non-compressed, the lzjb read would be 
>better than write?
>  >  Is the ARC or L2ARC do any tricks?
>  >  Thanks
>  >
>  > 
>  > *From:* David Pacheco >
>  > *To:* Chookiex mailto:hexcoo...@yahoo.com>>
>  > *Cc:* zfs-discuss@opensolaris.org 
>  > *Sent:* Wednesday, June 24, 2009 4:53:37 AM
>  > *Subject:* Re: [zfs-discuss] Is the PROPERTY compression will increase the 
>ZFS I/O throughput?
>  >
>  > Chookiex wrote:
>  >  > Hi all.
>  >  >
>  >  > Because the property compression could decrease the file size, and the 
>file IO will be decreased also.
>  >  > So, would it increase the ZFS I/O throughput with compression?
>  >  >
>  >  > for example:
>  >  > I turn on gzip-9,on a server with 2*4core Xeon, 8GB RAM.
>  >  > It could compress my files with compressratio 2.5x+. could it be?
>  >  > or I turn on lzjb, about 1.5x with the same files.
>  >
>  > It's possible, but it depends on a lot of factors, including what your 
>bottleneck is to begin with, how compressible your data is, and how hard you 
>want the system to work compressing it. With gzip-9, I'd be shocked if you saw 
>bandwidth improved. It seems more common with lzjb:
>  >
>  > h

Re: [zfs-discuss] zfs on 32 bit?

2009-06-26 Thread Scott Laird
It's actually worse than that--it's not just "recent CPUs" without VT
support.  Very few of Intel's current low-price processors, including
the Q8xxx quad-core desktop chips, have VT support.

On Wed, Jun 24, 2009 at 12:09 PM, roland wrote:
>>Dennis is correct in that there are significant areas where 32-bit
>>systems will remain the norm for some time to come.
>
> think of that hundreds of thousands of VMWare ESX/Workstation/Player/Server 
> installations on non VT capable cpu`s - even if the cpu has 64bit capability, 
> a VM cannot run in 64bit mode the cpu is missing VT support. And VT isn`t 
> available for so long, and still there are even recent CPUs which don`t have 
> VT support
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] BugID formally known as 6746456

2009-06-26 Thread Rob Healey
This appears to be the fix related to the ACL's which they seem to throw all of 
the ASSERT panics in zfs_fuid.c under even if they have nothing to do with 
ACL's; my case being one of those.

Thanks for the pointer though!

-Rob
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS for iSCSI based SAN

2009-06-26 Thread Scott Meilicke
I ran the RealLife iometer profile on NFS based storage (vs. SW iSCSI), and got 
nearly identical results to having the disks on iSCSI:

iSCSI
IOPS: 1003.8
MB/s: 7.8
Avg Latency (s): 27.9

NFS
IOPS: 1005.9
MB/s: 7.9
Avg Latency (s): 29.7

Interesting!

Here is how the pool was behaving during the testing. Again this is NFS backed 
storage:

data01   122G  20.3T166 63  2.80M  4.49M
data01   122G  20.3T145 59  2.28M  3.35M
data01   122G  20.3T168 58  2.89M  4.38M
data01   122G  20.3T169 59  2.79M  3.69M
data01   122G  20.3T 54935   856K  18.1M
data01   122G  20.3T  9  7.96K   183K   134M
data01   122G  20.3T 49  3.82K   900K  61.8M
data01   122G  20.3T160 61  2.73M  4.23M
data01   122G  20.3T166 63  2.62M  4.01M
data01   122G  20.3T162 64  2.55M  4.24M
data01   122G  20.3T163 61  2.63M  4.14M
data01   122G  20.3T145 54  2.37M  3.89M
data01   122G  20.3T163 63  2.69M  4.35M
data01   122G  20.3T171 64  2.80M  3.97M
data01   122G  20.3T153 67  2.68M  4.65M
data01   122G  20.3T164 66  2.63M  4.10M
data01   122G  20.3T171 66  2.75M  4.51M
data01   122G  20.3T175 53  3.02M  3.83M
data01   122G  20.3T157 59  2.64M  3.80M
data01   122G  20.3T172 59  2.85M  4.11M
data01   122G  20.3T173 68  2.99M  4.11M
data01   122G  20.3T 97 35  1.66M  2.61M
data01   122G  20.3T170 58  2.87M  3.62M
data01   122G  20.3T160 64  2.72M  4.17M
data01   122G  20.3T163 63  2.68M  3.77M
data01   122G  20.3T160 60  2.67M  4.29M
data01   122G  20.3T165 65  2.66M  4.05M
data01   122G  20.3T191 59  3.25M  3.97M
data01   122G  20.3T159 65  2.76M  4.18M
data01   122G  20.3T154 52  2.64M  3.50M
data01   122G  20.3T164 61  2.76M  4.38M
data01   122G  20.3T154 62  2.66M  4.08M
data01   122G  20.3T160 58  2.71M  3.95M
data01   122G  20.3T 84 34  1.48M  2.37M
data01   122G  20.3T  9  7.27K   156K   125M
data01   122G  20.3T 25  5.20K   422K  84.3M
data01   122G  20.3T170 60  2.77M  3.64M
data01   122G  20.3T170 63  2.85M  3.85M
 
So it appears NFS is doing syncs, while iSCSI is not (See my earlier zpool 
iostat data for iSCSI). Isn't this what we expect, because NFS does syncs, 
while iSCSI does not (assumed)?

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] slow ls or slow zfs

2009-06-26 Thread NightBird
Hello,
We have a server with a couple a raid-z2 pools, each with 23x1TB disks. This 
gives us 19TB of useable space on each pool. The server has 2 x quad core cpu, 
16GB RAM and are running b117. Average load is 4 and we use a log ot CIFS.

We notice ZFS is slow. Even a simple 'ls -al' can take 20sec.  After trying 
again, it's cached and therefore quick. We also noticed 'ls' is relatively 
quick (~3secs).

How can I improve the response time? 
How do I determine how much memory I need for ZFS caching? 

Here are some stats:
> ::arc
hits  =  44025797
misses=   8452650
[..]
p = 10646 MB
c = 11712 MB
c_min =  1918 MB
c_max = 15350 MB
size  = 11712 MB
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread Eric D. Mudama

On Fri, Jun 26 at 15:18, NightBird wrote:

Hello,

We have a server with a couple a raid-z2 pools, each with 23x1TB
disks. This gives us 19TB of useable space on each pool. The server
has 2 x quad core cpu, 16GB RAM and are running b117. Average load
is 4 and we use a log ot CIFS.

We notice ZFS is slow. Even a simple 'ls -al' can take 20sec.  After
trying again, it's cached and therefore quick. We also noticed 'ls'
is relatively quick (~3secs).


As I understand it, each vdev gets roughly the performance of a single
raw disk, with slight performance penalties once you exceed a certain
size of 6-8 disks in a single vdev.

--eric

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread Scott Meilicke
Hi,

When you have a lot of random read/writes, raidz/raidz2 can be fairly slow.
http://blogs.sun.com/roch/entry/when_to_and_not_to

The recommendation is to break the disks into smaller raidz/z2 stripes, thereby 
improving IO.

>From the ZFS Best Practices Guide:
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#RAID-Z_Configuration_Requirements_and_Recommendations

"The recommended number of disks per group is between 3 and 9. If you have more 
disks, use multiple groups."

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread NightBird
Hi Scott,

Why do you assume there is a IO problem?
I know my setup is unusual because of the large pool size. However, I have not 
seen any evidence this is a problem for my workload.
prstat does not show any IO wait.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread Ian Collins

NightBird wrote:


[please keep enough context so you post makes sense to the mail list]


Hi Scott,

Why do you assume there is a IO problem?
I know my setup is unusual because of the large pool size. However, I have not 
seen any evidence this is a problem for my workload.
prstat does not show any IO wait.
  
The pool size isn't the issue, it's the large number of disks in each 
vdev.  Read the suggested best practice links.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread NightBird
Thanks Ian.
I read the best practices and undestand the IO limitation I have created for 
this vdev. My system is a built for maximize capacity using large stripes, not 
performance.
All the tools I have used show no IO problems. 
I think the problem is memory but I am unsure on how to troubleshoot it.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread NightBird
[Adding context]

>> Hi Scott,
>>
>> Why do you assume there is a IO problem?
>> I know my setup is unusual because of the large pool size. However, I have 
>> not seen any evidence this is a problem for my workload.
>> prstat does not show any IO wait.
>   
>The pool size isn't the issue, it's the large number of disks in each vdev.  
>Read >the suggested best practice links.


Thanks Ian.
I read the best practices and undestand the IO limitation I have created for 
this vdev. My system is a built for maximize capacity using large stripes, not 
performance.
All the tools I have used show no IO problems. 
I think the problem is memory but I am unsure on how to troubleshoot it.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread Ian Collins

NightBird wrote:

Hello,
We have a server with a couple a raid-z2 pools, each with 23x1TB disks. This 
gives us 19TB of useable space on each pool. The server has 2 x quad core cpu, 
16GB RAM and are running b117. Average load is 4 and we use a log ot CIFS.

We notice ZFS is slow. Even a simple 'ls -al' can take 20sec.  After trying 
again, it's cached and therefore quick. We also noticed 'ls' is relatively 
quick (~3secs).

  

Going back to your original question

What is the data?  Is it owned by one or many users?  If the latter, the 
problem could be the time taken to fetch all the name service data.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread NightBird
>NightBird wrote:
>> Hello,
>> We have a server with a couple a raid-z2 pools, each with 23x1TB disks. This 
>> >gives us 19TB of useable space on each pool. The server has 2 x quad core 
>> cpu, >16GB RAM and are running b117. Average load is 4 and we use a log ot 
>> CIFS.
>>
>> We notice ZFS is slow. Even a simple 'ls -al' can take 20sec.  After trying 
>> >again, it's cached and therefore quick. We also noticed 'ls' is relatively 
>> quick >>(~3secs).
>>
>>   
>Going back to your original question
>
>What is the data?  Is it owned by one or many users?  If the latter, the 
>problem >could be the time taken to fetch all the name service data.
>
>--
>Ian.

Data is compressed files between 300KB and 2,000KB.
This is in AD environment with 20+ servers, all running under a single AD 
account. So the Opensolaris server sees one owner. We have also 3 or 4 idmap 
group mappings.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread William D. Hathaway
As others have mentioned, it would be easier to take a stab at this if there is 
some more data to look at.

Have you done any ZFS tuning?  If so, please provide the /etc/system, adb, zfs 
etc info.

Can you provide zpool status output?

As far as checking ls performance, just to remove name service lookups from the 
possibilities, lets use the  '-n' option instead of '-l'. I know you mentioned 
it was unlikely to be a problem, but the less variables the better.


Can you characterize what your ''ls -an" output looks like?  Is it 100 files or 
100,000?

How about some sample output like:
for run in  1 2 3 4
do
  echo run $run
  truss -c ls -an | wc -l
  echo ""
  echo
done
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS for iSCSI based SAN

2009-06-26 Thread Bob Friesenhahn

On Fri, 26 Jun 2009, Scott Meilicke wrote:

I ran the RealLife iometer profile on NFS based storage (vs. SW 
iSCSI), and got nearly identical results to having the disks on 
iSCSI:


Both of them are using TCP to access the server.

So it appears NFS is doing syncs, while iSCSI is not (See my earlier 
zpool iostat data for iSCSI). Isn't this what we expect, because NFS 
does syncs, while iSCSI does not (assumed)?


If iSCSI does not do syncs (presumably it should when a cache flush is 
requested) then NFS is safer in case the server crashes and reboots.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread Richard Elling

NightBird wrote:

Thanks Ian.
I read the best practices and undestand the IO limitation I have created for 
this vdev. My system is a built for maximize capacity using large stripes, not 
performance.
All the tools I have used show no IO problems. 
I think the problem is memory but I am unsure on how to troubleshoot it.
  


Look for latency, not bandwidth.  iostat will show latency at the
device level.

Other things that affect ls -la are name services and locale. Name services
because the user ids are numbers and are converted to user names via
the name service (these are cached in the name services cache daemon,
so you can look at the nscd hit rates with "nscd -g").  The locale matters
because the output is sorted, which is slower for locales which use unicode.
This implies that the more entries in the directory, and the longer the 
names

are with more common prefixes, the longer it takes to sort.  I expect case
insensitive sorts (common for CIFS environments) also take longer to
sort.  You could sort by a number instead, try "ls -c" or "ls -S"

ls looks at metadata, which is compressed and typically takes little space.
But it is also cached, which you can see by looking at the total name
lookups in "vmstat -s"

As others have pointed out, I think you will find that a 23-wide raidz,
raidz2, raid-5, or raid-6 configuration is not a recipe for performance.
-- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread Bob Friesenhahn

On Fri, 26 Jun 2009, NightBird wrote:


Thanks Ian.
I read the best practices and undestand the IO limitation I have created for 
this vdev. My system is a built for maximize capacity using large stripes, not 
performance.
All the tools I have used show no IO problems.
I think the problem is memory but I am unsure on how to troubleshoot it.


Perhaps someone else has answered your question.  The problem is not a 
shortage of I/O.  The problem is that raidz and raidz2 do synchronized 
reads and writes (in a stripe) and so all of the disks in the stripe 
need to respond before the read or write can return.  If one disk is a 
bit slower than the rest, then everything will be slower.  With so 
many disks, the deck is stacked against you.  Raidz can not use an 
infinite stripe size so the number of disks used for any given I/O is 
not all of the disks in the vdev, which may make finding the slow disk 
a bit more difficult.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread NightBird
> As others have mentioned, it would be easier to take a stab at this if there 
> is >some more data to look at.
>
>Have you done any ZFS tuning?  If so, please provide the /etc/system, adb, zfs 
>>etc info.
>
>Can you provide zpool status output?
>
>As far as checking ls performance, just to remove name service lookups from 
>>the possibilities, lets use the  '-n' option instead of '-l'. I know you 
>mentioned it >was unlikely to be a problem, but the less variables the better.
>
>
>Can you characterize what your ''ls -an" output looks like?  Is it 100 files 
>or >100,000?
>
>How about some sample output like:
>for run in  1 2 3 4
>do
 > echo run $run
  >truss -c ls -an | wc -l
  >echo ""
 > echo
>done
>
>

zfs tuning: /etc/system
set swapfs_minfree=0x2
set zfs:zfs_txg_synctime=1

Here is the output  (200 files in that folder):
truss -c ls -an | wc - l
 203

syscall   seconds   calls  errors
_exit.000   1
read .000   1
write.000   3
open .000   8   3
close.000   6
time .004 610
brk  .000  12
getpid   .000   1
sysi86   .000   1
ioctl.000   2   2
execve   .000   1
fcntl.000   1
openat   .000   1
getcontext   .000   1
setustack.000   1
pathconf .003 203
mmap .000   7
mmapobj  .000   4
getrlimit.000   1
memcntl  .000   6
sysconfig.000   2
lwp_private  .000   1
acl  .006 406
resolvepath  .000   6
getdents64   .315   2
stat64   .003 209   1
lstat64  .375 203
fstat64  .000   4
   --   
sys totals:  .7101704  6
usr time:.008
elapsed:   32.420

# zpool status
  pool: pool001
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
pool will no longer be accessible on older software versions.
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
pool001  ONLINE   0 0 0
  raidz2 ONLINE   0 0 0
c9t19d0  ONLINE   0 0 0
c9t18d0  ONLINE   0 0 0
c9t17d0  ONLINE   0 0 0
c9t13d0  ONLINE   0 0 0
c9t15d0  ONLINE   0 0 0
c9t16d0  ONLINE   0 0 0
c9t11d0  ONLINE   0 0 0
c9t12d0  ONLINE   0 0 0
c9t14d0  ONLINE   0 0 0
c9t9d0   ONLINE   0 0 0
c9t8d0   ONLINE   0 0 0
c9t10d0  ONLINE   0 0 0
c9t30d0  ONLINE   0 0 0
c9t29d0  ONLINE   0 0 0
c9t28d0  ONLINE   0 0 0
c9t24d0  ONLINE   0 0 0
c9t26d0  ONLINE   0 0 0
c9t27d0  ONLINE   0 0 0
c9t22d0  ONLINE   0 0 0
c9t23d0  ONLINE   0 0 0
c9t25d0  ONLINE   0 0 0
c9t20d0  ONLINE   0 0 0
c9t21d0  ONLINE   0 0 0
spares
  c8t3d0 AVAIL
  c8t2d0 AVAIL

errors: No known data errors

[...]

We are running the zpool version that came with b111b for now and have not 
decided if we want to upgrade to the version that comes with b117

## vmstat -s
0 swap ins
0 swap outs
0 pages swapped in
0 pages swapped out
   729488 total address trans. faults taken
2 page ins
0 page outs
2 pages paged in
0 pages paged out
   264171 total reclaims
   264171 reclaims from free list
0 micro (hat) faults
   729488 minor (as) faults
2 major faults
   147603 copy-on-write faults
   219098 zero fill page faults
   589820 pages examined by the clock daemon
0 revolutions of the clock hand
0 pages freed by the clock daemon
 1765 forks
  703 vforks
 2478 execs
2267309301 cpu context switches
671340139 device interrupts
  1036384 traps
 28907635 system calls
  2524120 total name lookups (cache hits 92%)
 8836 user   cpu
 11632644 system cpu
 15777851 idle   cpu
0 wait   cpu
-- 
This message posted from opensolaris.org
___
zfs-discuss ma

Re: [zfs-discuss] ZFS for iSCSI based SAN

2009-06-26 Thread Brent Jones
On Fri, Jun 26, 2009 at 6:04 PM, Bob
Friesenhahn wrote:
> On Fri, 26 Jun 2009, Scott Meilicke wrote:
>
>> I ran the RealLife iometer profile on NFS based storage (vs. SW iSCSI),
>> and got nearly identical results to having the disks on iSCSI:
>
> Both of them are using TCP to access the server.
>
>> So it appears NFS is doing syncs, while iSCSI is not (See my earlier zpool
>> iostat data for iSCSI). Isn't this what we expect, because NFS does syncs,
>> while iSCSI does not (assumed)?
>
> If iSCSI does not do syncs (presumably it should when a cache flush is
> requested) then NFS is safer in case the server crashes and reboots.
>
> Bob
> --
> Bob Friesenhahn
> bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

I'll chime in here as I've had experience with this subject as well
(ZFS NFS/iSCSI).
It depends on your NFS client!

I was using the FreeBSD NFSv3 client, which by default does an fsync()
for every NFS block (8KB afaik).
However, I changed the source and recompile so it would only fsync()
on file close or I believe after 5MB. I went from 3MB/sec, to over
100MB/sec after my change.
I detailed my struggle here:

http://www.brentrjones.com/?p=29

As for iSCSI, I am currently benchmarking the COMSTAR iSCSI target. I
previously used the old iscsitgtd framework with ZFS. Previously I
would get about 35-40MB/sec.
My initial testing with the new COMSTAR iSCSI target is not revealing
any substantial performance increase at all.
I've tried zvol based lu's, and file based lu's with no perceived
performance difference at all.

The iSCSI target is an X4540, 64GB RAM, and 48x 1TB disks configured
with 8 vdevs with 5-6 disks each. No SSD, ZIL enabled.

My NFS performance is now over 100MB/sec, I can get over 100MB/sec
with CIFS as well. However, my iSCSI performance is still rather low
for the hardware.

It is a standard GigE network, currently jumbo frames are disabled,
when I get some time I may make a VLAN with jumbo frames enabled and
see if that changes anything at all (not likely).

I am CC'ing the storage-discuss group as well for coverage as this
covers ZFS, and storage.

If anyone has some thoughts, code, or tests, I can run them on my
X4540's and see how it goes.

Thanks


-- 
Brent Jones
br...@servuhome.net
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] slow ls or slow zfs

2009-06-26 Thread Bob Friesenhahn

On Fri, 26 Jun 2009, Richard Elling wrote:

All the tools I have used show no IO problems. I think the problem is 
memory but I am unsure on how to troubleshoot it.


Look for latency, not bandwidth.  iostat will show latency at the
device level.


Unfortunately, the effect may not be all that obvious since the disks 
will only be driven as hard as the slowest disk and so the slowest 
disk may not seem much slower.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss