Re: [zfs-discuss] lots of zil_clean threads

2009-09-22 Thread Nils Goroll



I should add that I have quite a lot of datasets:


and maybe I should also add that I'm still running an old zpool version in order 
to keep the ability to boot snv_98:


aggis:~$ zpool upgrade
This system is currently running ZFS pool version 14.

The following pools are out of date, and can be upgraded.  After being
upgraded, these pools will no longer be accessible by older software versions.

VER  POOL
---  
13   rpool
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs bug

2009-09-22 Thread Trevor Pretty





Of course I meant 2009.06   :-)

Trevor Pretty wrote:

  
  BTW
  
Reading your bug. 
  
I assumed you meant? 
  
  zfs set mountpoint=/home/pool tank
  
ln -s /dev/null /home/pool
  
  I then tried on OpenSolaris 2008.11
  
r...@norton:~# zfs set mountpoint=
r...@norton:~# zfs set mountpoint=/home/pool tank
r...@norton:~# zpool export tank
r...@norton:~# rm -r /home/pool
rm: cannot remove `/home/pool': No such file or directory
r...@norton:~# ln -s /dev/null /home/pool
r...@norton:~# zpool import -f tank
cannot mount 'tank': Not a directory
r...@norton:~# 
  
  So looks fixed to me.
  
  
Trevor Pretty wrote:
  

Jeremy 

You sure?

http://bugs.opensolaris.org/view_bug.do%3Bjsessionid=32d28f683e21e4b5c35832c2e707?bug_id=6883885

BTW:  I only found this by hunting for one of my bugs  6428437
and changing the URL!  

I think the searching is broken - but using bugster has always been a
black art even when I worked at Sun :-)

Trevor


Jeremy Kister wrote:

  I entered CR 6883885 at bugs.opensolaris.org.

someone closed it - not reproducible.

Where do i find more information, like which planet's gravitational 
properties affect the zfs source code ??


  









  
  
  -- 
  
  
  
  
  
  Trevor
Pretty |
Technical Account Manager
  |
  +64
9 639 0652 |
  +64
21 666 161
  Eagle
Technology Group Ltd. 
  Gate
D, Alexandra Park, Greenlane West, Epsom
Private Bag 93211,
Parnell, Auckland
  
  
  
  
  
  
  


-- 





Trevor
Pretty |
Technical Account Manager
|
+64
9 639 0652 |
+64
21 666 161
Eagle
Technology Group Ltd. 
Gate
D, Alexandra Park, Greenlane West, Epsom
Private Bag 93211,
Parnell, Auckland










www.eagle.co.nz 
This email is confidential and may be legally 
privileged. If received in error please destroy and immediately notify 
us.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs bug

2009-09-22 Thread Jeremy Kister

On 9/22/2009 11:17 PM, Trevor Pretty wrote:

zfs set mountpoint=/home/pool tank

ln -s /dev/null /home/pool



ahha, I dumbed down the process too much (trying to make it simple to 
reproduce).


the key is in the /Auto/pool snippet that i put in the CR, but switched to 
/dev/null in the reproduce section.



so, i have automounter working and in NIS.  inside auto_home is:
poolserver:/home/pool

(where server is the host im importing the zfs pool on)

zfs set mountpoint=/home/pool tank
zfs set sharenfs=rw,anon=0 tank
zfs export tank
rm -r /home/pool
ln -s /Auto/pool /home/pool
zfs import -f tank


that is what is causing the breakage, not necessarily the softlink itself.

how do i amend the CR?  should i just make a new one ?

Thanks for your follow up, Trevor.
--

Jeremy Kister
http://jeremy.kister.net./
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs bug

2009-09-22 Thread Trevor Pretty




BTW

Reading your bug. 

I assumed you meant? 

zfs set mountpoint=/home/pool tank

ln -s /dev/null /home/pool

I then tried on OpenSolaris 2008.11

r...@norton:~# zfs set mountpoint=
r...@norton:~# zfs set mountpoint=/home/pool tank
r...@norton:~# zpool export tank
r...@norton:~# rm -r /home/pool
rm: cannot remove `/home/pool': No such file or directory
r...@norton:~# ln -s /dev/null /home/pool
r...@norton:~# zpool import -f tank
cannot mount 'tank': Not a directory
r...@norton:~# 

So looks fixed to me.


Trevor Pretty wrote:

  
Jeremy 
  
You sure?
  
  http://bugs.opensolaris.org/view_bug.do%3Bjsessionid=32d28f683e21e4b5c35832c2e707?bug_id=6883885
  
BTW:  I only found this by hunting for one of my bugs  6428437
  and changing the URL!  
  
I think the searching is broken - but using bugster has always been a
black art even when I worked at Sun :-)
  
Trevor
  
  
Jeremy Kister wrote:
  
I entered CR 6883885 at bugs.opensolaris.org.

someone closed it - not reproducible.

Where do i find more information, like which planet's gravitational 
properties affect the zfs source code ??


  
  
  
  
  
  
  
  
  
  


-- 





Trevor
Pretty |
Technical Account Manager
|
+64
9 639 0652 |
+64
21 666 161
Eagle
Technology Group Ltd. 
Gate
D, Alexandra Park, Greenlane West, Epsom
Private Bag 93211,
Parnell, Auckland










www.eagle.co.nz 
This email is confidential and may be legally 
privileged. If received in error please destroy and immediately notify 
us.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs bug

2009-09-22 Thread Trevor Pretty




Jeremy 

You sure?

http://bugs.opensolaris.org/view_bug.do%3Bjsessionid=32d28f683e21e4b5c35832c2e707?bug_id=6883885

BTW:  I only found this by hunting for one of my bugs  6428437
and changing the URL!  

I think the searching is broken - but using bugster has always been a
black art even when I worked at Sun :-)

Trevor


Jeremy Kister wrote:

  I entered CR 6883885 at bugs.opensolaris.org.

someone closed it - not reproducible.

Where do i find more information, like which planet's gravitational 
properties affect the zfs source code ??


  












www.eagle.co.nz 
This email is confidential and may be legally 
privileged. If received in error please destroy and immediately notify 
us.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs bug

2009-09-22 Thread Jeremy Kister

I entered CR 6883885 at bugs.opensolaris.org.

someone closed it - not reproducible.

Where do i find more information, like which planet's gravitational 
properties affect the zfs source code ??



--

Jeremy Kister
http://jeremy.kister.net./
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What does 128-bit mean

2009-09-22 Thread Trevor Pretty





http://blogs.sun.com/bonwick/entry/128_bit_storage_are_you

Trevor Pretty wrote:

  
  
  http://en.wikipedia.org/wiki/ZFS
  
Shu Wu wrote:
  Hi pals, I'm now looking into zfs source and have been
puzzled about 128-bit. It's announced that ZFS is an 128-bit file
system. But what does 128-bit mean? Does that mean the addressing
capability is 2^128? But in the source, 'zp_size' (in 'struct
znode_phys'), the file size in bytes, is defined as uint64_t. So I
guess 128-bit may be the bit width of the zpool pointer, but where is
it defined?

Regards,

Wu Shu
  
  
  -- 
  
  
  
  
  
  Trevor
Pretty |
Technical Account Manager
  |
  +64
9 639 0652 |
  +64
21 666 161
  Eagle
Technology Group Ltd. 
  Gate
D, Alexandra Park, Greenlane West, Epsom
Private Bag 93211,
Parnell, Auckland
  
  
  
  
  
  
  


-- 





Trevor
Pretty |
Technical Account Manager
|
+64
9 639 0652 |
+64
21 666 161
Eagle
Technology Group Ltd. 
Gate
D, Alexandra Park, Greenlane West, Epsom
Private Bag 93211,
Parnell, Auckland










www.eagle.co.nz 
This email is confidential and may be legally 
privileged. If received in error please destroy and immediately notify 
us.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What does 128-bit mean

2009-09-22 Thread Trevor Pretty





http://en.wikipedia.org/wiki/ZFS

Shu Wu wrote:
Hi pals, I'm now looking into zfs source and have been
puzzled about 128-bit. It's announced that ZFS is an 128-bit file
system. But what does 128-bit mean? Does that mean the addressing
capability is 2^128? But in the source, 'zp_size' (in 'struct
znode_phys'), the file size in bytes, is defined as uint64_t. So I
guess 128-bit may be the bit width of the zpool pointer, but where is
it defined?
  
Regards,
  
Wu Shu


-- 





Trevor
Pretty |
Technical Account Manager
|
+64
9 639 0652 |
+64
21 666 161
Eagle
Technology Group Ltd. 
Gate
D, Alexandra Park, Greenlane West, Epsom
Private Bag 93211,
Parnell, Auckland










www.eagle.co.nz 
This email is confidential and may be legally 
privileged. If received in error please destroy and immediately notify 
us.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] What does 128-bit mean

2009-09-22 Thread Shu Wu
Hi pals, I'm now looking into zfs source and have been puzzled about
128-bit. It's announced that ZFS is an 128-bit file system. But what does
128-bit mean? Does that mean the addressing capability is 2^128? But in the
source, 'zp_size' (in 'struct znode_phys'), the file size in bytes, is
defined as uint64_t. So I guess 128-bit may be the bit width of the zpool
pointer, but where is it defined?

Regards,

Wu Shu
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS file disk usage

2009-09-22 Thread Andrew Deason
On Tue, 22 Sep 2009 13:26:59 -0400
Richard Elling  wrote:

> > That seems to differ quite a bit from what I've seen; perhaps I am
> > misunderstanding... is the "+ 1 block" of a different size than the
> > recordsize? With recordsize=1k:
> >
> > $ ls -ls foo
> > 2261 -rw-r--r--   1 root root 1048576 Sep 22 10:59 foo
> 
> Well, there it is.  I suggest suitable guard bands.

So, you would say it's reasonable to assume the overhead will always be
less than about 100k or 10%?

And to be sure... if we're to be rounding up to the next recordsize
boundary, are we guaranteed to be able to get the from the blocksize
reported by statvfs?

-- 
Andrew Deason
adea...@sinenomine.net
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Persistent errors - do I believe?

2009-09-22 Thread Chris Murray
I've had an interesting time with this over the past few days ...

After the resilver completed, I had the message "no known data errors" in a 
zpool status.

I guess the title of my post should have been "how permanent are permanent 
errors?". Now, I don't know whether the action of completing the resilver was 
the thing that fixed the one remaining error (in the snapshot of the 'meerkat' 
zvol), or whether my looped zpool clear commands have done it. Anyhow, for 
space/noise reasons, I set the machine back up with the original cables 
(eSATA), in its original tucked-away position, installed SXCE 119 to get me 
remotely up to date, and imported the pool.

So far so good. I then powered up a load of my virtual machines. None of them 
report errors when running a chkdsk, and SQL Server 'DBCC CHECKDB' hasn't 
reported any problems yet. Things are looking promising on the corruption front 
- feels like the errors that were reported while the resilvers were in progress 
have finally been fixed by the final (successful) resilver! Microsoft Exchange 
2003 did complain of corruption of mailbox stores, however I have seen this a 
few times as a result of unclean shutdowns, and don't think it's related to the 
errors that ZFS was reporting on the pool during resilver.

Then, 'disk is gone' again - I think I can definitely put my original troubles 
down to cabling, which I'll sort out for good in the next few days. Now, I'm 
back on the same SATA cables which saw me through the resilvering operation.

One of the drives is showing read errors when I run dmesg. I'm having one 
problem after another with this pool!! I think the disk I/O during the resilver 
has tipped this disk over the edge. I'll replace it ASAP, and then I'll test 
the drive in a separate rig and RMA it.

Anyhow, there is one last thing that I'm struggling with - getting the pool to 
expand to use the size of the new disk. Before my original replace, I had 3x1TB 
and 1x750GB disk. I replaced the 750 with another 1TB, which by my reckoning 
should give me around 4TB as a total size even after checksums and metadata. No:

# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool74G  8.81G  65.2G11%  ONLINE  -
zp 2.73T  2.36T   379G86%  ONLINE  -

2.73T? I'm convinced I've expanded a pool in this way before. What am I missing?

Chris
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] If you have ZFS in production, willing to share some details (with me)?

2009-09-22 Thread Jeremy Kister

On 9/22/2009 1:55 PM, Jeremy Kister wrote:

(b) 2 of them have 268GB raw
 26 HP 300GB SCA disks with mirroring + 2 hot spares


28 * 300G = 8.2T.  Not 268G.

"Math class is tough!"


--

Jeremy Kister
http://jeremy.kister.net./
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] If you have ZFS in production, willing to share some details (with me)?

2009-09-22 Thread Jeremy Kister

On 9/18/2009 1:51 PM, Steffen Weiberle wrote:

# of systems


6 not including dozens of zfs root.


amount of storage


(a) 2 of them have 96TB raw,
46 WD SATA 2TB disks in two raidz2 pools + 2 hot spares
each raidz2 pool is on it's own shelf on it's own PCIx controller

(b) 2 of them have 268GB raw
26 HP 300GB SCA disks with mirroring + 2 hot spares
+ soon to be 3 way mirrored
each shelf of 14 disks is connected to it's own u320 pcix card

(c) 2 of them have 14TB raw
14 Dell SATA 1TB disks in two raidz2 pools + 1 hot spare


application profile(s)


(a) and (c) are file servers via nfs
(b) are postgres database servers

type of workload (low, high; random, sequential; read-only, read-write, 
write-only)


(a) are 70/30 read/write @ average of 40MB/s
30 clients
(b) are 50/50 read/write @ average of 180MB/s
local read/write only
(c) are 70/30 read/write @ average of 28MB/s
10 clients


storage type(s)


(a) and (c) are sata
(b) are u320 scsi


industry


call analytics


whether it is private or I can share in a summary


not private.


anything else that might be of interest


35. “Because” makes any explanation rational. In a line to Kinko’s copy 
machine a researcher asked to jump the line by presenting a reason “Can I 
jump the line, because I am in a rush?” 94% of people complied. Good 
reason, right? Okay, let’s change the reason. “Can I jump the line because 
I need to make copies?” Excuse me? That’s why everybody is in the line to 
begin with. Yet 93% of people complied. A request without “because” in it 
(”Can I jump the line, please?”) generated 24% compliance.



--

Jeremy Kister
http://jeremy.kister.net./
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] URGENT: very high busy and average service time with ZFS and USP1100

2009-09-22 Thread Richard Elling

comment below...

On Sep 22, 2009, at 9:57 AM, Jim Mauro wrote:



Cross-posting to zfs-discuss. This does not need to be on the
confidential alias. It's a performance query - there's nothing
confidential in here. Other folks post performance queries to
zfs-discuss

Forget %b - it's useless.

It's not the bandwidth that's hurting you, it's the IOPS.
One of the hot devices did 1515.8 reads-per-second,
the other did over 500.

Is this Oracle?

You never actually tell us what the huge performance problem is -
what's the workload, what's the delivered level of performance?

IO service times in the 32-22 millisecond range are not great,
but not the worst I've seen. Do you have any data that connects
the delivered perfomance of the workload to an IO latency
issue, or did the customer just run "iostat", saw "100% b",
and assumed this was the problem?

I need to see zpool stats.

Is each of these c3txx devices actually a raid 7+1 (which means
7 data disks and 1 parity disk)??

There's nothing here that tells us there's something that needs to be
done on the ZFS side. Not enough data.

It looks like a very lopsided IO load distribution problem.
You have 8 LUNs cetX devices, 2 of which are getting
slammed with IOPS, the other 6 are relatively idle.

Thanks,
/jim


Javier Conde wrote:


Hello,

IHAC with a huge performance problem in a newly installed M8000  
confiured with a USP1100 and ZFS.


From what we can see, 2 disks used by in different ZPOOLS have are  
100% busy and and average service time is also quite high (between  
30 and 5 ms).


  r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  0.0   11.40.0  224.1  0.0  0.20.0   20.7   0   5  
c3t5000C5000F94A607d0
  0.0   11.80.0  224.1  0.0  0.30.0   24.2   0   6  
c3t5000C5000F94E38Fd0
  0.20.0   25.60.0  0.0  0.00.07.9   0   0  
c3t60060E8015321F01321F0032d0
  0.03.60.0   20.8  0.0  0.00.00.5   0   0  
c3t60060E8015321F01321F0020d0
  0.2   24.0   25.6  488.0  0.0  0.00.00.6   0   1  
c3t60060E8015321F01321F001Cd0
 11.40.8   92.88.0  0.0  0.00.03.9   0   4  
c3t60060E8015321F01321F0019d0
573.40.0 73395.50.0  0.0 20.60.0   36.0   0 100  
c3t60060E8015321F01321F000Bd0


avg read size ~128kBytes... which is good

  0.80.8  102.48.0  0.0  0.00.0   22.8   0   4  
c3t60060E8015321F01321F0008d0
1515.8   10.2 30420.9  148.0  0.0 34.90.0   22.9   1 100  
c3t60060E8015321F01321F0006d0


avg read size ~20 kBytes... not so good
These look like single-LUN pools.  What is the workload?

  0.40.4   51.21.6  0.0  0.00.05.1   0   0  
c3t60060E8015321F01321F0055d0


The USP1100 is configured with a raid 7+1, which is the default  
recommendation.


Check the starting sector for the partition.  For older OpenSolaris  
and Solaris 10
installations, the default starting sector is 34, which has the  
unfortunate affect of
misaligning with most hardware RAID arrays. For newer installations,  
the default
starting sector is 256, which has a better chance of aligning with  
hardware RAID

arrays. This will be more pronounced when using RAID-5.

To check, look at the partition table in format(1m) or prtvtoc(1m)

BTW, the customer is surely not expecting super database performance  
from

RAID-5 are they?



The data transfered is not very high, between 50 and 150 MB/sec.

Is this normal to see the disks all the time busy at 100% and the  
average time always greater than 30 ms?


Is there something we can do from the ZFS side?

We have followed the recommendations regarding the block size for  
the database file systems, we use 4 different zpools for the DB,  
indexes, redolog and archive logs, the vdev_cache_bshift is set to  
13 (8k blocks)...


hmmm... what OS release?  The vdev cache should only read
metadata, unless you are running on an old OS. In other words, the
solution which suggests changing vdev_cache_bshift has been
superceded by later OS releases.  You can check this via the kstats
for vdev cache.

The big knob for databases is recordsize. Clearly, the recordsize is  
set as default

on the LUN with 128 kByte average reads.
 -- richard



Can someone help me o troubleshoot this issue?

Thanks in advance and best regards,

Javier

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS file disk usage

2009-09-22 Thread Richard Elling

On Sep 22, 2009, at 8:07 AM, Andrew Deason wrote:


On Mon, 21 Sep 2009 18:20:53 -0400
Richard Elling  wrote:


On Sep 21, 2009, at 2:43 PM, Andrew Deason wrote:


On Mon, 21 Sep 2009 17:13:26 -0400
Richard Elling  wrote:


You don't know the max overhead for the file before it is
allocated. You could guess at a max of 3x size + at least three
blocks.  Since you can't control this, it seems like the worst
case is when copies=3.


Is that max with copies=3? Assume copies=1; what is it then?


1x size + 1 block.


That seems to differ quite a bit from what I've seen; perhaps I am
misunderstanding... is the "+ 1 block" of a different size than the
recordsize? With recordsize=1k:

$ ls -ls foo
2261 -rw-r--r--   1 root root 1048576 Sep 22 10:59 foo


Well, there it is.  I suggest suitable guard bands.
 -- richard



1024k vs 1130k

--
Andrew Deason
adea...@sinenomine.net
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Recv slow with high CPU

2009-09-22 Thread Matthew Ahrens

Tristan Ball wrote:

OK, Thanks for that.

 From reading the RFE, it sound's like having a faster machine on the 
receive side will be enough to alleviate the problem in the short term?


That's correct.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] rpool import when another rpool already mounted ?

2009-09-22 Thread andy
Hi

I've a situation that I cant find any answers to after searching docs etc.

I'm testing a DR  process of installing solaris  on to zfs mirror using rpool . 
Then I am breaking the rpool mirror , recreating the none live half as newrpool 
and restoring my backup to the none-live mirror disk via a /mnt/ /mnt/opt etc...

I then need to boot the server from the none-live (new-rpool) disk and make it 
become rpool.

All the docs mention installing bootblk and booting of analternate mirror , but 
not the situation I have.
I've thought of booting cdrom -s and doing an rpool import rpool newrpool , 
would that work ?

Cheers
Andy
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] URGENT: very high busy and average service time with ZFS and USP1100

2009-09-22 Thread Jim Mauro


Cross-posting to zfs-discuss. This does not need to be on the
confidential alias. It's a performance query - there's nothing
confidential in here. Other folks post performance queries to
zfs-discuss

Forget %b - it's useless.

It's not the bandwidth that's hurting you, it's the IOPS.
One of the hot devices did 1515.8 reads-per-second,
the other did over 500.

Is this Oracle?

You never actually tell us what the huge performance problem is -
what's the workload, what's the delivered level of performance?

IO service times in the 32-22 millisecond range are not great,
but not the worst I've seen. Do you have any data that connects
the delivered perfomance of the workload to an IO latency
issue, or did the customer just run "iostat", saw "100% b",
and assumed this was the problem?

I need to see zpool stats.

Is each of these c3txx devices actually a raid 7+1 (which means
7 data disks and 1 parity disk)??

There's nothing here that tells us there's something that needs to be
done on the ZFS side. Not enough data.

It looks like a very lopsided IO load distribution problem.
You have 8 LUNs cetX devices, 2 of which are getting
slammed with IOPS, the other 6 are relatively idle.

Thanks,
/jim


Javier Conde wrote:


Hello,

IHAC with a huge performance problem in a newly installed M8000 
confiured with a USP1100 and ZFS.


From what we can see, 2 disks used by in different ZPOOLS have are 
100% busy and and average service time is also quite high (between 30 
and 5 ms).


   r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   0.0   11.40.0  224.1  0.0  0.20.0   20.7   0   5 
c3t5000C5000F94A607d0
   0.0   11.80.0  224.1  0.0  0.30.0   24.2   0   6 
c3t5000C5000F94E38Fd0
   0.20.0   25.60.0  0.0  0.00.07.9   0   0 
c3t60060E8015321F01321F0032d0
   0.03.60.0   20.8  0.0  0.00.00.5   0   0 
c3t60060E8015321F01321F0020d0
   0.2   24.0   25.6  488.0  0.0  0.00.00.6   0   1 
c3t60060E8015321F01321F001Cd0
  11.40.8   92.88.0  0.0  0.00.03.9   0   4 
c3t60060E8015321F01321F0019d0
 573.40.0 73395.50.0  0.0 20.60.0   36.0   0 100 
c3t60060E8015321F01321F000Bd0
   0.80.8  102.48.0  0.0  0.00.0   22.8   0   4 
c3t60060E8015321F01321F0008d0
1515.8   10.2 30420.9  148.0  0.0 34.90.0   22.9   1 100 
c3t60060E8015321F01321F0006d0
   0.40.4   51.21.6  0.0  0.00.05.1   0   0 
c3t60060E8015321F01321F0055d0


The USP1100 is configured with a raid 7+1, which is the default 
recommendation.


The data transfered is not very high, between 50 and 150 MB/sec.

Is this normal to see the disks all the time busy at 100% and the 
average time always greater than 30 ms?


Is there something we can do from the ZFS side?

We have followed the recommendations regarding the block size for the 
database file systems, we use 4 different zpools for the DB, indexes, 
redolog and archive logs, the vdev_cache_bshift is set to 13 (8k 
blocks)...



Can someone help me o troubleshoot this issue?

Thanks in advance and best regards,

Javier

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] If you have ZFS in production, willing to share some details (with me)?

2009-09-22 Thread Steffen Weiberle

On 09/18/09 14:34, Jeremy Kister wrote:

On 9/18/2009 1:51 PM, Steffen Weiberle wrote:

I am trying to compile some deployment scenarios of ZFS.

# of systems


do zfs root count?  or only big pools?


non root is more interesting to me. however, if you are sharing the root 
pool with your data, what you are running application wise is still of 
interest.





amount of storage


raw or after parity ?


Either, and great is you indicate which.






Thanks for all the private responses. I am still compiling and cleansing 
them, and will summarize when I get their OKs!


Steffen
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS file disk usage

2009-09-22 Thread Andrew Deason
On Mon, 21 Sep 2009 18:20:53 -0400
Richard Elling  wrote:

> On Sep 21, 2009, at 2:43 PM, Andrew Deason wrote:
> 
> > On Mon, 21 Sep 2009 17:13:26 -0400
> > Richard Elling  wrote:
> >
> >> You don't know the max overhead for the file before it is
> >> allocated. You could guess at a max of 3x size + at least three
> >> blocks.  Since you can't control this, it seems like the worst
> >> case is when copies=3.
> >
> > Is that max with copies=3? Assume copies=1; what is it then?
> 
> 1x size + 1 block.

That seems to differ quite a bit from what I've seen; perhaps I am
misunderstanding... is the "+ 1 block" of a different size than the
recordsize? With recordsize=1k:

$ ls -ls foo
2261 -rw-r--r--   1 root root 1048576 Sep 22 10:59 foo

1024k vs 1130k

-- 
Andrew Deason
adea...@sinenomine.net
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Migrate from iscsitgt to comstar?

2009-09-22 Thread Peter Cudhea

cc'ing to storage-discuss where this topic also came up recently.

By default for most backing stores, COMSTAR will put its disk metadata 
in the first 64K of the backing store as you say.   So if you take a 
backing store disk that is in use as an iscsitgt LUN and then  run 
"sbdadm create-lu /path/to/backing/store", it will corrupt the data on 
the disk. Don't do this!


There are two enhancements that were introduced with the putback of 
PSARC 2009/251 in snv_115 that may be helpful.  See  stmfadm(1m) for details


   * If the backing store is a ZVOL, the metadata is stored in a
 special data object in the ZVOL rather than overwriting the first
 64K of the ZVOL.
   * the command "stmfadm -o meta=/path/to/metadata-file create-lu
 /path/to/backing/store" can be used to redirect the metadata to a
 named file on the target system.

Here is the relevant paragraph from stmfadm(1m):

Logical units registered with the
 STMF require space for the metadata to be stored. When a
 zvol  is  specified  as  the  backing  store device, the
 default will be to use a special property of the zvol to
 contain the metadata. For all other devices, the default
 behavior will be to use the first 64k of the device.  An
 alternative  approach  would be to use the meta property
 in a create-lu command to specify an alternate  file  to
 contain the metadata. It is advisable to use a file that
 can provide sufficient storage of the logical unit meta-
 data, preferably 64k.
If you use the -o meta=file approach, remember that if the volume moves 
its metadata must move along with it.   Remembering this external 
linkage could become a long-term hassle.  Some have opted to create new 
LUNs and then copy the data over, so they can remove their dependency on 
this external metadata file.


You asked only about migrating the *DATA* from iscsitgt to COMSTAR.  
This part is doable, given the above tools. 

What is not supported is automatic migration of the target and LUN 
*definitions* from iscsitgt to COMSTAR.  The iscsitgt uses a "one target 
per LUN" model. The COMSTAR model is more like "all the LUNs visible 
through the same target", using initiator-specific Views to control 
access.Creating an automated tool to go between these very different 
approaches would probably do more harm than good.   You are better off 
creating a new set of LUN and Target definitions to match the new 
environment.   It is up to you.


Peter

On 09/21/09 04:29, Markus Kovero wrote:


Is it possible to migrate data from iscsitgt for comstar iscsi target? 
I guess comstar wants metadata at beginning of volume and this makes 
things difficult?


 


Yours

Markus Kovero



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Recv slow with high CPU

2009-09-22 Thread Tristan Ball

OK, Thanks for that.

From reading the RFE, it sound's like having a faster machine on the 
receive side will be enough to alleviate the problem in the short term?


The hardware I'm using at the moment is quite old, and not particularly 
fast - although this is the first out & out performance limitation I've 
had with using it as a opensolaris storage system.


Regards,
   Tristan.

Matthew Ahrens wrote:

Tristan Ball wrote:

Hi Everyone,

I have a couple of systems running opensolaris b118, one of which 
sends hourly snapshots to the other. This has been working well, 
however as of today, the receiving zfs process has started running 
extremely slowly, and is running at 100% CPU on one core, completely 
in kernel mode. A little bit of exploration with lockstat and dtrace 
seems to imply that the issue is around the "dbuf_free_range" 
function - or at least, that's what it looks like to my inexperienced 
eye!


This is probably RFE 6812603 "zfs send can aggregate free records", 
which is currently being worked on.


--matt

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] lots of zil_clean threads

2009-09-22 Thread Nils Goroll

Hi Neil and all,

thank you very much for looking into this:


So I don't know what's going on. What is the typical call stack for those
zil_clean() threads?


I'd say they are all blocking on their respective CVs:

ff0009066c60 fbc2c0300   0  60 ff01d25e1180
  PC: _resume_from_idle+0xf1TASKQ: zil_clean
  stack pointer for thread ff0009066c60: ff0009066b60
  [ ff0009066b60 _resume_from_idle+0xf1() ]
swtch+0x147()
cv_wait+0x61()
taskq_thread+0x10b()
thread_start+8()

I should add that I have quite a lot of datasets:

r...@haggis:~# zfs list -r -t filesystem | wc -l
  49
r...@haggis:~# zfs list -r -t volume | wc -l
  14
r...@haggis:~# zfs list -r -t snapshot | wc -l
6018

Nils
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss