Re: [zfs-discuss] lots of zil_clean threads
I should add that I have quite a lot of datasets: and maybe I should also add that I'm still running an old zpool version in order to keep the ability to boot snv_98: aggis:~$ zpool upgrade This system is currently running ZFS pool version 14. The following pools are out of date, and can be upgraded. After being upgraded, these pools will no longer be accessible by older software versions. VER POOL --- 13 rpool ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs bug
Of course I meant 2009.06 :-) Trevor Pretty wrote: BTW Reading your bug. I assumed you meant? zfs set mountpoint=/home/pool tank ln -s /dev/null /home/pool I then tried on OpenSolaris 2008.11 r...@norton:~# zfs set mountpoint= r...@norton:~# zfs set mountpoint=/home/pool tank r...@norton:~# zpool export tank r...@norton:~# rm -r /home/pool rm: cannot remove `/home/pool': No such file or directory r...@norton:~# ln -s /dev/null /home/pool r...@norton:~# zpool import -f tank cannot mount 'tank': Not a directory r...@norton:~# So looks fixed to me. Trevor Pretty wrote: Jeremy You sure? http://bugs.opensolaris.org/view_bug.do%3Bjsessionid=32d28f683e21e4b5c35832c2e707?bug_id=6883885 BTW: I only found this by hunting for one of my bugs 6428437 and changing the URL! I think the searching is broken - but using bugster has always been a black art even when I worked at Sun :-) Trevor Jeremy Kister wrote: I entered CR 6883885 at bugs.opensolaris.org. someone closed it - not reproducible. Where do i find more information, like which planet's gravitational properties affect the zfs source code ?? -- Trevor Pretty | Technical Account Manager | +64 9 639 0652 | +64 21 666 161 Eagle Technology Group Ltd. Gate D, Alexandra Park, Greenlane West, Epsom Private Bag 93211, Parnell, Auckland -- Trevor Pretty | Technical Account Manager | +64 9 639 0652 | +64 21 666 161 Eagle Technology Group Ltd. Gate D, Alexandra Park, Greenlane West, Epsom Private Bag 93211, Parnell, Auckland www.eagle.co.nz This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs bug
On 9/22/2009 11:17 PM, Trevor Pretty wrote: zfs set mountpoint=/home/pool tank ln -s /dev/null /home/pool ahha, I dumbed down the process too much (trying to make it simple to reproduce). the key is in the /Auto/pool snippet that i put in the CR, but switched to /dev/null in the reproduce section. so, i have automounter working and in NIS. inside auto_home is: poolserver:/home/pool (where server is the host im importing the zfs pool on) zfs set mountpoint=/home/pool tank zfs set sharenfs=rw,anon=0 tank zfs export tank rm -r /home/pool ln -s /Auto/pool /home/pool zfs import -f tank that is what is causing the breakage, not necessarily the softlink itself. how do i amend the CR? should i just make a new one ? Thanks for your follow up, Trevor. -- Jeremy Kister http://jeremy.kister.net./ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs bug
BTW Reading your bug. I assumed you meant? zfs set mountpoint=/home/pool tank ln -s /dev/null /home/pool I then tried on OpenSolaris 2008.11 r...@norton:~# zfs set mountpoint= r...@norton:~# zfs set mountpoint=/home/pool tank r...@norton:~# zpool export tank r...@norton:~# rm -r /home/pool rm: cannot remove `/home/pool': No such file or directory r...@norton:~# ln -s /dev/null /home/pool r...@norton:~# zpool import -f tank cannot mount 'tank': Not a directory r...@norton:~# So looks fixed to me. Trevor Pretty wrote: Jeremy You sure? http://bugs.opensolaris.org/view_bug.do%3Bjsessionid=32d28f683e21e4b5c35832c2e707?bug_id=6883885 BTW: I only found this by hunting for one of my bugs 6428437 and changing the URL! I think the searching is broken - but using bugster has always been a black art even when I worked at Sun :-) Trevor Jeremy Kister wrote: I entered CR 6883885 at bugs.opensolaris.org. someone closed it - not reproducible. Where do i find more information, like which planet's gravitational properties affect the zfs source code ?? -- Trevor Pretty | Technical Account Manager | +64 9 639 0652 | +64 21 666 161 Eagle Technology Group Ltd. Gate D, Alexandra Park, Greenlane West, Epsom Private Bag 93211, Parnell, Auckland www.eagle.co.nz This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs bug
Jeremy You sure? http://bugs.opensolaris.org/view_bug.do%3Bjsessionid=32d28f683e21e4b5c35832c2e707?bug_id=6883885 BTW: I only found this by hunting for one of my bugs 6428437 and changing the URL! I think the searching is broken - but using bugster has always been a black art even when I worked at Sun :-) Trevor Jeremy Kister wrote: I entered CR 6883885 at bugs.opensolaris.org. someone closed it - not reproducible. Where do i find more information, like which planet's gravitational properties affect the zfs source code ?? www.eagle.co.nz This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs bug
I entered CR 6883885 at bugs.opensolaris.org. someone closed it - not reproducible. Where do i find more information, like which planet's gravitational properties affect the zfs source code ?? -- Jeremy Kister http://jeremy.kister.net./ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What does 128-bit mean
http://blogs.sun.com/bonwick/entry/128_bit_storage_are_you Trevor Pretty wrote: http://en.wikipedia.org/wiki/ZFS Shu Wu wrote: Hi pals, I'm now looking into zfs source and have been puzzled about 128-bit. It's announced that ZFS is an 128-bit file system. But what does 128-bit mean? Does that mean the addressing capability is 2^128? But in the source, 'zp_size' (in 'struct znode_phys'), the file size in bytes, is defined as uint64_t. So I guess 128-bit may be the bit width of the zpool pointer, but where is it defined? Regards, Wu Shu -- Trevor Pretty | Technical Account Manager | +64 9 639 0652 | +64 21 666 161 Eagle Technology Group Ltd. Gate D, Alexandra Park, Greenlane West, Epsom Private Bag 93211, Parnell, Auckland -- Trevor Pretty | Technical Account Manager | +64 9 639 0652 | +64 21 666 161 Eagle Technology Group Ltd. Gate D, Alexandra Park, Greenlane West, Epsom Private Bag 93211, Parnell, Auckland www.eagle.co.nz This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What does 128-bit mean
http://en.wikipedia.org/wiki/ZFS Shu Wu wrote: Hi pals, I'm now looking into zfs source and have been puzzled about 128-bit. It's announced that ZFS is an 128-bit file system. But what does 128-bit mean? Does that mean the addressing capability is 2^128? But in the source, 'zp_size' (in 'struct znode_phys'), the file size in bytes, is defined as uint64_t. So I guess 128-bit may be the bit width of the zpool pointer, but where is it defined? Regards, Wu Shu -- Trevor Pretty | Technical Account Manager | +64 9 639 0652 | +64 21 666 161 Eagle Technology Group Ltd. Gate D, Alexandra Park, Greenlane West, Epsom Private Bag 93211, Parnell, Auckland www.eagle.co.nz This email is confidential and may be legally privileged. If received in error please destroy and immediately notify us. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] What does 128-bit mean
Hi pals, I'm now looking into zfs source and have been puzzled about 128-bit. It's announced that ZFS is an 128-bit file system. But what does 128-bit mean? Does that mean the addressing capability is 2^128? But in the source, 'zp_size' (in 'struct znode_phys'), the file size in bytes, is defined as uint64_t. So I guess 128-bit may be the bit width of the zpool pointer, but where is it defined? Regards, Wu Shu ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS file disk usage
On Tue, 22 Sep 2009 13:26:59 -0400 Richard Elling wrote: > > That seems to differ quite a bit from what I've seen; perhaps I am > > misunderstanding... is the "+ 1 block" of a different size than the > > recordsize? With recordsize=1k: > > > > $ ls -ls foo > > 2261 -rw-r--r-- 1 root root 1048576 Sep 22 10:59 foo > > Well, there it is. I suggest suitable guard bands. So, you would say it's reasonable to assume the overhead will always be less than about 100k or 10%? And to be sure... if we're to be rounding up to the next recordsize boundary, are we guaranteed to be able to get the from the blocksize reported by statvfs? -- Andrew Deason adea...@sinenomine.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Persistent errors - do I believe?
I've had an interesting time with this over the past few days ... After the resilver completed, I had the message "no known data errors" in a zpool status. I guess the title of my post should have been "how permanent are permanent errors?". Now, I don't know whether the action of completing the resilver was the thing that fixed the one remaining error (in the snapshot of the 'meerkat' zvol), or whether my looped zpool clear commands have done it. Anyhow, for space/noise reasons, I set the machine back up with the original cables (eSATA), in its original tucked-away position, installed SXCE 119 to get me remotely up to date, and imported the pool. So far so good. I then powered up a load of my virtual machines. None of them report errors when running a chkdsk, and SQL Server 'DBCC CHECKDB' hasn't reported any problems yet. Things are looking promising on the corruption front - feels like the errors that were reported while the resilvers were in progress have finally been fixed by the final (successful) resilver! Microsoft Exchange 2003 did complain of corruption of mailbox stores, however I have seen this a few times as a result of unclean shutdowns, and don't think it's related to the errors that ZFS was reporting on the pool during resilver. Then, 'disk is gone' again - I think I can definitely put my original troubles down to cabling, which I'll sort out for good in the next few days. Now, I'm back on the same SATA cables which saw me through the resilvering operation. One of the drives is showing read errors when I run dmesg. I'm having one problem after another with this pool!! I think the disk I/O during the resilver has tipped this disk over the edge. I'll replace it ASAP, and then I'll test the drive in a separate rig and RMA it. Anyhow, there is one last thing that I'm struggling with - getting the pool to expand to use the size of the new disk. Before my original replace, I had 3x1TB and 1x750GB disk. I replaced the 750 with another 1TB, which by my reckoning should give me around 4TB as a total size even after checksums and metadata. No: # zpool list NAMESIZE USED AVAILCAP HEALTH ALTROOT rpool74G 8.81G 65.2G11% ONLINE - zp 2.73T 2.36T 379G86% ONLINE - 2.73T? I'm convinced I've expanded a pool in this way before. What am I missing? Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] If you have ZFS in production, willing to share some details (with me)?
On 9/22/2009 1:55 PM, Jeremy Kister wrote: (b) 2 of them have 268GB raw 26 HP 300GB SCA disks with mirroring + 2 hot spares 28 * 300G = 8.2T. Not 268G. "Math class is tough!" -- Jeremy Kister http://jeremy.kister.net./ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] If you have ZFS in production, willing to share some details (with me)?
On 9/18/2009 1:51 PM, Steffen Weiberle wrote: # of systems 6 not including dozens of zfs root. amount of storage (a) 2 of them have 96TB raw, 46 WD SATA 2TB disks in two raidz2 pools + 2 hot spares each raidz2 pool is on it's own shelf on it's own PCIx controller (b) 2 of them have 268GB raw 26 HP 300GB SCA disks with mirroring + 2 hot spares + soon to be 3 way mirrored each shelf of 14 disks is connected to it's own u320 pcix card (c) 2 of them have 14TB raw 14 Dell SATA 1TB disks in two raidz2 pools + 1 hot spare application profile(s) (a) and (c) are file servers via nfs (b) are postgres database servers type of workload (low, high; random, sequential; read-only, read-write, write-only) (a) are 70/30 read/write @ average of 40MB/s 30 clients (b) are 50/50 read/write @ average of 180MB/s local read/write only (c) are 70/30 read/write @ average of 28MB/s 10 clients storage type(s) (a) and (c) are sata (b) are u320 scsi industry call analytics whether it is private or I can share in a summary not private. anything else that might be of interest 35. “Because” makes any explanation rational. In a line to Kinko’s copy machine a researcher asked to jump the line by presenting a reason “Can I jump the line, because I am in a rush?” 94% of people complied. Good reason, right? Okay, let’s change the reason. “Can I jump the line because I need to make copies?” Excuse me? That’s why everybody is in the line to begin with. Yet 93% of people complied. A request without “because” in it (”Can I jump the line, please?”) generated 24% compliance. -- Jeremy Kister http://jeremy.kister.net./ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] URGENT: very high busy and average service time with ZFS and USP1100
comment below... On Sep 22, 2009, at 9:57 AM, Jim Mauro wrote: Cross-posting to zfs-discuss. This does not need to be on the confidential alias. It's a performance query - there's nothing confidential in here. Other folks post performance queries to zfs-discuss Forget %b - it's useless. It's not the bandwidth that's hurting you, it's the IOPS. One of the hot devices did 1515.8 reads-per-second, the other did over 500. Is this Oracle? You never actually tell us what the huge performance problem is - what's the workload, what's the delivered level of performance? IO service times in the 32-22 millisecond range are not great, but not the worst I've seen. Do you have any data that connects the delivered perfomance of the workload to an IO latency issue, or did the customer just run "iostat", saw "100% b", and assumed this was the problem? I need to see zpool stats. Is each of these c3txx devices actually a raid 7+1 (which means 7 data disks and 1 parity disk)?? There's nothing here that tells us there's something that needs to be done on the ZFS side. Not enough data. It looks like a very lopsided IO load distribution problem. You have 8 LUNs cetX devices, 2 of which are getting slammed with IOPS, the other 6 are relatively idle. Thanks, /jim Javier Conde wrote: Hello, IHAC with a huge performance problem in a newly installed M8000 confiured with a USP1100 and ZFS. From what we can see, 2 disks used by in different ZPOOLS have are 100% busy and and average service time is also quite high (between 30 and 5 ms). r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 11.40.0 224.1 0.0 0.20.0 20.7 0 5 c3t5000C5000F94A607d0 0.0 11.80.0 224.1 0.0 0.30.0 24.2 0 6 c3t5000C5000F94E38Fd0 0.20.0 25.60.0 0.0 0.00.07.9 0 0 c3t60060E8015321F01321F0032d0 0.03.60.0 20.8 0.0 0.00.00.5 0 0 c3t60060E8015321F01321F0020d0 0.2 24.0 25.6 488.0 0.0 0.00.00.6 0 1 c3t60060E8015321F01321F001Cd0 11.40.8 92.88.0 0.0 0.00.03.9 0 4 c3t60060E8015321F01321F0019d0 573.40.0 73395.50.0 0.0 20.60.0 36.0 0 100 c3t60060E8015321F01321F000Bd0 avg read size ~128kBytes... which is good 0.80.8 102.48.0 0.0 0.00.0 22.8 0 4 c3t60060E8015321F01321F0008d0 1515.8 10.2 30420.9 148.0 0.0 34.90.0 22.9 1 100 c3t60060E8015321F01321F0006d0 avg read size ~20 kBytes... not so good These look like single-LUN pools. What is the workload? 0.40.4 51.21.6 0.0 0.00.05.1 0 0 c3t60060E8015321F01321F0055d0 The USP1100 is configured with a raid 7+1, which is the default recommendation. Check the starting sector for the partition. For older OpenSolaris and Solaris 10 installations, the default starting sector is 34, which has the unfortunate affect of misaligning with most hardware RAID arrays. For newer installations, the default starting sector is 256, which has a better chance of aligning with hardware RAID arrays. This will be more pronounced when using RAID-5. To check, look at the partition table in format(1m) or prtvtoc(1m) BTW, the customer is surely not expecting super database performance from RAID-5 are they? The data transfered is not very high, between 50 and 150 MB/sec. Is this normal to see the disks all the time busy at 100% and the average time always greater than 30 ms? Is there something we can do from the ZFS side? We have followed the recommendations regarding the block size for the database file systems, we use 4 different zpools for the DB, indexes, redolog and archive logs, the vdev_cache_bshift is set to 13 (8k blocks)... hmmm... what OS release? The vdev cache should only read metadata, unless you are running on an old OS. In other words, the solution which suggests changing vdev_cache_bshift has been superceded by later OS releases. You can check this via the kstats for vdev cache. The big knob for databases is recordsize. Clearly, the recordsize is set as default on the LUN with 128 kByte average reads. -- richard Can someone help me o troubleshoot this issue? Thanks in advance and best regards, Javier ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS file disk usage
On Sep 22, 2009, at 8:07 AM, Andrew Deason wrote: On Mon, 21 Sep 2009 18:20:53 -0400 Richard Elling wrote: On Sep 21, 2009, at 2:43 PM, Andrew Deason wrote: On Mon, 21 Sep 2009 17:13:26 -0400 Richard Elling wrote: You don't know the max overhead for the file before it is allocated. You could guess at a max of 3x size + at least three blocks. Since you can't control this, it seems like the worst case is when copies=3. Is that max with copies=3? Assume copies=1; what is it then? 1x size + 1 block. That seems to differ quite a bit from what I've seen; perhaps I am misunderstanding... is the "+ 1 block" of a different size than the recordsize? With recordsize=1k: $ ls -ls foo 2261 -rw-r--r-- 1 root root 1048576 Sep 22 10:59 foo Well, there it is. I suggest suitable guard bands. -- richard 1024k vs 1130k -- Andrew Deason adea...@sinenomine.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Recv slow with high CPU
Tristan Ball wrote: OK, Thanks for that. From reading the RFE, it sound's like having a faster machine on the receive side will be enough to alleviate the problem in the short term? That's correct. --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] rpool import when another rpool already mounted ?
Hi I've a situation that I cant find any answers to after searching docs etc. I'm testing a DR process of installing solaris on to zfs mirror using rpool . Then I am breaking the rpool mirror , recreating the none live half as newrpool and restoring my backup to the none-live mirror disk via a /mnt/ /mnt/opt etc... I then need to boot the server from the none-live (new-rpool) disk and make it become rpool. All the docs mention installing bootblk and booting of analternate mirror , but not the situation I have. I've thought of booting cdrom -s and doing an rpool import rpool newrpool , would that work ? Cheers Andy -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] URGENT: very high busy and average service time with ZFS and USP1100
Cross-posting to zfs-discuss. This does not need to be on the confidential alias. It's a performance query - there's nothing confidential in here. Other folks post performance queries to zfs-discuss Forget %b - it's useless. It's not the bandwidth that's hurting you, it's the IOPS. One of the hot devices did 1515.8 reads-per-second, the other did over 500. Is this Oracle? You never actually tell us what the huge performance problem is - what's the workload, what's the delivered level of performance? IO service times in the 32-22 millisecond range are not great, but not the worst I've seen. Do you have any data that connects the delivered perfomance of the workload to an IO latency issue, or did the customer just run "iostat", saw "100% b", and assumed this was the problem? I need to see zpool stats. Is each of these c3txx devices actually a raid 7+1 (which means 7 data disks and 1 parity disk)?? There's nothing here that tells us there's something that needs to be done on the ZFS side. Not enough data. It looks like a very lopsided IO load distribution problem. You have 8 LUNs cetX devices, 2 of which are getting slammed with IOPS, the other 6 are relatively idle. Thanks, /jim Javier Conde wrote: Hello, IHAC with a huge performance problem in a newly installed M8000 confiured with a USP1100 and ZFS. From what we can see, 2 disks used by in different ZPOOLS have are 100% busy and and average service time is also quite high (between 30 and 5 ms). r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 11.40.0 224.1 0.0 0.20.0 20.7 0 5 c3t5000C5000F94A607d0 0.0 11.80.0 224.1 0.0 0.30.0 24.2 0 6 c3t5000C5000F94E38Fd0 0.20.0 25.60.0 0.0 0.00.07.9 0 0 c3t60060E8015321F01321F0032d0 0.03.60.0 20.8 0.0 0.00.00.5 0 0 c3t60060E8015321F01321F0020d0 0.2 24.0 25.6 488.0 0.0 0.00.00.6 0 1 c3t60060E8015321F01321F001Cd0 11.40.8 92.88.0 0.0 0.00.03.9 0 4 c3t60060E8015321F01321F0019d0 573.40.0 73395.50.0 0.0 20.60.0 36.0 0 100 c3t60060E8015321F01321F000Bd0 0.80.8 102.48.0 0.0 0.00.0 22.8 0 4 c3t60060E8015321F01321F0008d0 1515.8 10.2 30420.9 148.0 0.0 34.90.0 22.9 1 100 c3t60060E8015321F01321F0006d0 0.40.4 51.21.6 0.0 0.00.05.1 0 0 c3t60060E8015321F01321F0055d0 The USP1100 is configured with a raid 7+1, which is the default recommendation. The data transfered is not very high, between 50 and 150 MB/sec. Is this normal to see the disks all the time busy at 100% and the average time always greater than 30 ms? Is there something we can do from the ZFS side? We have followed the recommendations regarding the block size for the database file systems, we use 4 different zpools for the DB, indexes, redolog and archive logs, the vdev_cache_bshift is set to 13 (8k blocks)... Can someone help me o troubleshoot this issue? Thanks in advance and best regards, Javier ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] If you have ZFS in production, willing to share some details (with me)?
On 09/18/09 14:34, Jeremy Kister wrote: On 9/18/2009 1:51 PM, Steffen Weiberle wrote: I am trying to compile some deployment scenarios of ZFS. # of systems do zfs root count? or only big pools? non root is more interesting to me. however, if you are sharing the root pool with your data, what you are running application wise is still of interest. amount of storage raw or after parity ? Either, and great is you indicate which. Thanks for all the private responses. I am still compiling and cleansing them, and will summarize when I get their OKs! Steffen ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS file disk usage
On Mon, 21 Sep 2009 18:20:53 -0400 Richard Elling wrote: > On Sep 21, 2009, at 2:43 PM, Andrew Deason wrote: > > > On Mon, 21 Sep 2009 17:13:26 -0400 > > Richard Elling wrote: > > > >> You don't know the max overhead for the file before it is > >> allocated. You could guess at a max of 3x size + at least three > >> blocks. Since you can't control this, it seems like the worst > >> case is when copies=3. > > > > Is that max with copies=3? Assume copies=1; what is it then? > > 1x size + 1 block. That seems to differ quite a bit from what I've seen; perhaps I am misunderstanding... is the "+ 1 block" of a different size than the recordsize? With recordsize=1k: $ ls -ls foo 2261 -rw-r--r-- 1 root root 1048576 Sep 22 10:59 foo 1024k vs 1130k -- Andrew Deason adea...@sinenomine.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Migrate from iscsitgt to comstar?
cc'ing to storage-discuss where this topic also came up recently. By default for most backing stores, COMSTAR will put its disk metadata in the first 64K of the backing store as you say. So if you take a backing store disk that is in use as an iscsitgt LUN and then run "sbdadm create-lu /path/to/backing/store", it will corrupt the data on the disk. Don't do this! There are two enhancements that were introduced with the putback of PSARC 2009/251 in snv_115 that may be helpful. See stmfadm(1m) for details * If the backing store is a ZVOL, the metadata is stored in a special data object in the ZVOL rather than overwriting the first 64K of the ZVOL. * the command "stmfadm -o meta=/path/to/metadata-file create-lu /path/to/backing/store" can be used to redirect the metadata to a named file on the target system. Here is the relevant paragraph from stmfadm(1m): Logical units registered with the STMF require space for the metadata to be stored. When a zvol is specified as the backing store device, the default will be to use a special property of the zvol to contain the metadata. For all other devices, the default behavior will be to use the first 64k of the device. An alternative approach would be to use the meta property in a create-lu command to specify an alternate file to contain the metadata. It is advisable to use a file that can provide sufficient storage of the logical unit meta- data, preferably 64k. If you use the -o meta=file approach, remember that if the volume moves its metadata must move along with it. Remembering this external linkage could become a long-term hassle. Some have opted to create new LUNs and then copy the data over, so they can remove their dependency on this external metadata file. You asked only about migrating the *DATA* from iscsitgt to COMSTAR. This part is doable, given the above tools. What is not supported is automatic migration of the target and LUN *definitions* from iscsitgt to COMSTAR. The iscsitgt uses a "one target per LUN" model. The COMSTAR model is more like "all the LUNs visible through the same target", using initiator-specific Views to control access.Creating an automated tool to go between these very different approaches would probably do more harm than good. You are better off creating a new set of LUN and Target definitions to match the new environment. It is up to you. Peter On 09/21/09 04:29, Markus Kovero wrote: Is it possible to migrate data from iscsitgt for comstar iscsi target? I guess comstar wants metadata at beginning of volume and this makes things difficult? Yours Markus Kovero ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Recv slow with high CPU
OK, Thanks for that. From reading the RFE, it sound's like having a faster machine on the receive side will be enough to alleviate the problem in the short term? The hardware I'm using at the moment is quite old, and not particularly fast - although this is the first out & out performance limitation I've had with using it as a opensolaris storage system. Regards, Tristan. Matthew Ahrens wrote: Tristan Ball wrote: Hi Everyone, I have a couple of systems running opensolaris b118, one of which sends hourly snapshots to the other. This has been working well, however as of today, the receiving zfs process has started running extremely slowly, and is running at 100% CPU on one core, completely in kernel mode. A little bit of exploration with lockstat and dtrace seems to imply that the issue is around the "dbuf_free_range" function - or at least, that's what it looks like to my inexperienced eye! This is probably RFE 6812603 "zfs send can aggregate free records", which is currently being worked on. --matt __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] lots of zil_clean threads
Hi Neil and all, thank you very much for looking into this: So I don't know what's going on. What is the typical call stack for those zil_clean() threads? I'd say they are all blocking on their respective CVs: ff0009066c60 fbc2c0300 0 60 ff01d25e1180 PC: _resume_from_idle+0xf1TASKQ: zil_clean stack pointer for thread ff0009066c60: ff0009066b60 [ ff0009066b60 _resume_from_idle+0xf1() ] swtch+0x147() cv_wait+0x61() taskq_thread+0x10b() thread_start+8() I should add that I have quite a lot of datasets: r...@haggis:~# zfs list -r -t filesystem | wc -l 49 r...@haggis:~# zfs list -r -t volume | wc -l 14 r...@haggis:~# zfs list -r -t snapshot | wc -l 6018 Nils ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss