[zfs-discuss] ZFS send/recv horribly slow on system with 1800+ filesystems
Hi all, I have a test system with a large amount of filesystems which we take snapshots of and do send/recvs with. On our test machine, we have 1800+ filesystems and about 5,000 snapshots.The system has 48GB of RAM, and 8 cores (x86). The filesystem is comprised of 2 regular 1TB in a mirror with a 320GB FusionIO flash card acting as a ZIL and read cache. We've noticed that on systems with just a handful of filesystems, ZFS send (recursive) is quite quick, but on our 1800+ fs box, it's horribly slow. For example, root@testbox:~# zfs send -R chunky/0@async-2011-02-28-15:11:20| pv -i 1 /dev/null 2.51GB 0:04:57 [47.4kB/s] [= ] ^C The other odd thing I've noticed is that during the 'zfs send' to /dev/null, zpool iostat shows we're actually *writing* to the zpool at the rate of 4MB-8MB/s, but reading almost nothing. How can this be the case? So I'm left with 2 questions - 1.) Does ZFS get immensely slow once we have thousands of filesystems? 2.) Why do we see 4MB-8MB/s of *writes* to the filesystem when we do a 'zfs send' to /dev/null ? -Moazam ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS send/receive while write is enabled on receive side?
Hi all, from much of the documentation I've seen, the advice is to set readonly=on on volumes on the receiving side during send/receive operations. Is this still a requirement? I've been trying the send/receive while NOT setting the receiver to readonly and haven't seen any problems even though we're traversing and ls'ing the dirs within the receiving volume during the send/recv. So, is it OK to send/recv while having the receive volume write enabled? -Moazam ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ RevoDrive ZFS support
Agreed, SSD with SandForce controllers are the only way to go. The controller makes a world of difference. -Moazam On Sat, Nov 27, 2010 at 12:27 PM, Tim Cook t...@cook.ms wrote: On Sat, Nov 27, 2010 at 2:16 PM, Orvar Korvar knatte_fnatte_tja...@yahoo.com wrote: Your system drive on a Solaris system generally doesn't see enough I/O activity to require the kind of IOPS you can get out of most modern SSD's. My system drive sees a lot of activity, to the degree everything is going slow. I have a SunRay that my girlfriend use, and I have 5-10 torrents going on, and surf the web - often my system crawls. Very often my girlfriend gets irritated because everything lags and she frequently asks me if she can do some task, or if she should wait until I have finished copying my files. Unbearable. I have a quad core Intel 9450 at 2.66GHz, and 8GB RAM. I am planning to use a SSD and really hope it will be faster. $ iostat -xcnXCTdz 1 cpu us sy wt id 25 7 0 68 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0 0 c8 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0,0 0 0 c8t0d0 37,0 442,1 4489,6 51326,1 7,5 2,0 15,7 4,1 98 100 c7d0 Desktop usage is a different beast as I alluded to. A dedicated server typically doesn't have any issues. I'd strongly suggest getting one of the sandforce controller based SSD's. They're the best on the market right now by far. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] WarpDrive SLP-300
I tested the Fusion IO MLC based PCI-e cards on OpenSolaris and found the performance to be amazing. Hopefully Fusion IO will release supported drivers for Solaris 11 Express and onwards. The Fusion IO MLC card was giving me around 500MB/s write performance with O_SYNC. -Moazam On Wed, Nov 17, 2010 at 7:49 PM, Fred Liu fred_...@issi.com wrote: http://www.lsi.com/channel/about_channel/whatsnew/warpdrive_slp300/index.html Good stuff for ZFS. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 RIP
I have this with 36 2TB drives (and 2 separate boot drives). http://www.colfax-intl.com/jlrid/SpotLight_more_Acc.asp?L=134S=58B=2267 It's not exactly the same (it has cons/pros), but it is definitely less expensive. I'm running b147 on it with an LSI controller. -Moazam On Mon, Nov 8, 2010 at 7:22 PM, Ian Collins i...@ianshome.com wrote: Oracle have deleted the best ZFS platform I know, the X4540. Does anyone know of an equivalent system? None of the current Oracle/Sun offerings come close. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [OpenIndiana-discuss] format dumps the core
I'm having the same problem after adding 2 SSD disks to my machine. The controller is LSI SAS9211-8i PCI Express. # format Searching for disks...Arithmetic Exception (core dumped) # pstack core.format.1016 core 'core.format.1016' of 1016:format fee62e4a UDiv (4, 0, 8046bf0, 8046910, 80469a0, 80469c0) + 2a 08079799 auto_sense (4, 0, 8046bf0, 1c8) + 281 080751a6 add_device_to_disklist (8047930, 8047530, feffb8f4, 804716c) + 62a 080746ff do_search (0, 1, 8047d98, 8066576) + 273 0806658d main (1, 8047dd0, 8047dd8, 8047d8c) + c1 0805774d _start (1, 8047e88, 0, 8047e8f, 8047e99, 8047ead) + 7d I'm on b147. # uname -a SunOS geneva5 5.11 oi_147 i86pc i386 i86pc Solaris On Tue, Nov 2, 2010 at 7:17 AM, Joerg Schilling joerg.schill...@fokus.fraunhofer.de wrote: Roy Sigurd Karlsbakk r...@karlsbakk.net wrote: Hi all (crossposting to zfs-discuss) This error also seems to occur on osol 134. Any idea what this might be? ioctl(4, USCSICMD, 0x08046910) = 0 ioctl(4, USCSICMD, 0x08046900) = 0 ioctl(4, USCSICMD, 0x08046570) = 0 ioctl(4, USCSICMD, 0x08046570) = 0 Incurred fault #8, FLTIZDIV %pc = 0xFEE62E4A siginfo: SIGFPE FPE_INTDIV addr=0xFEE62E4A Received signal #8, SIGFPE [default] siginfo: SIGFPE FPE_INTDIV addr=0xFEE62E4A r...@tos-backup:~# You need to find out at which source line the integer division by zero occurs. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de (uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [OpenIndiana-discuss] format dumps the core
Fixed! It turns out the problem was that we pulled these two disks from a Linux box and they were formatted with ext3 on partition 0 for the whole disk, which was somehow causing 'format' to freak out. So, we fdisk'ed the p0 slice to delete the Linux partition and then created a SOLARIS2 type partition on it. It worked and no more crash during format command. Cindy, please let the format team know about this since I'm sure others will also run into this problem at some point if they have a mixed Linux/Solaris environment. -Moazam On Tue, Nov 2, 2010 at 3:15 PM, Cindy Swearingen cindy.swearin...@oracle.com wrote: Hi Moazam, The initial diagnosis is that the LSI controller is reporting bogus information. It looks like Roy is using a similar controller. You might report this problem to LSI, but I will pass this issue along to the format folks. Thanks, Cindy On 11/02/10 15:26, Moazam Raja wrote: I'm having the same problem after adding 2 SSD disks to my machine. The controller is LSI SAS9211-8i PCI Express. # format Searching for disks...Arithmetic Exception (core dumped) # pstack core.format.1016 core 'core.format.1016' of 1016: format fee62e4a UDiv (4, 0, 8046bf0, 8046910, 80469a0, 80469c0) + 2a 08079799 auto_sense (4, 0, 8046bf0, 1c8) + 281 080751a6 add_device_to_disklist (8047930, 8047530, feffb8f4, 804716c) + 62a 080746ff do_search (0, 1, 8047d98, 8066576) + 273 0806658d main (1, 8047dd0, 8047dd8, 8047d8c) + c1 0805774d _start (1, 8047e88, 0, 8047e8f, 8047e99, 8047ead) + 7d I'm on b147. # uname -a SunOS geneva5 5.11 oi_147 i86pc i386 i86pc Solaris On Tue, Nov 2, 2010 at 7:17 AM, Joerg Schilling joerg.schill...@fokus.fraunhofer.de wrote: Roy Sigurd Karlsbakk r...@karlsbakk.net wrote: Hi all (crossposting to zfs-discuss) This error also seems to occur on osol 134. Any idea what this might be? ioctl(4, USCSICMD, 0x08046910) = 0 ioctl(4, USCSICMD, 0x08046900) = 0 ioctl(4, USCSICMD, 0x08046570) = 0 ioctl(4, USCSICMD, 0x08046570) = 0 Incurred fault #8, FLTIZDIV %pc = 0xFEE62E4A siginfo: SIGFPE FPE_INTDIV addr=0xFEE62E4A Received signal #8, SIGFPE [default] siginfo: SIGFPE FPE_INTDIV addr=0xFEE62E4A r...@tos-backup:~# You need to find out at which source line the integer division by zero occurs. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin �...@cs.tu-berlin.de (uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and RAM
The following is a good explanation: http://blogs.sun.com/brendan/entry/test -Moazam On Sun, Oct 24, 2010 at 10:42 AM, besson3c j...@netmusician.org wrote: Can somebody kindly clarify as to how Solaris and ZFS makes use of RAM? I have 4 gig of RAM installed on my Solaris/ZFS box serving a pool of 6 disks. I can see that it is using all of this memory, although the machine has never had to dip into swap space. What would be the net effect of adding more RAM? Would it just use all of that RAM and pretty much accomplish nothing, or will adding more RAM increase the size of my buffers and provide me with faster I/O that gets backlogged less frequently? How can I really tell whether my machine is RAM starved? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS COW and simultaneous read write of files
Hi all, I have a ZFS question related to COW and scope. If user A is reading a file while user B is writing to the same file, when do the changes introduced by user B become visible to everyone? Is there a block level scope, or file level, or something else? Thanks! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS host to host replication with AVS?
Hi all, I'm trying to accomplish server to server storage replication in synchronous mode where each server is a Solaris/OpenSolaris machine with its own local storage. For Linux, I've been able to achieve what I want with DRBD but I'm hoping I can find a similar solution on Solaris so that I can leverage ZFS. It seems that solution is Sun Availability Suite (AVS)? One of the major concerns I have is what happens when the primary storage server fails. Will the secondary take over automatically (using some sort of heartbeat mechanism)? Once the secondary node takes over, can it fail-back to the primary node once the primary node is back? My concern is that AVS is not able to repair the primary node after it has failed, as per the conversation in this forum: http://discuss.joyent.com/viewtopic.php?id=19096 AVS is essentially one-way replication. If your primary fails, your secondary can take over as the primary but the disks remain in the secondary state. There is no way to reverse the replication while the secondary is acting as the primary. Is AVS even the right solution here, or should I be looking at some other technology? Thanks. -Moazam ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Thumper Origins Q
Well, he did say fairly cheap. the ST 3511 is about $18.5k. That's about the same price for the low-end NetApp FAS250 unit. -Moazam On Jan 24, 2007, at 9:40 AM, Richard Elling wrote: Peter Eriksson wrote: too much of our future roadmap, suffice it to say that one should expect much, much more from Sun in this vein: innovative software and innovative hardware working together to deliver world-beating systems with undeniable economics. Yes please. Now give me a fairly cheap (but still quality) FC- attached JBOD utilizing SATA/SAS disks and I'll be really happy! :-) ... with write cache and dual redundant controllers? I think we call that the Sun StorageTek 3511. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Enhance 1U eSATA storage device and Solaris 10?
Hi all, I'm thinking of using an Enhance 1U R4 SA eSATA storage device with my Solaris 10 based Sun X2100 box. Has anyone used this before and have any experiences you could share? The device starts at about $450. http://www.enhance-tech.com/products/diskarrays/R4SA.html Thanks. -M ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss