Re: [zfs-discuss] Resilver/scrub times?
On my home server (currently having problems with random reboots), it takes around 1.5hrs to do a scrub of my RAIDZ1 6 x 1.5TB array, with around 2TB of data on it. Specs are: CPU: core2duo 2.5GHz RAM: 2GB 800Mhz DDR2 OS DIsks: 120GB Seagate ATA Storage drives: 6 x 15TB seagate sata2 7200rpm -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Kernel panic - directed here from networking
Hi, I have been having problems with reboots, it usually happens when I am either sending or receiving data on the server, it can be over CIFS, or HTTP, NNTP. SO could be a networking problem, but they directed me here or to CIFS, but as it happens when I'm not using CIFS (but the service is still running) its probably not CIFS. I have checked for faulty RAM, ran memtest86+ (4.0), ran through multiple times without problem. The previous thread is http://opensolaris.org/jive/thread.jspa?threadID=116843 I have had 2 reboots today, within 10 minutes of each other. The previous 2 crashes produced the following: r...@nas:/var/crash/NAS# echo '$c' | mdb -k 11 page_create_va+0x314(fbc30210, ff016060d000, 2, 53, ff00048c25d0, ff016060d000) segkmem_page_create+0x8d(ff016060d000, 2, 4, fbc30210) segkmem_xalloc+0xc0(ff0146e1f000, 0, 2, 4, 0, fb880cb8) segkmem_alloc_vn+0xcd(ff0146e1f000, 2, 4, fbc30210) segkmem_alloc+0x24(ff0146e1f000, 2, 4) vmem_xalloc+0x546(ff0146e2, 2, 1000, 0, 0, 0) vmem_alloc+0x161(ff0146e2, 2, 4) kmem_slab_create+0x81(ff014890f858, 4) kmem_slab_alloc+0x5b(ff014890f858, 4) kmem_cache_alloc+0x130(ff014890f858, 4) zio_buf_alloc+0x2c(2) vdev_queue_io_to_issue+0x42f(ff014c9985a8, 23) vdev_queue_io_done+0x61(ff014d1180a8) zio_vdev_io_done+0x62(ff014d1180a8) zio_execute+0xa0(ff014d1180a8) taskq_thread+0x1b7(ff014c716688) thread_start+8() r...@nas:/var/crash/NAS# echo '$c' | mdb -k 12 fsflush_do_pages+0x1e4() fsflush+0x3a6() thread_start+8() Any help on finding out the problem would be great. Thanks -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Pulsing write performance
Hi, happens on opensolaris build 101b and 111b. Arc cache max set to 6GB, joined to a windows 2003 r2 ad domain. With a pool of 4 15Krpm drives in a 2 way mirror. The bnx driver has been changed to have offloading enabled. Not much else has been changed. Ok, so when the chache fills and needs to be flushed, when the flush occurs it locks access to it, so no read? or writes can occur from cache, and as everything will go through the arc, nothing can happen until the arc has finished its flush. And to compensate for this, I would have to either reduce the cache size to one that is small enough that the disk array can write it at such a speed that the pauses are reduced to ones that are not really noticable. Wouldnt that then impact the overal burst write performance also. Why doesnt the arc allow writes while flushing? or just have 2 caches so that one can keep taking writes while the other flushes. If it allowed writes to the buffer while it was flushing, it would just reduce the write speed down to what the disks can handel wouldnt it? Anyway, thanks for the info I will give that parameter a go, see how it works. Thanks -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Pulsing write performance
Ok, so by limiting the write cache to that of the controller you were able to remove the pauses? How id that affect your overall write performance, if at all? thanks I will give that ago. David -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Pulsing write performance
I dont have any windows machine connected to it over iscsi (yet). My reference to the windows servers was, having the same hapdware running windows and its read writes not having these problems, so it isnt hardware causing it. But when I do eventually get iscsi going I will send a message if i have teh same problems. Also with your replication, whats teh perfomance like, does it impact the overall write performance of your server having it enabled, is the replication continuous? David -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Pulsing write performance
Hi, I was directed here after posting in CIFS discuss (as i first thought that it could be a CIFS problem). I posted the following in CIFS: When using iometer from windows to the file share on opensolaris svn101 and svn111 I get pauses every 5 seconds of around 5 seconds (maybe a little less) where no data is transfered, when data is transfered it is at a fair speed and gets around 1000-2000 iops with 1 thread (depending on the work type). The maximum read response time is 200ms and the maximum write response time is 9824ms, which is very bad, an almost 10 seconds delay in being able to send data to the server. This has been experienced on 2 test servers, the same servers have also been tested with windows server 2008 and they havent shown this problem (the share performance was slightly lower than CIFS, but it was consistent, and the average access time and maximums were very close. I just noticed that if the server hasnt hit its target arc size, the pauses are for maybe .5 seconds, but as soon as it hits its arc target, the iops drop to around 50% of what it was and then there are the longer pauses around 4-5 seconds. and then after every pause the performance slows even more. So it appears it is definately server side. This is with 100% random io with a spread of 33% write 66% read, 2KB blocks. over a 50GB file, no compression, and a 5.5GB target arc size. Also I have just ran some tests with different IO patterns and 100 sequencial writes produce and consistent IO of 2100IOPS, except when it pauses for maybe .5 seconds every 10 - 15 seconds. 100% random writes produce around 200 IOPS with a 4-6 second pause around every 10 seconds. 100% sequencial reads produce around 3700IOPS with no pauses, just random peaks in response time (only 16ms) after about 1 minute of running, so nothing to complain about. 100% random reads produce around 200IOPS, with no pauses. So it appears that writes cause a problem, what is causing these very long write delays? A network capture shows that the server doesnt respond to the write from the client when these pauses occur. Also, when using iometer, the initial file creation doesnt have and pauses in the creation, so it might only happen when modifying files. Any help on finding a solution to this would be really appriciated. David -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss