Re: [zfs-discuss] Resilver/scrub times?

2009-11-22 Thread David Bond
On my home server (currently having problems with random reboots), it takes 
around 1.5hrs to do a scrub of my RAIDZ1 6 x 1.5TB array, with around 2TB of 
data on it.

Specs are:
CPU: core2duo 2.5GHz
RAM: 2GB 800Mhz DDR2
OS DIsks: 120GB Seagate ATA
Storage drives: 6 x 15TB seagate sata2 7200rpm
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Kernel panic - directed here from networking

2009-11-21 Thread David Bond
Hi,

I have been having problems with reboots, it usually happens when I am either 
sending or receiving data on the server, it can be over CIFS, or HTTP, NNTP. SO 
could be a networking problem, but they directed me here or to CIFS, but as it 
happens when I'm not using CIFS (but the service is still running) its probably 
not CIFS. I have checked for faulty RAM, ran memtest86+ (4.0), ran through 
multiple times without problem.

The previous thread is http://opensolaris.org/jive/thread.jspa?threadID=116843

I have had 2 reboots today, within 10 minutes of each other.
The previous 2 crashes produced the following:

r...@nas:/var/crash/NAS# echo '$c' | mdb -k 11
page_create_va+0x314(fbc30210, ff016060d000, 2, 53,
ff00048c25d0, ff016060d000)
segkmem_page_create+0x8d(ff016060d000, 2, 4, fbc30210)
segkmem_xalloc+0xc0(ff0146e1f000, 0, 2, 4, 0, fb880cb8)
segkmem_alloc_vn+0xcd(ff0146e1f000, 2, 4, fbc30210)
segkmem_alloc+0x24(ff0146e1f000, 2, 4)
vmem_xalloc+0x546(ff0146e2, 2, 1000, 0, 0, 0)
vmem_alloc+0x161(ff0146e2, 2, 4)
kmem_slab_create+0x81(ff014890f858, 4)
kmem_slab_alloc+0x5b(ff014890f858, 4)
kmem_cache_alloc+0x130(ff014890f858, 4)
zio_buf_alloc+0x2c(2)
vdev_queue_io_to_issue+0x42f(ff014c9985a8, 23)
vdev_queue_io_done+0x61(ff014d1180a8)
zio_vdev_io_done+0x62(ff014d1180a8)
zio_execute+0xa0(ff014d1180a8)
taskq_thread+0x1b7(ff014c716688)
thread_start+8()

r...@nas:/var/crash/NAS# echo '$c' | mdb -k 12
fsflush_do_pages+0x1e4()
fsflush+0x3a6()
thread_start+8()

Any help on finding out the problem would be great.

Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pulsing write performance

2009-08-29 Thread David Bond
Hi,

happens on opensolaris build 101b and 111b.
Arc cache max set to 6GB, joined to a windows 2003 r2 ad domain. With a pool of 
4 15Krpm drives in a 2 way mirror.
The bnx driver has been changed to have offloading enabled.

Not much else has been changed.

Ok, so when the chache fills and needs to be flushed, when the flush occurs it 
locks access to it, so no read? or writes can occur from cache, and as 
everything will go through the arc, nothing can happen until the arc has 
finished its flush.

And to compensate for this, I would have to either reduce the cache size to one 
that is small enough that the disk array can write it at such a speed that the 
pauses are reduced to ones that are not really noticable.

Wouldnt that then impact the overal burst write performance also. Why doesnt 
the arc allow writes while flushing? or just have 2 caches so that one can keep 
taking writes while the other flushes. If it allowed writes to the buffer while 
it was flushing, it would just reduce the write speed down to what the disks 
can handel wouldnt it?

Anyway, thanks for the info I will give that parameter a go, see how it works.

Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pulsing write performance

2009-08-29 Thread David Bond
Ok,

so by limiting the write cache to that of the controller you were able to 
remove the pauses?

How id that affect your overall write performance, if at all?

thanks I will give that ago.

David
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pulsing write performance

2009-08-29 Thread David Bond
I dont have any windows machine connected to it over iscsi (yet).

My reference to the windows servers was, having the same hapdware running 
windows and its read writes not having these problems, so it isnt hardware 
causing it.

But when I do eventually get iscsi going I will send a message if i have teh 
same problems.

Also with your replication, whats teh perfomance like, does it impact the 
overall write performance of your server having it enabled, is the replication  
continuous?

David
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Pulsing write performance

2009-08-27 Thread David Bond
Hi,

I was directed here after posting in CIFS discuss (as i first thought that it 
could be a CIFS problem).

I posted the following in CIFS:

When using iometer from windows to the file share on opensolaris svn101 and 
svn111 I get pauses every 5 seconds of around 5 seconds (maybe a little less) 
where no data is transfered, when data is transfered it is at a fair speed and 
gets around 1000-2000 iops with 1 thread (depending on the work type). The 
maximum read response time is 200ms and the maximum write response time is 
9824ms, which is very bad, an almost 10 seconds delay in being able to send 
data to the server.
This has been experienced on 2 test servers, the same servers have also been 
tested with windows server 2008 and they havent shown this problem (the share 
performance was slightly lower than CIFS, but it was consistent, and the 
average access time and maximums were very close.


I just noticed that if the server hasnt hit its target arc size, the pauses are 
for maybe .5 seconds, but as soon as it hits its arc target, the iops drop to 
around 50% of what it was and then there are the longer pauses around 4-5 
seconds. and then after every pause the performance slows even more. So it 
appears it is definately server side.

This is with 100% random io with a spread of 33% write 66% read, 2KB blocks. 
over a 50GB file, no compression, and a 5.5GB target arc size.



Also I have just ran some tests with different IO patterns and 100 sequencial 
writes produce and consistent IO of 2100IOPS, except when it pauses for maybe 
.5 seconds every 10 - 15 seconds.

100% random writes produce around 200 IOPS with a 4-6 second pause around every 
10 seconds.

100% sequencial reads produce around 3700IOPS with no pauses, just random peaks 
in response time (only 16ms) after about 1 minute of running, so nothing to 
complain about.

100% random reads produce around 200IOPS, with no pauses.

So it appears that writes cause a problem, what is causing these very long 
write delays?

A network capture shows that the server doesnt respond to the write from the 
client when these pauses occur.

Also, when using iometer, the initial file creation doesnt have and pauses in 
the creation, so it  might only happen when modifying files.

Any help on finding a solution to this would be really appriciated.

David
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss