Re: [zfs-discuss] Extremely Slow ZFS Performance

2011-05-06 Thread Garrett D'Amore
Sounds like a nasty bug, and not one I've seen in illumos or
NexentaStor.  What build are you running?

- Garrett

On Wed, 2011-05-04 at 15:40 -0700, Adam Serediuk wrote:
> Dedup is disabled (confirmed to be.) Doing some digging it looks like
> this is a very similar issue
> to http://forums.oracle.com/forums/thread.jspa?threadID=2200577&tstart=0.
> 
> 
> 
> On May 4, 2011, at 2:26 PM, Garrett D'Amore wrote:
> 
> > My first thought is dedup... perhaps you've got dedup enabled and
> > the DDT no longer fits in RAM?  That would create a huge performance
> > cliff.
> > 
> > -Original Message-
> > From: zfs-discuss-boun...@opensolaris.org on behalf of Eric D.
> > Mudama
> > Sent: Wed 5/4/2011 12:55 PM
> > To: Adam Serediuk
> > Cc: zfs-discuss@opensolaris.org
> > Subject: Re: [zfs-discuss] Extremely Slow ZFS Performance
> > 
> > On Wed, May  4 at 12:21, Adam Serediuk wrote:
> > >Both iostat and zpool iostat show very little to zero load on the
> > devices even while blocking.
> > >
> > >Any suggestions on avenues of approach for troubleshooting?
> > 
> > is 'iostat -en' error free?
> > 
> > 
> > --
> > Eric D. Mudama
> > edmud...@bounceswoosh.org
> > 
> > ___
> > zfs-discuss mailing list
> > zfs-discuss@opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> > 
> > 
> > 
> > 
> 
> 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Extremely Slow ZFS Performance

2011-05-04 Thread Adam Serediuk
On May 4, 2011, at 4:16 PM, Victor Latushkin wrote:

> Try
> 
> echo metaslab_debug/W1 | mdb -kw
> 
> If it does not help, reset it back to zero 
> 
> echo metaslab_debug/W0 | mdb -kw

That appears to have resolved the issue! Within seconds of making the change 
performance has increased by an order of magnitude. I was typing the reply 
below when your message came in. Is this bug 7000208?

On May 4, 2011, at 4:01 PM, Garrett D'Amore wrote:

> Sounds like a nasty bug, and not one I've seen in illumos or
> NexentaStor.  What build are you running?


running snv_151a

Running some synthetic tests right now and comparing the various stats, one 
thing that stands out as very different on this system compared to our others 
is that writes seem to be going to ~5 mirror sets at a time (of 22 configured.) 
The next batch of writes will move on to the next ~5 mirror sets, and so forth 
cycling around. The other systems will write to many more mirror sets 
simultaneously. This particular machine does not appear to be buffering writes 
and appears to be doing everything sync to disk despite having sync/zil 
disabled.

I'm trying to do a little more introspection into the zpool thread that is 
using cpu but not having much luck finding anything meaningful. Occasionally 
the cpu usage for that thread will drop, and when it does performance of the 
filesystem increases.


> On Wed, 2011-05-04 at 15:40 -0700, Adam Serediuk wrote:
>> Dedup is disabled (confirmed to be.) Doing some digging it looks like
>> this is a very similar issue
>> to http://forums.oracle.com/forums/thread.jspa?threadID=2200577&tstart=0.
>> 
>> 
>> 
>> On May 4, 2011, at 2:26 PM, Garrett D'Amore wrote:
>> 
>>> My first thought is dedup... perhaps you've got dedup enabled and
>>> the DDT no longer fits in RAM?  That would create a huge performance
>>> cliff.
>>> 
>>> -Original Message-
>>> From: zfs-discuss-boun...@opensolaris.org on behalf of Eric D.
>>> Mudama
>>> Sent: Wed 5/4/2011 12:55 PM
>>> To: Adam Serediuk
>>> Cc: zfs-discuss@opensolaris.org
>>> Subject: Re: [zfs-discuss] Extremely Slow ZFS Performance
>>> 
>>> On Wed, May  4 at 12:21, Adam Serediuk wrote:
>>>> Both iostat and zpool iostat show very little to zero load on the
>>> devices even while blocking.
>>>> 
>>>> Any suggestions on avenues of approach for troubleshooting?
>>> 
>>> is 'iostat -en' error free?
>>> 
>>> 
>>> --
>>> Eric D. Mudama
>>> edmud...@bounceswoosh.org
>>> 
>>> ___
>>> zfs-discuss mailing list
>>> zfs-discuss@opensolaris.org
>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Extremely Slow ZFS Performance

2011-05-04 Thread Adam Serediuk
Dedup is disabled (confirmed to be.) Doing some digging it looks like this is a 
very similar issue to 
http://forums.oracle.com/forums/thread.jspa?threadID=2200577&tstart=0.


On May 4, 2011, at 2:26 PM, Garrett D'Amore wrote:

> My first thought is dedup... perhaps you've got dedup enabled and the DDT no 
> longer fits in RAM?  That would create a huge performance cliff.
> 
> -Original Message-
> From: zfs-discuss-boun...@opensolaris.org on behalf of Eric D. Mudama
> Sent: Wed 5/4/2011 12:55 PM
> To: Adam Serediuk
> Cc: zfs-discuss@opensolaris.org
> Subject: Re: [zfs-discuss] Extremely Slow ZFS Performance
> 
> On Wed, May  4 at 12:21, Adam Serediuk wrote:
> >Both iostat and zpool iostat show very little to zero load on the devices 
> >even while blocking.
> >
> >Any suggestions on avenues of approach for troubleshooting?
> 
> is 'iostat -en' error free?
> 
> 
> --
> Eric D. Mudama
> edmud...@bounceswoosh.org
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
> 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Extremely Slow ZFS Performance

2011-05-04 Thread Adam Serediuk
iostat doesn't show any high service times and fsstat also shows low 
throughput. Occasionally I can generate enough load that you do see some very 
high asvc_t but when that occurs the pool is performing as expected. As a 
precaution I just added two extra drives to the zpool incase zfs was having 
difficulty finding a location to allocate new blocks and it has made no 
difference. You can still see still see block allocation cycling evenly across 
the available mirrors.



iostat -xnz 3 before, during, and after the blocking
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.0   33.30.0  184.1  0.0  0.10.01.5   0   1 c12t0d0
0.0   15.30.0   98.7  0.0  0.00.01.0   0   0 c12t1d0
0.0   15.30.0   98.7  0.0  0.00.01.1   0   0 c12t2d0
0.0   14.30.0   98.7  0.0  0.00.01.0   0   0 c12t6d0
0.0   20.70.0  256.6  0.0  0.00.01.4   0   0 c11t5d0
0.0   48.00.0  273.4  0.0  0.10.01.3   0   1 c11t6d0
0.0   34.00.0  199.9  0.0  0.00.01.5   0   1 c11t7d0
0.0   20.30.0  256.6  0.0  0.00.01.4   0   0 c10t5d0
0.0   47.70.0  273.4  0.0  0.10.01.2   0   1 c10t6d0
0.0   34.00.0  199.9  0.0  0.00.01.5   0   1 c10t7d0
0.09.70.0 1237.9  0.0  0.00.04.7   0   1 c14t0d0
0.0   33.30.0  184.1  0.0  0.10.01.8   0   1 c13t0d0
0.0   15.70.0   98.7  0.0  0.00.01.2   0   0 c13t1d0
0.0   15.70.0   98.7  0.0  0.00.01.0   0   0 c13t2d0
0.0   13.70.0   98.7  0.0  0.00.01.1   0   0 c13t6d0

extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.00.30.00.0  0.0  0.00.00.0   0   0 c8t1d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c8t2d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c8t3d0
0.30.31.70.0  0.0  0.00.06.0   0   0 c8t4d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c8t5d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c8t6d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c8t7d0
0.02.00.01.3  0.0  0.00.00.2   0   0 c12t0d0
0.0   15.70.0  235.9  0.0  0.00.00.8   0   0 c12t1d0
0.3   16.31.8  235.9  0.0  0.00.01.2   0   1 c12t2d0
0.3   15.02.8  130.6  0.0  0.00.01.0   0   1 c12t3d0
0.0   11.70.0  127.9  0.0  0.00.00.8   0   0 c12t4d0
0.30.30.20.0  0.0  0.00.09.0   0   1 c12t5d0
0.0   40.30.0  599.9  0.0  0.00.00.8   0   1 c12t6d0
0.30.32.80.0  0.0  0.00.03.2   0   0 c11t0d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c11t1d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c11t2d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c11t3d0
0.70.33.00.0  0.0  0.00.06.5   0   1 c11t4d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c11t5d0
0.32.01.51.3  0.0  0.00.02.8   0   1 c11t6d0
0.02.00.01.3  0.0  0.00.00.2   0   0 c11t7d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c10t0d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c10t1d0
0.30.30.20.0  0.0  0.00.06.7   0   0 c10t2d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c10t3d0
0.30.31.50.0  0.0  0.00.09.1   0   1 c10t4d0
0.30.31.50.0  0.0  0.00.07.7   0   1 c10t5d0
0.32.01.51.3  0.0  0.00.02.2   0   0 c10t6d0
0.02.00.01.3  0.0  0.00.00.2   0   0 c10t7d0
0.00.30.0   42.6  0.0  0.00.00.7   0   0 c14t1d0
0.30.31.50.0  0.0  0.00.05.9   0   0 c9t1d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c9t2d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c9t3d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c9t4d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c9t5d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c9t6d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c9t7d0
0.02.00.01.3  0.0  0.00.00.4   0   0 c13t0d0
0.0   15.70.0  235.9  0.0  0.00.00.8   0   0 c13t1d0
0.0   16.30.0  235.9  0.0  0.00.00.9   0   0 c13t2d0
0.0   14.70.0  130.6  0.0  0.00.00.7   0   0 c13t3d0
0.0   12.00.0  127.9  0.0  0.00.00.9   0   0 c13t4d0
0.00.30.00.0  0.0  0.00.00.0   0   0 c13t5d0
0.0   39.60.0  599.9  0.0  0.00.00.8   0   1 c13t6d0
extended device statistics  
r/sw/s   kr/

Re: [zfs-discuss] Extremely Slow ZFS Performance

2011-05-04 Thread Eric D. Mudama

On Wed, May  4 at 12:21, Adam Serediuk wrote:

Both iostat and zpool iostat show very little to zero load on the devices even 
while blocking.

Any suggestions on avenues of approach for troubleshooting?


is 'iostat -en' error free?


--
Eric D. Mudama
edmud...@bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Extremely Slow ZFS Performance

2011-05-04 Thread Adam Serediuk
We have an X4540 running Solaris 11 Express snv_151a that has developed an 
issue where its write performance is absolutely abysmal. Even touching a file 
takes over five seconds both locally and remotely.

/pool1/data# time touch foo

real0m5.305s
user0m0.001s
sys 0m0.004s
/pool1/data# time rm foo

real0m5.912s
user0m0.001s
sys 0m0.005s

The system exhibits this issue under the slightest load.  We have sync=disabled 
set on all filesystems in this pool. The pool is at 75% capacity and is 
healthy. This issue started suddenly several days ago and persists after 
reboot. prstat shows zpool-pool1/150 taking 10% CPU constantly whereas other 
similar systems in our infrastructure under the same load do not. Even doing a 
'zfs set' on a property takes up to 10 seconds and on other systems is 
instantaneous. Something appears to be blocking internally.

Both iostat and zpool iostat show very little to zero load on the devices even 
while blocking.

Any suggestions on avenues of approach for troubleshooting?

Thanks,

Adam


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss