[zfs-discuss] Data distribution not even between vdevs

2011-11-08 Thread Ding Honghui
Hi list,

My zfs write performance is poor and need your help.

I create zpool with 2 raidz1. When the space is to be used up, I add 2
another raidz1 to extend the zpool.
After some days, the zpool is almost full, I remove some old data.

But now, as show below, the first 2 raidz1 vdev usage is about 78% and the
last 2 raidz1 vdev usage is about 93%.

I have line in /etc/system

set zfs:metaslab_df_free_pct=4

So the performance degrade will happen when the vdev usage is above 90%.

All my file is small files which size is about 150KB.

Now the questions is:
1. Should I balance the data between the vdevs by copy the data and remove
the data which locate in last 2 vdevs?
2. Is there any method to automatically re-balance the data?
or
Any better solution to resolve this problem?

root@nas-01:~# zpool iostat -v
   capacity operations
bandwidth
pool used  avail   read  write   read
write
--  -  -  -  -  -
-
datapool21.3T  3.93T 26 96  81.4K
2.81M
  raidz14.93T  1.39T  8 28  25.7K
708K
c3t600221900085486703B2490FB009d0  -  -  3 10
216K   119K
c3t600221900085486703B4490FB063d0  -  -  3 10
214K   119K
c3t6002219000852889055F4CB79C10d0  -  -  3 10
214K   119K
c3t600221900085486703B8490FB0FFd0  -  -  3 10
215K   119K
c3t600221900085486703BA490FB14Fd0  -  -  3 10
215K   119K
c3t6002219000852889041C490FAFA0d0  -  -  3 10
215K   119K
c3t600221900085486703C0490FB27Dd0  -  -  3 10
214K   119K
  raidz14.64T  1.67T  8 32  24.6K
581K
c3t600221900085486703C2490FB2BFd0  -  -  3 10
224K  98.2K
c3t6002219000852889041F490FAFD0d0  -  -  3 10
222K  98.2K
c3t60022190008528890428490FB0D8d0  -  -  3 10
222K  98.2K
c3t60022190008528890422490FB02Cd0  -  -  3 10
223K  98.3K
c3t60022190008528890425490FB07Cd0  -  -  3 10
223K  98.3K
c3t60022190008528890434490FB24Ed0  -  -  3 10
223K  98.3K
c3t6002219000852889043949100968d0  -  -  3 10
224K  98.2K
  raidz15.88T   447G  5 17  16.0K
67.7K
c3t6002219000852889056B4CB79D66d0  -  -  3 12
215K  12.2K
c3t600221900085486704B94CB79F91d0  -  -  3 12
216K  12.2K
c3t600221900085486704BB4CB79FE1d0  -  -  3 12
214K  12.2K
c3t600221900085486704BD4CB7A035d0  -  -  3 12
215K  12.2K
c3t600221900085486704BF4CB7A0ABd0  -  -  3 12
216K  12.2K
c3t6002219000852889055C4CB79BB8d0  -  -  3 12
214K  12.2K
c3t600221900085486704C14CB7A0FDd0  -  -  3 12
215K  12.2K
  raidz15.88T   441G  4  1  14.9K
12.4K
c3t6002219000852889042B490FB124d0  -  -  1  1
131K  2.33K
c3t600221900085486704C54CB7A199d0  -  -  1  1
132K  2.33K
c3t600221900085486704C74CB7A1D5d0  -  -  1  1
130K  2.33K
c3t600221900085288905594CB79B64d0  -  -  1  1
133K  2.33K
c3t600221900085288905624CB79C86d0  -  -  1  1
132K  2.34K
c3t600221900085288905654CB79CCCd0  -  -  1  1
131K  2.34K
c3t600221900085288905684CB79D1Ed0  -  -  1  1
132K  2.33K
  c3t6B8AC6FF837605864DC9E9F1d0  0   928G  0 16289
1.47M
--  -  -  -  -  -
-

root@nas-01:~#
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] solaris 10u8 hangs with message Disconnected command timeout for Target 0

2011-08-16 Thread Ding Honghui
Hi,

My solaris storage hangs. I login to the console and there is messages[1]
display on the console.
I can't login into the console and seems the IO is totally blocked.

The system is solaris 10u8 on Dell R710 with disk array Dell MD3000. 2 HBA
cable connect the server and MD3000.
The symptom is random.

It is very appreciated if any one can help me out.

Regards,
Ding

[1]
Aug 16 13:14:16 nas-hz-02 scsi: WARNING: /pci@0,0/pci8086,3410@9
/pci8086,32c@0/pci1028,1f04@8 (mpt1):
Aug 16 13:14:16 nas-hz-02   Disconnected command timeout for Target 0
Aug 16 13:14:16 nas-hz-02 scsi: WARNING:
/scsi_vhci/disk@g60026b900053aa1802a44b8f0ded (sd47):
Aug 16 13:14:16 nas-hz-02   Error for Command: write(10)
Error Level: Retryable
Aug 16 13:14:16 nas-hz-02 scsi: Requested Block:
1380679073Error Block: 1380679073
Aug 16 13:14:16 nas-hz-02 scsi: Vendor:
DELL   Serial Number:
Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention
Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset),
ASCQ: 0x4, FRU: 0x0
Aug 16 13:14:16 nas-hz-02 scsi: WARNING:
/scsi_vhci/disk@g60026b900053aa18029e4b8f0d61 (sd41):
Aug 16 13:14:16 nas-hz-02   Error for Command: write(10)
Error Level: Retryable
Aug 16 13:14:16 nas-hz-02 scsi: Requested Block:
1380679072Error Block: 1380679072
Aug 16 13:14:16 nas-hz-02 scsi: Vendor:
DELL   Serial Number:
Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention
Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset),
ASCQ: 0x4, FRU: 0x0
Aug 16 13:14:16 nas-hz-02 scsi: WARNING:
/scsi_vhci/disk@g60026b900053aa1802a24b8f0dc5 (sd45):
Aug 16 13:14:16 nas-hz-02   Error for Command: write(10)
Error Level: Retryable
Aug 16 13:14:16 nas-hz-02 scsi: Requested Block:
1380679073Error Block: 1380679073
Aug 16 13:14:16 nas-hz-02 scsi: Vendor:
DELL   Serial Number:
Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention
Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset),
ASCQ: 0x4, FRU: 0x0
Aug 16 13:14:16 nas-hz-02 scsi: WARNING:
/scsi_vhci/disk@g60026b900053aa18029c4b8f0d35 (sd39):
Aug 16 13:14:16 nas-hz-02   Error for Command: write(10)
Error Level: Retryable
Aug 16 13:14:16 nas-hz-02 scsi: Requested Block:
1380679072Error Block: 1380679072
Aug 16 13:14:16 nas-hz-02 scsi: Vendor:
DELL   Serial Number:
Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention
Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset),
ASCQ: 0x4, FRU: 0x0
Aug 16 13:14:16 nas-hz-02 scsi: WARNING:
/scsi_vhci/disk@g60026b900053aa1802984b8f0cd2 (sd35):
Aug 16 13:14:16 nas-hz-02   Error for Command: write(10)
Error Level: Retryable
Aug 16 13:14:16 nas-hz-02 scsi: Requested Block:
1380679072Error Block: 1380679072
Aug 16 13:14:16 nas-hz-02 scsi: Vendor:
DELL   Serial Number:
Aug 16 13:14:16 nas-hz-02 scsi: Sense Key: Unit Attention
Aug 16 13:14:16 nas-hz-02 scsi: ASC: 0x29 (device internal reset),
ASCQ: 0x4, FRU: 0x0
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs usable space?

2011-06-14 Thread Ding Honghui

I have 15x1TB disks, each disk usable space should be
1Tib=1B=1/1024/1024/1024G=931G.

As it shows in command format:
# echo | format | grep MD
   3. c4t60026B900053AA1502C74B8F0EADd0 DELL-MD3000-0735-931.01GB
   4. c4t60026B900053AA1502C94B8F0EE3d0 DELL-MD3000-0735-931.01GB
   5. c4t60026B900053AA1502CB4B8F0F0Dd0 DELL-MD3000-0735-931.01GB
   6. c4t60026B900053AA1502CD4B8F0F3Dd0 DELL-MD3000-0735-931.01GB
   7. c4t60026B900053AA1502CF4B8F0F6Dd0 DELL-MD3000-0735-931.01GB
   8. c4t60026B900053AA1502D14B8F0F9Cd0 DELL-MD3000-0735-931.01GB
   9. c4t60026B900053AA1502D34B8F0FC8d0 DELL-MD3000-0735-931.01GB
  10. c4t60026B900053AA1802A04B8F0D91d0 DELL-MD3000-0735-931.01GB
  11. c4t60026B900053AA1802A24B8F0DC5d0 DELL-MD3000-0735-931.01GB
  12. c4t60026B900053AA1802A44B8F0DEDd0 DELL-MD3000-0735-931.01GB
  13. c4t60026B900053AA18029C4B8F0D35d0 DELL-MD3000-0735-931.01GB
  14. c4t60026B900053AA18029E4B8F0D61d0 DELL-MD3000-0735-931.01GB
  15. c4t60026B900053AA18036E4DBF6BA6d0 DELL-MD3000-0735-931.01GB
  16. c4t60026B900053AA1802984B8F0CD2d0 DELL-MD3000-0735-931.01GB
  17. c4t60026B900053AA1503074B901CF3d0 DELL-MD3000-0735-931.01GB
#

I create 2 raidz1(7 disk) and 1 global hot spare as zpool:
NAME STATE READ 
WRITE CKSUM
datapool ONLINE   0 
 0 0
  raidz1 ONLINE   0 
 0 0
c4t60026B900053AA1502C74B8F0EADd0ONLINE   0 
 0 0
c4t60026B900053AA1502C94B8F0EE3d0ONLINE   0 
 0 0
c4t60026B900053AA1502CB4B8F0F0Dd0ONLINE   0 
 0 0
c4t60026B900053AA1502CD4B8F0F3Dd0ONLINE   0 
 0 0
c4t60026B900053AA1502CF4B8F0F6Dd0ONLINE   0 
 0 0
c4t60026B900053AA1502D14B8F0F9Cd0ONLINE   0 
 0 0
c4t60026B900053AA1502D34B8F0FC8d0ONLINE   0 
 0 0
  raidz1 ONLINE   0 
 0 0
spareONLINE   0 
 0 7
  c4t60026B900053AA1802A04B8F0D91d0  ONLINE  10 
 0 0  194K resilvered
  c4t60026B900053AA18036E4DBF6BA6d0  ONLINE   0 
 0 0  531G resilvered
c4t60026B900053AA1802A24B8F0DC5d0ONLINE   0 
 0 0
c4t60026B900053AA1802A44B8F0DEDd0ONLINE   0 
 0 0
c4t60026B900053AA1503074B901CF3d0ONLINE   0 
 0 0
c4t60026B900053AA18029C4B8F0D35d0ONLINE   0 
 0 0
c4t60026B900053AA18029E4B8F0D61d0ONLINE   0 
 0 0
c4t60026B900053AA1802984B8F0CD2d0ONLINE   0 
 0 0

spares
  c4t60026B900053AA18036E4DBF6BA6d0  INUSE 
currently in use


I expect to have 14*931/1024=12.7TB zpool space, but actually, it only 
have 12.6TB zpool space:

# zpool list
NAME   SIZE   USED  AVAILCAP  HEALTH  ALTROOT
datapool  12.6T  9.96T  2.66T78%  ONLINE  -
#

And I expect the zfs usable space is 12*931/1024=10.91TB, but actually, 
it only have 10.58TB zfs space.


Can any one explain where the disk space goes?

Regards,
Ding
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wired write performance problem

2011-06-08 Thread Ding Honghui



On 06/08/2011 12:12 PM, Donald Stahl wrote:

One day, the write performance of zfs degrade.
The write performance decrease from 60MB/s to about 6MB/s in sequence
write.

Command:
date;dd if=/dev/zero of=block bs=1024*128 count=1;date

See this thread:

http://www.opensolaris.org/jive/thread.jspa?threadID=139317tstart=45

And search in the page for:
metaslab_min_alloc_size

Try adjusting the metaslab size and see if it fixes your performance problem.

-Don



metaslab_min_alloc_size is not in use when block allocator isDynamic block 
allocator[1].
So it is not tunable parameter in my case.

Thanks anyway.

[1] 
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c#496


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wired write performance problem

2011-06-08 Thread Ding Honghui
For now, I find it take long time in function metaslab_block_picker in 
metaslab.c.

I guess there maybe many avl search actions.

I still not sure what cause the avl search and if there is any 
parameters to tune for it.


Any suggestions?

On 06/08/2011 05:57 PM, Markus Kovero wrote:

Hi, also see;
http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg45408.html

We hit this with Sol11 though, not sure if it's possible with sol10

Yours
Markus Kovero

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Ding Honghui
Sent: 8. kesäkuuta 2011 6:07
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Wired write performance problem

Hi,

I got a wired write performance and need your help.

One day, the write performance of zfs degrade.
The write performance decrease from 60MB/s to about 6MB/s in sequence write.

Command:
date;dd if=/dev/zero of=block bs=1024*128 count=1;date

The hardware configuration is 1 Dell MD3000 and 1 MD1000 with 30 disks.
The OS is Solaris 10U8, zpool version 15 and zfs version 4.

I run Dtrace to trace the write performance:

fbt:zfs:zfs_write:entry
{
  self-ts = timestamp;
}


fbt:zfs:zfs_write:return
/self-ts/
{
  @time = quantize(timestamp-self-ts);
  self-ts = 0;
}

It shows
 value  - Distribution - count
  8192 | 0
 16384 | 16
 32768 | 3270
 65536 |@@@  898
131072 |@@@  985
262144 | 33
524288 | 1
   1048576 | 1
   2097152 | 3
   4194304 | 0
   8388608 |@180
  16777216 | 33
  33554432 | 0
  67108864 | 0
 134217728 | 0
 268435456 | 1
 536870912 | 1
1073741824 | 2
2147483648 | 0
4294967296 | 0
8589934592 | 0
   17179869184 | 2
   34359738368 | 3
   68719476736 | 0

Compare to a working well storage(1 MD3000), the max write time of zfs_write is 
4294967296, it is about 10 times faster.

Any suggestions?

Thanks
Ding

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wired write performance problem

2011-06-08 Thread Ding Honghui

On 06/08/2011 04:05 PM, Tomas Ögren wrote:

On 08 June, 2011 - Donald Stahl sent me these 0,6K bytes:


One day, the write performance of zfs degrade.
The write performance decrease from 60MB/s to about 6MB/s in sequence
write.

Command:
date;dd if=/dev/zero of=block bs=1024*128 count=1;date

See this thread:

http://www.opensolaris.org/jive/thread.jspa?threadID=139317tstart=45

And search in the page for:
metaslab_min_alloc_size

Try adjusting the metaslab size and see if it fixes your performance problem.

And if pool usage is90%, then there's another problem (change of
finding free space algorithm).

/Tomas


Tomas,

Thanks for your suggestion.

You are right.

I have tune parameter metaslab_df_free_pct from 35 to 4 to reduce this 
problem some days ago.

The performance keep good for about 1 week and performance degrade again.

And I still not sure how many operation run into best fit block allocate 
policy

and how many run into fist fit block allocate policy in current situation.

It's very appreciate if you can help.

Regards,
Ding


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wired write performance problem

2011-06-08 Thread Ding Honghui

On 06/08/2011 09:15 PM, Donald Stahl wrote:

metaslab_min_alloc_size is not in use when block allocator isDynamic block
allocator[1].
So it is not tunable parameter in my case.

May I ask where it says this is not a tunable in that case? I've read
through the code and I don't see what you are talking about.

The problem you are describing- including the long time in function
metaslab_block_picker exactly matches the block picker trying to find
a large enough block and failing.

What value do you get when you run:
echo metaslab_min_alloc_size/K | mdb -kw
?

You can always try setting it via:
echo metaslab_min_alloc_size/Z 1000 | mdb -kw

and if that doesn't work set it right back.

I'm not familiar with the specifics of Solaris 10u8 so perhaps this is
not a tunable in that version but if it is- I would suggest you try
changing it. If your performance is as bad as you say then it can't
hurt to try it.

-Don


Thanks very much, Don.

In Solaris 10u8:
root@nas-hz-01:~# uname -a
SunOS nas-hz-01 5.10 Generic_141445-09 i86pc i386 i86pc
root@nas-hz-01:~# echo metaslab_min_alloc_size/K | mdb -kw
mdb: failed to dereference symbol: unknown symbol name
root@nas-hz-01:~#

The pool version is 15 and zfs version is 4.

And this parameter is valid in my openindiana build 148, it's zpool 
version is 28 and zfs version is 5.

ops@oi:~$ echo metaslab_min_alloc_size/Z 1000 | pfexec mdb -kw
metaslab_min_alloc_size:0x1000  =   0x1000
ops@oi:~$

I'm not sure which version introduce the parameter.

Should I run this openindiana? Any suggestions?

Regards,
Ding
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wired write performance problem

2011-06-08 Thread Ding Honghui


On 06/09/2011 12:23 AM, Donald Stahl wrote:

Another (less satisfying) workaround is to increase the amount of free space
in the pool, either by reducing usage or adding more storage. Observed
behavior is that allocation is fast until usage crosses a threshhold, then
performance hits a wall.

We actually tried this solution. We were at 70% usage and performance
hit a wall. We figured it was because of the change of fit algorithm
so we added 16 2TB disks in mirrors. (Added 16TB to an 18TB pool). It
made almost no difference in our pool performance. It wasn't until we
told the metaslab allocator to stop looking for such large chunks that
the problem went away.


The original poster's pool is about 78% full.  If possible, try freeing
stuff until usage goes back under 75% or 70% and see if your performance
returns.

Freeing stuff did fix the problem for us (temporarily) but only in an
indirect way. When we freed up a bunch of space, the metaslab
allocator was able to find large enough blocks to write to without
searching all over the place. This would fix the performance problem
until those large free blocks got used up. Then- even though we were
below the usage problem threshold from earlier- we would still have
the performance problem.

-Don
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Don,

From your words, my symptom is almost same with yours.

We have examine the metaslab layout, when metaslab_df_free_pct is 35, 
there are 65 free metaslab(64G),
The write performance is very low and the rough test shows no new free 
metaslab will be loaded and activated.


Then we tune the metaslab_df_free_pct to 4, the performance keep good 
for 1 week and the free metaslab reduce to 51.
But now, the write bandwidth is poor again ( maybe I'd better trace the 
free space of each metaslab? )


Maybe there are some problem in metaslab rating score(weight) for select 
the metaslab and block allocator algorithm?


There is snapshot of metaslab layout, the last 51 metaslabs have 64G 
free space.


vdev offsetspacemap  free
--   ---   ---   -

... snip

vdev 3   offset  270   spacemap440   free21.0G
vdev 3   offset  280   spacemap 31   free7.36G
vdev 3   offset  290   spacemap 32   free2.44G
vdev 3   offset  2a0   spacemap 33   free2.91G
vdev 3   offset  2b0   spacemap 34   free3.25G
vdev 3   offset  2c0   spacemap 35   free3.03G
vdev 3   offset  2d0   spacemap 36   free3.20G
vdev 3   offset  2e0   spacemap 90   free3.28G
vdev 3   offset  2f0   spacemap 91   free2.46G
vdev 3   offset  300   spacemap 92   free2.98G
vdev 3   offset  310   spacemap 93   free2.19G
vdev 3   offset  320   spacemap 94   free2.42G
vdev 3   offset  330   spacemap 95   free2.83G
vdev 3   offset  340   spacemap252   free41.6G
vdev 3   offset  350   spacemap  0   free  64G
vdev 3   offset  360   spacemap  0   free  64G
vdev 3   offset  370   spacemap  0   free  64G
vdev 3   offset  380   spacemap  0   free  64G
vdev 3   offset  390   spacemap  0   free  64G
vdev 3   offset  3a0   spacemap  0   free  64G
vdev 3   offset  3b0   spacemap  0   free  64G
vdev 3   offset  3c0   spacemap  0   free  64G
vdev 3   offset  3d0   spacemap  0   free  64G
vdev 3   offset  3e0   spacemap  0   free  64G
...snip
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wired write performance problem

2011-06-08 Thread Ding Honghui

On 06/09/2011 10:14 AM, Ding Honghui wrote:


On 06/09/2011 12:23 AM, Donald Stahl wrote:
Another (less satisfying) workaround is to increase the amount of 
free space

in the pool, either by reducing usage or adding more storage. Observed
behavior is that allocation is fast until usage crosses a 
threshhold, then

performance hits a wall.

We actually tried this solution. We were at 70% usage and performance
hit a wall. We figured it was because of the change of fit algorithm
so we added 16 2TB disks in mirrors. (Added 16TB to an 18TB pool). It
made almost no difference in our pool performance. It wasn't until we
told the metaslab allocator to stop looking for such large chunks that
the problem went away.


The original poster's pool is about 78% full.  If possible, try freeing
stuff until usage goes back under 75% or 70% and see if your 
performance

returns.

Freeing stuff did fix the problem for us (temporarily) but only in an
indirect way. When we freed up a bunch of space, the metaslab
allocator was able to find large enough blocks to write to without
searching all over the place. This would fix the performance problem
until those large free blocks got used up. Then- even though we were
below the usage problem threshold from earlier- we would still have
the performance problem.

-Don
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Don,

From your words, my symptom is almost same with yours.

We have examine the metaslab layout, when metaslab_df_free_pct is 35, 
there are 65 free metaslab(64G),
The write performance is very low and the rough test shows no new free 
metaslab will be loaded and activated.


Then we tune the metaslab_df_free_pct to 4, the performance keep good 
for 1 week and the free metaslab reduce to 51.
But now, the write bandwidth is poor again ( maybe I'd better trace 
the free space of each metaslab? )


Maybe there are some problem in metaslab rating score(weight) for 
select the metaslab and block allocator algorithm?


There is snapshot of metaslab layout, the last 51 metaslabs have 64G 
free space.


vdev offsetspacemap  free
--   ---   ---   
-


... snip

vdev 3   offset  270   spacemap440   free
21.0G
vdev 3   offset  280   spacemap 31   free
7.36G
vdev 3   offset  290   spacemap 32   free
2.44G
vdev 3   offset  2a0   spacemap 33   free
2.91G
vdev 3   offset  2b0   spacemap 34   free
3.25G
vdev 3   offset  2c0   spacemap 35   free
3.03G
vdev 3   offset  2d0   spacemap 36   free
3.20G
vdev 3   offset  2e0   spacemap 90   free
3.28G
vdev 3   offset  2f0   spacemap 91   free
2.46G
vdev 3   offset  300   spacemap 92   free
2.98G
vdev 3   offset  310   spacemap 93   free
2.19G
vdev 3   offset  320   spacemap 94   free
2.42G
vdev 3   offset  330   spacemap 95   free
2.83G
vdev 3   offset  340   spacemap252   free
41.6G
vdev 3   offset  350   spacemap  0   free  
64G
vdev 3   offset  360   spacemap  0   free  
64G
vdev 3   offset  370   spacemap  0   free  
64G
vdev 3   offset  380   spacemap  0   free  
64G
vdev 3   offset  390   spacemap  0   free  
64G
vdev 3   offset  3a0   spacemap  0   free  
64G
vdev 3   offset  3b0   spacemap  0   free  
64G
vdev 3   offset  3c0   spacemap  0   free  
64G
vdev 3   offset  3d0   spacemap  0   free  
64G
vdev 3   offset  3e0   spacemap  0   free  
64G

...snip


I free up some disk space(about 300GB), the performance is back again.
I'm sure the performance will degrade again soon.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Wired write performance problem

2011-06-07 Thread Ding Honghui

Hi,

I got a wired write performance and need your help.

One day, the write performance of zfs degrade.
The write performance decrease from 60MB/s to about 6MB/s in sequence write.

Command:
date;dd if=/dev/zero of=block bs=1024*128 count=1;date

The hardware configuration is 1 Dell MD3000 and 1 MD1000 with 30 disks.
The OS is Solaris 10U8, zpool version 15 and zfs version 4.

I run Dtrace to trace the write performance:

fbt:zfs:zfs_write:entry
{
self-ts = timestamp;
}


fbt:zfs:zfs_write:return
/self-ts/
{
@time = quantize(timestamp-self-ts);
self-ts = 0;
}

It shows
   value  - Distribution - count
8192 | 0
   16384 | 16
   32768 | 3270
   65536 |@@@  898
  131072 |@@@  985
  262144 | 33
  524288 | 1
 1048576 | 1
 2097152 | 3
 4194304 | 0
 8388608 |@180
16777216 | 33
33554432 | 0
67108864 | 0
   134217728 | 0
   268435456 | 1
   536870912 | 1
  1073741824 | 2
  2147483648 | 0
  4294967296 | 0
  8589934592 | 0
 17179869184 | 2
 34359738368 | 3
 68719476736 | 0

Compare to a working well storage(1 MD3000), the max write time of 
zfs_write is 4294967296, it is about 10 times faster.


Any suggestions?

Thanks
Ding

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Wired write performance problem

2011-06-07 Thread Ding Honghui

And one comment:
When we do write operation(by command dd), heavy read operation 
increased from zero to 3M for each disk,

and the write bandwidth is poor.
The disk io %b increase from 0 to about 60.

I don't understand why this happened.

   capacity operations
bandwidth
pool used  avail   read  write   
read  write
--  -  -  -  -  
-  -
datapool19.8T  5.48T543 47  
1.74M  5.89M
  raidz15.64T   687G146 13   
480K  1.66M
c3t600221900085486703B2490FB009d0  -  - 49 13  
3.26M   293K
c3t600221900085486703B4490FB063d0  -  - 48 13  
3.19M   296K
c3t6002219000852889055F4CB79C10d0  -  - 48 13  
3.19M   293K
c3t600221900085486703B8490FB0FFd0  -  - 50 13  
3.28M   284K
c3t600221900085486703BA490FB14Fd0  -  - 50 13  
3.31M   287K
c3t6002219000852889041C490FAFA0d0  -  - 49 14  
3.27M   297K
c3t600221900085486703C0490FB27Dd0  -  - 48 14  
3.24M   300K
  raidz15.73T   594G102  7   
337K   996K
c3t600221900085486703C2490FB2BFd0  -  - 52  5  
3.59M   166K
c3t6002219000852889041F490FAFD0d0  -  - 54  5  
3.72M   166K
c3t60022190008528890428490FB0D8d0  -  - 55  5  
3.79M   166K
c3t60022190008528890422490FB02Cd0  -  - 52  5  
3.57M   166K
c3t60022190008528890425490FB07Cd0  -  - 53  5  
3.64M   166K
c3t60022190008528890434490FB24Ed0  -  - 55  5  
3.76M   166K
c3t6002219000852889043949100968d0  -  - 55  5  
3.83M   166K
  raidz15.81T   519G117 10   
388K  1.26M
c3t6002219000852889056B4CB79D66d0  -  - 46  9  
3.09M   215K
c3t600221900085486704B94CB79F91d0  -  - 44  9  
2.91M   215K
c3t600221900085486704BB4CB79FE1d0  -  - 44  9  
2.97M   224K
c3t600221900085486704BD4CB7A035d0  -  - 44  9  
2.96M   215K
c3t600221900085486704BF4CB7A0ABd0  -  - 44  9  
2.97M   216K
c3t6002219000852889055C4CB79BB8d0  -  - 45  9  
3.04M   215K
c3t600221900085486704C14CB7A0FDd0  -  - 46  9  
3.02M   215K
  raidz12.59T  3.72T176 16   
581K  2.00M
c3t6002219000852889042B490FB124d0  -  - 48  5  
3.21M   342K
c3t600221900085486704C54CB7A199d0  -  - 46  5  
2.99M   342K
c3t600221900085486704C74CB7A1D5d0  -  - 49  5  
3.27M   342K
c3t600221900085288905594CB79B64d0  -  - 46  6  
3.00M   342K
c3t600221900085288905624CB79C86d0  -  - 47  6  
3.11M   342K
c3t600221900085288905654CB79CCCd0  -  - 50  6  
3.29M   342K
c3t600221900085288905684CB79D1Ed0  -  - 45  5  
2.98M   342K
  c3t6B8AC6FF837605864DC9E9F1d0 4K   928G  0  
0  0  0
--  -  -  -  -  
-  -


^C
root@nas-hz-01:~#


On 06/08/2011 11:07 AM, Ding Honghui wrote:

Hi,

I got a wired write performance and need your help.

One day, the write performance of zfs degrade.
The write performance decrease from 60MB/s to about 6MB/s in sequence 
write.


Command:
date;dd if=/dev/zero of=block bs=1024*128 count=1;date

The hardware configuration is 1 Dell MD3000 and 1 MD1000 with 30 disks.
The OS is Solaris 10U8, zpool version 15 and zfs version 4.

I run Dtrace to trace the write performance:

fbt:zfs:zfs_write:entry
{
self-ts = timestamp;
}


fbt:zfs:zfs_write:return
/self-ts/
{
@time = quantize(timestamp-self-ts);
self-ts = 0;
}

It shows
   value  - Distribution - count
8192 | 0
   16384 | 16
   32768 | 3270
   65536 |@@@  898
  131072 |@@@  985
  262144 | 33
  524288 | 1
 1048576 | 1
 2097152 | 3
 4194304 | 0
 8388608 |@180
16777216 | 33
33554432 | 0