Re: [zfs-discuss] ZFS Hard disk buffer at 100%

2010-05-09 Thread Ben Rockwood
The drive (c7t2d0)is bad and should be replaced.   The second drive
(c7t5d0) is either bad or going bad.  This is exactly the kind of
problem that can force a Thumper to it knees, ZFS performance is
horrific, and as soon as you drop the bad disks things magicly return to
normal.

My first recommendation is to pull the SMART data from the disks if you
can.  I wrote a blog entry about SMART to address exactly the behavior
your seeing back in 2008:
http://www.cuddletech.com/blog/pivot/entry.php?id=993

Yes, people will claim that SMART data is useless for predicting
failures, but in a case like yours you are just looking for data to
corroborate a hypothesis.

In order to test this condition, zpool offline... c7t2d0, which
emulated removal.  See if performance improves.  On Thumpers I'd build a
list of suspect disks based on 'iostat', like you show, and then
correlate the SMART data, and then systematically offline disks to see
if it really was the problem.

In my experience the only other reason you'll legitimately see really
wierd bottoming out of IO like this is if you hit the max conncurrent
IO limits in ZFS (untill recently that limit was 35), so you'd see
actv=35, and then when the device finally processed the IO's the thing
would snap back to life.  But even in those cases you shouldn't see
request times (asvc_t) rise above 200ms.

All that to say, replace those disks or at least test it.  SSD's won't
help, one or more drives are toast.

benr.



On 5/8/10 9:30 PM, Emily Grettel wrote:
 Hi Giovani,
  
 Thanks for the reply.
  
 Here's a bit of iostat after uncompressing a 2.4Gb RAR file that has 1
 DWF file that we use.

 extended device statistics
 r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 1.0   13.0   26.0   18.0  0.0  0.00.00.8   0   1 c7t1d0
 2.05.0   77.0   12.0  2.4  1.0  343.8  142.8 100 100 c7t2d0
 1.0   16.0   25.5   15.5  0.0  0.00.00.3   0   0 c7t3d0
 0.0   10.00.0   17.0  0.0  0.03.21.2   1   1 c7t4d0
 1.0   12.0   25.5   15.5  0.4  0.1   32.4   10.9  14  14 c7t5d0
 1.0   15.0   25.5   18.0  0.0  0.00.10.1   0   0 c0t1d0
 extended device statistics
 r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 0.00.00.00.0  2.0  1.00.00.0 100 100 c7t2d0
 1.00.00.50.0  0.0  0.00.00.1   0   0 c7t0d0
 extended device statistics
 r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 5.0   15.0  128.0   18.0  0.0  0.00.01.8   0   3 c7t1d0
 1.09.0   25.5   18.0  2.0  1.8  199.7  179.4 100 100 c7t2d0
 3.0   13.0  102.5   14.5  0.0  0.10.05.2   0   5 c7t3d0
 3.0   11.0  102.0   16.5  0.0  0.12.34.2   1   6 c7t4d0
 1.04.0   25.52.0  0.4  0.8   71.3  158.9  12  79 c7t5d0
 5.0   16.0  128.5   19.0  0.0  0.10.12.6   0   5 c0t1d0
 extended device statistics
 r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 0.04.00.02.0  2.0  2.0  496.1  498.0  99 100 c7t2d0
 0.00.00.00.0  0.0  1.00.00.0   0 100 c7t5d0
 extended device statistics
 r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
 7.00.0  204.50.0  0.0  0.00.00.2   0   0 c7t1d0
 1.00.0   25.50.0  3.0  1.0 2961.6 1000.0  99 100 c7t2d0
 8.00.0  282.00.0  0.0  0.00.00.3   0   0 c7t3d0
 6.00.0  282.50.0  0.0  0.06.12.3   1   1 c7t4d0
 0.03.00.05.0  0.5  1.0  165.4  333.3  18 100 c7t5d0
 7.00.0  204.50.0  0.0  0.00.01.6   0   1 c0t1d0
 2.02.0   89.0   12.0  0.0  0.03.16.1   1   2 c3t0d0
 0.02.00.0   12.0  0.0  0.00.00.2   0   0 c3t1d0

 Sometimes two or more disks are going at 100. How does one solve this
 issue if its a firmware bug? I tried looking around for Western
 Digital Firmware for WD10EADS but couldn't find any available.
  
 Would adding an SSD or two help here?
  
 Thanks,
 Em
  
 
 Date: Fri, 7 May 2010 14:38:25 -0300
 Subject: Re: [zfs-discuss] ZFS Hard disk buffer at 100%
 From: gtirl...@sysdroid.com
 To: emilygrettelis...@hotmail.com
 CC: zfs-discuss@opensolaris.org


 On Fri, May 7, 2010 at 8:07 AM, Emily Grettel
 emilygrettelis...@hotmail.com mailto:emilygrettelis...@hotmail.com
 wrote:

 Hi,
  
 I've had my RAIDz volume working well on SNV_131 but it has come
 to my attention that there has been some read issues with the
 drives. Previously I thought this was a CIFS problem but I'm
 noticing that when transfering files or uncompressing some fairly
 large 7z (1-2Gb) files (or even smaller rar - 200-300Mb) files
 occasionally running iostat will give the b% as 100 for a drive or
 two.



 That's 

Re: [zfs-discuss] ZFS Hard disk buffer at 100%

2010-05-09 Thread Emily Grettel

Hi Ben,

 

 The drive (c7t2d0)is bad and should be replaced. 

 The second drive (c7t5d0) is either bad or going bad. 


Dagnabbit. I'm glad you told me this, but I would have thought that running a 
scrub would have alerted me to some fault?

 

 and as soon as you drop the bad disks things magicly return to
 normal.

 

Being a raidz, is it OK for me to actually do zpool offline for one drive 
without degrading the entire pool?

 

I'm wondering whether I should keep using the WD10EADS or ask the business to 
invest in the black versions. I was thinking of the WD1002FAEX (which is 
SATA-III but my cards only do SATA-II) which seems to be better accomodated for 
NAS's. What are other peoples thoughts on this?

 

Here's my current layout - 1,2  3 are 320Gb drives.


   0. c0t1d0 ATA-WDC WD10EADS-00P-0A01-931.51GB
  /p...@0,0/pci1002,5...@4/pci1458,b...@0/d...@1,0
   4. c7t1d0 ATA-WDC WD10EADS-00L-1A01-931.51GB
  /p...@0,0/pci1458,b...@11/d...@1,0
   5. c7t2d0 ATA-WDC WD10EADS-00P-0A01-931.51GB
  /p...@0,0/pci1458,b...@11/d...@2,0
   6. c7t3d0 ATA-WDC WD10EADS-00P-0A01-931.51GB
  /p...@0,0/pci1458,b...@11/d...@3,0
   7. c7t4d0 ATA-WDC WD10EADS-00P-0A01-931.51GB
  /p...@0,0/pci1458,b...@11/d...@4,0
   8. c7t5d0 ATA-WDC WD10EADS-00P-0A01-931.51GB
  /p...@0,0/pci1458,b...@11/d...@5,0


The other thing I was thinking of redoing the way the pool was setup, instead 
of a straight raidz layout, adopting a stripe and mirror? so 3 disks in RAID-0, 
then mirro them to the other three?

 

 http://www.cuddletech.com/blog/pivot/entry.php?id=993

 

Great blog entry! Unfortunately the SUNWhd package isn't available in the repo 
and I haven't been able to locate a similar SMART reader :( But your 
explanations are very valuable.

 

 In my experience the only other reason you'll legitimately see really
 wierd bottoming out of IO like this is if you hit the max conncurrent
 IO limits in ZFS (untill recently that limit was 35), so you'd see
 actv=35, and then when the device finally processed the IO's the thing
 would snap back to life. But even in those cases you shouldn't see
 request times (asvc_t) rise above 200ms.


Hmmm, I did remember another admin tweaking the zfs configuration. Are these to 
blame by chance:

 

/etc/system

set pcplusmp:apic_intr_policy=1
set zfs:zfs_txg_synctime=1
set zfs:zfs_vdev_max_pending=10

 

I've tried to avoid tweaking anything in the ZFS configuration in fear it may 
give worse performance.

 

 All that to say, replace those disks or at least test it. SSD's won't
 help, one or more drives are toast.

 

Thanks mate, I really appreciate some backing about this :-)

 

Cheers,

Em
  
_
Need a new place to live? Find it on Domain.com.au
http://clk.atdmt.com/NMN/go/157631292/direct/01/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Roy Sigurd Karlsbakk
- Bob Friesenhahn bfrie...@simple.dallas.tx.us skrev:

 On Sat, 8 May 2010, Edward Ned Harvey wrote:
 
  A vast majority of the time, the opposite is true.  Most of the
 time, having
  swap available increases performance.  Because the kernel is able to
 choose:
  Should I swap out this idle process, or should I dump files out of
 cache?
  With swap enabled, the kernel is given another degree of freedom, to
 choose
  which is colder:  idle process memory, or cold cached files.
 
 Are you sure about this?  It is always good to be sure ...

This is the case with most OSes now. Swap out stuff early, perhaps keep it in 
RAM and swap at the same time, and the kernel can choose what to do later. In 
Linux you can set it in /proc/sys/vm/swappiness.

Anyone that knows how this is tuned in osol, btw?

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 -- cfgadm won't create attach point (dsk/xxxx)

2010-05-09 Thread Roy Sigurd Karlsbakk
- Giovanni giof...@gmail.com skrev:

 Hi,
 
 Were you ever able to solve this problem on your AOC-SAT2-MV8 card? I
 am in need of purchasing it to add more drives to my server.
 

What problem was this? I have two servers with these cards and the work well

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Edward Ned Harvey
 From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us]
 
 On Sat, 8 May 2010, Edward Ned Harvey wrote:
 
  A vast majority of the time, the opposite is true.  Most of the time,
 having
  swap available increases performance.  Because the kernel is able to
 choose:
  Should I swap out this idle process, or should I dump files out of
 cache?
  With swap enabled, the kernel is given another degree of freedom, to
 choose
  which is colder:  idle process memory, or cold cached files.
 
 Are you sure about this?  It is always good to be sure ...


Hehheheeh ...  I am sure of it in Linux.  I am only assuming
solaris/opensolaris are as good.  So I could be wrong.  ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Edward Ned Harvey
 From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us]
 
 On Sat, 8 May 2010, Edward Ned Harvey wrote:
 
  A vast majority of the time, the opposite is true.  Most of the time,
 having
  swap available increases performance.  Because the kernel is able to
 choose:
  Should I swap out this idle process, or should I dump files out of
 cache?
  With swap enabled, the kernel is given another degree of freedom, to
 choose
  which is colder:  idle process memory, or cold cached files.
 
 Are you sure about this?  It is always good to be sure ...

This is the easiest way I know to show this in Linux:

After the machine has been on, and doing things for a while (maybe hours,
maybe days) run top or free.

It is natural for the free to decrease to near-zero, while the buffers and
cached climb to huge numbers.  The buffers and cache are memory allocated to
the kernel.  It is also normal to see plenty of free, plenty of buffers and
cache, and then see the swap usage increase to something nonzero.

This is evidence that the Linux kernel is sometimes choosing to swap out
idle processes, instead of dropping the buffers or cache usage.

I don't really know how to do the same in solaris/opensolaris, but I haven't
tried either.  I only know that top doesn't show the same info in solaris
10.  And then I moved on.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS loses configuration

2010-05-09 Thread R.G. Keen
I'm answering my own question, having just decided to try it. Yes, anything you 
want to persist beyond reboot with EON that's not in the zfs pools has to have 
an image update done before shutdown. 

I had this Doh! moment after I did the trial. Of course all the system config 
has to be on the system directoriess - which exist only in the boot image for 
EON. This realization let me fix quite a number of things which I was doing 
wrong. 

But it was not obvious to me as a raw beginner at EON.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool import hanging

2010-05-09 Thread Eduardo Bragatto
Additionally, I would like to mention that the only ZFS filesystem not  
mounting -- causing the entire zpool import backup command to hang,  
is the only filesystem configured to be exported via NFS:


backup/insightiq  sharenfs root=* local

Is there any chance the NFS share is the culprit here? If so, how to  
avoid it?


Thanks,
Eduardo Bragatto
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Bob Friesenhahn

On Sun, 9 May 2010, Roy Sigurd Karlsbakk wrote:


Are you sure about this?  It is always good to be sure ...


This is the case with most OSes now. Swap out stuff early, perhaps 
keep it in RAM and swap at the same time, and the kernel can choose 
what to do later. In Linux you can set it in 
/proc/sys/vm/swappiness.


Anyone that knows how this is tuned in osol, btw?


While this is the zfs-discuss list, usually we are talking about 
Solaris/OpenSolaris here rather than most OSes.  No?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mirrored Servers

2010-05-09 Thread Richard Elling
On May 8, 2010, at 7:01 PM, Tony wrote:

 Ok, this is definitely the kind of feedback I was looking for. I'll have 
 to check out the docs on these technologies it looks like. Appreciate it.
 
 I figured I would load balance the hosts with a Cisco device, since I can get 
 around the IOS ok.
 
 I want to offer a online backup service that provides high availability.

Yep, for this sort of business, a replication scheme works well.  
Note that the replication is better when it is done closer to the
application. For a cloud storage company, it is relatively easy
to modify the app to perform the redundancy and keep the storage
simple.  By contrast, if you are running a legacy app that you have
no control over, you are stuck with their architecture and adding
redundancy lower in the software stack is a necessary evil.

 Check out the following concept :
 http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/

Syncing a box of this size could take a week or two.  Consider triple
redundancy.

 I like their basic idea, but they are in the market of $5/month unlimited. I 
 want to take this to the next level. While this solution provides fault 
 tolerance for drive failures, it does not seem to have a safeguard for if 
 terrorists bomb one of your servers. So I figured ZFS would take care of the 
 soft RAID, encryption (for compliance), dedup, and that kind of neat stuff.

Yep.

 I'm still not 100% on how we're going to give access to the storage. I had 
 thought about using RSync or DFS/RDC for Win hosts. VPN for encrypted 
 transfer if needed. Would avoid CIFS probably.
 
 I guess a couple minute delay in the replication is ok, if the backup 
 management software is smart enough to say wtf get over it, and 
 retransmit the file again without user intervention.

If you put redundancy in the infeed, then you won't have a delay like this.
 -- richard

-- 
ZFS storage and performance consulting at http://www.RichardElling.com










___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How can I be sure the zfs send | zfs received is correct?

2010-05-09 Thread Jim Horng
Okay, so after some test with dedup on snv_134.  I decided we can not to use 
dedup feature for the time being.

While unable to destroy a dedupped file system.  I decided to migrate the file 
system to another pool then destroy the pool. (see below)

http://opensolaris.org/jive/thread.jspa?threadID=128532tstart=75
http://opensolaris.org/jive/thread.jspa?threadID=128620tstart=60


Now here is my problem.  
I did a snapshot of the file system I want to migrate.
I did a send and receive of the file system 

zfs send tank/export/projects/project1...@today | zfs receive -d mpool

but the file system end up smaller than the original file system without the 
dedup turn on.  How is this possible?  Can someone explain. I am not able to 
trust the data now until I can verify the data are identical.

SunOS filearch1 5.11 snv_134 i86pc i386 i86xpv Solaris
r...@filearch1:/var/adm# zpool status
  pool: mpool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
mpool   ONLINE   0 0 0
  c7t7d0ONLINE   0 0 0

errors: No known data errors

  pool: rpool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
rpool   ONLINE   0 0 0
  c7t0d0s0  ONLINE   0 0 0

errors: No known data errors

  pool: tank
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz1-0  ONLINE   0 0 0
c7t1d0  ONLINE   0 0 0
c7t2d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c7t4d0  ONLINE   0 0 0
c7t5d0  ONLINE   0 0 0
c7t6d0  ONLINE   0 0 0

errors: No known data errors

r...@filearch1:/var/adm# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
mpool  407G   278G22K  /mpool
mpool/export   407G   278G22K  /mpool/export
mpool/export/projects  407G   278G23K  
/mpool/export/projects
mpool/export/projects/bali_nobackup407G   278G   407G  
/mpool/export/projects/project1_nb
 ...
tank   520G  4.11T  34.9K  /tank
tank/export/projects   515G  4.11T  41.5K  /export/projects
tank/export/projects/bali_nobackup 427G  4.11T   424G  
/export/projects/project1_nb

r...@filearch1:/var/adm# zfs get compressratio
NAME   PROPERTY   VALUE  SOURCE
mpool  compressratio  2.43x  -
mpool/export   compressratio  2.43x  -
mpool/export/projects  compressratio  2.43x  -
mpool/export/projects/project1_nbcompressratio  2.43x  -
mpool/export/projects/project1...@today  compressratio  2.43x  -
tank   compressratio  2.34x  -
tank/exportcompressratio  2.34x  -
tank/export/projects   compressratio  2.34x  -
tank/export/projects/project1_nb compressratio  2.44x  -
tank/export/projects/project1...@today   compressratio  2.44x  -
tank/export/projects/project1_nb_2   compressratio  1.00x  -
tank/export/projects/project1_nb_3   compressratio  1.90x  -

r...@filearch1:/var/adm# zpool list
NAMESIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
mpool   696G   407G   289G58%  1.00x  ONLINE  -
rpool  19.9G  9.50G  10.4G47%  1.00x  ONLINE  -
tank   5.44T   403G  5.04T 7%  2.53x  ONLINE  -
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Richard Elling
On May 9, 2010, at 6:30 AM, Roy Sigurd Karlsbakk wrote:
 - Bob Friesenhahn bfrie...@simple.dallas.tx.us skrev:
 
 On Sat, 8 May 2010, Edward Ned Harvey wrote:
 
 A vast majority of the time, the opposite is true.  Most of the
 time, having
 swap available increases performance.  Because the kernel is able to
 choose:
 Should I swap out this idle process, or should I dump files out of
 cache?
 With swap enabled, the kernel is given another degree of freedom, to
 choose
 which is colder:  idle process memory, or cold cached files.
 
 Are you sure about this?  It is always good to be sure ...
 
 This is the case with most OSes now. Swap out stuff early, perhaps keep it in 
 RAM and swap at the same time, and the kernel can choose what to do later. In 
 Linux you can set it in /proc/sys/vm/swappiness.
 
 Anyone that knows how this is tuned in osol, btw?

This is a better question for perf-discuss.

For a storage server, swap is not needed. If you notice swap being used
then your storage server is undersized.
 -- richard

-- 
ZFS storage and performance consulting at http://www.RichardElling.com










___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How can I be sure the zfs send | zfs received is correct?

2010-05-09 Thread Richard Elling
On May 9, 2010, at 11:16 AM, Jim Horng wrote:

 Okay, so after some test with dedup on snv_134.  I decided we can not to use 
 dedup feature for the time being.
 
 While unable to destroy a dedupped file system.  I decided to migrate the 
 file system to another pool then destroy the pool. (see below)
 
 http://opensolaris.org/jive/thread.jspa?threadID=128532tstart=75
 http://opensolaris.org/jive/thread.jspa?threadID=128620tstart=60
 
 
 Now here is my problem.  
 I did a snapshot of the file system I want to migrate.
 I did a send and receive of the file system 
 
 zfs send tank/export/projects/project1...@today | zfs receive -d mpool
 
 but the file system end up smaller than the original file system without the 
 dedup turn on.  How is this possible?

What you think you are measuring is not what you are measuring.
Compare the size of the snapshots.
 -- richard

  Can someone explain. I am not able to trust the data now until I can verify 
 the data are identical.
 
 SunOS filearch1 5.11 snv_134 i86pc i386 i86xpv Solaris
 r...@filearch1:/var/adm# zpool status
  pool: mpool
 state: ONLINE
 scrub: none requested
 config:
 
NAMESTATE READ WRITE CKSUM
mpool   ONLINE   0 0 0
  c7t7d0ONLINE   0 0 0
 
 errors: No known data errors
 
  pool: rpool
 state: ONLINE
 scrub: none requested
 config:
 
NAMESTATE READ WRITE CKSUM
rpool   ONLINE   0 0 0
  c7t0d0s0  ONLINE   0 0 0
 
 errors: No known data errors
 
  pool: tank
 state: ONLINE
 scrub: none requested
 config:
 
NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz1-0  ONLINE   0 0 0
c7t1d0  ONLINE   0 0 0
c7t2d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c7t4d0  ONLINE   0 0 0
c7t5d0  ONLINE   0 0 0
c7t6d0  ONLINE   0 0 0
 
 errors: No known data errors
 
 r...@filearch1:/var/adm# zfs list
 NAME   USED  AVAIL  REFER  MOUNTPOINT
 mpool  407G   278G22K  /mpool
 mpool/export   407G   278G22K  /mpool/export
 mpool/export/projects  407G   278G23K  
 /mpool/export/projects
 mpool/export/projects/bali_nobackup407G   278G   407G  
 /mpool/export/projects/project1_nb
  ...
 tank   520G  4.11T  34.9K  /tank
 tank/export/projects   515G  4.11T  41.5K  /export/projects
 tank/export/projects/bali_nobackup 427G  4.11T   424G  
 /export/projects/project1_nb
 
 r...@filearch1:/var/adm# zfs get compressratio
 NAME   PROPERTY   VALUE  SOURCE
 mpool  compressratio  2.43x  -
 mpool/export   compressratio  2.43x  -
 mpool/export/projects  compressratio  2.43x  -
 mpool/export/projects/project1_nbcompressratio  2.43x  -
 mpool/export/projects/project1...@today  compressratio  2.43x  -
 tank   compressratio  2.34x  -
 tank/exportcompressratio  2.34x  -
 tank/export/projects   compressratio  2.34x  -
 tank/export/projects/project1_nb compressratio  2.44x  -
 tank/export/projects/project1...@today   compressratio  2.44x  -
 tank/export/projects/project1_nb_2   compressratio  1.00x  -
 tank/export/projects/project1_nb_3   compressratio  1.90x  -
 
 r...@filearch1:/var/adm# zpool list
 NAMESIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
 mpool   696G   407G   289G58%  1.00x  ONLINE  -
 rpool  19.9G  9.50G  10.4G47%  1.00x  ONLINE  -
 tank   5.44T   403G  5.04T 7%  2.53x  ONLINE  -
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 
ZFS storage and performance consulting at http://www.RichardElling.com










___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How can I be sure the zfs send | zfs received is correct?

2010-05-09 Thread Jim Horng
size of snapshot?

r...@filearch1:/var/adm# zfs list mpool/export/projects/project1...@today
NAMEUSED  AVAIL  REFER  MOUNTPOINT
mpool/export/projects/project1...@today  0  -   407G  -
r...@filearch1:/var/adm# zfs list tank/export/projects/project1...@today
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank/export/projects/project1...@today  2.44G  -   424G  -
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Disk Drive Qualification

2010-05-09 Thread Lutz Schumann
Hello,

I see strange behaviour when qualifying disk drives for ZFS. The tests I want 
to run should make sure that the drives honour the cache flush command. For 
this I do the following: 

1) Create singe disk pools (only one disk in the pool)
2) Perorm I/O on the pools
This is done via SQLIte and transactions. As soon as the transaction is 
commited to the calling application, I record the number of the transaction (a 
increasing number). 
3) Then I pull the disk 
4) I remember the number last committed 
5) I power off the server
6) I plug the disk back 
7) I power on the server 
8) I verify that the last commited number is on disk 

Here I find that this fails always by 1 transaction. One transaction is 
committed to the appliance but not on disk. 

The strange thing is that if I only wait 10 seconds after the disk pull and do 
the reoot then, then all transactions are on disk. 

For me it looks like the I/O in flight is commited altought it never reaches 
the disk.  

Any tipps how to investigate further ? 

Regards, 
Robert
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Karl Dalen
I know that according to the documentation Solaris is supposed to be
fully operational in the absences of swap devices. However, I've experienced
cases which I have not been able to trace the root cause of  yet where the disk
access has increased drastically and caused the system to hang but  it may be
more of a performance issue.

One concern is that I have applications that create a lot of /tmp files
and they may end up consuming all RAM. I assume /tmp files cannot be
swapped out to give room for new processes without a swap device
so the malloc failures in the applications will come much sooner.

I wonder if cached files or process pages have the highest priority of not
being swapped out in the Solaris swap policy? 

/Karl D
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Hard disk buffer at 100%

2010-05-09 Thread Eric D. Mudama

On Sat, May  8 at 23:39, Ben Rockwood wrote:

The drive (c7t2d0)is bad and should be replaced.   The second drive
(c7t5d0) is either bad or going bad.  This is exactly the kind of
problem that can force a Thumper to it knees, ZFS performance is
horrific, and as soon as you drop the bad disks things magicly return to
normal.


Problem is the OP is mixing client 4k drives with 512b drives.  They
may not actually be bad, but they appear to be getting misused in
this application.

I doubt they're broken per say, they're just dramatically slower
than their peers in this workload.

As a replacement recommendation, we've been beating on the WD 1TB RE3
drives for 18 months or so, and we're happy with both performance and
the price for what we get.  $160/ea with a 5 year warranty.

--eric

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Mike Gerdts
On Sun, May 9, 2010 at 7:40 PM, Edward Ned Harvey
solar...@nedharvey.com wrote:

  From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
  boun...@opensolaris.org] On Behalf Of Richard Elling
 
  For a storage server, swap is not needed. If you notice swap being used
  then your storage server is undersized.

 Indeed, I have two solaris 10 fileservers that have uptime in the range of a
 few months.  I just checked swap usage, and they're both zero.

 So, Bob, rub it in if you wish.  ;-)  I was wrong.  I knew the behavior in
 Linux, which Roy seconded as most OSes, and apparently we both assumed the
 same here, but that was wrong.  I don't know if solaris and opensolaris both
 have the same swap behavior.  I don't know if there's *ever* a situation
 where solaris/opensolaris would swap idle processes.  But there's at least
 evidence that my two servers have not, or do not.

If Solaris is under memory pressure, pages may be paged to swap.
Under severe memory pressure, entire processes may be swapped.  This
will happen after freeing up the memory used for file system buffers,
ARC, etc.  If the processes never page in the pages that have been
paged out (or the processes that have been swapped out are never
scheduled) then those pages will not consume RAM.

The best thing to do with processes that can be swapped out forever is
to not run them.

--
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Daniel Carosone
On Sun, May 09, 2010 at 09:24:38PM -0500, Mike Gerdts wrote:
 The best thing to do with processes that can be swapped out forever is
 to not run them.

Agreed, however:

#1  Shorter values of forever (like, say, daily) may still be useful.
#2  This relies on knowing in advance what these processes will be.
#3  Where are the JeOS builds without all the gnome-infested likely suspects?

--
Dan.

pgpHYkrXDUgqQ.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is it safe to disable the swap partition?

2010-05-09 Thread Bob Friesenhahn

On Sun, 9 May 2010, Edward Ned Harvey wrote:


So, Bob, rub it in if you wish.  ;-)  I was wrong.  I knew the behavior in
Linux, which Roy seconded as most OSes, and apparently we both assumed the
same here, but that was wrong.  I don't know if solaris and opensolaris both
have the same swap behavior.  I don't know if there's *ever* a situation
where solaris/opensolaris would swap idle processes.  But there's at least
evidence that my two servers have not, or do not.


Solaris and Linux are different in many ways since they are completely 
different operating systems.  Solaris 2.X has never swapped processes. 
It only sends dirty pages to the paging device if there is a shortage 
of pages when more are requested, or if there are not enough free, but 
first it will purge seldom accessed read-only pages which can easily 
be restored.  Zfs has changed things up again by not caching file data 
via the unified page cache and using a specialized ARC instead.  It 
seems that simple paging and MMU control was found not to be smart 
enough.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS and Comstar iSCSI BLK size

2010-05-09 Thread Geoff Nordli
I am using ZFS as the backing store for an iscsi target running a virtual
machine.

 

I am looking at using 8K block size on the zfs volume.  

 

I was looking at the comstar iscsi settings and there is also a blk size
configuration, which defaults as 512 bytes. That would make me believe that
all of the IO will be broken down into 512 bytes which seems very
inefficient.  

 

It seems this value should match the file system allocation/cluster size in
the VM, maybe 4K if you are using an ntfs file system. 

 

Does anyone have any input on this?

 

Thanks,

 

Geoff 

 

 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss