Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-18 Thread Ethan Erchinger

Frank wrote:
 Have you dealt with RedHat Enterprise support?  lol.

Have you dealt with Sun/Oracle support lately? lololol  It's a disaster.
We've had a failed disk in a fully support Sun system for over 3 weeks,
Explorer data turned in, and been given the runaround forever.  The 7000
series support is no better, possibly worse.

 The enterprise is going to continue to want Oracle on Solaris.

The enterprise wants what they used to get from Sun, not what's
currently being offered.

Ethan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-18 Thread Ethan Erchinger
Edward wrote:
 That is really weird.  What are you calling failed?  If you're
getting
 either a red blinking light, or a checksum failure on a device in a
zpool...
 You should get your replacement with no trouble.

Yes, failed, with all the normal failed signs, cfgadm not finding it,
FAULTED in zpool output.

 I have had wonderful support, up to and including recently, on my Sun
 hardware.

I wish we had the same luck.  We've been handed off between 3 different
technicians at this point, each one asking for the same information.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Storage system with 72 GB memoryconstantly has 11 GB free memory

2010-02-26 Thread Ethan Erchinger
I would probably tune lotsfree down as well. At 72G of ram currently it's 
probably reserving around 1.1GB of ram.

http://docs.sun.com/app/docs/doc/819-2724/6n50b07bk?a=view

Ethan

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Tomas Ögren
Sent: Friday, February 26, 2010 6:45 AM
To: Ronny Egner
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS Storage system with 72 GB memoryconstantly has 
11 GB free memory

On 26 February, 2010 - Ronny Egner sent me these 0,6K bytes:

 Dear All,
 
 our storage system running opensolaris b133 + ZFS has a lot of memory for 
 caching. 72 GB total. While testing we observed free memory never falls below 
 11 GB.
 
 Even if we create a ram disk free memory drops below 11 GB but will be 11 GB 
 shortly after (i assume ARC cache is shrunken in this context).
 
 As far as i know ZFS is designed to use all memory except 1 GB for caching

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/arc.c#arc_init

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/arc.c#arc_reclaim_needed


So you have a max limit which it won't try to go past, but also a keep
this much free for the rest of the system. Both are a bit too
protective for a pure ZFS/NFS server in my opinion (but can be tuned).

You can check most variables with f.ex:
echo freemem/D | mdb -k


On one server here, I have in /etc/system:

* 
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Limiting_the_ARC_Cache
* about 7.8*1024*1024*1024, must be  physmem*pagesize  
(206*4096=8446861312 right now)
set zfs:zfs_arc_max = 835000
set zfs:zfs_arc_meta_limit = 70
* some tuning
set ncsize = 50
set nfs:nrnode = 5


And I've done runtime modifications to swapfs_minfree to force usage of another
chunk of memory.


/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] data disappear

2009-08-17 Thread Ethan Erchinger
 I have installed open solaris, build 111. I also added some packages
 from www.sunfreeware.com to my system and other tools (compiled by me)
 to /opt.
 Problem is, that all new data (added by me) after some days get lost.
 Disk looks like (for example) packages from sunfreeware was never
 installed on my system.
 I think that there is something with snapshots, but I don´t know what.
 
Did you happen to perform a pkg image-update, then install sunfreeware 
packages, then reboot?  Sounds like a zfs boot environment change to me.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS write I/O stalls

2009-06-24 Thread Ethan Erchinger
  http://opensolaris.org/jive/thread.jspa?threadID=105702tstart=0
 
 Yes, this does sound very similar.  It looks to me like data from read
 files is clogging the ARC so that there is no more room for more
 writes when ZFS periodically goes to commit unwritten data.  

I'm wondering if changing txg_time to a lower value might help.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] problems with l2arc in 2009.06

2009-06-18 Thread Ethan Erchinger
 
   correct ratio of arc to l2arc?
 
 from http://blogs.sun.com/brendan/entry/l2arc_screenshots
 
Thanks Rob. Hmm...that ratio isn't awesome.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] problems with l2arc in 2009.06

2009-06-17 Thread Ethan Erchinger
Hi all,

Since we've started running 2009.06 on a few servers we seem to be
hitting a problem with l2arc that causes it to stop receiving evicted
arc pages.  Has anyone else seen this kind of problem?

The filesystem contains about 130G of compressed (lzjb) data, and looks
like:
$ zpool status -v data
  pool: data
 state: ONLINE
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
data   ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c1t1d0p0   ONLINE   0 0 0
c1t9d0p0   ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c1t2d0p0   ONLINE   0 0 0
c1t10d0p0  ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c1t3d0p0   ONLINE   0 0 0
c1t11d0p0  ONLINE   0 0 0
logs   ONLINE   0 0 0
  c1t7d0p0 ONLINE   0 0 0
  c1t15d0p0ONLINE   0 0 0
cache
  c1t14d0p0ONLINE   0 0 0
  c1t6d0p0 ONLINE   0 0 0

$ zpool iostat -v data
  capacity operationsbandwidth
poolused  avail   read  write   read  write
-  -  -  -  -  -  -
data133G   275G334926  2.35M  8.62M
  mirror   44.4G  91.6G111257   799K  1.60M
c1t1d0p0   -  - 55145   979K  1.61M
c1t9d0p0   -  - 54145   970K  1.61M
  mirror   44.3G  91.7G111258   804K  1.61M
c1t2d0p0   -  - 55140   979K  1.61M
c1t10d0p0  -  - 55140   973K  1.61M
  mirror   44.4G  91.6G111258   801K  1.61M
c1t3d0p0   -  - 55145   982K  1.61M
c1t11d0p0  -  - 55145   975K  1.61M
  c1t7d0p0   12K  29.7G  0 76 71  1.90M
  c1t15d0p0 152K  29.7G  0 78 11  1.96M
cache  -  -  -  -  -  -
  c1t14d0p051.3G  23.2G 51 35   835K  4.07M
  c1t6d0p0 48.7G  25.9G 45 34   750K  3.86M
-  -  -  -  -  -  -

After adding quite a bit of data to l2arc, it quits getting new writes,
and read traffic is quite low, even though arc misses are quite high:
  capacity operationsbandwidth
poolused  avail   read  write   read  write
-  -  -  -  -  -  -
data133G   275G550263  3.85M  1.57M
  mirror   44.4G  91.6G180  0  1.18M  0
c1t1d0p0   -  - 88  0  3.22M  0
c1t9d0p0   -  - 91  0  3.36M  0
  mirror   44.3G  91.7G196  0  1.29M  0
c1t2d0p0   -  - 95  0  2.74M  0
c1t10d0p0  -  -100  0  3.60M  0
  mirror   44.4G  91.6G174  0  1.38M  0
c1t3d0p0   -  - 85  0  2.71M  0
c1t11d0p0  -  - 88  0  3.34M  0
  c1t7d0p08K  29.7G  0131  0   790K
  c1t15d0p0 156K  29.7G  0131  0   816K
cache  -  -  -  -  -  -
  c1t14d0p051.3G  23.2G 16  0   271K  0
  c1t6d0p0 48.7G  25.9G 14  0   224K  0
-  -  -  -  -  -  -

$ perl arcstat.pl
Time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz
c
21:21:31   10M5M 535M   53 002M   31   857M
1G
21:21:32   20984 4084   40 0060   32   833M
1G
21:21:33   25557 2257   22 00 94   832M
1G
21:21:34   630   483 76   483   76 00   232   63   831M
1G

Arcstats output, just for completeness:
$ kstat -n arcstats
module: zfs instance: 0
name:   arcstatsclass:misc
c   1610325248
c_max   2147483648
c_min   1610325248
crtime  129.137246015
data_size   528762880
deleted 14452910
demand_data_hits589823
demand_data_misses  3812972
demand_metadata_hits4477921
demand_metadata_misses  2069450
evict_skip  5347558
hash_chain_max  13
hash_chains 521232
hash_collisions 9991276
hash_elements   1750708
hash_elements_max   2627838
hdr_size25463208
hits5067744
l2_abort_lowmem 3225
l2_cksum_bad0

Re: [zfs-discuss] problems with l2arc in 2009.06

2009-06-17 Thread Ethan Erchinger
 
 This is a mysql database server, so if you are wondering about the
 smallish arc size, it's being artificially limited by set
 zfs:zfs_arc_max = 0x8000 in /etc/system, so that the majority of
 ram can be allocated to InnoDb.
 
I was told offline that it's likely because my arc size has been limited
to a point that it cannot utilize l2arc correctly.  Can anyone tell me
the correct ratio of arc to l2arc?

Thanks again,
Ethan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-06 Thread Ethan Erchinger


Ethan Erchinger wrote:
 Here is a sample set of messages at that time.  It looks like timeouts 
 on the SSD for various requested blocks.  Maybe I need to talk with 
 Intel about this issue.
   
Keeping everyone up-to-date, for those who care, I've RMAd the Intel 
drive, and will retest when the replacement arrives.  I'm working under 
the assumption that I have a bad drive.

Ethan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-05 Thread Ethan Erchinger
Richard Elling wrote:
 The answer may lie in the /var/adm/messages file which should report
 if a reset was received or sent.
Here is a sample set of messages at that time.  It looks like timeouts 
on the SSD for various requested blocks.  Maybe I need to talk with 
Intel about this issue.

Ethan
==

Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd16):
Dec  2 20:14:01 opensolaris Error for Command: 
write   Error Level: Retryable
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]   
Requested Block: 840   Error Block: 840
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]   Vendor: 
ATASerial Number: CVEM840201EU
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]   Sense 
Key: Unit_Attention
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]   ASC: 
0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Dec  2 20:15:08 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:08 opensolaris Disconnected command timeout for Target 15
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:09 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:15:09 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:09 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:15:09 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:09 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:15:09 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:09 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:15:09 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd16):
Dec  2 20:15:12 opensolaris Error for Command: 
write   Error Level: Retryable
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]   
Requested Block: 810   Error Block: 810
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]   Vendor: 
ATASerial Number: CVEM840201EU
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]   Sense 
Key: Unit_Attention
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]   ASC: 
0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Dec  2 20:16:19 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:19 opensolaris Disconnected command timeout for Target 15
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:21 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:16:21 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:21 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:16:21 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:21 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:16:21 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:21 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:16:21 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ethan Erchinger


Richard Elling wrote:

 I've seen these symptoms when a large number of errors were reported
 in a short period of time and memory was low.  What does fmdump -eV
 show?

fmdump -eV shows lots of messages like this, and yea, I believe that to 
be sd16 which is the SSD:

Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.rqs.derr
ena = 0x76bbfe8ef9000c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /[EMAIL PROTECTED],0/pci10de,[EMAIL 
PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
devid = id1,[EMAIL PROTECTED]
(end detector)

driver-assessment = retry
op-code = 0x2a
cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
pkt-reason = 0x0
pkt-state = 0x37
pkt-stats = 0x0
stat-code = 0x2
key = 0x6
asc = 0x29
ascq = 0x0
sense-data = 0x70 0x0 0x6 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 
0x29 0x0 0x0 0x0 0x0 0x0 0x0 0x0
__ttl = 0x1
__tod = 0x4936b44f 0xd6481a3

Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.recovered
nvlist version: 0
class = ereport.io.scsi.cmd.disk.recovered
ena = 0x76bbfe8ef9000c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /[EMAIL PROTECTED],0/pci10de,[EMAIL 
PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
devid = id1,[EMAIL PROTECTED]
(end detector)

driver-assessment = recovered
op-code = 0x2a
cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
__ttl = 0x1
__tod = 0x4936b44f 0xd6481a3

 Also, it would help to know what OS release you are using.
Oh, it was at the bottom of my original message, but I just realized 
that I'm actually running snv_99, forgot that I had performed an 
image-update a while back.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ethan Erchinger


Ross wrote:
 I'm no expert, but the first thing I'd ask is whether you could repeat that 
 test without using compression?  I'd be quite worried about how a system is 
 going to perform when it's basically running off a 50GB compressed file.

   
Yes this does occur with compression off, but it's harder to repeat.  I 
think that's only due to the fact that physical memory happens to 
closely match the size of the SSD though.  32G of ram, 32G SSD, both 
with some overhead taking the respectable levels to 26-28G.
 There seem to be a lot of variables here, on quite a few new systems, and I'd 
 want to try to focus on individual subsystems to ensure they're all working 
 to spec fist.  
   
New systems?  SSD yes.  But the application is very well tested, 
although using malloc(), not mmap().  Compression isn't new, lots of 
people use it.  I've run this system, running MySQL, through many many 
tests and it's worked quite nicely.  Many variables, yes, lots of new 
code paths, I'd hope not.
 If you can try without compression, the next thing I'd want to test is to try 
 with a file that's smaller than RAM.  That should hopefully eliminate paging 
 or RAM pressure, and allow you to verify that the storage subsystem is 
 performing as expected.  You could then re-enable compression and see if it 
 still works.
   
Running with a file smaller than ram does not have this issue.  As a 
test I've run things a little differently.  I put the SSD as swap and 
used malloc() again, but allocate a large dataset  physical memory, and 
I've seen very similar stalls. There is 0 IO, but the application is 
blocked on something.  I guess I should try to insert to some debug 
code, or use dtruss to see if the application is waiting on a syscall.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ethan Erchinger
Tim wrote:


 Are you leaving ANY ram for zfs to do it's thing?  If you're consuming 
 ALL system memory for just this file/application, I would expect the 
 system to fall over and die.

Hmm.  I believe that the kernel should manage that relationship for me.  
If the system cannot manage swap or paging on large mmap() files, there 
is something very wrong with the virtual memory manager in the kernel.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ethan Erchinger


Richard Elling wrote:

asc = 0x29
ascq = 0x0

 ASC/ASCQ 29/00 is POWER ON, RESET, OR BUS DEVICE RESET OCCURRED
 http://www.t10.org/lists/asc-num.htm#ASC_29

 [this should be more descriptive as the codes are, more-or-less,
 standardized, I'll try to file an RFE, unless someone beat me to it]

 Depending on which system did the reset, it should be noted in the
 /var/adm/messages log.  This makes me suspect the hardware (firmware,
 actually).

Firmware of the SSD, or something else?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] help diagnosing system hang

2008-12-03 Thread Ethan Erchinger
Hi all,

First, I'll say my intent is not to spam a bunch of lists, but after 
posting to opensolaris-discuss I had someone communicate with me offline 
that these lists would possibly be a better place to start.  So here we 
are. For those on all three lists, sorry for the repetition.

Second, this message is meant to solicit help in diagnosing the issue 
described below.  Any hints on how DTrace may help, or where in general 
to start would be much appreciated.  Back to the subject at hand.

---

I'm testing an application which makes use of a large file mmap'd into 
memory, as if it the application was using malloc().  The file is 
roughly 2x the size of physical ram.  Basically, I'm seeing the system 
stall for long periods of time, 60+ seconds, and then resume.  The file 
lives on an SSD (Intel x25-e) and I'm using zfs's lzjb compression to 
make more efficient use of the ~30G of space provided by that SSD.

The general flow of things is, start application, ask it to use a 50G 
file. The file is created in a sparse manner at the location
designated, then mmap is called on the entire file.  All fine up to this
point.

I then start loading data into the application, and it starts pushing
data to the file as you'd expect.  Data is pushed to the file early and 
often, as it's mmap'd with the MAP_SHARED flag.  But, when the 
application's resident size reaches about 80% of the physical ram on the 
system, the system starts paging and things are still working relatively 
well, though slower, as expected.

Soon after, when reaching about 40G of data, I get stalls accessing the
SSD (according to iostat), in other words, no IO to that drive.  When I
started looking into what could be causing it, such as IO timeouts, I
run dmesg and it hangs after printing a timestamp.  I can ctrl-c dmesg,
but subsequent runs provide no better results.  I see no new messages in
/var/adm/messages, as I'd expect.

Eventually the system recovers, the latest case took over 10 minutes to
recover, after killing the application mentioned above, and I do see
disk timeouts in dmesg.

So, I can only assume that there's either a driver bug in the SATA/SAS
controller I'm using and it's throwing timeouts, or the SSD is having
issues.  Looking at the zpool configuration, I see that failmode=wait,
and since that SSD is the only member of the zpool I would expect IO to
hang.

But, does that mean that dmesg should hang also?  Does that mean that
the kernel has at least one thread stuck?  Would failmode=continue be
more desired, or resilient?

During the hang, load-avg is artificially high, fmd being the one
process that sticks out in prstat output.  But fmdump -v doesn't show
anything relevant.

Anyone have ideas on how to diagnose what's going on there?

Thanks,
Ethan

System: Sun x4240 dual-amd2347, 32G of ram
SAS/SATA Controller: LSI3081E
OS: osol snv_98
SSD: Intel x25-e


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Cannot remove slog device from zpool

2008-10-26 Thread Ethan Erchinger

Hello,

I've looked quickly through the archives and haven't found mention of 
this issue.  I'm running SXCE (snv_99), which I believe uses zfs version 
13.  I had an existing zpool:
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Cannot remove slog device from zpool

2008-10-26 Thread Ethan Erchinger
Sorry for the first incomplete send,  stupid Ctrl-Enter. :-)

Hello,

I've looked quickly through the archives and haven't found mention of 
this issue.  I'm running SXCE (snv_99), which uses zfs version 13.  I 
had an existing zpool:
--
[EMAIL PROTECTED] ~]$ zpool status -v data
  pool: data
 state: ONLINE
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
data   ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c4t1d0p0   ONLINE   0 0 0
c4t9d0p0   ONLINE   0 0 0
  ...
cache
  c4t15d0p0ONLINE   0 0 0

errors: No known data errors

--

The cache device (c4t15d0p0) is an Intel SSD.  To test zil, I removed 
the cache device, and added it as a log device:
--
[EMAIL PROTECTED] ~]$ pfexec zpool remove data c4t15d0p0
[EMAIL PROTECTED] ~]$ pfexec zpool add data log c4t15d0p0
[EMAIL PROTECTED] ~]$ zpool status -v data
  pool: data
 state: ONLINE
 scrub: none requested
config:

NAME   STATE READ WRITE CKSUM
data   ONLINE   0 0 0
  mirror   ONLINE   0 0 0
c4t1d0p0   ONLINE   0 0 0
c4t9d0p0   ONLINE   0 0 0
  ...
logs   ONLINE   0 0 0
  c4t15d0p0ONLINE   0 0 0

errors: No known data errors
--

The device is working fine.  I then said, that was fun, time to remove 
and add as cache device.  But that doesn't seem possible:
--
[EMAIL PROTECTED] ~]$ pfexec zpool remove data c4t15d0p0
cannot remove c4t15d0p0: only inactive hot spares or cache devices can 
be removed
--

I've also tried using detach, offline, each failing in other more 
obvious ways.  The manpage does say that those devices should be 
removable/replaceable.  At this point the only way to reclaim my SSD 
device is to destroy the zpool.

Just in-case you are wondering about versions:
--
[EMAIL PROTECTED] ~]$ zpool upgrade data
This system is currently running ZFS pool version 13.

Pool 'data' is already formatted using the current version.
[EMAIL PROTECTED] ~]$ uname -a
SunOS opensolaris 5.11 snv_99 i86pc i386 i86pc
--

Any ideas?

Thanks,
Ethan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS write performance on boot disk

2008-10-26 Thread Ethan Erchinger

William Bauer wrote:

I've done some more research, but would still greatly appreciate someone 
helping me understand this.

It seems that writes to only the home directory of the person logged in to the 
console suffers from degraded performance.  If I write to a subdirectory 
beneath my home, or to any other directory on the system, performance is great. 
 But if I have a session on the console, no matter where else I test from 
(Gnome or a remote shell), writes ONLY to my home suffer.  If I log out of the 
console and then SSH in from another system, writes to the home directory no 
longer suffer from degraded performance.

This has proven true on every OpenSolaris system I've tried--all of which are 
using ZFS.  So what is it about logging into the console that slows write 
performance to ONLY the top level home directory of the username on the same 
console


Maybe something to do with autofs?  What happens if you move your home dir?

Ethan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss