Re: [zfs-discuss] help diagnosing system hang

2008-12-06 Thread Ethan Erchinger


Ethan Erchinger wrote:
> Here is a sample set of messages at that time.  It looks like timeouts 
> on the SSD for various requested blocks.  Maybe I need to talk with 
> Intel about this issue.
>   
Keeping everyone up-to-date, for those who care, I've RMAd the Intel 
drive, and will retest when the replacement arrives.  I'm working under 
the assumption that I have a bad drive.

Ethan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-05 Thread Ethan Erchinger
Richard Elling wrote:
> The answer may lie in the /var/adm/messages file which should report
> if a reset was received or sent.
Here is a sample set of messages at that time.  It looks like timeouts 
on the SSD for various requested blocks.  Maybe I need to talk with 
Intel about this issue.

Ethan
==

Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd16):
Dec  2 20:14:01 opensolaris Error for Command: 
write   Error Level: Retryable
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]   
Requested Block: 840   Error Block: 840
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]   Vendor: 
ATASerial Number: CVEM840201EU
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]   Sense 
Key: Unit_Attention
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]   ASC: 
0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Dec  2 20:15:08 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:08 opensolaris Disconnected command timeout for Target 15
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:09 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:15:09 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:09 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:15:09 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:09 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:15:09 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:15:09 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:15:09 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd16):
Dec  2 20:15:12 opensolaris Error for Command: 
write   Error Level: Retryable
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]   
Requested Block: 810   Error Block: 810
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]   Vendor: 
ATASerial Number: CVEM840201EU
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]   Sense 
Key: Unit_Attention
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]   ASC: 
0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Dec  2 20:16:19 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:19 opensolaris Disconnected command timeout for Target 15
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:21 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:16:21 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:21 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:16:21 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:21 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:16:21 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):
Dec  2 20:16:21 opensolaris Log info 0x3114 received for target 15.
Dec  2 20:16:21 opensolaris scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-05 Thread Richard Elling
Ethan Erchinger wrote:
> 
> Richard Elling wrote:
>>>
>>>asc = 0x29
>>>ascq = 0x0
>>
>> ASC/ASCQ 29/00 is POWER ON, RESET, OR BUS DEVICE RESET OCCURRED
>> http://www.t10.org/lists/asc-num.htm#ASC_29
>>
>> [this should be more descriptive as the codes are, more-or-less,
>> standardized, I'll try to file an RFE, unless someone beat me to it]
>>
>> Depending on which system did the reset, it should be noted in the
>> /var/adm/messages log.  This makes me suspect the hardware (firmware,
>> actually).
>>
> Firmware of the SSD, or something else?

The answer may lie in the /var/adm/messages file which should report
if a reset was received or sent.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ethan Erchinger


Richard Elling wrote:
>>
>>asc = 0x29
>>ascq = 0x0
>
> ASC/ASCQ 29/00 is POWER ON, RESET, OR BUS DEVICE RESET OCCURRED
> http://www.t10.org/lists/asc-num.htm#ASC_29
>
> [this should be more descriptive as the codes are, more-or-less,
> standardized, I'll try to file an RFE, unless someone beat me to it]
>
> Depending on which system did the reset, it should be noted in the
> /var/adm/messages log.  This makes me suspect the hardware (firmware,
> actually).
>
Firmware of the SSD, or something else?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Richard Elling
Ethan Erchinger wrote:
>
>
> Richard Elling wrote:
>>
>> I've seen these symptoms when a large number of errors were reported
>> in a short period of time and memory was low.  What does "fmdump -eV"
>> show?
>>
> fmdump -eV shows lots of messages like this, and yea, I believe that 
> to be sd16 which is the SSD:
>
> Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.dev.rqs.derr
> nvlist version: 0
>class = ereport.io.scsi.cmd.disk.dev.rqs.derr
>ena = 0x76bbfe8ef9000c01
>detector = (embedded nvlist)
>nvlist version: 0
>version = 0x0
>scheme = dev
>device-path = /[EMAIL PROTECTED],0/pci10de,[EMAIL 
> PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
>devid = id1,[EMAIL PROTECTED]
>(end detector)
>
>driver-assessment = retry

We retry, which is why there is a recovery e-report later.

>op-code = 0x2a
>cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
>pkt-reason = 0x0
>pkt-state = 0x37
>pkt-stats = 0x0
>stat-code = 0x2
>key = 0x6
>asc = 0x29
>ascq = 0x0

ASC/ASCQ 29/00 is POWER ON, RESET, OR BUS DEVICE RESET OCCURRED
http://www.t10.org/lists/asc-num.htm#ASC_29

[this should be more descriptive as the codes are, more-or-less,
standardized, I'll try to file an RFE, unless someone beat me to it]

Depending on which system did the reset, it should be noted in the
/var/adm/messages log.  This makes me suspect the hardware (firmware,
actually).
 -- richard

>sense-data = 0x70 0x0 0x6 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 
> 0x29 0x0 0x0 0x0 0x0 0x0 0x0 0x0
>__ttl = 0x1
>__tod = 0x4936b44f 0xd6481a3
>
> Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.recovered
> nvlist version: 0
>class = ereport.io.scsi.cmd.disk.recovered
>ena = 0x76bbfe8ef9000c01
>detector = (embedded nvlist)
>nvlist version: 0
>version = 0x0
>scheme = dev
>device-path = /[EMAIL PROTECTED],0/pci10de,[EMAIL 
> PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
>devid = id1,[EMAIL PROTECTED]
>(end detector)
>
>driver-assessment = recovered
>op-code = 0x2a
>cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
>pkt-reason = 0x0
>pkt-state = 0x1f
>pkt-stats = 0x0
>__ttl = 0x1
>__tod = 0x4936b44f 0xd6481a3
>
>> Also, it would help to know what OS release you are using.
> Oh, it was at the bottom of my original message, but I just realized 
> that I'm actually running snv_99, forgot that I had performed an 
> image-update a while back.
>

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ethan Erchinger
Tim wrote:
>
>
> Are you leaving ANY ram for zfs to do it's thing?  If you're consuming 
> ALL system memory for just this file/application, I would expect the 
> system to fall over and die.
>
Hmm.  I believe that the kernel should manage that relationship for me.  
If the system cannot manage swap or paging on large mmap() files, there 
is something very wrong with the virtual memory manager in the kernel.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Tim
On Thu, Dec 4, 2008 at 11:55 AM, Ethan Erchinger <[EMAIL PROTECTED]> wrote:

>
>
> Ross wrote:
> > I'm no expert, but the first thing I'd ask is whether you could repeat
> that test without using compression?  I'd be quite worried about how a
> system is going to perform when it's basically running off a 50GB compressed
> file.
> >
> >
> Yes this does occur with compression off, but it's harder to repeat.  I
> think that's only due to the fact that physical memory happens to
> closely match the size of the SSD though.  32G of ram, 32G SSD, both
> with some overhead taking the respectable levels to 26-28G.
> > There seem to be a lot of variables here, on quite a few new systems, and
> I'd want to try to focus on individual subsystems to ensure they're all
> working to spec fist.
> >
> New systems?  SSD yes.  But the application is very well tested,
> although using malloc(), not mmap().  Compression isn't new, lots of
> people use it.  I've run this system, running MySQL, through many many
> tests and it's worked quite nicely.  Many variables, yes, lots of new
> code paths, I'd hope not.
> > If you can try without compression, the next thing I'd want to test is to
> try with a file that's smaller than RAM.  That should hopefully eliminate
> paging or RAM pressure, and allow you to verify that the storage subsystem
> is performing as expected.  You could then re-enable compression and see if
> it still works.
> >
> Running with a file smaller than ram does not have this issue.  As a
> test I've run things a little differently.  I put the SSD as swap and
> used malloc() again, but allocate a large dataset > physical memory, and
> I've seen very similar stalls. There is 0 IO, but the application is
> blocked on something.  I guess I should try to insert to some debug
> code, or use dtruss to see if the application is waiting on a syscall.
>


Are you leaving ANY ram for zfs to do it's thing?  If you're consuming ALL
system memory for just this file/application, I would expect the system to
fall over and die.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ethan Erchinger


Ross wrote:
> I'm no expert, but the first thing I'd ask is whether you could repeat that 
> test without using compression?  I'd be quite worried about how a system is 
> going to perform when it's basically running off a 50GB compressed file.
>
>   
Yes this does occur with compression off, but it's harder to repeat.  I 
think that's only due to the fact that physical memory happens to 
closely match the size of the SSD though.  32G of ram, 32G SSD, both 
with some overhead taking the respectable levels to 26-28G.
> There seem to be a lot of variables here, on quite a few new systems, and I'd 
> want to try to focus on individual subsystems to ensure they're all working 
> to spec fist.  
>   
New systems?  SSD yes.  But the application is very well tested, 
although using malloc(), not mmap().  Compression isn't new, lots of 
people use it.  I've run this system, running MySQL, through many many 
tests and it's worked quite nicely.  Many variables, yes, lots of new 
code paths, I'd hope not.
> If you can try without compression, the next thing I'd want to test is to try 
> with a file that's smaller than RAM.  That should hopefully eliminate paging 
> or RAM pressure, and allow you to verify that the storage subsystem is 
> performing as expected.  You could then re-enable compression and see if it 
> still works.
>   
Running with a file smaller than ram does not have this issue.  As a 
test I've run things a little differently.  I put the SSD as swap and 
used malloc() again, but allocate a large dataset > physical memory, and 
I've seen very similar stalls. There is 0 IO, but the application is 
blocked on something.  I guess I should try to insert to some debug 
code, or use dtruss to see if the application is waiting on a syscall.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ethan Erchinger


Richard Elling wrote:
>
> I've seen these symptoms when a large number of errors were reported
> in a short period of time and memory was low.  What does "fmdump -eV"
> show?
>
fmdump -eV shows lots of messages like this, and yea, I believe that to 
be sd16 which is the SSD:

Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.rqs.derr
ena = 0x76bbfe8ef9000c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /[EMAIL PROTECTED],0/pci10de,[EMAIL 
PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
devid = id1,[EMAIL PROTECTED]
(end detector)

driver-assessment = retry
op-code = 0x2a
cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
pkt-reason = 0x0
pkt-state = 0x37
pkt-stats = 0x0
stat-code = 0x2
key = 0x6
asc = 0x29
ascq = 0x0
sense-data = 0x70 0x0 0x6 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 
0x29 0x0 0x0 0x0 0x0 0x0 0x0 0x0
__ttl = 0x1
__tod = 0x4936b44f 0xd6481a3

Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.recovered
nvlist version: 0
class = ereport.io.scsi.cmd.disk.recovered
ena = 0x76bbfe8ef9000c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /[EMAIL PROTECTED],0/pci10de,[EMAIL 
PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0
devid = id1,[EMAIL PROTECTED]
(end detector)

driver-assessment = recovered
op-code = 0x2a
cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
pkt-reason = 0x0
pkt-state = 0x1f
pkt-stats = 0x0
__ttl = 0x1
__tod = 0x4936b44f 0xd6481a3

> Also, it would help to know what OS release you are using.
Oh, it was at the bottom of my original message, but I just realized 
that I'm actually running snv_99, forgot that I had performed an 
image-update a while back.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Richard Elling
Ethan Erchinger wrote:
> Hi all,
>
> First, I'll say my intent is not to spam a bunch of lists, but after 
> posting to opensolaris-discuss I had someone communicate with me offline 
> that these lists would possibly be a better place to start.  So here we 
> are. For those on all three lists, sorry for the repetition.
>
> Second, this message is meant to solicit help in diagnosing the issue 
> described below.  Any hints on how DTrace may help, or where in general 
> to start would be much appreciated.  Back to the subject at hand.
>
> ---
>
> I'm testing an application which makes use of a large file mmap'd into 
> memory, as if it the application was using malloc().  The file is 
> roughly 2x the size of physical ram.  Basically, I'm seeing the system 
> stall for long periods of time, 60+ seconds, and then resume.  The file 
> lives on an SSD (Intel x25-e) and I'm using zfs's lzjb compression to 
> make more efficient use of the ~30G of space provided by that SSD.
>
> The general flow of things is, start application, ask it to use a 50G 
> file. The file is created in a sparse manner at the location
> designated, then mmap is called on the entire file.  All fine up to this
> point.
>
> I then start loading data into the application, and it starts pushing
> data to the file as you'd expect.  Data is pushed to the file early and 
> often, as it's mmap'd with the MAP_SHARED flag.  But, when the 
> application's resident size reaches about 80% of the physical ram on the 
> system, the system starts paging and things are still working relatively 
> well, though slower, as expected.
>
> Soon after, when reaching about 40G of data, I get stalls accessing the
> SSD (according to iostat), in other words, no IO to that drive.  When I
> started looking into what could be causing it, such as IO timeouts, I
> run dmesg and it hangs after printing a timestamp.  I can ctrl-c dmesg,
> but subsequent runs provide no better results.  I see no new messages in
> /var/adm/messages, as I'd expect.
>
> Eventually the system recovers, the latest case took over 10 minutes to
> recover, after killing the application mentioned above, and I do see
> disk timeouts in dmesg.
>
> So, I can only assume that there's either a driver bug in the SATA/SAS
> controller I'm using and it's throwing timeouts, or the SSD is having
> issues.  Looking at the zpool configuration, I see that failmode=wait,
> and since that SSD is the only member of the zpool I would expect IO to
> hang.
>
> But, does that mean that dmesg should hang also?  Does that mean that
> the kernel has at least one thread stuck?  Would failmode=continue be
> more desired, or resilient?
>
> During the hang, load-avg is artificially high, fmd being the one
> process that sticks out in prstat output.  But fmdump -v doesn't show
> anything relevant.
>   

I've seen these symptoms when a large number of errors were reported
in a short period of time and memory was low.  What does "fmdump -eV"
show?

Also, it would help to know what OS release you are using.
 -- richard

> Anyone have ideas on how to diagnose what's going on there?
>
> Thanks,
> Ethan
>
> System: Sun x4240 dual-amd2347, 32G of ram
> SAS/SATA Controller: LSI3081E
> OS: osol snv_98
> SSD: Intel x25-e
>
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] help diagnosing system hang

2008-12-04 Thread Ross
I'm no expert, but the first thing I'd ask is whether you could repeat that 
test without using compression?  I'd be quite worried about how a system is 
going to perform when it's basically running off a 50GB compressed file.

There seem to be a lot of variables here, on quite a few new systems, and I'd 
want to try to focus on individual subsystems to ensure they're all working to 
spec fist.  

If you can try without compression, the next thing I'd want to test is to try 
with a file that's smaller than RAM.  That should hopefully eliminate paging or 
RAM pressure, and allow you to verify that the storage subsystem is performing 
as expected.  You could then re-enable compression and see if it still works.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] help diagnosing system hang

2008-12-03 Thread Ethan Erchinger
Hi all,

First, I'll say my intent is not to spam a bunch of lists, but after 
posting to opensolaris-discuss I had someone communicate with me offline 
that these lists would possibly be a better place to start.  So here we 
are. For those on all three lists, sorry for the repetition.

Second, this message is meant to solicit help in diagnosing the issue 
described below.  Any hints on how DTrace may help, or where in general 
to start would be much appreciated.  Back to the subject at hand.

---

I'm testing an application which makes use of a large file mmap'd into 
memory, as if it the application was using malloc().  The file is 
roughly 2x the size of physical ram.  Basically, I'm seeing the system 
stall for long periods of time, 60+ seconds, and then resume.  The file 
lives on an SSD (Intel x25-e) and I'm using zfs's lzjb compression to 
make more efficient use of the ~30G of space provided by that SSD.

The general flow of things is, start application, ask it to use a 50G 
file. The file is created in a sparse manner at the location
designated, then mmap is called on the entire file.  All fine up to this
point.

I then start loading data into the application, and it starts pushing
data to the file as you'd expect.  Data is pushed to the file early and 
often, as it's mmap'd with the MAP_SHARED flag.  But, when the 
application's resident size reaches about 80% of the physical ram on the 
system, the system starts paging and things are still working relatively 
well, though slower, as expected.

Soon after, when reaching about 40G of data, I get stalls accessing the
SSD (according to iostat), in other words, no IO to that drive.  When I
started looking into what could be causing it, such as IO timeouts, I
run dmesg and it hangs after printing a timestamp.  I can ctrl-c dmesg,
but subsequent runs provide no better results.  I see no new messages in
/var/adm/messages, as I'd expect.

Eventually the system recovers, the latest case took over 10 minutes to
recover, after killing the application mentioned above, and I do see
disk timeouts in dmesg.

So, I can only assume that there's either a driver bug in the SATA/SAS
controller I'm using and it's throwing timeouts, or the SSD is having
issues.  Looking at the zpool configuration, I see that failmode=wait,
and since that SSD is the only member of the zpool I would expect IO to
hang.

But, does that mean that dmesg should hang also?  Does that mean that
the kernel has at least one thread stuck?  Would failmode=continue be
more desired, or resilient?

During the hang, load-avg is artificially high, fmd being the one
process that sticks out in prstat output.  But fmdump -v doesn't show
anything relevant.

Anyone have ideas on how to diagnose what's going on there?

Thanks,
Ethan

System: Sun x4240 dual-amd2347, 32G of ram
SAS/SATA Controller: LSI3081E
OS: osol snv_98
SSD: Intel x25-e


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss