definitely time to bust out some mdb -k and see what it's moaning about.

I did not see the screenshot earlier... sorry about that.

Nathan.

Blake wrote:
I start the cp, and then, with prstat -a, watch the cpu load for the
cp process climb to 25% on a 4-core machine.

Load, measured for example with 'uptime', climbs steadily until the reboot.

Note that the machine does not dump properly, panic or hang - rather,
it reboots.

I attached a screenshot earlier in this thread of the little bit of
error message I could see on the console.  The machine is trying to
dump to the dump zvol, but fails to do so.  Only sometimes do I see an
error on the machine's local console - mos times, it simply reboots.



On Thu, Mar 12, 2009 at 1:55 AM, Nathan Kroenert
<nathan.kroen...@sun.com> wrote:
Hm -

Crashes, or hangs? Moreover - how do you know a CPU is pegged?

Seems like we could do a little more discovery on what the actual problem
here is, as I can read it about 4 different ways.

By this last piece of information, I'm guessing the system does not crash,
but goes really really slow??

Crash == panic == we see stack dump on console and try to take a dump
hang == nothing works == no response -> might be worth looking at mdb -K
       or booting with a -k on the boot line.

So - are we crashing, hanging, or something different?

It might simply be that you are eating up all your memory, and your physical
backing storage is taking a while to catch up....?

Nathan.

Blake wrote:
My dump device is already on a different controller - the motherboards
built-in nVidia SATA controller.

The raidz2 vdev is the one I'm having trouble with (copying the same
files to the mirrored rpool on the nVidia controller work nicely).  I
do notice that, when using cp to copy the files to the raidz2 pool,
load on the machine climbs steadily until the crash, and one proc core
pegs at 100%.

Frustrating, yes.

On Thu, Mar 12, 2009 at 12:31 AM, Maidak Alexander J
<maidakalexand...@johndeere.com> wrote:
If you're having issues with a disk contoller or disk IO driver its
highly likely that a savecore to disk after the panic will fail.  I'm not
sure how to work around this, maybe a dedicated dump device not on a
controller that uses a different driver then the one that you're having
issues with?

-----Original Message-----
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake
Sent: Wednesday, March 11, 2009 4:45 PM
To: Richard Elling
Cc: Marc Bevand; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] reboot when copying large amounts of data

I guess I didn't make it clear that I had already tried using savecore to
retrieve the core from the dump device.

I added a larger zvol for dump, to make sure that I wasn't running out of
space on the dump device:

r...@host:~# dumpadm
    Dump content: kernel pages
     Dump device: /dev/zvol/dsk/rpool/bigdump (dedicated) Savecore
directory: /var/crash/host
 Savecore enabled: yes

I was using the -L option only to try to get some idea of why the system
load was climbing to 1 during a simple file copy.



On Wed, Mar 11, 2009 at 4:58 PM, Richard Elling
<richard.ell...@gmail.com> wrote:
Blake wrote:
I'm attaching a screenshot of the console just before reboot.  The
dump doesn't seem to be working, or savecore isn't working.

On Wed, Mar 11, 2009 at 11:33 AM, Blake <blake.ir...@gmail.com> wrote:

I'm working on testing this some more by doing a savecore -L right
after I start the copy.


savecore -L is not what you want.

By default, for OpenSolaris, savecore on boot is disabled.  But the
core will have been dumped into the dump slice, which is not used for
swap.
So you should be able to run savecore at a later time to collect the
core from the last dump.
-- richard


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
//////////////////////////////////////////////////////////////////
// Nathan Kroenert              nathan.kroen...@sun.com         //
// Systems Engineer             Phone:  +61 3 9869-6255         //
// Sun Microsystems             Fax:    +61 3 9869-6288         //
// Level 7, 476 St. Kilda Road  Mobile: 0419 305 456            //
// Melbourne 3004   Victoria    Australia                       //
//////////////////////////////////////////////////////////////////


--
//////////////////////////////////////////////////////////////////
// Nathan Kroenert              nathan.kroen...@sun.com         //
// Systems Engineer             Phone:  +61 3 9869-6255         //
// Sun Microsystems             Fax:    +61 3 9869-6288         //
// Level 7, 476 St. Kilda Road  Mobile: 0419 305 456            //
// Melbourne 3004   Victoria    Australia                       //
//////////////////////////////////////////////////////////////////
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to