Re: [zfs-discuss] reboot when copying large amounts of data

2009-12-20 Thread Ian Collins
arnaud wrote: Hi folks, I was trying to load a large file in /tmp so that a process that parses it wouldn't be limited by a disk throughput bottleneck. My rig here has only 12GB of RAM and the file I copied is about 12GB as well. Before the copied finished, my system restarted. I'm pretty up

Re: [zfs-discuss] reboot when copying large amounts of data

2009-12-20 Thread arnaud
Hi folks, I was trying to load a large file in /tmp so that a process that parses it wouldn't be limited by a disk throughput bottleneck. My rig here has only 12GB of RAM and the file I copied is about 12GB as well. Before the copied finished, my system restarted. I'm pretty up to date, the sy

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-14 Thread Tim
On Sat, Mar 14, 2009 at 12:57 PM, Miles Nordin wrote: > > "jcm" == James C McPherson writes: > >jcm> As for "will it work on sparc" - yes, I would expect so. BUT > jcm> without fcode it won't be bootable. That's what you are really > jcm> asking, isn't it? > > just if it will work.

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-14 Thread Miles Nordin
> "jcm" == James C McPherson writes: jcm> As for "will it work on sparc" - yes, I would expect so. BUT jcm> without fcode it won't be bootable. That's what you are really jcm> asking, isn't it? just if it will work. I want to add a slog. jcm> It turns out you can change between

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-14 Thread James C. McPherson
On Fri, 13 Mar 2009 22:46:19 -0400 Miles Nordin wrote: > > "jcm" == James C McPherson writes: > >jcm> We haven't got mega_sas on SPARC at this point either. > > The card Blake found: > > http://www.provantage.com/lsi-logic-lsi00117~7LSIG03X.htm > > http://www.lsi.com/storage_home/p

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-13 Thread Miles Nordin
> "jcm" == James C McPherson writes: jcm> We haven't got mega_sas on SPARC at this point either. The card Blake found: http://www.provantage.com/lsi-logic-lsi00117~7LSIG03X.htm http://www.lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/lsisas3080xr/index.html any idea i

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-13 Thread James C. McPherson
On Fri, 13 Mar 2009 18:00:04 -0400 Miles Nordin wrote: > > "t" == Tim writes: > > t> > http://src.opensolaris.org/source/xref/zfs-crypto/phase2/usr/src/pkgdefs/SUNWmegasas/postinstall?&r=6542 > > thanks. > > t> Looks like there's a TON of supported cards. > > They are all 107

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-13 Thread Miles Nordin
> "t" == Tim writes: t> http://src.opensolaris.org/source/xref/zfs-crypto/phase2/usr/src/pkgdefs/SUNWmegasas/postinstall?&r=6542 thanks. t> Looks like there's a TON of supported cards. They are all 1078 cards though. James mentioned mega_sas supports some 1068E cards depending

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-13 Thread Tim
On Fri, Mar 13, 2009 at 3:33 PM, Miles Nordin wrote: > > "c" == Miles Nordin writes: > > c> Is there some file in OpenSolaris against which I can > c> cross-reference this? or...really, just use instead of > c> pci.ids, since only the PCI ID not the description is enough > > I f

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-13 Thread Tim
On Fri, Mar 13, 2009 at 3:33 PM, Miles Nordin wrote: > > "c" == Miles Nordin writes: > > c> Is there some file in OpenSolaris against which I can > c> cross-reference this? or...really, just use instead of > c> pci.ids, since only the PCI ID not the description is enough > > I f

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-13 Thread Miles Nordin
> "c" == Miles Nordin writes: c> Is there some file in OpenSolaris against which I can c> cross-reference this? or...really, just use instead of c> pci.ids, since only the PCI ID not the description is enough I found these: http://src.opensolaris.org/source/xref/onnv/onnv-

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-13 Thread Miles Nordin
> "jcm" == James C McPherson writes: jcm> the mpt(7D) driver supports that card. Then I am apparently stuck with a closed-source driver again, and again by surprise. I bought it because I thought you said 1068E was supported by mega_sas: >> http://www.osnews.com/thread?317113 jc

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-13 Thread Blake Irvin
This is really great information, though most of the controllers mentioned aren't on the OpenSolaris HCL. Seems like that should be corrected :) My thanks to the community for their support. On Mar 12, 2009, at 10:42 PM, "James C. McPherson" > wrote: On Thu, 12 Mar 2009 22:24:12 -0400 Mi

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread James C. McPherson
On Thu, 12 Mar 2009 22:24:12 -0400 Miles Nordin wrote: > > "wm" == Will Murnane writes: > > >>     * SR = Software RAID IT = Integrate. Target mode. IR mode > >> is not supported. > wm> Integrated target mode lets you export some storage attached > wm> to the host system (th

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread Miles Nordin
> "wm" == Will Murnane writes: >>     * SR = Software RAID IT = Integrate. Target mode. IR mode >> is not supported. wm> Integrated target mode lets you export some storage attached wm> to the host system (through another adapter, presumably) as a wm> storage device. IR m

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread Will Murnane
On Thu, Mar 12, 2009 at 18:30, Miles Nordin wrote: >  I love the way they use the numbers 3800 and 3080, so you are >  constantly transposing them thus leaving google littered with all >  this confusingly wrong information. Think of the middle two digits as (number of external ports, number of int

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread Nathan Kroenert
For what it's worth, I have been running Nevada (so, same kernel as opensolaris) for ages (at least 18 months) on a Gigabyte board with the MCP55 chipset and it's been flawless. I liked it so much, I bought it's newer brother, based on the nvidia 750SLI chipset... M750SLI-DS4 Cheers! Nath

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread Miles Nordin
> "b" == Blake writes: b> http://www.provantage.com/lsi-logic-lsi00117~7LSIG03X.htm I'm having trouble matching up chips, cards, drivers, platforms, and modes with the LSI stuff. The more I look at it the mroe confused I get. Platforms: x86 SPARC Drivers: mpt mega_sas mfi Chip

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread Dave
Tim wrote: On Thu, Mar 12, 2009 at 2:22 PM, Blake > wrote: I've managed to get the data transfer to work by rearranging my disks so that all of them sit on the integrated SATA controller. So, I feel pretty certain that this is either an issue with

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread Tim
On Thu, Mar 12, 2009 at 2:22 PM, Blake wrote: > I've managed to get the data transfer to work by rearranging my disks > so that all of them sit on the integrated SATA controller. > > So, I feel pretty certain that this is either an issue with the > Supermicro aoc-sat2-mv8 card, or with PCI-X on t

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread Blake
t; If you're having issues with a disk contoller or disk IO driver its >>>>> highly likely that a savecore to disk after the panic will fail.  I'm >>>>> not >>>>> sure how to work around this, maybe a dedicated dump device not on a >>>&

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-12 Thread Miles Nordin
> "maj" == Maidak Alexander J writes: maj> If you're having issues with a disk contoller or disk IO maj> driver its highly likely that a savecore to disk after the maj> panic will fail. I'm not sure how to work around this not in Solaris, but as a concept for solving the problem:

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
, 2009 at 12:31 AM, Maidak Alexander J >>>> wrote: >>>>> >>>>> If you're having issues with a disk contoller or disk IO driver its >>>>> highly likely that a savecore to disk after the panic will fail.  I'm >>>>> not &g

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Nathan Kroenert
boun...@opensolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copying large amounts of data I guess I didn't make it clear that I had already tried using savecore to retr

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Nathan Kroenert
nsolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copying large amounts of data I guess I didn't make it clear that I had already tried using savecore to retrieve the core

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
aving >>> issues with? >>> >>> -Original Message----- >>> From: zfs-discuss-boun...@opensolaris.org >>> [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake >>> Sent: Wednesday, March 11, 2009 4:45 PM >>> To: Richard Elling

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Nathan Kroenert
u're having issues with? -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copy

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
ensolaris.org > [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake > Sent: Wednesday, March 11, 2009 4:45 PM > To: Richard Elling > Cc: Marc Bevand; zfs-discuss@opensolaris.org > Subject: Re: [zfs-discuss] reboot when copying large amounts of data > > I guess I didn't m

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Maidak Alexander J
ing issues with? -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copyin

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
I guess I didn't make it clear that I had already tried using savecore to retrieve the core from the dump device. I added a larger zvol for dump, to make sure that I wasn't running out of space on the dump device: r...@host:~# dumpadm Dump content: kernel pages Dump device: /dev/zvol

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Richard Elling
Blake wrote: I'm attaching a screenshot of the console just before reboot. The dump doesn't seem to be working, or savecore isn't working. On Wed, Mar 11, 2009 at 11:33 AM, Blake wrote: I'm working on testing this some more by doing a savecore -L right after I start the copy. savec

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
Any chance this could be the motherboard? I suspect the controller. The boot disks are on the built-in nVidia controller. On Wed, Mar 11, 2009 at 3:41 PM, Remco Lengers wrote: > - Upgrade FW of controller to highest or known working level I think I have the latest controller firmware. > - Upgr

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Remco Lengers
looks worth a go otherwise: if the boot disk is also off that controller it may be too hosed to write anything to the boot disk hence FMA doesn't see any issue when it comes up. Possible further actions: - Upgrade FW of controller to highest or known working level - Upgrade driver or OS level

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
Could the problem be related to this bug: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6793353 I'm testing setting the maximum payload size as a workaround, as noted in the bug notes. On Wed, Mar 11, 2009 at 3:14 PM, Blake wrote: > I think that TMC Research is the company that d

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
I think that TMC Research is the company that designed the Supermicro-branded controller card that has the Marvell SATA controller chip on it. Googling around I see connections between Supermicro and TMC. This is the card: http://www.supermicro.com/products/accessories/addon/AOC-SAT2-MV8.cfm On

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
fmdump is not helping much: r...@host:~# fmdump -eV TIME CLASS fmdump: /var/fm/fmd/errlog is empty comparing that screenshot to the output of cfgadm is interesting - looks like the controller(s): r...@host:~# cfgadm -v Ap_Id Receptacle Occupa

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Remco Lengers
Something is not right in the IO space. The messages talk about vendor ID = 11AB 0x11AB Marvell Semiconductor TMC Research Vendor Id: 0x1030 Short Name: TMC Does "fmdump -eV" give any clue when the box comes back up? ..Remco Blake wrote: I'm attaching a screenshot of the console just bef

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
I'm working on testing this some more by doing a savecore -L right after I start the copy. BTW, I'm copying to a raidz2 of only 5 disks, not 16 (the chassis supports 16, but isn't fully populated). So far as I know, there is no spinup happening - these are not RAID controllers, just dumb SATA JBO

Re: [zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Marc Bevand
The copy operation will make all the disks start seeking at the same time and will make your CPU activity jump to a significant percentage to compute the ZFS checksum and RAIDZ parity. I think you could be overloading your PSU because of the sudden increase in power consumption... However if yo

[zfs-discuss] reboot when copying large amounts of data

2009-03-11 Thread Blake
I have a H8DM8-2 motherboard with a pair of AOC-SAT2-MV8 SATA controller cards in a 16-disk Supermicro chassis. I'm running OpenSolaris 2008.11, and the machine performs very well unless I start to copy a large amount of data to the ZFS (software raid) array that's on the Supermicro SATA controlle