Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash

2012-10-01 Thread guy . helmer
On Wednesday, June 6, 2012 8:36:04 PM UTC-5, Mark Felder wrote:
 Hi guys I'm excitedly posting this from my phone. Good news for you guys, bad 
 news for us -- we were building HA storage on vmware for a client and can now 
 replicate the crash on demand. I'll be posting details when I get home to my 
 PC tonight, but this hopefully is enough to replicate the crash for any 
 curious followers:
 
 
 
 ESXi 5
 
 9 or 9-STABLE
 
 HAST 
 
 1 cpu is fine
 
 1GB of ram
 
 UFS SUJ on HAST device
 
 No special loader.conf, sysctl, etc
 
 No need for VMWare tools
 
 Run Bonnie++ on the HAST device
 
 
 
 We can get the crash to happen on the first run of bonnie++ right now. I'll 
 post the exact specs and precise command run in the PR. We found an old post 
 from 2004 when we looked up the process state obtained from CTRL+T -- flswai 
 -- which describes the symptoms nearly perfectly.
 
 
 
  http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2004-02/0250.html 
 
 
 
 Hopefully this gets us closer to a fix...

Is this a crash or a hang? Over the past couple of weeks, I've been working 
with a FreeBSD 9.1RC1 system under VMware ESXi 5.0 with a 64GB UFS root FS and 
2TB ZFS filesystem mounted via a virtual LSI SAS interface. Sometimes during 
heavy I/O load (rsync from other servers) on the ZFS FS, this shows up in 
/var/log/messages:

Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 5 ee 60 
16 0 1 0 0 
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): Retrying command
Sep 21 02:18:44 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 3 ef 42 
51 0 1 0 0 
Sep 21 02:18:44 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 02:18:44 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 02:18:44 backups kernel: (da1:mpt0:0:1:0): Retrying command
Sep 21 02:18:48 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 3 ef 64 
51 0 1 0 0 
Sep 21 02:18:48 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 02:18:48 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 02:18:48 backups kernel: (da1:mpt0:0:1:0): Retrying command
Sep 21 02:18:49 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 3 ef 66 
51 0 1 0 0 
Sep 21 02:18:49 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 02:18:49 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
...
Sep 21 05:06:18 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 41 f3 94 
99 0 1 0 0 
Sep 21 05:06:18 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI Status Error
Sep 21 05:06:18 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 05:06:18 backups kernel: (da1:mpt0:0:1:0): Retrying command

These have been happening roughly every other day.

mpt0 and em0 were sharing int 18, so today I put 
hint.mpt.0.msi_enable=1
into /boot/devices.hints and rebooted; now mpt0 is using int 256. I'll see if 
it helps.

Guy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


6.2-amd64 Hang at reboot on Supermicro X7DBR-i+

2007-03-15 Thread Guy Helmer
I'm investigating a problem where a pretty much stock 6.2 SMP kernel 
randomly hangs on multiple Supermicro X7DBR-i+ and X7DBR-8+ systems.  
The system syncs the filesystems and prints Uptime: ..., then hangs.


So far, I've narrowed it down to the MOD_SHUTDOWN request to the 
rootbus module.  Adding a printf() before and after the 
device_shutdown(child); line in subr_bus.c method 
bus_generic_shutdown() seems to make the problem go away, as does 
running a kernel with INVARIANTS, WITNESS, and DDB/KDB.  I'm trying to 
reproduce the hang on a plain SMP kernel with just DDB/KDB, but it 
hasn't hung yet.


Any ideas?

Guy

--
Guy Helmer, Ph.D.
Chief System Architect
Palisade Systems, Inc.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]