On Mon, 01 Oct 2012 15:00:40 -0500, wrote:
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): WRITE(10). CDB: 2a 0 5
ee 60 16 0 1 0 0
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): CAM status: SCSI
Status Error
Sep 21 02:14:55 backups kernel: (da1:mpt0:0:1:0): SCSI status: Busy
Sep 21 0
On Wednesday, June 6, 2012 8:36:04 PM UTC-5, Mark Felder wrote:
> Hi guys I'm excitedly posting this from my phone. Good news for you guys, bad
> news for us -- we were building HA storage on vmware for a client and can now
> replicate the crash on demand. I'll be posting details when I get home
On Thursday, September 13, 2012 12:14:49 pm Mark Felder wrote:
> On Thu, 13 Sep 2012 10:11:28 -0500, Andriy Gapon wrote:
>
> > Just curious - does VMWare provide a remote debugger support (gdb stub)?
>
> I'm not aware of one. What I have been able to successfully do is break
> into the debugge
On Sep 15, 2012, at 11:36 AM, Mark Felder wrote:
> On Fri, 14 Sep 2012 20:37:40 -0500, Mark Saad wrote:
>
>> How do you have suj on 8.3 ? Are you using a patch ?
>
> I don't have suj on 8.3
I misread the prior emails
>
>> Also can you retest 9 with the following
>> sysctlkern.timecounte
On Fri, 14 Sep 2012 20:37:40 -0500, Mark Saad
wrote:
How do you have suj on 8.3 ? Are you using a patch ?
I don't have suj on 8.3
Also can you retest 9 with the following
sysctlkern.timecounter.hardware=Acpi-fast
Yes, I'll attempt that as soon as possible. We're under a tight deadline
On Sep 14, 2012, at 8:48 AM, Mark Felder wrote:
> Hi Mark,
>
> Here's the output of our VMs running on ESXi 4.1u1
>
> FreeBSD 7.4:
> # sysctl kern.timecounter.choice
> kern.timecounter.choice: TSC(800) ACPI-safe(850) i8254(0) dummy(-100)
> # sysctl kern.timecounter.hardware
> kern.timecou
Hi Mark,
Here's the output of our VMs running on ESXi 4.1u1
FreeBSD 7.4:
# sysctl kern.timecounter.choice
kern.timecounter.choice: TSC(800) ACPI-safe(850) i8254(0) dummy(-100)
# sysctl kern.timecounter.hardware
kern.timecounter.hardware: ACPI-safe
FreeBSD 8.3:
# sysctl kern.timecounter.choi
---
On Sep 13, 2012, at 7:45 PM, Mark Felder wrote:
> Changing timer source has not been tested. It doesn't crash in 7.x, so did
> something timer related change in 8.x?
>
Mark
Yes the time counter choice priority changed , in 8 favoring higher precision
hardware like hpet over acpi-fas
Changing timer source has not been tested. It doesn't crash in 7.x, so did
something timer related change in 8.x?
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "fr
---
On Sep 13, 2012, at 5:13 PM, Andriy Gapon wrote:
> on 13/09/2012 22:57 Mark Felder said the following:
>> On Thu, 13 Sep 2012 11:28:15 -0500, Kurt Lidl wrote:
>>
>>> Isn't this what you want?
>>>
>>> http://stackframe.blogspot.com/2007/04/debugging-linux-kernels-with.html
>>>
>>> -Kur
on 13/09/2012 22:57 Mark Felder said the following:
> On Thu, 13 Sep 2012 11:28:15 -0500, Kurt Lidl wrote:
>
>> Isn't this what you want?
>>
>> http://stackframe.blogspot.com/2007/04/debugging-linux-kernels-with.html
>>
>> -Kurt
>
> Interesting -- it looks like that's an option on ESX as well. T
On Thu, 13 Sep 2012 11:28:15 -0500, Kurt Lidl wrote:
Isn't this what you want?
http://stackframe.blogspot.com/2007/04/debugging-linux-kernels-with.html
-Kurt
Interesting -- it looks like that's an option on ESX as well. The only
question is: what do I do with that? It's going to give me t
On Thu, Sep 13, 2012 at 11:14:49AM -0500, Mark Felder wrote:
> On Thu, 13 Sep 2012 10:11:28 -0500, Andriy Gapon wrote:
>
> > Just curious - does VMWare provide a remote debugger support (gdb stub)?
>
> I'm not aware of one. What I have been able to successfully do is break
> into the debugger
On Thu, 13 Sep 2012 10:11:28 -0500, Andriy Gapon wrote:
Just curious - does VMWare provide a remote debugger support (gdb stub)?
I'm not aware of one. What I have been able to successfully do is break
into the debugger during the hang but the info I've posted so far has not
been relevant
on 13/09/2012 17:50 Mark Felder said the following:
> On Wed, 12 Sep 2012 14:20:26 -0500, John Baldwin wrote:
>
>> Are you still seeing this, and if so can you get a crashdump? Also, I'm
>> curious if you only see this with SUJ or if plain UFS+SU works fine?
>
> The crash on demand right now is
On Wed, 12 Sep 2012 14:20:26 -0500, John Baldwin wrote:
Are you still seeing this, and if so can you get a crashdump? Also, I'm
curious if you only see this with SUJ or if plain UFS+SU works fine?
The crash on demand right now is producable on 8.x and 9.x, so SUJ isn't a
requirement. Also,
On Wednesday, June 06, 2012 9:34:02 pm Mark Felder wrote:
> Hi guys I'm excitedly posting this from my phone. Good news for you guys,
bad news for us -- we were building HA storage on vmware for a client and can
now replicate the crash on demand. I'll be posting details when I get home to
my PC
Hi guys I'm excitedly posting this from my phone. Good news for you guys, bad
news for us -- we were building HA storage on vmware for a client and can now
replicate the crash on demand. I'll be posting details when I get home to my PC
tonight, but this hopefully is enough to replicate the crash
On Thursday, May 31, 2012 11:11:11 am Mark Felder wrote:
> So when this hang happens, there never is a real panic. It just sits in a
> state which I describe as like being in a deadlock. How would I go about
> getting a crashdump if it never panics? Is it possible to do the dump over
> a netw
So when this hang happens, there never is a real panic. It just sits in a
state which I describe as like being in a deadlock. How would I go about
getting a crashdump if it never panics? Is it possible to do the dump over
a network or something because I don't believe it can write through the
On Wednesday, May 30, 2012 3:56:02 pm Mark Felder wrote:
> On Wed, 30 May 2012 12:17:07 -0500, John Baldwin wrote:
>
> >
> > Humm, can you test it with 2 CPUs?
> >
>
> We primarily only run with 1 CPU. We have seen it crash on multiple CPU
> VMs. Also, Dane Foster appeared to have been using m
On Wed, 30 May 2012 12:17:07 -0500, John Baldwin wrote:
Humm, can you test it with 2 CPUs?
We primarily only run with 1 CPU. We have seen it crash on multiple CPU
VMs. Also, Dane Foster appeared to have been using multiple CPUs in his
video transcoding VMs.
Unfortunately I can't give
On Wednesday, May 30, 2012 12:07:50 pm Mark Felder wrote:
> On Wed, 30 May 2012 10:06:13 -0500, John Baldwin wrote:
>
> >
> > Do you only have one CPU in this VM? If not, do you know which threads
> > the other CPUs were running (e.g. do you have ps7.png, etc.)?
>
> correct, only one CPU in the
On Wed, 30 May 2012 10:06:13 -0500, John Baldwin wrote:
Do you only have one CPU in this VM? If not, do you know which threads
the other CPUs were running (e.g. do you have ps7.png, etc.)?
correct, only one CPU in the VM
___
freebsd-hackers@freebs
On Thursday, May 24, 2012 9:47:46 am Mark Felder wrote:
> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd
> wrote:
>
> > Hi,
> >
> > can you please, -please- file a PR? And place all of the above
> > information in it so we don't lose it?
> >
>
> I'd be glad to post a PR and assist in helping
Hi,
You guys now absolutely, positively have enough information for a PR.
It's still not clear whether it's a device/interrupt layer issue in
FreeBSD, or whether vmware is doing something wrong with how it
implements shared interrupts, or a bit of both..
Adrian
On 24 May 2012 13:54, dane foster
On 24. May 2012, at 13:47 , Mark Felder wrote:
> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd wrote:
>
>> Hi,
>>
>> can you please, -please- file a PR? And place all of the above
>> information in it so we don't lose it?
>>
>
> I'd be glad to post a PR and assist in helping to get it per
Hey all,
On 25/05/2012, at 1:47 AM, Mark Felder wrote:
> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd wrote:
>
>> Hi,
>>
>> can you please, -please- file a PR? And place all of the above
>> information in it so we don't lose it?
>>
>
> I'd be glad to post a PR and assist in helping to ge
On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd
wrote:
Hi,
can you please, -please- file a PR? And place all of the above
information in it so we don't lose it?
I'd be glad to post a PR and assist in helping to get it permanently
fixed. I certainly don't want this data to get lost and
Hi,
can you please, -please- file a PR? And place all of the above
information in it so we don't lose it?
If this is indeed the problem then I really think we should root cause
why the driver and/or interrupt handling code is getting angry with
the shared interrupt.
I'd also appreciate it if you
On Mon, 21 May 2012 12:01:19 -0500, Andrew Boyer
wrote:
You could try switching mpt to MSI. MSI interrupts are never shared.
Add this to /boot/device.hints:
hint.mpt.0.msi_enable="1"
Currently implementing this on the known crashy servers. I've been looking
around and all of our VM
On May 21, 2012, at 12:41 PM, Mark Felder wrote:
> OK guys I've been talking with another user who can recreate this crash and
> the last bit of information we've learned seems to be leaning towards
> interrupts/IRQ issues like someone (bz@ perhaps?) suggested.
>
> I'm still trying to test thi
OK guys I've been talking with another user who can recreate this crash
and the last bit of information we've learned seems to be leaning towards
interrupts/IRQ issues like someone (bz@ perhaps?) suggested.
I'm still trying to test this myself, but the other user was able to
recreate my cra
Quick update:
I have received word last night that this crash has been consistently
happening to someone on FreeBSD 9 and they're looking for more ideas. I
changed the following 41 days ago:
- Video memory to "auto" if it wasn't already
- SCSI controller changed from LSI Logic Parallel to L
On 4/2/2012 3:59 PM, Joe Greco wrote:
>> On 4/2/2012 11:43 AM, Joe Greco wrote:
>>> As a user, you can't win. If you don't report
>>> a problem, you get criticized. If you report a problem but can't figure
>>> out how to reproduce it, you get criticized. If you can reproduce it
>>> but you don't
Guys,
The crash on my machine with debugging has evaded me for a few days. I'm
still looking for further suggestions of things I should grab from the DDB
when it happens again.
Thanks for the help everyone!
___
freebsd-hackers@freebsd.org mailing
> On 4/2/2012 11:43 AM, Joe Greco wrote:
> > As a user, you can't win. If you don't report
> > a problem, you get criticized. If you report a problem but can't figure
> > out how to reproduce it, you get criticized. If you can reproduce it
> > but you don't submit a workaround, you get criticize
On 4/2/2012 11:43 AM, Joe Greco wrote:
> As a user, you can't win. If you don't report
> a problem, you get criticized. If you report a problem but can't figure
> out how to reproduce it, you get criticized. If you can reproduce it
> but you don't submit a workaround, you get criticized. If you
> On 03/30/2012 07:41, Joe Greco wrote:
> >> On 3/29/2012 7:01 AM, Joe Greco wrote:
> On 3/28/2012 1:59 PM, Mark Felder wrote:
> > FreeBSD 8-STABLE, 8.3, and 9.0 are untested
>
> As much as I'm sensitive to your production requirements, realistically
> it's not likely that y
On 03/30/2012 07:41, Joe Greco wrote:
>> On 3/29/2012 7:01 AM, Joe Greco wrote:
On 3/28/2012 1:59 PM, Mark Felder wrote:
> FreeBSD 8-STABLE, 8.3, and 9.0 are untested
As much as I'm sensitive to your production requirements, realistically
it's not likely that you'll get a he
On Fri, 30 Mar 2012 19:49:54 -0500, Adrian Chadd
wrote:
There's no guarantee that upgarding a VM or rebooting it won't change
the config of said VM. Don't forget to diff the vm config file..
I'm not sure how this would be accomplished Am I supposed to be
running backup software (rsync
There's no guarantee that upgarding a VM or rebooting it won't change
the config of said VM. Don't forget to diff the vm config file..
Adrian
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsub
> Subsequent inspection suggested that it was happening during the
> periodic daily, though we never managed to get it to happen by manually
> forcing periodic daily, so that's only a theory.
Perhaps due to a bunch of VMs all running periodic daily at the same time?
> We had a perfectly functiona
Mark Felder wrote:
On Thu, 29 Mar 2012 12:24:30 -0500, wrote:
I just started reading this tread, but I am wondering if I missed
something here. What does this have to do with "Windows 7"?
I emailed him off-list but I'm guessing he thought this was on VMWare
Workstation or another product t
On Fri, 30 Mar 2012 11:53:10 -0500, Joe Greco wrote:
On the same vmdk files? "Deleting the VM" makes it sound like not.
Fresh new VMDK files every time, and always thick provisioned.
None of the other VM's, even the VM's that had been abused in this
horribly insensitive manner of being pla
> On Fri, 30 Mar 2012 09:44:47 -0500, Joe Greco wrote:
> > Have you migrated these hosts, or were they installed in-place and
> > never moved?
> > fwiw the apparent integrity of things on the VM is consistent with
> > our experience too.
>
> VMMotion and StorageVMMotion does not seem to affect th
On Fri, 30 Mar 2012 09:44:47 -0500, Joe Greco wrote:
Have you migrated these hosts, or were they installed in-place and
never moved?
fwiw the apparent integrity of things on the VM is consistent with
our experience too.
VMMotion and StorageVMMotion does not seem to affect the stability. Even
> On Thu, 29 Mar 2012 19:27:31 -0500, Joe Greco wrote:
>
> > It also doesn't explain the experience here, where one VM basically
> > crapped out but only after a migration - and then stayed crapped out.
> > It would be interesting to hear about your datastore, how busy it is,
> > what technology,
> On 3/29/2012 7:01 AM, Joe Greco wrote:
> >> On 3/28/2012 1:59 PM, Mark Felder wrote:
> >>> FreeBSD 8-STABLE, 8.3, and 9.0 are untested
> >>
> >> As much as I'm sensitive to your production requirements, realistically
> >> it's not likely that you'll get a helpful result without testing a newer
>
Again, it's starting to sound like an interrupt handling issue which
may or may not be limited to the storage device.
You'll have to engage someone who knows those device drivers and
likely have them add some debugging to the driver which can be easily
flipped on (via binaries in a ramdisk - very
On Thu, 29 Mar 2012 19:27:31 -0500, Joe Greco wrote:
It also doesn't explain the experience here, where one VM basically
crapped out but only after a migration - and then stayed crapped out.
It would be interesting to hear about your datastore, how busy it is,
what technology, whether you're us
> > And then there is this one with similar symptoms and a workaround:
> >
> > http://forums.freebsd.org/showthread.php?t=3D27899
>
> I'm now investigating those loader.conf options. I have my crashy machine
> set to use them on next boot so we'll see if it crashes now that I'm using
> LSI SAS emu
On 3/29/2012 7:01 AM, Joe Greco wrote:
>> On 3/28/2012 1:59 PM, Mark Felder wrote:
>>> FreeBSD 8-STABLE, 8.3, and 9.0 are untested
>>
>> As much as I'm sensitive to your production requirements, realistically
>> it's not likely that you'll get a helpful result without testing a newer
>> version. 8.
On Thu, 29 Mar 2012 15:53:52 -0500, Adam Vande More
wrote:
Doesn't VMWare offer different types of emulated disk controllers? If
so,
that might be the easiest way to narrow the field. Another thing maybe
to
try would be to backport the mpt
Yes, they offer Paravirtual (not applicable
On Thu, Mar 29, 2012 at 1:22 PM, Mark Felder wrote:
>
> If we assume mpt is the culprit
>
Doesn't VMWare offer different types of emulated disk controllers? If so,
that might be the easiest way to narrow the field. Another thing maybe to
try would be to backport the mpt
Also, it's not VMWare'
On Thu, 29 Mar 2012 12:53:49 -0500, Dieter BSD
wrote:
FreeBSD ?? - 7.4 never crash
FreeBSD 8.0 - 8.2 crashes
Obvious short term workaround is to run production on 7.4 (assuming you
can)
until you figure out what is wrong with 8.x.
We're moving our most critical servers to 7.4 this week
> FreeBSD ?? - 7.4 never crash
> FreeBSD 8.0 - 8.2 crashes
Obvious short term workaround is to run production on 7.4 (assuming you can)
until you figure out what is wrong with 8.x.
What filesystem(s) are you running? UFS? ZFS? other?
> started randomly disconnecting people every morning
Due to
> On Thursday 29 March 2012 17:49:30 Joe Greco wrote:
> > > On Thursday 29 March 2012 15:42:42 Joe Greco wrote:
> > > > > Hi,
> > >
> > > Do both 32- and 64-bit versions of FreeBSD crash?
> >
> > We've only seen it happen on one virtual machine. That was a 32-bit
> > version. And it's not so mu
On Thu, 29 Mar 2012 11:53:02 -0500, Alan Cox wrote:
Not so long ago, VMware implemented a clever scheme for reducing the
overhead of virtualized interrupts that must be delivered by at least
some
(if not all) of their emulated storage controllers:
http://static.usenix.org/events/atc11/tech
On Thu, 29 Mar 2012 12:24:30 -0500, wrote:
I just started reading this tread, but I am wondering if I missed
something here. What does this have to do with "Windows 7"?
I emailed him off-list but I'm guessing he thought this was on VMWare
Workstation or another product that would virtualiz
On Thu, 29 Mar 2012 12:05:30 -0500, Mark Atkinson
wrote:
If this is an interrupt problem with disk i/o, then you might want to
look into (DDB(4))
show intr
show intrcount
maybe
show allrman
Thank you! I really don't know what things we should be running in DDB to
diagnose this and we wi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 03/29/2012 07:03, Mark Felder wrote:
> Alright, new data. It happened to crash about 10 minutes after I
> came in this morning and I ran some stuff in the DDB. I have no
> idea what information is useful, but perhaps someone will see
> something out
On Thu, Mar 29, 2012 at 11:27 AM, Mark Felder wrote:
> On Thu, 29 Mar 2012 10:55:36 -0500, Hans Petter Selasky
> wrote:
>
>>
>> It almost sounds like the lost interrupt issue I've seen with USB EHCI
>> devices, though disk I/O should have a retry timeout?
>>
>> What does "wmstat -i" output?
>>
>
This sounds just like a race condition that happens under Windows 7 on
this laptop. The race condition, as far as I can tell involves heavy
disk access and heavy network access, and usually leaves the drive light
on, while all activity monitors (alldisk, allcpu, allnetwork) are still
active, a
On Thu, 29 Mar 2012 10:49:30 -0500, Joe Greco wrote:
I explained it at the time to one of my VMware friends:
This is 100% identical to what we see, Joe! And we're so unlucky that we
have this happen on probably a dozen servers, but a handful are the really
bad ones. We've rebuilt them fr
On Thu, 29 Mar 2012 10:55:36 -0500, Hans Petter Selasky
wrote:
It almost sounds like the lost interrupt issue I've seen with USB EHCI
devices, though disk I/O should have a retry timeout?
What does "wmstat -i" output?
--HPS
Here's a server that has a week uptime and is due for a crash any
On Thu, 29 Mar 2012 10:31:24 -0500, Eduardo Morras
wrote:
Don't know about ESXi but on others VM Managers i can change the chipset
emulation from ICH10 to ICH4. Can you change it to an older chipset too?
Unfortunately there's no setting in the GUI for that but I'll keep looking
to see
On Thursday 29 March 2012 17:49:30 Joe Greco wrote:
> > On Thursday 29 March 2012 15:42:42 Joe Greco wrote:
> > > > Hi,
> >
> > Do both 32- and 64-bit versions of FreeBSD crash?
>
> We've only seen it happen on one virtual machine. That was a 32-bit
> version. And it's not so much a crash as it
> On Thursday 29 March 2012 15:42:42 Joe Greco wrote:
> > > Hi,
>
> Do both 32- and 64-bit versions of FreeBSD crash?
We've only seen it happen on one virtual machine. That was a 32-bit
version. And it's not so much a crash as it is a "disk I/O hang".
The fact that it was happening regularly t
At 16:03 29/03/2012, you wrote:
Alright, new data. It happened to crash about 10 minutes after I came in
this morning and I ran some stuff in the DDB. I have no idea what
information is useful, but perhaps someone will see something out of the
ordinary?
http://feld.me/freebsd/esx_crash/
Don't
On Thu, 29 Mar 2012 09:58:16 -0500, Hans Petter Selasky
wrote:
Do both 32- and 64-bit versions of FreeBSD crash?
Correct, we see both i386 and amd64 flavors crash in the same way.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.o
> On 3/28/2012 1:59 PM, Mark Felder wrote:
> > FreeBSD 8-STABLE, 8.3, and 9.0 are untested
>
> As much as I'm sensitive to your production requirements, realistically
> it's not likely that you'll get a helpful result without testing a newer
> version. 8.2 came out over a year ago, many many thing
On Thursday 29 March 2012 15:42:42 Joe Greco wrote:
> > Hi,
Do both 32- and 64-bit versions of FreeBSD crash?
--HPS
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "
> Hi,
>
> * have you filed a PR?
> * is the crash easily reproducable?
> * are you able to boot some ramdisk-only FreeBSD-8.2 images (eg create
> a ramdisk image using nanobsd?) and do some stress testing inside
> that?
>
> It sounds like you've established it's a storage issue, or at least
> int
Alright, new data. It happened to crash about 10 minutes after I came in
this morning and I ran some stuff in the DDB. I have no idea what
information is useful, but perhaps someone will see something out of the
ordinary?
http://feld.me/freebsd/esx_crash/
Thanks...
__
On Thu, 29 Mar 2012 02:36:49 -0500, Doug Barton wrote:
As much as I'm sensitive to your production requirements, realistically
it's not likely that you'll get a helpful result without testing a newer
version. 8.2 came out over a year ago, many many things have changed
since then.
The sad part
On Wed, 28 Mar 2012 18:31:38 -0500, Adrian Chadd
wrote:
* have you filed a PR?
No
* is the crash easily reproducable?
Unfortunately not. It's totally random. Some servers will "get the bug"
and crash daily, some will crash weekly, some might seem to be fine but 3
months later hit th
On 3/28/2012 1:59 PM, Mark Felder wrote:
> FreeBSD 8-STABLE, 8.3, and 9.0 are untested
As much as I'm sensitive to your production requirements, realistically
it's not likely that you'll get a helpful result without testing a newer
version. 8.2 came out over a year ago, many many things have chang
Hi,
* have you filed a PR?
* is the crash easily reproducable?
* are you able to boot some ramdisk-only FreeBSD-8.2 images (eg create
a ramdisk image using nanobsd?) and do some stress testing inside
that?
It sounds like you've established it's a storage issue, or at least
interrupt handling for
Alright guys, I'm at the end of my rope here. For those that haven't seen
my previous emails here's the (not so) quick breakdown:
Overview:
FreeBSD ?? - 7.4 never crash
FreeBSD 8.0 - 8.2 crashes
FreeBSD 8-STABLE, 8.3, and 9.0 are untested (Sorry, not possible in our
production at this time,
80 matches
Mail list logo