Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
Hi, You guys now absolutely, positively have enough information for a PR. It's still not clear whether it's a device/interrupt layer issue in FreeBSD, or whether vmware is doing something wrong with how it implements shared interrupts, or a bit of both.. Adrian On 24 May 2012 13:54, dane foster wrote: > Hey all, > > On 25/05/2012, at 1:47 AM, Mark Felder wrote: > >> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd wrote: >> >>> Hi, >>> >>> can you please, -please- file a PR? And place all of the above >>> information in it so we don't lose it? >>> >> >> I'd be glad to post a PR and assist in helping to get it permanently fixed. >> I certainly don't want this data to get lost and honestly our business uses >> FreeBSD on VMWare so much that we really need a permanent fix as much as >> anyone else :-) >> >> The reason I've hesitated to post a PR so far is that I didn't have any >> truly useful or concrete evidence of where the problem lies. After Dane >> Foster contacted me and told me he could recreate the crash on demand with >> his workload it was easier to narrow things down. The suggestion that it was >> an interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery that his >> crashes ceased when em0 and mpt0 share an IRQ, but em0 is completely unused >> was starting to prove there is some strong evidence here in favor of the >> interrupts issue. >> >> Dane, what's the status on your end? Has your fix still been successful? Is >> it also stable if you simply set hint.mpt.0.msi_enable="1" ? >> > > The situation I've got that's stable now is: > > hw.pci.enable_msi="0" > hw.pci.enable_msix="0" > > in /boot/loader.conf > > and: > > samael:~:% vmstat -i [ > 6:31PM] > interrupt total rate > irq1: atkbd0 6 0 > irq18: em0 mpt0 3061100 15 > irq19: em1 6891706 35 > cpu0: timer 166383735 868 > cpu1: timer 166382123 868 > cpu3: timer 166382123 868 > cpu2: timer 166382121 868 > Total 675482914 3525 > > Not using em0. This works for 8 (FreeBSD samael.slush.ca 8.3-STABLE FreeBSD > 8.3-STABLE #1: Mon May 7 11:51:03 NZST 2012 > r...@samael.slush.ca:/usr/obj/usr/src/sys/DENE amd64). > > Neither of those settings on their own seem to stop it from happening. > > The 9 box I've tried this on still hangs almost every time i run handbrake, > no matter whether MSI/MSIX is enabled, or I have separate IRQs for mpt0 and > em0/1 > > I can cause the hang mostly on demand, but not quite sure what information to > provide from the hung system. If somebody can let me know what they need, > including root access, I can make that happen. > > Cheers, > > Dane > > > >> >> Thanks! > > > > ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
On 24. May 2012, at 13:47 , Mark Felder wrote: > On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd wrote: > >> Hi, >> >> can you please, -please- file a PR? And place all of the above >> information in it so we don't lose it? >> > > I'd be glad to post a PR and assist in helping to get it permanently fixed. I > certainly don't want this data to get lost and honestly our business uses > FreeBSD on VMWare so much that we really need a permanent fix as much as > anyone else :-) > > The reason I've hesitated to post a PR so far is that I didn't have any truly > useful or concrete evidence of where the problem lies. After Dane Foster > contacted me and told me he could recreate the crash on demand with his > workload it was easier to narrow things down. The suggestion that it was an > interrupts issue (by possibly Bjoern Zeeb?) Just for the public archives. Interrupts wasn't me. I might have mentioned disabling cdrom and fdc as good as possible but everything else I cannot remember... > and Dane's discovery that his crashes ceased when em0 and mpt0 share an IRQ, > but em0 is completely unused was starting to prove there is some strong > evidence here in favor of the interrupts issue. > > Dane, what's the status on your end? Has your fix still been successful? Is > it also stable if you simply set hint.mpt.0.msi_enable="1" ? -- Bjoern A. Zeeb You have to have visions! It does not matter how good you are. It matters what good you do! ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
Hey all, On 25/05/2012, at 1:47 AM, Mark Felder wrote: > On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd wrote: > >> Hi, >> >> can you please, -please- file a PR? And place all of the above >> information in it so we don't lose it? >> > > I'd be glad to post a PR and assist in helping to get it permanently fixed. I > certainly don't want this data to get lost and honestly our business uses > FreeBSD on VMWare so much that we really need a permanent fix as much as > anyone else :-) > > The reason I've hesitated to post a PR so far is that I didn't have any truly > useful or concrete evidence of where the problem lies. After Dane Foster > contacted me and told me he could recreate the crash on demand with his > workload it was easier to narrow things down. The suggestion that it was an > interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery that his > crashes ceased when em0 and mpt0 share an IRQ, but em0 is completely unused > was starting to prove there is some strong evidence here in favor of the > interrupts issue. > > Dane, what's the status on your end? Has your fix still been successful? Is > it also stable if you simply set hint.mpt.0.msi_enable="1" ? > The situation I've got that's stable now is: hw.pci.enable_msi="0" hw.pci.enable_msix="0" in /boot/loader.conf and: samael:~:% vmstat -i [ 6:31PM] interrupt total rate irq1: atkbd0 6 0 irq18: em0 mpt0 3061100 15 irq19: em1 6891706 35 cpu0: timer166383735868 cpu1: timer166382123868 cpu3: timer166382123868 cpu2: timer166382121868 Total 675482914 3525 Not using em0. This works for 8 (FreeBSD samael.slush.ca 8.3-STABLE FreeBSD 8.3-STABLE #1: Mon May 7 11:51:03 NZST 2012 r...@samael.slush.ca:/usr/obj/usr/src/sys/DENE amd64). Neither of those settings on their own seem to stop it from happening. The 9 box I've tried this on still hangs almost every time i run handbrake, no matter whether MSI/MSIX is enabled, or I have separate IRQs for mpt0 and em0/1 I can cause the hang mostly on demand, but not quite sure what information to provide from the hung system. If somebody can let me know what they need, including root access, I can make that happen. Cheers, Dane > > Thanks! ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: proper newfs options for SSD disk
On 2012-May-18 22:54:43 +0200, Dimitry Andric wrote: >Be sure to use "-t enable" when creating the filesystem: Only if your SSD supports TRIM. Some consumer-grade SSDs don't and get very confused if sent TRIM commands. -- Peter Jeremy pgp2LuXn5iRWb.pgp Description: PGP signature
Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd wrote: Hi, can you please, -please- file a PR? And place all of the above information in it so we don't lose it? I'd be glad to post a PR and assist in helping to get it permanently fixed. I certainly don't want this data to get lost and honestly our business uses FreeBSD on VMWare so much that we really need a permanent fix as much as anyone else :-) The reason I've hesitated to post a PR so far is that I didn't have any truly useful or concrete evidence of where the problem lies. After Dane Foster contacted me and told me he could recreate the crash on demand with his workload it was easier to narrow things down. The suggestion that it was an interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery that his crashes ceased when em0 and mpt0 share an IRQ, but em0 is completely unused was starting to prove there is some strong evidence here in favor of the interrupts issue. Dane, what's the status on your end? Has your fix still been successful? Is it also stable if you simply set hint.mpt.0.msi_enable="1" ? Thanks! ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: proper newfs options for SSD disk
On Wed, 23 May 2012, Tim Kientzle wrote: On May 22, 2012, at 7:40 AM, Warren Block wrote: On Tue, 22 May 2012, Matthias Apitz wrote: El día Tuesday, May 22, 2012 a las 07:42:18AM -0600, Warren Block escribió: On Tue, 22 May 2012, Matthias Apitz wrote: El día Sunday, May 20, 2012 a las 03:36:01AM +0900, rozhuk...@gmail.com escribió: Do not use MBR (or manually do all to align). 63 - not 4k aligned. To create the above shown partition layout I have not used gpart(8); I just said: # fdisk -I /dev/ada0 # fdisk -B /dev/ada0 ... What is wrong with this procedure? The filesystem partitions end up at locations that aren't even multiples of 4K. This can reduce performance. How much probably depends on the SSD. But this is then rather a bug in fdisk(8) and not a PEBKAC, or? :-) A bug in the design of MBR. Which probably can be forgiven, considering when it was created and the other problems with it. :) gpart's alignment option can be used with MBR slices and bsdlabel partitions. GPart's alignment option doesn't work for MBR slices. It rounds to the requested alignment, and then rounds again to the track size, which defaults to 63 sectors. There's an example in my proposed rewrite of the Handbook RAID1 section: http://www.wonkity.com/~wblock/mirror/book.html The slice starts at block 126, two blocks shy of 4K alignment. With the added two blocks for the bsdlabel, all of the FreeBSD partitions end up aligned at even 4K multiples. A filesystem in the raw slice would be misaligned. Presumably the answer is "well don't do that, then" (always use a bsdlabel with MBR), or some trick to skip a couple of blocks like gnop. If there are any mistakes in that example, please help me correct them to avert steps 4 and 5 of the traditional commit process (4: apologize, and 5: fix and recommit). I'm not convinced this is a bug in the design of MBR. I don't think anything in the MBR design requires that partitions be track-aligned. I meant "bug" in the sense of a missing feature. MBR may not have a provision for fixed alignment, but to its credit, doesn't prevent it either.___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
RE: proper newfs options for SSD disk
> -Original Message- > From: owner-freebsd-hack...@freebsd.org > [mailto:owner-freebsd-hack...@freebsd.org] On Behalf Of Tim Kientzle > Sent: Thursday, May 24, 2012 12:49 AM > To: Warren Block > Cc: freebsd-hackers@freebsd.org; Matthias Apitz > Subject: Re: proper newfs options for SSD disk > > GPart's alignment option doesn't work for MBR slices. > It rounds to the requested alignment, and then rounds again > to the track size, which defaults to 63 sectors. > > I'm not convinced this is a bug in the design of MBR. I don't > think anything in the MBR design requires that partitions > be track-aligned. > > Tim It really doesn't. This is old school thinking based around minimizing seek and rotation time on slow multiplatter HDDs. It also helped the redundant superblock layout scheme of UFS make that spiral striping down a set of disk platters. My bet is no one has ever bothered to rethink this in the 25 years since ... Andrew Duane Juniper Networks +1 978-589-0551 (o) +1 603-770-7088 (m) adu...@juniper.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"