Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash

2012-05-24 Thread Adrian Chadd
Hi,

You guys now absolutely, positively have enough information for a PR.

It's still not clear whether it's a device/interrupt layer issue in
FreeBSD, or whether vmware is doing something wrong with how it
implements shared interrupts, or a bit of both..

Adrian

On 24 May 2012 13:54, dane foster  wrote:
> Hey all,
>
> On 25/05/2012, at 1:47 AM, Mark Felder wrote:
>
>> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd  wrote:
>>
>>> Hi,
>>>
>>> can you please, -please- file a PR? And place all of the above
>>> information in it so we don't lose it?
>>>
>>
>> I'd be glad to post a PR and assist in helping to get it permanently fixed. 
>> I certainly don't want this data to get lost and honestly our business uses 
>> FreeBSD on VMWare so much that we really need a permanent fix as much as 
>> anyone else :-)
>>
>> The reason I've hesitated to post a PR so far is that I didn't have any 
>> truly useful or concrete evidence of where the problem lies. After Dane 
>> Foster contacted me and told me he could recreate the crash on demand with 
>> his workload it was easier to narrow things down. The suggestion that it was 
>> an interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery that his 
>> crashes ceased when em0 and mpt0 share an IRQ, but em0 is completely unused 
>> was starting to prove there is some strong evidence here in favor of the 
>> interrupts issue.
>>
>> Dane, what's the status on your end? Has your fix still been successful? Is 
>> it also stable if you simply set hint.mpt.0.msi_enable="1" ?
>>
>
> The situation I've got that's stable now is:
>
> hw.pci.enable_msi="0"
> hw.pci.enable_msix="0"
>
> in /boot/loader.conf
>
> and:
>
> samael:~:% vmstat -i                                                  [ 
> 6:31PM]
> interrupt                          total       rate
> irq1: atkbd0                           6          0
> irq18: em0 mpt0                  3061100         15
> irq19: em1                       6891706         35
> cpu0: timer                    166383735        868
> cpu1: timer                    166382123        868
> cpu3: timer                    166382123        868
> cpu2: timer                    166382121        868
> Total                          675482914       3525
>
> Not using em0. This works for 8 (FreeBSD samael.slush.ca 8.3-STABLE FreeBSD 
> 8.3-STABLE #1: Mon May  7 11:51:03 NZST 2012     
> r...@samael.slush.ca:/usr/obj/usr/src/sys/DENE  amd64).
>
> Neither of those settings on their own seem to stop it from happening.
>
> The 9 box I've tried this on still hangs almost every time i run handbrake, 
> no matter whether MSI/MSIX is enabled, or I have separate IRQs for mpt0 and 
> em0/1
>
> I can cause the hang mostly on demand, but not quite sure what information to 
> provide from the hung system. If somebody can let me know what they need, 
> including root access, I can make that happen.
>
> Cheers,
>
> Dane
>
>
>
>>
>> Thanks!
>
>
>
>
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash

2012-05-24 Thread Bjoern A. Zeeb

On 24. May 2012, at 13:47 , Mark Felder wrote:

> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd  wrote:
> 
>> Hi,
>> 
>> can you please, -please- file a PR? And place all of the above
>> information in it so we don't lose it?
>> 
> 
> I'd be glad to post a PR and assist in helping to get it permanently fixed. I 
> certainly don't want this data to get lost and honestly our business uses 
> FreeBSD on VMWare so much that we really need a permanent fix as much as 
> anyone else :-)
> 
> The reason I've hesitated to post a PR so far is that I didn't have any truly 
> useful or concrete evidence of where the problem lies. After Dane Foster 
> contacted me and told me he could recreate the crash on demand with his 
> workload it was easier to narrow things down. The suggestion that it was an 
> interrupts issue (by possibly Bjoern Zeeb?) 

Just for the public archives.  Interrupts wasn't me.   I might have mentioned 
disabling cdrom and fdc as good as possible but everything else I cannot 
remember...


> and Dane's discovery that his crashes ceased when em0 and mpt0 share an IRQ, 
> but em0 is completely unused was starting to prove there is some strong 
> evidence here in favor of the interrupts issue.
> 
> Dane, what's the status on your end? Has your fix still been successful? Is 
> it also stable if you simply set hint.mpt.0.msi_enable="1" ?

-- 
Bjoern A. Zeeb You have to have visions!
   It does not matter how good you are. It matters what good you do!

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash

2012-05-24 Thread dane foster
Hey all,

On 25/05/2012, at 1:47 AM, Mark Felder wrote:

> On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd  wrote:
> 
>> Hi,
>> 
>> can you please, -please- file a PR? And place all of the above
>> information in it so we don't lose it?
>> 
> 
> I'd be glad to post a PR and assist in helping to get it permanently fixed. I 
> certainly don't want this data to get lost and honestly our business uses 
> FreeBSD on VMWare so much that we really need a permanent fix as much as 
> anyone else :-)
> 
> The reason I've hesitated to post a PR so far is that I didn't have any truly 
> useful or concrete evidence of where the problem lies. After Dane Foster 
> contacted me and told me he could recreate the crash on demand with his 
> workload it was easier to narrow things down. The suggestion that it was an 
> interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery that his 
> crashes ceased when em0 and mpt0 share an IRQ, but em0 is completely unused 
> was starting to prove there is some strong evidence here in favor of the 
> interrupts issue.
> 
> Dane, what's the status on your end? Has your fix still been successful? Is 
> it also stable if you simply set hint.mpt.0.msi_enable="1" ?
> 

The situation I've got that's stable now is:

hw.pci.enable_msi="0"
hw.pci.enable_msix="0"

in /boot/loader.conf

and:

samael:~:% vmstat -i  [ 6:31PM]
interrupt  total   rate
irq1: atkbd0   6  0
irq18: em0 mpt0  3061100 15
irq19: em1   6891706 35
cpu0: timer166383735868
cpu1: timer166382123868
cpu3: timer166382123868
cpu2: timer166382121868
Total  675482914   3525

Not using em0. This works for 8 (FreeBSD samael.slush.ca 8.3-STABLE FreeBSD 
8.3-STABLE #1: Mon May  7 11:51:03 NZST 2012 
r...@samael.slush.ca:/usr/obj/usr/src/sys/DENE  amd64).

Neither of those settings on their own seem to stop it from happening.

The 9 box I've tried this on still hangs almost every time i run handbrake, no 
matter whether MSI/MSIX is enabled, or I have separate IRQs for mpt0 and em0/1

I can cause the hang mostly on demand, but not quite sure what information to 
provide from the hung system. If somebody can let me know what they need, 
including root access, I can make that happen.

Cheers,

Dane



> 
> Thanks!




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: proper newfs options for SSD disk

2012-05-24 Thread Peter Jeremy
On 2012-May-18 22:54:43 +0200, Dimitry Andric  wrote:
>Be sure to use "-t enable" when creating the filesystem:

Only if your SSD supports TRIM.  Some consumer-grade SSDs don't and
get very confused if sent TRIM commands.

-- 
Peter Jeremy


pgp2LuXn5iRWb.pgp
Description: PGP signature


Re: Please help me diagnose this crazy VMWare/FreeBSD 8.x crash

2012-05-24 Thread Mark Felder
On Wed, 23 May 2012 17:30:40 -0500, Adrian Chadd   
wrote:



Hi,

can you please, -please- file a PR? And place all of the above
information in it so we don't lose it?



I'd be glad to post a PR and assist in helping to get it permanently  
fixed. I certainly don't want this data to get lost and honestly our  
business uses FreeBSD on VMWare so much that we really need a permanent  
fix as much as anyone else :-)


The reason I've hesitated to post a PR so far is that I didn't have any  
truly useful or concrete evidence of where the problem lies. After Dane  
Foster contacted me and told me he could recreate the crash on demand with  
his workload it was easier to narrow things down. The suggestion that it  
was an interrupts issue (by possibly Bjoern Zeeb?) and Dane's discovery  
that his crashes ceased when em0 and mpt0 share an IRQ, but em0 is  
completely unused was starting to prove there is some strong evidence here  
in favor of the interrupts issue.


Dane, what's the status on your end? Has your fix still been successful?  
Is it also stable if you simply set hint.mpt.0.msi_enable="1" ?



Thanks!
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: proper newfs options for SSD disk

2012-05-24 Thread Warren Block

On Wed, 23 May 2012, Tim Kientzle wrote:


On May 22, 2012, at 7:40 AM, Warren Block wrote:


On Tue, 22 May 2012, Matthias Apitz wrote:


El día Tuesday, May 22, 2012 a las 07:42:18AM -0600, Warren Block escribió:


On Tue, 22 May 2012, Matthias Apitz wrote:


El día Sunday, May 20, 2012 a las 03:36:01AM +0900, rozhuk...@gmail.com 
escribió:


Do not use MBR (or manually do all to align).
63 - not 4k aligned.


To create the above shown partition layout I have not used gpart(8); I
just said:

  # fdisk -I /dev/ada0
  # fdisk -B /dev/ada0

...
What is wrong with this procedure?


The filesystem partitions end up at locations that aren't even multiples
of 4K.  This can reduce performance.  How much probably depends on the
SSD.


But this is then rather a bug in fdisk(8) and not a PEBKAC, or? :-)


A bug in the design of MBR.  Which probably can be forgiven, considering when 
it was created and the other problems with it. :)

gpart's alignment option can be used with MBR slices and bsdlabel partitions.


GPart's alignment option doesn't work for MBR slices.
It rounds to the requested alignment, and then rounds again
to the track size, which defaults to 63 sectors.


There's an example in my proposed rewrite of the Handbook RAID1 
section: http://www.wonkity.com/~wblock/mirror/book.html


The slice starts at block 126, two blocks shy of 4K alignment.  With the 
added two blocks for the bsdlabel, all of the FreeBSD partitions end up 
aligned at even 4K multiples.


A filesystem in the raw slice would be misaligned.  Presumably the 
answer is "well don't do that, then" (always use a bsdlabel with MBR), 
or some trick to skip a couple of blocks like gnop.


If there are any mistakes in that example, please help me correct them 
to avert steps 4 and 5 of the traditional commit process (4: apologize, 
and 5: fix and recommit).



I'm not convinced this is a bug in the design of MBR.  I don't
think anything in the MBR design requires that partitions
be track-aligned.


I meant "bug" in the sense of a missing feature.  MBR may not have a 
provision for fixed alignment, but to its credit, doesn't prevent it 
either.___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

RE: proper newfs options for SSD disk

2012-05-24 Thread Andrew Duane
> -Original Message-
> From: owner-freebsd-hack...@freebsd.org 
> [mailto:owner-freebsd-hack...@freebsd.org] On Behalf Of Tim Kientzle
> Sent: Thursday, May 24, 2012 12:49 AM
> To: Warren Block
> Cc: freebsd-hackers@freebsd.org; Matthias Apitz
> Subject: Re: proper newfs options for SSD disk
> 
> GPart's alignment option doesn't work for MBR slices.
> It rounds to the requested alignment, and then rounds again
> to the track size, which defaults to 63 sectors.
> 
> I'm not convinced this is a bug in the design of MBR.  I don't
> think anything in the MBR design requires that partitions
> be track-aligned.
> 
> Tim

It really doesn't. This is old school thinking based around minimizing seek and 
rotation time on slow multiplatter HDDs. It also helped the redundant 
superblock layout scheme of UFS make that spiral striping down a set of disk 
platters. My bet is no one has ever bothered to rethink this in the 25 years 
since


...
Andrew Duane
Juniper Networks
+1 978-589-0551 (o)
+1 603-770-7088 (m)
adu...@juniper.net

 



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"