:You write:
::      we can not identify the specific problem from this message.
:: without sufficient information to indentify and hopefully reproduce
:: the problem, we can not address it.  please provide this information
:: if it is available to you. if it is not, please provide us contact
:: information for the commercial entities experiencing the problem.
:
:I work at Yahoo.  My address there is "[EMAIL PROTECTED]".
:
:On a recent project I encountered two show-stopping bugs with 3.3-release
:that did not exist in 2.2.8-release:
:
:1) Random crashes in FXP interrupt or low-level IP code.  Something is
:   clobbering the kernel stack--possibly the NCR driver, since using an
:   Adaptec made the problem stop, as did a backport of the CAM driver
:   Peter Wemm tried.  This was on an N440BX, which is becoming quite
:   common in server applications.  Other installations are apparantly
:   seeing the same problem on this hardware.
:
:2) A hard loop in the pagedaemon.  This was especially egregious, since
:   it meant the system had to be rebooted from the console--and since
:   the application could elicit the problem within a few minutes.
:   Disabling the use of mmap() for file update in the application
:   prevented the problem.  After spending a day trying to cook up a
:   test program that elicited the same behavior that the application
:   did, I gave up for lack of time.  But there have been other reports
:   of late that sound like this problem, mostly in high VM/RAM situations.
:
:That's two serious bugs that exist in 3.3-release but not in 2.2.8-release.
:Looking back through the archives, I can see that I'm not the only one who
:has experienced them.  I came away from the experience with the feeling that
:the FreeBSD project has some serious Q/A problems... and I can assure you,
:I'm not alone in this feeling.
:
:               -Ed

    Well, #2 at least should be fixed in -current.  Unfortunately the
    changes to the VM system were too extensive to backport to 3.x.  Or, 
    I should say, that at the time I started working on the VM system core 
    was not interested in allowing me to backport the changes, and then later
    it was simply too late - too many changes had been made.

    #1 has come up a couple of times.  There was a conversation in October
    that closely relates to your problem:

:From: Joe McGuckin <[EMAIL PROTECTED]>
:Subject:  fxp related kernel panic
:
:I have a 3.3-stable machine that I use as a news router (running diablo). The
:fxp0 interface averages 10-15 Mbps bandwidth continously.
:
:About once a week the machine crashes & reboots. We enabled the debugger this ti
:me
:and captured the following debug output:
:
:Fatal trap 12: page fault while in kernel mode
:fault virtual address   = 0x382e4641
:fault code              = supervisor write, page not present
:instruction pointer     = 0x8:0xc01a372e
:stack pointer           = 0x10:0xc02523b0
:frame pointer           = 0x10:0xc02523c0
:code segment            = base 0x0, limit 0xfffff, type 0x1b
:                        = DPL 0, pres 1, def32 1, gran 1
:processor eflags        = interrupt enabled, resume, IOPL = 0
:current process         = Idle
:interrupt mask          = net
:kernel: type 12 trap, code=0
:Stopped at      fxp_add_rfabuf+0x1de:   movw    %ax,0x4(%esi)
:db> 
:
:%uname -a
:FreeBSD feeder.via.net 3.3-STABLE FreeBSD 3.3-STABLE #7: Mon Oct 18 17:14:40 PDT
: 1999     [EMAIL PROTECTED]:/usr/src/sys/compile/DIABLO  i386
:
:%dmesg
:Copyright (c) 1992-1999 FreeBSD Inc.
:Copyright (c) 1982, 1986, 1989, 1991, 1993
:        The Regents of the University of California. All rights reserved.
:FreeBSD 3.3-STABLE #7: Mon Oct 18 17:14:40 PDT 1999

    To which DG responded:

:From:     David Greenman <[EMAIL PROTECTED]>
:Subject:  Re: fxp related kernel panic 
:To:       Joe McGuckin <[EMAIL PROTECTED]>
:Cc:       [EMAIL PROTECTED], [EMAIL PROTECTED]
:Date:     Tue, 26 Oct 1999 11:43:02 -0700
:
:
:   Let me guess...your system has an Intel N440BX motherboard, right? If so,
:then it's a known problem with no solution yet.
:
:-DG
:
:David Greenman
:Co-founder/Principal Architect, The FreeBSD Project - http://www.freebsd.org
:Creator of high-performance Internet servers - http://www.terasolutions.com
:Pave the road of life with opportunities.

    And he also said:

:From:     David Greenman <[EMAIL PROTECTED]>
:Subject:  Re: fxp related kernel panic 
:To:       Lew Payne <[EMAIL PROTECTED]>
:Cc:       [EMAIL PROTECTED], Joe McGuckin <[EMAIL PROTECTED]>
:Date:     Tue, 26 Oct 1999 13:19:45 -0700
:
:
:>Hi David -- What if I install a *real* EtherExpress Pro-100B (or
:>whatever it's known as today) in the PCI slot, and use it instead
:>of the on-board (N440BX motherboard) fxp0 interface?
:>
:>Judging that you probably know the nature of the problem, do you
:>think this might circumvent it?
:
:   I think it is caused by the NCR/Symbios controller. It might be a side
:effect of the NCR just using up a lot of PCI bandwidth, with the real bug
:being in the fxp driver (although I've looked and haven't found one). So
:I don't think putting in a real Pro/100 will have any effect on the problem.
:Of course I don't really know what is causing it, so just about anything
:is possible.
:
:-DG
:
:David Greenman

    And that, I'm afraid is where it has been left.  Nobody is sure where
    the problem is.  I suspect that it may be a DMA synchronization problem
    with either the NCR or the FXP driver, or perhaps heavy PCI bandwidth
    useage is generating a FIFO overrun error during the FXP DMA that the
    driver is not handling properly.  I just don't know.

    The only current solution is to use an adaptec controller.  I have
    personally had *extremely* good luck with adaptec's, 2940UW, 7896 (or 97)
    U2W (on-motherboard), and 7890 (or 91) U2W (PCI card).

    I think part of the reason the problem has not been fixed is that many
    of the hardcore developers are using Adaptec controllers rather then NCR
    controllers and simply cannot reproduce it.

                                        -Matt
                                        Matthew Dillon 
                                        <[EMAIL PROTECTED]>



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to