Re: missing interrupts (was Re: CURRENT is freezing again ...)
On Mon, 27 Nov 2000, Andrew Gallatin wrote: > > Bruce Evans writes: > > Possible causes of the problem: > > 1) isa_handle_intr() claims to send specific EOIs (0x30 | irq) but > >actually sends non-specific ones (0x20 | garbage). Since interrupts > > I think that sending non-specific EOIs is the problem. Sending > specific EOIs seem to eliminate my nic timeouts and the need to > manually feed an eoi to recover from a missing interrupt. > > My question is: how does one send a specific EOI correctly? I don't > have decent documentation for this. Above, you seem to imply that > 0x30 is a specific EOI. That does not seem to work for me (machine > locks at boot). > > Linux uses 0xe0. According to some Tru64 docs I have, > that means "Rotate Priority on specific EOI". According > to that same documentation, 0x60 is a specific EOI. Both of these Oops, I misread the data sheet. 0x60 is correct, 0x30 is wrong. The irq number is in the lowest 3 bits. > appear to work just fine. What should the alpha port use? I think it should use non-specific EOIs and send them early (when there is no ambiguity about which interrupt is being handled), as in the i386 port. Sending them late mainly gives the ICU's braindamaged interrupt priority scheme for longer than necessary. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: missing interrupts (was Re: CURRENT is freezing again ...)
In <[EMAIL PROTECTED]>, Andrew Gallatin wrote: > Bruce Evans writes: > > Possible causes of the problem: > > 1) isa_handle_intr() claims to send specific EOIs (0x30 | irq) but > >actually sends non-specific ones (0x20 | garbage). Since interrupts > >may be handled in non-LIFO order, this results in EOIs being sent > >for the wrong interrupts. I think this just randomizes the > >brokenness caused by delaying sending of EOIs. I can't see how it > >would result in an EOI being lost -- the right number of EOIs will > >have been sent after all handlers have returned. > > > I think that sending non-specific EOIs is the problem. Sending > specific EOIs seem to eliminate my nic timeouts and the need to > manually feed an eoi to recover from a missing interrupt. > > My question is: how does one send a specific EOI correctly? I don't > have decent documentation for this. Above, you seem to imply that > 0x30 is a specific EOI. That does not seem to work for me (machine > locks at boot). > > Linux uses 0xe0. According to some Tru64 docs I have, > that means "Rotate Priority on specific EOI". According > to that same documentation, 0x60 is a specific EOI. Both of these > appear to work just fine. What should the alpha port use? My notes say: Non-specific EOI : 0x20 Specific EOI : 0x60 | IRQn EOI + rotate priority: 0xa0 EOI + select lowest priority : 0xe0 | IRQn -- Robert S. F. Drehmel <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: missing interrupts (was Re: CURRENT is freezing again ...)
Bruce Evans writes: > Possible causes of the problem: > 1) isa_handle_intr() claims to send specific EOIs (0x30 | irq) but >actually sends non-specific ones (0x20 | garbage). Since interrupts >may be handled in non-LIFO order, this results in EOIs being sent >for the wrong interrupts. I think this just randomizes the >brokenness caused by delaying sending of EOIs. I can't see how it >would result in an EOI being lost -- the right number of EOIs will >have been sent after all handlers have returned. I think that sending non-specific EOIs is the problem. Sending specific EOIs seem to eliminate my nic timeouts and the need to manually feed an eoi to recover from a missing interrupt. My question is: how does one send a specific EOI correctly? I don't have decent documentation for this. Above, you seem to imply that 0x30 is a specific EOI. That does not seem to work for me (machine locks at boot). Linux uses 0xe0. According to some Tru64 docs I have, that means "Rotate Priority on specific EOI". According to that same documentation, 0x60 is a specific EOI. Both of these appear to work just fine. What should the alpha port use? Thanks, Drew To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: missing interrupts (was Re: CURRENT is freezing again ...)
On Fri, 17 Nov 2000, Andrew Gallatin wrote: > [fxp isa irq pending but never occurs] > I then wrote a hack which sends an eoi. If I call my hack from ddb > and send an eoi for irq10, everything goes back to normal and the > network interface is back. > > So, is it a race in the interrupt code, or is it something about how > the code is structured? > > On the alpha at least, we get the irq, mask the irq and set the > ithread runnable. When the (isa) ithread runs, it calls the interrupt > handler and then sends an eoi. The interrupt is then unmasked. > > I've peeked at the linux code and noticed that they do things > differently. They first mask the interrupt, and then send the eoi > immediately -- before the handler runs. They then run the handler > and unmask the interrupt. The seem to do this both on i386 and > alpha. FreeBSD does the same thing on i386's as Linux, except for fast interrupts it delays the EOI until the handler returns so that the handler gets called as soon as possible. > Does anybody have any ideas about this? Does something bad > happen if you don't send an eoi in a reasonable amount of time? Delayed EOIs work normally, but lower priority interrupts (according to the ICU's priority scheme) are masked until the EIO is sent. This is bad mainly because the ICU's priority scheme is different from FreeBSD's priority scheme. Possible causes of the problem: 1) isa_handle_intr() claims to send specific EOIs (0x30 | irq) but actually sends non-specific ones (0x20 | garbage). Since interrupts may be handled in non-LIFO order, this results in EOIs being sent for the wrong interrupts. I think this just randomizes the brokenness caused by delaying sending of EOIs. I can't see how it would result in an EOI being lost -- the right number of EOIs will have been sent after all handlers have returned. 2) Insufficient locking for ICU accesses. Again, I can't see how this would affect EOIs. On i386's, some accesses are locked implicitly by sched_lock. 3) Enabling interrupts (and unlocking the ICU) before sending EOI seems to just make things more complicated. It requires the specific EOIs in (1). On alphas, interrupts aren't masked in the ICU while they are handled (the disable/enable args in the call to alpha_setup_intr() in isa_setup_intr() are NULL ...). They are masked by some combination of the CPU and ICU priorities. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
missing interrupts (was Re: CURRENT is freezing again ...)
Valentin Chopov writes: > Hi, > > After last cvsup my machine (Dual PIII, SMP kernel) is freezing again in > 10 min after boot... > I've seen one similar problem on an alpha UP1000 that I'd like some input about. The UP1000 is essentially an alpha 21264 stuffed into an AMD Athlon system. It has an AMD-751 chipset and handles all device interrupts via an isa interrupt controller. I've noticed that under "heavy" load (gdb -k kernel.debug /dev/mem on an NFS filesystem), the network interface goes away, never to reappear. All I see is "fxp0: device timeout" on console. This started with SMPng. After a little bit of investigation with ddb, I discovered that the NIC's irq was pending. Eg: login: fxp0: device timeout Stopped at siointr1+0x17c: br zero,siointr1+0x32c db> call isa_irq_pending() 0x410 The fxp interface is at ir10, so 0x410 means there's an irq 10 pending. I then wrote a hack which sends an eoi. If I call my hack from ddb and send an eoi for irq10, everything goes back to normal and the network interface is back. So, is it a race in the interrupt code, or is it something about how the code is structured? On the alpha at least, we get the irq, mask the irq and set the ithread runnable. When the (isa) ithread runs, it calls the interrupt handler and then sends an eoi. The interrupt is then unmasked. I've peeked at the linux code and noticed that they do things differently. They first mask the interrupt, and then send the eoi immediately -- before the handler runs. They then run the handler and unmask the interrupt. The seem to do this both on i386 and alpha. Does anybody have any ideas about this? Does something bad happen if you don't send an eoi in a reasonable amount of time? Drew -- Andrew Gallatin, Sr Systems Programmer http://www.cs.duke.edu/~gallatin Duke University Email: [EMAIL PROTECTED] Department of Computer Science Phone: (919) 660-6590 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message