On 2013-02-25 15:54, Jan Kiszka wrote:
On 2013-02-25 15:39, Anders Blomdell wrote:
On 2013-02-25 13:27, Gilles Chanteperdrix wrote:
On 02/25/2013 11:18 AM, Anders Blomdell wrote:

On 2013-02-15 16:26, Jan Kiszka wrote:
On 2013-02-15 16:15, Anders Blomdell wrote:
Hi,

I have a DX79SI that dies with "kernel BUG at
arch/x86/kernel/ipipe.c:589!" when running Xenomai. This is not very
surprising since when running the system with an ordinary kernel thera
are a few 'do_IRQ: X.Y No irq handler for vector (irq -1)' each day.

Question is if it would be possible to do something less fatal than
'BUG_ON(irq < 0);' in the code below:

This remains a bug that has to be understood.


int __ipipe_handle_irq(struct pt_regs *regs)
{
       struct ipipe_percpu_data *p = __ipipe_this_cpu_ptr(&ipipe_percpu);
       int irq, vector = regs->orig_ax, flags = 0;
       struct pt_regs *tick_regs;

       if (likely(vector < 0)) {
           irq = __this_cpu_read(vector_irq[~vector]);
           BUG_ON(irq < 0);
       } else { /* Software-generated. */
           irq = vector;
           flags = IPIPE_IRQF_NOACK;
       }

Kernel 3.5.7 with latest I-pipe?
Yes.

This is the second report of this kind,
see [1] for the discussion and suggestions. If you don't have KGDB and
that kind enabled, try Gilles' instrumentations.
After a running xenomai five and a half day on a DX58SO motherboard, the
system crashed, leaving a single 'do_IRQ: 2.166 No irq handler for
vector (irq -1)' on our logserver.

I'm planning to put in Gilles instrumentations and change the BUG_ON to
a WARN_ON/WARN, but what should I return after that (my guess is a
'return 1', but waiting a week to be proved wrong would be a waste of
time :-).


Returning 1 is incorrect:
- you should probably jump to the end of the __ipipe_handle_irq function
- if the irq is irq 7, meaning a spurious irq, Linux should handle it,
so, __ipipe_dispatch_irq should be called.
OK, so you mean that I'm probably lokking at two different problems
DX58SO: a spurious interrupt (irq==7) passes through __ipipe_handle_irq
without triggering BUG_ON, but something else breaks.
DX79SI: some (spurious?) interrupt results in irq < 0, triggering BUG_ON

Would the following changes be what you have in mind:

        if (likely(vector < 0)) {
                irq = __this_cpu_read(vector_irq[~vector]);
                if (irq < 0) {
                        WARN(irq < 0, "irq(%d) < 0", irq);

Again, that's only an instrumentation to help finding the bug.
I know (have to crawl before i can walk :-))

My understanding of the code is that if irq < 0, should not call __ipipe_dispatch_irq, since its irq argument is unsigned, but perhaps the MAYDAY and IPIPE_STALL_FLAG stuff should be executed (moving the out: label up a few lines).

I guess that I don't need to consider the special case when p->hrtimer_irq == -1 and irq == -1?



According to my reading of the code, Linux should behave incorrectly
> over invalid vector_irq entries as well.
The difference on DX79SI being that Linux (3.6.11) only logs do_IRQ, while Xenomai (Linux 3.5.7/xenomai-2.6.2.1) gives a BUG_ON which makes all filesystems read-only, and after that everything more or less freezes :-(


Jan

                        goto out:
                }
        } else { /* Software-generated. */
                irq = vector;
                flags = IPIPE_IRQF_NOACK;
        }
        ...
out:
        return 1;




Regards

Anders

--
Anders Blomdell                  Email: [email protected]
Department of Automatic Control
Lund University                  Phone:    +46 46 222 4625
P.O. Box 118                     Fax:      +46 46 138118
SE-221 00 Lund, Sweden


_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai

Reply via email to