Re: Freeze or Oops on recent kernels

2012-09-24 Thread yvahk-xreary
>
>HINT: We only care about the very most recent kernel.  If you can
>take a photo of the stack trace, then file a bug report and attach
>the .jpg.
>

After a bit of experimentation my guess is that is is all
about bad Intel DX58SO2 motherboard.

And my guess is that it has something to do with memory
mapping during an interrupt. Don't know much about memory
mapping but I suspect that something quite fundamental
goes wrong with this motherboard only.

Exact same type of oops also happens with a PCIe RS232 serial card.

Trouble is that one tends to select hardware for stability so that
means slightly older hardware and hopefully fewer bugs.
Unfortunately latest hardware may not be supported by older
more stable kernels, and it also wont be well supported by newer
kernels.

Worse still, everything lasts less than a year with manufacturers getting
away with vandalistic change of sockets or connectors or chipsets just to
achieve more sales, and new bugs. This constant cycle of pointless
change doesn't seem to be heading anywhere in particular.

John W

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Freeze or Oops on recent kernels

2012-09-24 Thread yvahk-xreary

HINT: We only care about the very most recent kernel.  If you can
take a photo of the stack trace, then file a bug report and attach
the .jpg.


After a bit of experimentation my guess is that is is all
about bad Intel DX58SO2 motherboard.

And my guess is that it has something to do with memory
mapping during an interrupt. Don't know much about memory
mapping but I suspect that something quite fundamental
goes wrong with this motherboard only.

Exact same type of oops also happens with a PCIe RS232 serial card.

Trouble is that one tends to select hardware for stability so that
means slightly older hardware and hopefully fewer bugs.
Unfortunately latest hardware may not be supported by older
more stable kernels, and it also wont be well supported by newer
kernels.

Worse still, everything lasts less than a year with manufacturers getting
away with vandalistic change of sockets or connectors or chipsets just to
achieve more sales, and new bugs. This constant cycle of pointless
change doesn't seem to be heading anywhere in particular.

John W

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Freeze or Oops on recent kernels

2012-09-07 Thread yvahk-xreary

I am getting either a a kernel Oops or freeze (without any console output)
on recent kernels.  I have tested on 2.6.32.26 PAE, 3.1.9 PAE, and 3.4.9 PAE
all with similar results.

The hardware involved comprises mainly:

12 GB ram
Intel DX58S02 motherboard
Intel Xeon E5220
2 x onboard ethernet
Intel PRO/1000 PT Dual Port ethernet
Hauppauge HVR1700 DVB-T PCIe card
Technisat SkyStar2 DVB-T PCI card

The oops or freeze occurs when both DVB cards are recording
simultaneously. With either card installed on their own there
is never any problem.

I should also add that the exact same kernel and cards on
a Gigabyte GA-P31-S3G motherboard + Intel Pentium 4 the problem
NEVER occurs. So there may be a DX58S02/timing/interrupt issue.

When there is an oops it is like this (hand-transcribed
from 2.6.32.26 PAE kernel):

c0789444panic + 3e/e9
oops_end + 97/a6
no_context + 13b/145
__bad_area_nosemaphore + ec/f4
? do_page_fault + 0/29f
bad_area_nosemaphore + 12/15
do_page_fault + 139/29f
? do_page_fault + 139/29f
c078b54berror_code + 73/78
f82084fb? cx23885_video_irq + d8/1dc
f820b04dcx23885_irq + 3df/3fe
c04834achandle_irq_event + 57/fd
handle_fasteoi_irq + 6f/a2
handle_irq + 40/4d
do_IRQ + 46/9a
common_interrupt + 30/38
c045b7a9? prepare_to_wait + 14c
f7fef5a5? videobug_waitm + 90/133
c045b601? autoremove_waker_function + 0/34
f805e5bcvideobuf_dvb_thread + 73/135
f805e4f9? videobuf_dvb_thread + 0/135
c045b3c9kthread + 64/69
? kthread + 0/69
kernel_thread_helper + 7/10

Although I am a very experienced programmer I have next to zero
kernel expertise except for minor patching of a few drivers.

My guess from the stack trace is that there might be an issue
with page fault recursion (if that is at all possible).

Anyhow I don't want to waste too much of my time or anybody elses
on this - although with the problem occuring with 3.4.9 kernel which
has significant interrupt handling changes it probably is something
that somebody might want to know about. If anybody can spot a clue
as to where I should be looking and how I should go about isolating
the problem (if only kernel core dumped!) please let me know and
I will possibly try and assist. I need some guidance.

Regards
John W.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Freeze or Oops on recent kernels

2012-09-07 Thread yvahk-xreary

I am getting either a a kernel Oops or freeze (without any console output)
on recent kernels.  I have tested on 2.6.32.26 PAE, 3.1.9 PAE, and 3.4.9 PAE
all with similar results.

The hardware involved comprises mainly:

12 GB ram
Intel DX58S02 motherboard
Intel Xeon E5220
2 x onboard ethernet
Intel PRO/1000 PT Dual Port ethernet
Hauppauge HVR1700 DVB-T PCIe card
Technisat SkyStar2 DVB-T PCI card

The oops or freeze occurs when both DVB cards are recording
simultaneously. With either card installed on their own there
is never any problem.

I should also add that the exact same kernel and cards on
a Gigabyte GA-P31-S3G motherboard + Intel Pentium 4 the problem
NEVER occurs. So there may be a DX58S02/timing/interrupt issue.

When there is an oops it is like this (hand-transcribed
from 2.6.32.26 PAE kernel):

c0789444panic + 3e/e9
oops_end + 97/a6
no_context + 13b/145
__bad_area_nosemaphore + ec/f4
? do_page_fault + 0/29f
bad_area_nosemaphore + 12/15
do_page_fault + 139/29f
? do_page_fault + 139/29f
c078b54berror_code + 73/78
f82084fb? cx23885_video_irq + d8/1dc
f820b04dcx23885_irq + 3df/3fe
c04834achandle_irq_event + 57/fd
handle_fasteoi_irq + 6f/a2
handle_irq + 40/4d
do_IRQ + 46/9a
common_interrupt + 30/38
c045b7a9? prepare_to_wait + 14c
f7fef5a5? videobug_waitm + 90/133
c045b601? autoremove_waker_function + 0/34
f805e5bcvideobuf_dvb_thread + 73/135
f805e4f9? videobuf_dvb_thread + 0/135
c045b3c9kthread + 64/69
? kthread + 0/69
kernel_thread_helper + 7/10

Although I am a very experienced programmer I have next to zero
kernel expertise except for minor patching of a few drivers.

My guess from the stack trace is that there might be an issue
with page fault recursion (if that is at all possible).

Anyhow I don't want to waste too much of my time or anybody elses
on this - although with the problem occuring with 3.4.9 kernel which
has significant interrupt handling changes it probably is something
that somebody might want to know about. If anybody can spot a clue
as to where I should be looking and how I should go about isolating
the problem (if only kernel core dumped!) please let me know and
I will possibly try and assist. I need some guidance.

Regards
John W.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/