Re: Regression in 32-bit ppc kernel

2012-04-28 Thread Larry Finger

On 04/27/2012 07:42 PM, Benjamin Herrenschmidt wrote:


Ok, so you do have a serial port, probably two even :-) One of them is
connected to the infra red transceiver and the other one is probably
connected to the internal modem.

(The modem itself might not use it, some of these machines use an
i2s/i2c modem, some use a usb modem, but the serial port is wired to the
connector regardless).


I have done a little more debugging. The problem is definitely coming from 
drivers/tty/serial/pmac_zilog.c. I am getting ChanB interrupts while open, which 
causes the following code segment to return IRQ_NONE:


   if (r3  (CHBEXT | CHBTxIP | CHBRxIP)) {
   if (!ZS_IS_OPEN(uap_a)) {
   pmz_debug(ChanB interrupt while open !\n);
   goto skip_b;
   }
   write_zsreg(uap_b, R0, RES_H_IUS);
   zssync(uap_b);
   if (r3  CHBEXT)

When this section is entered, r3 == 0x2 (CHBTxIP).

Larry
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-28 Thread Andreas Schwab
Larry Finger larry.fin...@lwfinger.net writes:

 I have done a little more debugging. The problem is definitely coming from
 drivers/tty/serial/pmac_zilog.c. I am getting ChanB interrupts while open,
 which causes the following code segment to return IRQ_NONE:

if (r3  (CHBEXT | CHBTxIP | CHBRxIP)) {
if (!ZS_IS_OPEN(uap_a)) {

s/uap_a/uap_b/?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-28 Thread Benjamin Herrenschmidt
On Sat, 2012-04-28 at 13:09 -0500, Larry Finger wrote:
 I have done a little more debugging. The problem is definitely coming
 from 
 drivers/tty/serial/pmac_zilog.c. I am getting ChanB interrupts while
 open, which 
 causes the following code segment to return IRQ_NONE:
 
 if (r3  (CHBEXT | CHBTxIP | CHBRxIP)) {
 if (!ZS_IS_OPEN(uap_a)) {
 pmz_debug(ChanB interrupt while open !\n);
 goto skip_b;
 }
 write_zsreg(uap_b, R0, RES_H_IUS);
 zssync(uap_b);
 if (r3  CHBEXT)
 
 When this section is entered, r3 == 0x2 (CHBTxIP).
 
 
Ok. The debug code was meant to spell while not open btw :-)

I have some ideas what's going on. I think the irda stuff can trigger
interrupts during the open/close sequence before ZS_IS_OPEN is true.

I'll send a fix.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-28 Thread Benjamin Herrenschmidt
On Sat, 2012-04-28 at 20:23 +0200, Andreas Schwab wrote:
 Larry Finger larry.fin...@lwfinger.net writes:
 
  I have done a little more debugging. The problem is definitely coming from
  drivers/tty/serial/pmac_zilog.c. I am getting ChanB interrupts while open,
  which causes the following code segment to return IRQ_NONE:
 
 if (r3  (CHBEXT | CHBTxIP | CHBRxIP)) {
 if (!ZS_IS_OPEN(uap_a)) {
 
 s/uap_a/uap_b/?

Good catch... Let's see if that fixes it for Larry...

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-28 Thread Benjamin Herrenschmidt
On Sun, 2012-04-29 at 08:41 +1000, Benjamin Herrenschmidt wrote:
 On Sat, 2012-04-28 at 13:09 -0500, Larry Finger wrote:
  I have done a little more debugging. The problem is definitely coming
  from 
  drivers/tty/serial/pmac_zilog.c. I am getting ChanB interrupts while
  open, which 
  causes the following code segment to return IRQ_NONE:
  
  if (r3  (CHBEXT | CHBTxIP | CHBRxIP)) {
  if (!ZS_IS_OPEN(uap_a)) {
  pmz_debug(ChanB interrupt while open !\n);
  goto skip_b;
  }
  write_zsreg(uap_b, R0, RES_H_IUS);
  zssync(uap_b);
  if (r3  CHBEXT)
  
  When this section is entered, r3 == 0x2 (CHBTxIP).
  
  
 Ok. The debug code was meant to spell while not open btw :-)
 
 I have some ideas what's going on. I think the irda stuff can trigger
 interrupts during the open/close sequence before ZS_IS_OPEN is true.
 
 I'll send a fix.

Hrm, actually, Andreas also found an actual bug here, as we aren't
testing uap_b but uap_a ... oops. I think when I tested chan b I always
had chan a open :-) That will be easy to fix.

Can you try turning the uap_a to uap_b test above and see if that fixes
some of it for you ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-28 Thread Larry Finger

On 04/28/2012 05:48 PM, Benjamin Herrenschmidt wrote:

On Sat, 2012-04-28 at 20:23 +0200, Andreas Schwab wrote:

Larry Fingerlarry.fin...@lwfinger.net  writes:


I have done a little more debugging. The problem is definitely coming from
drivers/tty/serial/pmac_zilog.c. I am getting ChanB interrupts while open,
which causes the following code segment to return IRQ_NONE:

if (r3  (CHBEXT | CHBTxIP | CHBRxIP)) {
if (!ZS_IS_OPEN(uap_a)) {


s/uap_a/uap_b/?


Good catch... Let's see if that fixes it for Larry...


Yes, good catch by Andreas. That change does fix the problem.

Ben - Do you want to fix the typos for open/not open with the same patch?

Larry

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-28 Thread Benjamin Herrenschmidt
On Sat, 2012-04-28 at 18:17 -0500, Larry Finger wrote:
 Yes, good catch by Andreas. That change does fix the problem.
 
 Ben - Do you want to fix the typos for open/not open with the same
 patch?
 
Sure, if you're going to do a proper patch, by all means please fix
those too :-)

Does it fix all the occurrences of the problem for you ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-28 Thread Larry Finger

On 04/28/2012 06:23 PM, Benjamin Herrenschmidt wrote:

On Sat, 2012-04-28 at 18:17 -0500, Larry Finger wrote:

Yes, good catch by Andreas. That change does fix the problem.

Ben - Do you want to fix the typos for open/not open with the same
patch?


Sure, if you're going to do a proper patch, by all means please fix
those too :-)

Does it fix all the occurrences of the problem for you ?


Yes. After the patch is applied, no more nobody cared IRQ messages.

I will prepare the patch and send it to you.

Larry



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-27 Thread Larry Finger

On 04/25/2012 04:44 PM, Benjamin Herrenschmidt wrote:


Do we know what the bad interrupt maps to ? Also what is the value of
NR_IRQ and do you have SPARSE_IRQ enabled ? Can you try with the latter
disabled and NR_IRQ set to something large, such as 128 ?

(You may be able to check the interrupt mapping in debugfs)


Sorry, I was unable to find anything in debugfs to help me learn about interrupt 
mapping. The value of CONFIG_NR_IRQS is already 512. I have not tried reducing 
it to 128. The setting for CONFIG_SPARSE_IRQ was on, and changing it to off did 
not make any difference.


I finished the bisection, which led to

commit a79dd5ae5a8f49688d65b89a859f2b98a7ee5538
Author: Benjamin Herrenschmidt b...@kernel.crashing.org
Date:   Thu Dec 15 11:13:03 2011 +1100

tty/serial/pmac_zilog: Fix suspend  resume

As this seemed to be an improbable result, I did the full test by checking out 
the previous commit (43ca5d3). That resulted in a good result. Then I used 
quilt to add commit a79dd5a as a patch and the fault returned. I then noticed 
that you said in the commit message that I removed some code for handling 
unexpected interrupt which should never be hit It appears that my box does 
indeed hit such an unexpected interrupt.


I could always get rid of the fault by disabling CONFIG_SERIAL_PMACZILOG, but I 
would like to fix the problem if possible.


Thanks,

Larry

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-27 Thread Benjamin Herrenschmidt
On Fri, 2012-04-27 at 10:38 -0500, Larry Finger wrote:

 Sorry, I was unable to find anything in debugfs to help me learn about 
 interrupt 
 mapping. The value of CONFIG_NR_IRQS is already 512. I have not tried 
 reducing 
 it to 128. The setting for CONFIG_SPARSE_IRQ was on, and changing it to off 
 did 
 not make any difference.
 
 I finished the bisection, which led to
 
 commit a79dd5ae5a8f49688d65b89a859f2b98a7ee5538
 Author: Benjamin Herrenschmidt b...@kernel.crashing.org
 Date:   Thu Dec 15 11:13:03 2011 +1100
 
  tty/serial/pmac_zilog: Fix suspend  resume
 
 As this seemed to be an improbable result, I did the full test by checking 
 out 
 the previous commit (43ca5d3). That resulted in a good result. Then I used 
 quilt to add commit a79dd5a as a patch and the fault returned. I then noticed 
 that you said in the commit message that I removed some code for handling 
 unexpected interrupt which should never be hit It appears that my box 
 does 
 indeed hit such an unexpected interrupt.
 
 I could always get rid of the fault by disabling CONFIG_SERIAL_PMACZILOG, but 
 I 
 would like to fix the problem if possible.

Right, it should be fixed. I need to understand where the unexpected
interrupt comes from. Can you tell me (or remind me) what specific
machine model you are using ? Are you putting the console on the serial
port ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-27 Thread Larry Finger

On 04/27/2012 05:26 PM, Benjamin Herrenschmidt wrote:

On Fri, 2012-04-27 at 10:38 -0500, Larry Finger wrote:


Sorry, I was unable to find anything in debugfs to help me learn about interrupt
mapping. The value of CONFIG_NR_IRQS is already 512. I have not tried reducing
it to 128. The setting for CONFIG_SPARSE_IRQ was on, and changing it to off did
not make any difference.

I finished the bisection, which led to

commit a79dd5ae5a8f49688d65b89a859f2b98a7ee5538
Author: Benjamin Herrenschmidtb...@kernel.crashing.org
Date:   Thu Dec 15 11:13:03 2011 +1100

  tty/serial/pmac_zilog: Fix suspend  resume

As this seemed to be an improbable result, I did the full test by checking out
the previous commit (43ca5d3). That resulted in a good result. Then I used
quilt to add commit a79dd5a as a patch and the fault returned. I then noticed
that you said in the commit message that I removed some code for handling
unexpected interrupt which should never be hit It appears that my box does
indeed hit such an unexpected interrupt.

I could always get rid of the fault by disabling CONFIG_SERIAL_PMACZILOG, but I
would like to fix the problem if possible.


Right, it should be fixed. I need to understand where the unexpected
interrupt comes from. Can you tell me (or remind me) what specific
machine model you are using ? Are you putting the console on the serial
port ?


It is a 15 Powerbook G4. I think they call it a Titanium. The console is not on 
a serial port. In fact, the reason that I did not think this patch was a problem 
is because the serial port does not appear to be connected to an external port. 
I was unaware that there was a serial port on the motherboard. There is a modem 
jack, but no 9 or 25-pin connectors that would indicate a standard serial port.


There are two stack dumps with the same trace. I posted the first, but the 
second is preceded by the lines


[c02adca0] pmz_interrupt
Disabling IRQ #23
ttyPZ1: IrDA setup for 57600 bps, dongle version: 4
ttyPZ1: IrDA setup for 115200 bps, dongle version: 4
irq23: nobody cared (try booting with the irqpoll option

As I am not sure how to put options in with yaboot, I have not tried that.

Larry


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-27 Thread Benjamin Herrenschmidt
On Fri, 2012-04-27 at 19:02 -0500, Larry Finger wrote:

 It is a 15 Powerbook G4. I think they call it a Titanium. The console is not 
 on 
 a serial port. In fact, the reason that I did not think this patch was a 
 problem 
 is because the serial port does not appear to be connected to an external 
 port. 
 I was unaware that there was a serial port on the motherboard. There is a 
 modem 
 jack, but no 9 or 25-pin connectors that would indicate a standard serial 
 port.
 
 There are two stack dumps with the same trace. I posted the first, but the 
 second is preceded by the lines
 
 [c02adca0] pmz_interrupt
 Disabling IRQ #23
 ttyPZ1: IrDA setup for 57600 bps, dongle version: 4
 ttyPZ1: IrDA setup for 115200 bps, dongle version: 4
 irq23: nobody cared (try booting with the irqpoll option
 
 As I am not sure how to put options in with yaboot, I have not tried that.

Ok, so you do have a serial port, probably two even :-) One of them is
connected to the infra red transceiver and the other one is probably
connected to the internal modem.

(The modem itself might not use it, some of these machines use an
i2s/i2c modem, some use a usb modem, but the serial port is wired to the
connector regardless).

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-25 Thread Larry Finger

On 04/24/2012 11:11 PM, Benjamin Herrenschmidt wrote:

On Tue, 2012-04-24 at 21:37 -0500, Larry Finger wrote:

Somewhere between v3.2 and v3.3, the kernel in my Powerbook G4

started issuing

the following traceback on bootup:


Does it continue working afterward or not at all ?

Are you using the old IDE driver or the newer libata based

pata_macio ?

Yes, it finishes the boot, and appears to work correctly. If a device
is
missing, I do not know what it is.

I think I am using the old IDE driver.


Interesting. Does it make a difference if you switch to pata_macio ?


After a few tries, I managed to change over to pata_macio. Fortunately, most of 
the system used dev-by-id or UUID, thus most of the process was getting all the 
kernel pieces built in.


Unfortunately, the original problem remains. I have resumed the bisecting - only 
11 steps to go. I should have it by Friday! :)


Larry
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-25 Thread Benjamin Herrenschmidt
On Wed, 2012-04-25 at 10:00 -0500, Larry Finger wrote:
 
 After a few tries, I managed to change over to pata_macio.
 Fortunately, most of 
 the system used dev-by-id or UUID, thus most of the process was
 getting all the 
 kernel pieces built in.
 
 Unfortunately, the original problem remains. I have resumed the
 bisecting - only 
 11 steps to go. I should have it by Friday! :)
 
Thanks !

Do we know what the bad interrupt maps to ? Also what is the value of
NR_IRQ and do you have SPARSE_IRQ enabled ? Can you try with the latter
disabled and NR_IRQ set to something large, such as 128 ?

(You may be able to check the interrupt mapping in debugfs)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Regression in 32-bit ppc kernel

2012-04-24 Thread Larry Finger

Hi,

Somewhere between v3.2 and v3.3, the kernel in my Powerbook G4 started issuing 
the following traceback on bootup:


[   40.264006] irq 23: nobody cared (try booting with the irqpoll option)
[   40.264031] Call Trace:
[   40.264070] [dfff3f00] [c000984c] show_stack+0x7c/0x194 (unreliable)
[   40.264102] [dfff3f40] [c00a6840] __report_bad_irq+0x44/0xf4
[   40.264119] [dfff3f60] [c00a6adc] note_interrupt+0x1ec/0x2ac
[   40.264135] [dfff3f80] [c00a48a8] handle_irq_event_percpu+0x250/0x2b8
[   40.264152] [dfff3fd0] [c00a4944] handle_irq_event+0x34/0x54
[   40.264169] [dfff3fe0] [c00a7514] handle_fasteoi_irq+0xb4/0x124
[   40.264192] [dfff3ff0] [c000f5bc] call_handle_irq+0x18/0x28
[   40.264208] [dec85ce0] [c000719c] do_IRQ+0x114/0x1cc
[   40.264226] [dec85d10] [c0015868] ret_from_except+0x0/0x1c
[   40.264254] --- Exception: 501 at find_vma+0x10/0x80
[   40.264259] LR = do_page_fault+0x26c/0x6ac
[   40.264272] [dec85dd0] [c03f0128] do_page_fault+0x25c/0x6ac (unreliable)
[   40.264289] [dec85f40] [c00155e4] handle_page_fault+0xc/0x80
[   40.264327] --- Exception: 301 at 0x4800a174

The problem still exists in v3.4-rc3. I am currently doing a bisection of this 
problem, but it will take a long time to complete.


Note: IRQ 23 is not active in v3.2.

Thanks,

Larry

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-24 Thread Benjamin Herrenschmidt
On Tue, 2012-04-24 at 17:58 -0500, Larry Finger wrote:
 Hi,
 
 Somewhere between v3.2 and v3.3, the kernel in my Powerbook G4 started 
 issuing 
 the following traceback on bootup:

Does it continue working afterward or not at all ?

Are you using the old IDE driver or the newer libata based pata_macio ?

Cheers,
Ben.

 [   40.264006] irq 23: nobody cared (try booting with the irqpoll option)
 [   40.264031] Call Trace:
 [   40.264070] [dfff3f00] [c000984c] show_stack+0x7c/0x194 (unreliable)
 [   40.264102] [dfff3f40] [c00a6840] __report_bad_irq+0x44/0xf4
 [   40.264119] [dfff3f60] [c00a6adc] note_interrupt+0x1ec/0x2ac
 [   40.264135] [dfff3f80] [c00a48a8] handle_irq_event_percpu+0x250/0x2b8
 [   40.264152] [dfff3fd0] [c00a4944] handle_irq_event+0x34/0x54
 [   40.264169] [dfff3fe0] [c00a7514] handle_fasteoi_irq+0xb4/0x124
 [   40.264192] [dfff3ff0] [c000f5bc] call_handle_irq+0x18/0x28
 [   40.264208] [dec85ce0] [c000719c] do_IRQ+0x114/0x1cc
 [   40.264226] [dec85d10] [c0015868] ret_from_except+0x0/0x1c
 [   40.264254] --- Exception: 501 at find_vma+0x10/0x80
 [   40.264259] LR = do_page_fault+0x26c/0x6ac
 [   40.264272] [dec85dd0] [c03f0128] do_page_fault+0x25c/0x6ac (unreliable)
 [   40.264289] [dec85f40] [c00155e4] handle_page_fault+0xc/0x80
 [   40.264327] --- Exception: 301 at 0x4800a174
 
 The problem still exists in v3.4-rc3. I am currently doing a bisection of 
 this 
 problem, but it will take a long time to complete.
 
 Note: IRQ 23 is not active in v3.2.
 
 Thanks,
 
 Larry


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-24 Thread Larry Finger

On 04/24/2012 06:53 PM, Benjamin Herrenschmidt wrote:

On Tue, 2012-04-24 at 17:58 -0500, Larry Finger wrote:

Hi,

Somewhere between v3.2 and v3.3, the kernel in my Powerbook G4 started issuing
the following traceback on bootup:


Does it continue working afterward or not at all ?

Are you using the old IDE driver or the newer libata based pata_macio ?


Yes, it finishes the boot, and appears to work correctly. If a device is 
missing, I do not know what it is.


I think I am using the old IDE driver.

Larry


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Regression in 32-bit ppc kernel

2012-04-24 Thread Benjamin Herrenschmidt
On Tue, 2012-04-24 at 21:37 -0500, Larry Finger wrote:
  Somewhere between v3.2 and v3.3, the kernel in my Powerbook G4
 started issuing
  the following traceback on bootup:
 
  Does it continue working afterward or not at all ?
 
  Are you using the old IDE driver or the newer libata based
 pata_macio ?
 
 Yes, it finishes the boot, and appears to work correctly. If a device
 is 
 missing, I do not know what it is.
 
 I think I am using the old IDE driver.
 
Interesting. Does it make a difference if you switch to pata_macio ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev