Re: [PATCH] aacraid: [Fastboot] Panics for AACRAID driverduring'insmod' for kexec test [take 4]

2007-04-03 Thread Judith Lebzelter
Hi Mark,

I was going to try and test this patch rather than the last, but I am getting 
this compile error again where line 640 is the beginning of function 
aac_rx_init():

CC [M]  drivers/scsi/aacraid/rx.o
drivers/scsi/aacraid/rx.c: In function '_aac_rx_init':
drivers/scsi/aacraid/rx.c:640: warning: ISO C90 forbids mixed declarations and 
code
drivers/scsi/aacraid/rx.c:649: error: expected declaration or statement at end 
of input
drivers/scsi/aacraid/rx.c:649: warning: control reaches end of non-void function
make[3]: *** [drivers/scsi/aacraid/rx.o] Error 1
make[2]: *** [drivers/scsi/aacraid] Error 2
make[1]: *** [drivers/scsi] Error 2
make: *** [drivers] Error 2

I applied it to the scsi-misc tree I pulled yesterday after removing the old 
patch. 

Judith


On Tue, Apr 03, 2007 at 11:58:17AM -0400, Salyzyn, Mark wrote:
 I will do you one better, James, I will slip in a little cleanup in sa.c 
 (support file for the old PPC based ARC cards) where I discovered the restart 
 platform function was ALSO left unset which could result in similar pain of 
 null pointer discovery.
 
 Please note: The issue Judith ran into, where the card took longer than 3 
 minutes to initialize because of a problem drive may require the extension of 
 the timeout to address (insmod parameter aacraid.startup_timeout=540 may do 
 the trick). Extending the timeout may have been a fact of life given that the 
 restart of the adapter normally occurs on BIOS load long before the driver 
 instantiates settling the problem drives; if this is the case a small and 
 lower priority follow-up hardening patch can help the users that find adding 
 the insmod parameter repugnant in order to support kexec and kdump in the 
 face of problem drives. Problem drives may have lead to the need to get a 
 kernel dump ...
 
 You will find enclosed the pristine patch based on the initial patch, 
 dropping the static function, and adding the three missing platform function 
 initializations.
 
 Attached is the patch I feel will address this interrupt issue. As an added 
 'perk' I have also added the code to detect if the controller was previously 
 initialized for interrupted operations by ANY operating system should the 
 reset_devices kernel parameter not be set and we are dealing with a naïve 
 kexec without the addition of this kernel parameter. The reset handler is 
 also improved. Related to reset operations, but not pertinent specifically to 
 this issue, I have also altered the handling somewhat so that we reset the 
 adapter if we feel it is taking too long (three minutes) to start up.
 
 ObligatoryDisclaimer: Please accept my condolences regarding Outlook's 
 handling of patches.
 
 This attached patch is against current scsi-misc-2.6 MINUS the initial 
 version of this patch and the first patch that sets the missing platform 
 function related to this discussion.
  
 Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]
 
 ---
 
 Sincerely -- Mark Salyzyn
 
  -Original Message-
  From: James Bottomley [mailto:[EMAIL PROTECTED] 
  Sent: Tuesday, April 03, 2007 10:52 AM
  To: Salyzyn, Mark
  Cc: Judith Lebzelter; [EMAIL PROTECTED]
  Subject: RE: [PATCH] aacraid: [Fastboot] Panics for AACRAID 
  driverduring'insmod' for kexec test.
  
  
  On Tue, 2007-04-03 at 09:30 -0400, Salyzyn, Mark wrote:
   0x48 status code means the Firmware is trying to boot the 
  Kernel. This
   phase is most likely blocked because of the hard drive 
  failure as you
   suspected; the kernel is not declared up and running until after the
   drives have spun up, and a problem drive could be tricking 
  the Firmware
   into a recovery loop holding things back ...
  
  I'm constructing what I hope will be the last pre 2.6.21 
  merge tree ...
  do you have a clean patch with the two necessary fixes for 
  the panic you
  can send to the list?
  
  James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] aacraid: [Fastboot] Panics for AACRAID driver during 'insmod' for kexec test.

2007-03-30 Thread Judith Lebzelter
On Fri, Mar 30, 2007 at 10:30:48AM -0400, Salyzyn, Mark wrote:
 Thanks for the info.
 
 Attached is the patch I feel will address this issue. As an added 'perk' I 
 have also added the code to detect if the controller was previously 
 initialized for interrupted operations by ANY operating system should the 
 reset_devices kernel parameter not be set and we are dealing with a naïve 
 kexec without the addition of this kernel parameter. The reset handler is 
 also improved. Related to reset operations, but not pertinent specifically to 
 this issue, I have also altered the handling somewhat so that we reset the 
 adapter if we feel it is taking too long (three minutes) to start up.
 
 We have not unit tested the reset_devices flag propagation to this driver 
 code, nor have we unit tested the check for the interrupted operations under 
 the conditions of a naively issued kexec. We are submitting this modified 
 driver to our Q/A department for integration testing in our current programs. 
 I would appreciate an ACK to this patch should it resolve the issue described 
 in this thread...
 

Mark; 

I am getting an error applying this patch:

-bash-3.1# patch -p1  ../../aacraid_kexec.patch
patching file drivers/scsi/aacraid/rx.c
patch:  malformed patch at line 36: @@ -526,6 +529,7 @@

Do you think you could regenerate it?

Thanks;
Judith

 ObligatoryDisclaimer: Please accept my condolences regarding Outlook's 
 handling of patches.
 
 This attached patch is against current scsi-misc-2.6
  
 Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]
 
 ---
 
 Sincerely -- Mark Salyzyn
 
  -Original Message-
  From: Vivek Goyal [mailto:[EMAIL PROTECTED] 
  Sent: Friday, March 30, 2007 2:06 AM
  To: Salyzyn, Mark
  Cc: Judith Lebzelter; linux-scsi@vger.kernel.org; AACRAID; 
  fastboot@lists.osdl.org
  Subject: Re: [Fastboot] Panics for AACRAID driver during 
  'insmod' for kexec test.
  
  
  On Thu, Mar 29, 2007 at 10:17:18AM -0400, Salyzyn, Mark wrote:
   I have been working on a patch to the driver to do just 
  this, reset the
   adapter during init if necessary. We want to limit the 
  adapter's reset
   as it takes time (an additional 45 seconds or longer) for 
  the Firmware
   to cycle... I will bump the priority of the testing for this patch.
   
  Hi,
  
  Thanks for looking into this. You can make device reset 
  conditional. Now
  one command line parameter reset_devices has been defined 
  for the kernel.
  You can reset the device only if the user has passed 
  reset_devices command
  line option otherwise you can continue to boot normaly.
  
  I have introduced this parameter to handle the concern that in normal
  BIOS boot total boot time will increase.
  
  kexec/kdump will pass this parameter to second kernel so that 
  device will
  be reset during initialization and normal BIOS boot will 
  reamin unaffected.
  
  Thanks
  Vivek
  


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] aacraid: [Fastboot] Panics for AACRAID driver during 'insmod' for kexec test.

2007-03-30 Thread Judith Lebzelter
On Fri, Mar 30, 2007 at 01:21:33PM -0400, Salyzyn, Mark wrote:
 Resending patch file.
 
 I looked at the submission that showed on the list, and the original email, 
 and a blank line dropped away at line 20 of the patch (!)
 
 Dunno, hope this comes through this second time. But if not, please add the 
 blank line as referenced.
 

Now I got this error which does not seem to be the result of the missing line:

Hunk #3 FAILED at 529.
Hunk #4 succeeded at 541 (offset 3 lines).
Hunk #5 FAILED at 576.


I tried manually editing in those two hunks and got an error on compile:

C [M]  drivers/scsi/aacraid/rx.o
drivers/scsi/aacraid/rx.c: In function '_aac_rx_init':
drivers/scsi/aacraid/rx.c:641: warning: ISO C90 forbids mixed declarations and 
code
drivers/scsi/aacraid/rx.c:650: error: expected declaration or statement at end 
of input
drivers/scsi/aacraid/rx.c:650: warning: control reaches end of non-void function
make[3]: *** [drivers/scsi/aacraid/rx.o] Error 1
make[2]: *** [drivers/scsi/aacraid] Error 2
make[1]: *** [drivers/scsi] Error 2
make: *** [drivers] Error 2

I am pretty sure that I pasted okay, it is not that big a hunk and 
I tried it twice.  Are you sure that the git tree you used is up to date?  
I am not sure why this is failing; it doesn't look off.  Line 641 is actually
the start of the next function aac_rx_init(), not _aac_rx_init().

Judith

 
  -Original Message-
  From: Judith Lebzelter [mailto:[EMAIL PROTECTED] 
  Sent: Friday, March 30, 2007 1:10 PM
  To: Salyzyn, Mark
  Cc: [EMAIL PROTECTED]; Judith Lebzelter; 
  linux-scsi@vger.kernel.org; fastboot@lists.osdl.org
  Subject: Re: [PATCH] aacraid: [Fastboot] Panics for AACRAID 
  driver during 'insmod' for kexec test.
  
  
  On Fri, Mar 30, 2007 at 10:30:48AM -0400, Salyzyn, Mark wrote:
   Thanks for the info.
   
   Attached is the patch I feel will address this issue. As an 
  added 'perk' I have also added the code to detect if the 
  controller was previously initialized for interrupted 
  operations by ANY operating system should the reset_devices 
  kernel parameter not be set and we are dealing with a naïve 
  kexec without the addition of this kernel parameter. The 
  reset handler is also improved. Related to reset operations, 
  but not pertinent specifically to this issue, I have also 
  altered the handling somewhat so that we reset the adapter if 
  we feel it is taking too long (three minutes) to start up.
   
   We have not unit tested the reset_devices flag propagation 
  to this driver code, nor have we unit tested the check for 
  the interrupted operations under the conditions of a naively 
  issued kexec. We are submitting this modified driver to our 
  Q/A department for integration testing in our current 
  programs. I would appreciate an ACK to this patch should it 
  resolve the issue described in this thread...
   
  
  Mark; 
  
  I am getting an error applying this patch:
  
  -bash-3.1# patch -p1  ../../aacraid_kexec.patch
  patching file drivers/scsi/aacraid/rx.c
  patch:  malformed patch at line 36: @@ -526,6 +529,7 @@
  
  Do you think you could regenerate it?
  
  Thanks;
  Judith
  
   ObligatoryDisclaimer: Please accept my condolences 
  regarding Outlook's handling of patches.
   
   This attached patch is against current scsi-misc-2.6

   Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]
   
   ---
   
   Sincerely -- Mark Salyzyn
   
-Original Message-
From: Vivek Goyal [mailto:[EMAIL PROTECTED] 
Sent: Friday, March 30, 2007 2:06 AM
To: Salyzyn, Mark
Cc: Judith Lebzelter; linux-scsi@vger.kernel.org; AACRAID; 
fastboot@lists.osdl.org
Subject: Re: [Fastboot] Panics for AACRAID driver during 
'insmod' for kexec test.


On Thu, Mar 29, 2007 at 10:17:18AM -0400, Salyzyn, Mark wrote:
 I have been working on a patch to the driver to do just 
this, reset the
 adapter during init if necessary. We want to limit the 
adapter's reset
 as it takes time (an additional 45 seconds or longer) for 
the Firmware
 to cycle... I will bump the priority of the testing for 
  this patch.
 
Hi,

Thanks for looking into this. You can make device reset 
conditional. Now
one command line parameter reset_devices has been defined 
for the kernel.
You can reset the device only if the user has passed 
reset_devices command
line option otherwise you can continue to boot normaly.

I have introduced this parameter to handle the concern 
  that in normal
BIOS boot total boot time will increase.

kexec/kdump will pass this parameter to second kernel so that 
device will
be reset during initialization and normal BIOS boot will 
reamin unaffected.

Thanks
Vivek

  
  
  


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Panics for AACRAID driver during 'insmod' for kexec test.

2007-03-28 Thread Judith Lebzelter
Hello, 

I have been running a series of kexec tests using LKDTT on the 
aacraid driver on this card (ASR-4805SAS (Marauder-E)) on x86_64
using the latest top of scsi-misc git-tree(as of yesterday), and 
I have found that it is not coming up consistantly when booted 
through kexec.

I have included 4 different types of failures I found here because 
I assume they might be related, and thought maybe there could 
be an issue with the card's state on reboot (through kexec).

The most common problem is this oops/panic, which has happened 
with various types of crash points (6 times out of 40):

Loading aacraid.Adaptec aacraid driver (1.1-5[2437]-mh4)^M
ko module^M
ACPI: PCI Interrupt :03:0e.0[A] - Link [LNKC] - GSI 3 (level, low) - IRQ 
3^M
general protection fault:  [1] ^M
CPU 0 ^M
Modules linked in: aacraid^M
Pid: 0, comm: swapper Not tainted 2.6.21-rc3-kdump #1^M
RIP: 0010:[88008a99]  [88008a99] 
:aacraid:aac_intr_normal+0x17a/0x1b1^M
RSP: :81523ed8  EFLAGS: 00010006^M
RAX: 810004102000 RBX: 8100014f01e0 RCX: 0086^M
RDX: 810004041540 RSI: 8100014f01e0 RDI: ^M
RBP: 810004702cd8 R08: a6037e6c R09: 0016001562d7^M
R10: 0023 R11:  R12: 0011^M
R13: 810004702cd8 R14: 810004001400 R15: ^M
FS:  () GS:814d5000() knlGS:^M
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b^M
CR2: 006ba5a0 CR3: 0474d000 CR4: 06e0^M
Process swapper (pid: 0, threadinfo 814e4000, task 81470360)^M
Stack:  0011 810004702cd8 0100 0003^M
 0001 88009470  810004041540^M
 814d5080 810428f4  814d5080^M
Call Trace:^M
 IRQ  [88009470] :aacraid:aac_rx_intr_message+0x2c/0x60^M
 [810428f4] note_interrupt+0xd3/0x1db^M
 [8104319b] handle_level_irq+0x7e/0xab^M
 [8100b0b1] do_IRQ+0xd7/0x132^M
 [810085a1] mwait_idle+0x0/0x43^M
 [81009651] ret_from_intr+0x0/0xa^M
 EOI  [810085e0] mwait_idle+0x3f/0x43^M
 [81008540] cpu_idle+0x3d/0x5c^M
 [814e78d2] start_kernel+0x28f/0x29b^M
 [814e7140] _sinittext+0x140/0x144^M
^M
^M
Code: ff 53 38 eb 20 9c 58 fa 83 7b 30 00 75 07 c7 43 30 01 00 00 ^M
RIP  [88008a99] :aacraid:aac_intr_normal+0x17a/0x1b1^M
Kernel panic - not syncing: Aiee, killing interrupt handler!^M
 

Another failure:   for crash point 'TIMERADD-bug' I got this error 
loading insmod:

Loading aacraid.Adaptec aacraid driver (1.1-5[2437]-mh4)^M
ko module^M
ACPI: PCI Interrupt :03:0e.0[A] - Link [LNKC] - GSI 3 (level, low) - IRQ 
3^M
input: ImExPS/2 Generic Explorer Mouse as /class/input/input3^M
aacraid: aac_fib_send: adapter blinkLED 0xc2.^M
Usually a result of a serious unrecoverable hardware problem^M
aac_fib_free, XferState != 0, fibptr = 0x8100014f, XferState = 0x810ad^M
aacraid: probe of :03:0e.0 failed with error -14^M


Yet another Failure: for crash point 'TIMERADD-panic' I got this error 
during insmod:

Loading aacraid.Adaptec aacraid driver (1.1-5[2437]-mh4)^M
ko module^M
ACPI: PCI Interrupt :03:0e.0[A] - Link [LNKC] - GSI 3 (level, low) - IRQ 
3^M
input: ImExPS/2 Generic Explorer Mouse as /class/input/input3^M
Ecr^H ^H^H ^H^H ^HBUG: soft lockup detected on CPU#0!^M
^M
Call Trace:^M
 IRQ  [8102bcbb] update_process_times+0x3b/0x5f^M
 [8100bebf] main_timer_handler+0x2f/0x1ae^M
 [8102b504] run_timer_softirq+0x14/0x161^M
 [8100c050] timer_interrupt+0x12/0x27^M
 [81041f9c] handle_IRQ_event+0x25/0x53^M
 [81028c1b] __do_softirq+0x46/0x90^M
 [81043186] handle_level_irq+0x69/0xab^M
 [8100b0b1] do_IRQ+0xd7/0x132^M
 [81009651] ret_from_intr+0x0/0xa^M
 EOI  [811229ed] __delay+0x8/0x10^M
 [88007c68] :aacraid:aac_fib_send+0x1ba/0x234^M
 [880048aa] :aacraid:aac_get_adapter_info+0x76/0x536^M
 [88002bb3] :aacraid:aac_probe_one+0x236/0x457^M
 [8112bd6d] pci_device_probe+0x4c/0x75^M
 [8117d0da] really_probe+0xc4/0x148^M
 [8117d30b] __driver_attach+0x6d/0xab^M
 [8117d29e] __driver_attach+0x0/0xab^M
 [8117d29e] __driver_attach+0x0/0xab^M
 [8117c5b2] bus_for_each_dev+0x43/0x6e^M
 [8117c8f4] bus_add_driver+0x6b/0x18d^M
 [8112bf0b] __pci_register_driver+0x72/0xa7^M
 [8801203a] :aacraid:aac_init+0x3a/0x75^M
 [8103bafc] sys_init_module+0x1195/0x12e6^M
 [8100913e] system_call+0x7e/0x83^M
^M
BUG: soft lockup detected on CPU#0!^M

One last error I got for INT_TASKLET_ENTRY-exception was this
after the filesystem is mounted and I am copying the vmcore 
file to it:

Copying the dump
aacraid: Host adapter abort request (4,0,0,0)
aacraid: Host adapter abort request (4,0,0,0)
aacraid: Host adapter reset request. SCSI hang 

[ PATCH ] mptsas: Fix oops during driver load time(rev 2)

2007-03-12 Thread Judith Lebzelter
This fixes an oops during driver load time.   

mptsas_probe calls mpt_attach(over in mptbase.c).  Inside that 
call, we read some manufacturing config pages to setup some 
defaults.  While reading the config pages, the firmware doesn't 
complete the reply in time, and we have a timeout. The timeout 
results in hardreset handler being called.  The hardreset 
handler calls all the fusion upper layer driver reset callback 
handlers.  The mptsas_ioc_reset function is the callback handler 
in mptsas.c.   In summary, mptsas_ioc_reset is getting called 
before scsi_host_alloc is called, and the pointer ioc-sh is 
NULL, as well as the hostdata.

Signed-off-by:  Judith Lebzelter [EMAIL PROTECTED]

---
Sorry I was not more descriptive.  Here is the patch with Eric's 
description as requested.

Index: linux-2.6.21-rc3/drivers/message/fusion/mptsas.c
===
--- linux-2.6.21-rc3.orig/drivers/message/fusion/mptsas.c
+++ linux-2.6.21-rc3/drivers/message/fusion/mptsas.c
@@ -815,7 +815,7 @@ mptsas_taskmgmt_complete(MPT_ADAPTER *io
 static int
 mptsas_ioc_reset(MPT_ADAPTER *ioc, int reset_phase)
 {
-   MPT_SCSI_HOST   *hd = (MPT_SCSI_HOST *)ioc-sh-hostdata;
+   MPT_SCSI_HOST   *hd;
struct mptsas_target_reset_event *target_reset_list, *n;
int rc;
 
@@ -827,7 +827,10 @@ mptsas_ioc_reset(MPT_ADAPTER *ioc, int r
if (reset_phase != MPT_IOC_POST_RESET)
goto out;
 
-   if (!hd || !hd-ioc)
+   if (!ioc-sh || !ioc-sh-hostdata)
+   goto out;
+   hd = (MPT_SCSI_HOST *)ioc-sh-hostdata;
+   if (!hd-ioc)
goto out;
 
if (list_empty(hd-target_reset_list))


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ PATCH ] mptsas: Fix oops for insmod during kexec

2007-03-09 Thread Judith Lebzelter
Hello,

This patch is to fix an oops on insmod for mptsas during kexec.
This applies to 2.6.21-rc3.

Signed-off-by:  Judith Lebzelter [EMAIL PROTECTED]

---


Index: linux-2.6.21-rc3/drivers/message/fusion/mptsas.c
===
--- linux-2.6.21-rc3.orig/drivers/message/fusion/mptsas.c
+++ linux-2.6.21-rc3/drivers/message/fusion/mptsas.c
@@ -815,7 +815,7 @@ mptsas_taskmgmt_complete(MPT_ADAPTER *io
 static int
 mptsas_ioc_reset(MPT_ADAPTER *ioc, int reset_phase)
 {
-   MPT_SCSI_HOST   *hd = (MPT_SCSI_HOST *)ioc-sh-hostdata;
+   MPT_SCSI_HOST   *hd;
struct mptsas_target_reset_event *target_reset_list, *n;
int rc;
 
@@ -827,7 +827,10 @@ mptsas_ioc_reset(MPT_ADAPTER *ioc, int r
if (reset_phase != MPT_IOC_POST_RESET)
goto out;
 
-   if (!hd || !hd-ioc)
+   if (!ioc-sh || !ioc-sh-hostdata)
+   goto out;
+   hd = (MPT_SCSI_HOST *)ioc-sh-hostdata;
+   if (!hd-ioc)
goto out;
 
if (list_empty(hd-target_reset_list))

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html