Re: [PATCH] improve bsg device allocation

2007-03-28 Thread Jens Axboe
On Thu, Mar 22 2007, FUJITA Tomonori wrote:
 This patch addresses on two issues on bsg device allocation.
 
 - the current maxium number of bsg devices is 256. It's too small if
 we allocate bsg devices to all SCSI devices, transport entities, etc.
 This increses the maxium number to 32768 (taken from the sg driver).
 
 - SCSI devices are dynamically added and removed. Currently, bsg can't
 handle it well since bsd_device-minor is simply increased.
 
 This is dependent on the patchset that I posted yesterday:
 
 http://marc.info/?l=linux-scsim=117440208726755w=2

applied

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] zfcp: fix initialization of FSF timer

2007-03-28 Thread Christof Schmitt
From: Christof Schmitt [EMAIL PROTECTED]

Correctly initialize the timer for FSF requests with jiffies + timeout.

Cc: Swen Schillig [EMAIL PROTECTED]
Acked-by: Heiko Carstens [EMAIL PROTECTED]
Signed-off-by: Christof Schmitt [EMAIL PROTECTED]
---

Please consider this patch for 2.6.21.

 drivers/s390/scsi/zfcp_erp.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux/drivers/s390/scsi/zfcp_erp.c.orig 2007-03-07 09:53:22.0 
+0100
+++ linux/drivers/s390/scsi/zfcp_erp.c  2007-03-07 09:53:35.0 +0100
@@ -186,7 +186,7 @@
 {
fsf_req-timer.function = zfcp_fsf_request_timeout_handler;
fsf_req-timer.data = (unsigned long) fsf_req-adapter;
-   fsf_req-timer.expires = timeout;
+   fsf_req-timer.expires = jiffies + timeout;
add_timer(fsf_req-timer);
 }
 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] add a request_queue argument to scsi_cmd_ioctl()

2007-03-28 Thread Jens Axboe
On Tue, Mar 20 2007, FUJITA Tomonori wrote:
 bsg uses scsi_cmd_ioctl() for some SCSI/sg ioctl
 commands. scsi_cmd_ioctl() gets a request queue from a gendisk
 arguement. This prevents bsg being bound to SCSI devices that don't
 have a gendisk (like OSD). This adds a request_queue argument to
 scsi_cmd_ioctl(). The SCSI/sg ioctl commands doesn't use a gendisk so
 it's safe for any SCSI devices to use scsi_cmd_ioctl().

applied all 3


-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Disabling block layer

2007-03-28 Thread Jens Axboe
On Mon, Mar 26 2007, Mark Lobo wrote:
 Hello!
 
 I had a question about disabling the block layer for SCSI devices. We
 have an embedded device, and it runs 2.4.30. We need to be able to
 support a lot of SCSI devices (in the thousands) for our device, and we
 talk to the devices via SG. We are facing a memory allocation problem
 after discovering a few thousand devices. For every device,  there
 seems to be a lot of memory allocated in the block layer. This memory
 includes cache memory (which IIRC is reclaimable by the kernel memory
 subsystem when it needs it) and also pages that are used for the
 alloc_pages pool.

A much easier approach would be to limit the memory used for each device
in the block layer. Since SCSI uses the block layer as a transport for
commands, you cannot disable the block layer in any easy manner.

But your memory is likely being eated by the queue freelist. So edit
drivers/block/ll_rw_blk.c and hardcode nr_requests to a low number (like
2).

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] aacraid: Add likely() and unlikely()

2007-03-28 Thread Arjan van de Ven
On Tue, 2007-03-27 at 09:17 -0700, Allexio Ju wrote:
 On Thu, 2007-03-22 at 2007 2:24 AM, Arjan van de Ven wrote:
  (I assume you're aware that likely/unlikely should only be
  used for 99:1 or higher ratios, this one looks correct for sure)
 Could you share details of reasons why those macros should be used in the way?
 I thought those macros simply tell compiler to layout code in such a
 way that minimizes unnecessary jumps.


it's more than that. it generally also tells the processor what the
branch will be, at which point most processors disable their own branch
prediction logic. Trying to hand-layout code is almost always a
mistake... don't do that. GCC also is quite good at recognizing certain
patterns to keep the code flow working. Trying to override that only
hurts...
-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] aacraid: remove unused or deprecated firmware constants

2007-03-28 Thread Salyzyn, Mark
Just sweeping the floor clean in one spot. Some of these constants have
never been used in the driver or in the firmware (and thus are
meaningless). Triggered this patch because I discovered one of the
unused constants was actually incorrect and figured it was better to
clean them out than correct and update. There are no side effects at all
regarding this patch, it is purely cosmetic.

ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patches.

This attached patch is against current scsi-misc-2.6

Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]


aacraid_unused_firmware_constants.patch
Description: aacraid_unused_firmware_constants.patch


Re: [PATCH] [qla2xxx] Remove duplicate pci_disable_device() call

2007-03-28 Thread Andrew Vasquez
On Wed, 28 Mar 2007, Bernhard Walle wrote:

 [PATCH] [qla2xxx] Remove duplicate pci_disable_device() call
 
 On the path qla2x00_probe_one() - probe_failed - qla2x00_free_device(),
 pci_disable_device() is executed twice, once in qla2x00_free_device()
 and once in qla2x00_probe_one().
 
 This patch removes the unnecessary call.
 
 
 Signed-off-by: Bernhard Walle [EMAIL PROTECTED]

Acked-by: Andrew Vasquez [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] aacraid: Add likely() and unlikely()

2007-03-28 Thread Allexio Ju

On 3/28/07, Arjan van de Ven [EMAIL PROTECTED] wrote:

On Tue, 2007-03-27 at 09:17 -0700, Allexio Ju wrote:
 I thought those macros simply tell compiler to layout code in such a
 way that minimizes unnecessary jumps.
it's more than that. it generally also tells the processor what the
branch will be, at which point most processors disable their own branch
prediction logic. Trying to hand-layout code is almost always a
mistake... don't do that. GCC also is quite good at recognizing certain
patterns to keep the code flow working. Trying to override that only
hurts...

I see... thanks for clarification.

allexio
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2][resend] ibmvscsi: dynamic request_limit and device restart

2007-03-28 Thread Robert Jennings
James,

Resending these patches for inclusion in your tree.

There are two fixes for the ibmvscsi client driver in this set.

- Dynamic request_limit
The request_limit for the driver was not properly reflecting the value on
the server side and could cause can_queue to be set to improper values (-1).
The patch corrects this so that request_limit mirrors the value
on the server and sets can_queue appropriately.

- Device restart
When a drive was removed from the server and then re-added the client
would not be able to use that device.  The device would return a unit
attention and then not ready.  By adding a slave_configure function we
can set the allow_restart flag for all disk devices.  Now devices will
resume functioning when they are re-added to the server.

---
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] aacraid: fix print of Firmware Build Date and add TSID

2007-03-28 Thread Salyzyn, Mark
The Adapter build date that is to be printed on instantiation was not
displayed as a result of the supplemental adapter information structure
not being in sync with the Firmware; the driver took an early test cycle
version that had a miss-sized padded region at the head and the
structure was not re-checked at the end of qualification. The Build Date
was not a priority and is merely a cosmetic enhancement, and the wrong
location for the start of the structure member would not induce any
side-effect problems. We updated the structure to match the actual
format, and added the TSID (Tech Support Identification) value print,
should it be present, to the adapter instantiation announcements during
driver load.

This later enhancement should improve the relationship between Service
folk  Tech Support if the printed value of the TSID found it's way into
the circular file labeled G...

Neither of these values show in sysfs (yet).

ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patches.

This attached patch is against current scsi-misc-2.6
 
Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]


aacraid_build_date_tsid.patch
Description: aacraid_build_date_tsid.patch


[PATCH 1/2] ibmvscsi: allow for dynamic adjustment of server request_limit

2007-03-28 Thread Robert Jennings
The request limit calculations used previously on the client failed to
mirror the state of the server.  Additionally, when a value  3 was provided
there could be problems setting can_queue and handling abort and reset 
commands.

Signed-off-by: Robert Jennings [EMAIL PROTECTED]
Signed-off-by: Santiago Leon [EMAIL PROTECTED]

---
 drivers/scsi/ibmvscsi/ibmvscsi.c |   58 +--
 drivers/scsi/ibmvscsi/ibmvscsi.h |2 +
 2 files changed, 40 insertions(+), 20 deletions(-)

Index: b/drivers/scsi/ibmvscsi/ibmvscsi.c
===
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -85,7 +85,7 @@
 static int max_id = 64;
 static int max_channel = 3;
 static int init_timeout = 5;
-static int max_requests = 50;
+static int max_requests = IBMVSCSI_MAX_REQUESTS_DEFAULT;
 
 #define IBMVSCSI_VERSION 1.5.8
 
@@ -538,7 +538,8 @@
int request_status;
int rc;
 
-   /* If we have exhausted our request limit, just fail this request.
+   /* If we have exhausted our request limit, just fail this request,
+* unless it is for a reset or abort.
 * Note that there are rare cases involving driver generated requests 
 * (such as task management requests) that the mid layer may think we
 * can handle more requests (can_queue) when we actually can't
@@ -551,9 +552,30 @@
 */
if (request_status  -1)
goto send_error;
-   /* Otherwise, if we have run out of requests */
-   else if (request_status  0)
-   goto send_busy;
+   /* Otherwise, we may have run out of requests. */
+   /* Abort and reset calls should make it through.
+* Nothing except abort and reset should use the last two
+* slots unless we had two or less to begin with.
+*/
+   else if (request_status  2 
+evt_struct-iu.srp.cmd.opcode != SRP_TSK_MGMT) {
+   /* In the case that we have less than two requests
+* available, check the server limit as a combination
+* of the request limit and the number of requests
+* in-flight (the size of the send list).  If the
+* server limit is greater than 2, return busy so
+* that the last two are reserved for reset and abort.
+*/
+   int server_limit = request_status;
+   struct srp_event_struct *tmp_evt;
+
+   list_for_each_entry(tmp_evt, hostdata-sent, list) {
+   server_limit++;
+   }
+
+   if (server_limit  2)
+   goto send_busy;
+   }
}
 
/* Copy the IU into the transfer area */
@@ -572,6 +594,7 @@
 
printk(KERN_ERR ibmvscsi: send error %d\n,
   rc);
+   atomic_inc(hostdata-request_limit);
goto send_error;
}
 
@@ -581,7 +604,8 @@
unmap_cmd_data(evt_struct-iu.srp.cmd, evt_struct, hostdata-dev);
 
free_event_struct(hostdata-pool, evt_struct);
-   return SCSI_MLQUEUE_HOST_BUSY;
+   atomic_inc(hostdata-request_limit);
+   return SCSI_MLQUEUE_HOST_BUSY;
 
  send_error:
unmap_cmd_data(evt_struct-iu.srp.cmd, evt_struct, hostdata-dev);
@@ -831,23 +855,16 @@
 
printk(KERN_INFO ibmvscsi: SRP_LOGIN succeeded\n);
 
-   if (evt_struct-xfer_iu-srp.login_rsp.req_lim_delta 
-   (max_requests - 2))
-   evt_struct-xfer_iu-srp.login_rsp.req_lim_delta =
-   max_requests - 2;
+   if (evt_struct-xfer_iu-srp.login_rsp.req_lim_delta  0)
+   printk(KERN_ERR ibmvscsi: Invalid request_limit.\n);
 
-   /* Now we know what the real request-limit is */
+   /* Now we know what the real request-limit is.
+* This value is set rather than added to request_limit because
+* request_limit could have been set to -1 by this client.
+*/
atomic_set(hostdata-request_limit,
   evt_struct-xfer_iu-srp.login_rsp.req_lim_delta);
 
-   hostdata-host-can_queue =
-   evt_struct-xfer_iu-srp.login_rsp.req_lim_delta - 2;
-
-   if (hostdata-host-can_queue  1) {
-   printk(KERN_ERR ibmvscsi: Invalid request_limit_delta\n);
-   return;
-   }
-
/* If we had any pending I/Os, kick them */
scsi_unblock_requests(hostdata-host);
 
@@ -1483,7 +1500,7 @@
.eh_abort_handler = ibmvscsi_eh_abort_handler,
.eh_device_reset_handler = ibmvscsi_eh_device_reset_handler,
.cmd_per_lun = 16,
-   .can_queue = 1, /* Updated after SRP_LOGIN */
+   .can_queue = 

Re: [PATCH 2/2] ibmvscsi: add slave_configure to allow device restart

2007-03-28 Thread Randy Dunlap
On Wed, 28 Mar 2007 12:47:04 -0500 Robert Jennings wrote:

 Adding a slave_configure function for the driver. Now the disks can be
 restarted by the scsi mid-layer when the are disconnected and reconnected.
 
 Signed-off-by: Robert Jennings [EMAIL PROTECTED]
 Signed-off-by: Santiago Leon [EMAIL PROTECTED]
 
 ---
  drivers/scsi/ibmvscsi/ibmvscsi.c |   18 ++
  1 file changed, 18 insertions(+)
 
 Index: b/drivers/scsi/ibmvscsi/ibmvscsi.c
 ===
 --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
 +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
 @@ -1354,6 +1354,23 @@
   return rc;
  }
  
 +/**
 + * ibmvscsi_slave_configure: For each slave device that is a disk,
 + * ensure that the allow_restart flag is enabled.

Hi,
Please don't use kernel-doc notation (/**) unless the following
documentation block is in kernel-doc format.  See
Documentation/kernel-doc-nano-HOWTO.txt for more info, or just
ask me questions if you have any.

 + */
 +static int ibmvscsi_slave_configure(struct scsi_device *sdev)
 +{
 + struct Scsi_Host *shost = sdev-host;
 + unsigned long lock_flags = 0;
 +
 + spin_lock_irqsave(shost-host_lock, lock_flags);
 + if (sdev-type == TYPE_DISK)
 + sdev-allow_restart = 1;
 + scsi_adjust_queue_depth(sdev, 0, shost-cmd_per_lun);
 + spin_unlock_irqrestore(shost-host_lock, lock_flags);
 + return 0;
 +}
 +
  /* 
   * sysfs attributes
   */
 @@ -1499,6 +1516,7 @@
   .queuecommand = ibmvscsi_queuecommand,
   .eh_abort_handler = ibmvscsi_eh_abort_handler,
   .eh_device_reset_handler = ibmvscsi_eh_device_reset_handler,
 + .slave_configure = ibmvscsi_slave_configure,
   .cmd_per_lun = 16,
   .can_queue = IBMVSCSI_MAX_REQUESTS_DEFAULT,
   .this_id = -1,


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Panics for AACRAID driver during 'insmod' for kexec test.

2007-03-28 Thread Judith Lebzelter
Hello, 

I have been running a series of kexec tests using LKDTT on the 
aacraid driver on this card (ASR-4805SAS (Marauder-E)) on x86_64
using the latest top of scsi-misc git-tree(as of yesterday), and 
I have found that it is not coming up consistantly when booted 
through kexec.

I have included 4 different types of failures I found here because 
I assume they might be related, and thought maybe there could 
be an issue with the card's state on reboot (through kexec).

The most common problem is this oops/panic, which has happened 
with various types of crash points (6 times out of 40):

Loading aacraid.Adaptec aacraid driver (1.1-5[2437]-mh4)^M
ko module^M
ACPI: PCI Interrupt :03:0e.0[A] - Link [LNKC] - GSI 3 (level, low) - IRQ 
3^M
general protection fault:  [1] ^M
CPU 0 ^M
Modules linked in: aacraid^M
Pid: 0, comm: swapper Not tainted 2.6.21-rc3-kdump #1^M
RIP: 0010:[88008a99]  [88008a99] 
:aacraid:aac_intr_normal+0x17a/0x1b1^M
RSP: :81523ed8  EFLAGS: 00010006^M
RAX: 810004102000 RBX: 8100014f01e0 RCX: 0086^M
RDX: 810004041540 RSI: 8100014f01e0 RDI: ^M
RBP: 810004702cd8 R08: a6037e6c R09: 0016001562d7^M
R10: 0023 R11:  R12: 0011^M
R13: 810004702cd8 R14: 810004001400 R15: ^M
FS:  () GS:814d5000() knlGS:^M
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b^M
CR2: 006ba5a0 CR3: 0474d000 CR4: 06e0^M
Process swapper (pid: 0, threadinfo 814e4000, task 81470360)^M
Stack:  0011 810004702cd8 0100 0003^M
 0001 88009470  810004041540^M
 814d5080 810428f4  814d5080^M
Call Trace:^M
 IRQ  [88009470] :aacraid:aac_rx_intr_message+0x2c/0x60^M
 [810428f4] note_interrupt+0xd3/0x1db^M
 [8104319b] handle_level_irq+0x7e/0xab^M
 [8100b0b1] do_IRQ+0xd7/0x132^M
 [810085a1] mwait_idle+0x0/0x43^M
 [81009651] ret_from_intr+0x0/0xa^M
 EOI  [810085e0] mwait_idle+0x3f/0x43^M
 [81008540] cpu_idle+0x3d/0x5c^M
 [814e78d2] start_kernel+0x28f/0x29b^M
 [814e7140] _sinittext+0x140/0x144^M
^M
^M
Code: ff 53 38 eb 20 9c 58 fa 83 7b 30 00 75 07 c7 43 30 01 00 00 ^M
RIP  [88008a99] :aacraid:aac_intr_normal+0x17a/0x1b1^M
Kernel panic - not syncing: Aiee, killing interrupt handler!^M
 

Another failure:   for crash point 'TIMERADD-bug' I got this error 
loading insmod:

Loading aacraid.Adaptec aacraid driver (1.1-5[2437]-mh4)^M
ko module^M
ACPI: PCI Interrupt :03:0e.0[A] - Link [LNKC] - GSI 3 (level, low) - IRQ 
3^M
input: ImExPS/2 Generic Explorer Mouse as /class/input/input3^M
aacraid: aac_fib_send: adapter blinkLED 0xc2.^M
Usually a result of a serious unrecoverable hardware problem^M
aac_fib_free, XferState != 0, fibptr = 0x8100014f, XferState = 0x810ad^M
aacraid: probe of :03:0e.0 failed with error -14^M


Yet another Failure: for crash point 'TIMERADD-panic' I got this error 
during insmod:

Loading aacraid.Adaptec aacraid driver (1.1-5[2437]-mh4)^M
ko module^M
ACPI: PCI Interrupt :03:0e.0[A] - Link [LNKC] - GSI 3 (level, low) - IRQ 
3^M
input: ImExPS/2 Generic Explorer Mouse as /class/input/input3^M
Ecr^H ^H^H ^H^H ^HBUG: soft lockup detected on CPU#0!^M
^M
Call Trace:^M
 IRQ  [8102bcbb] update_process_times+0x3b/0x5f^M
 [8100bebf] main_timer_handler+0x2f/0x1ae^M
 [8102b504] run_timer_softirq+0x14/0x161^M
 [8100c050] timer_interrupt+0x12/0x27^M
 [81041f9c] handle_IRQ_event+0x25/0x53^M
 [81028c1b] __do_softirq+0x46/0x90^M
 [81043186] handle_level_irq+0x69/0xab^M
 [8100b0b1] do_IRQ+0xd7/0x132^M
 [81009651] ret_from_intr+0x0/0xa^M
 EOI  [811229ed] __delay+0x8/0x10^M
 [88007c68] :aacraid:aac_fib_send+0x1ba/0x234^M
 [880048aa] :aacraid:aac_get_adapter_info+0x76/0x536^M
 [88002bb3] :aacraid:aac_probe_one+0x236/0x457^M
 [8112bd6d] pci_device_probe+0x4c/0x75^M
 [8117d0da] really_probe+0xc4/0x148^M
 [8117d30b] __driver_attach+0x6d/0xab^M
 [8117d29e] __driver_attach+0x0/0xab^M
 [8117d29e] __driver_attach+0x0/0xab^M
 [8117c5b2] bus_for_each_dev+0x43/0x6e^M
 [8117c8f4] bus_add_driver+0x6b/0x18d^M
 [8112bf0b] __pci_register_driver+0x72/0xa7^M
 [8801203a] :aacraid:aac_init+0x3a/0x75^M
 [8103bafc] sys_init_module+0x1195/0x12e6^M
 [8100913e] system_call+0x7e/0x83^M
^M
BUG: soft lockup detected on CPU#0!^M

One last error I got for INT_TASKLET_ENTRY-exception was this
after the filesystem is mounted and I am copying the vmcore 
file to it:

Copying the dump
aacraid: Host adapter abort request (4,0,0,0)
aacraid: Host adapter abort request (4,0,0,0)
aacraid: Host adapter reset request. SCSI hang 

[PATCH] scsi: megaraid_sas - intercepts cmd timeout and throttle io

2007-03-28 Thread Sumant Patro

eh_timed_out call back (megasas_reset_timer) is used to throttle io to the 
adapter 
when it is called the first time for a scmd.
The MEGASAS_FW_BUSY flag is set and can_queue reduced to 16. The can_queue is 
restored 
from completion routine in following two conditions : 5 seconds has elapsed and 
the # of
outstanding cmds in FW is  17.

Signed-off-by: Sumant Patro [EMAIL PROTECTED]
---
 drivers/scsi/megaraid/megaraid_sas.c |   65 +++--
 drivers/scsi/megaraid/megaraid_sas.h |   13 +++--
 2 files changed, 70 insertions(+), 8 deletions(-)

This patch requires the patch submitted by James with subject line : 

[PATCH] expose eh_timed_out to the host template

diff -uprN linux-2.6.orig/drivers/scsi/megaraid/megaraid_sas.c 
linux-2.6.new/drivers/scsi/megaraid/megaraid_sas.c
--- linux-2.6.orig/drivers/scsi/megaraid/megaraid_sas.c 2007-03-28 
08:41:49.0 -0700
+++ linux-2.6.new/drivers/scsi/megaraid/megaraid_sas.c  2007-03-28 
08:36:38.0 -0700
@@ -10,7 +10,7 @@
  *2 of the License, or (at your option) any later version.
  *
  * FILE: megaraid_sas.c
- * Version : v00.00.03.10-rc1
+ * Version : v00.00.03.10-rc3
  *
  * Authors:
  * (email-id : [EMAIL PROTECTED])
@@ -886,6 +886,7 @@ megasas_queue_command(struct scsi_cmnd *
goto out_return_cmd;
 
cmd-scmd = scmd;
+   scmd-SCp.ptr = (char *)cmd;
 
/*
 * Issue the command to the FW
@@ -981,8 +982,8 @@ static int megasas_generic_reset(struct 
 
instance = (struct megasas_instance *)scmd-device-host-hostdata;
 
-   scmd_printk(KERN_NOTICE, scmd, megasas: RESET -%ld cmd=%x\n,
-  scmd-serial_number, scmd-cmnd[0]);
+   scmd_printk(KERN_NOTICE, scmd, megasas: RESET -%ld cmd=%x 
retries=%x\n,
+scmd-serial_number, scmd-cmnd[0], scmd-retries);
 
if (instance-hw_crit_error) {
printk(KERN_ERR megasas: cannot recover from previous reset 
@@ -1000,6 +1001,40 @@ static int megasas_generic_reset(struct 
 }
 
 /**
+ * megasas_reset_timer - quiesce the adapter if required
+ * @scmd:  scsi cmnd
+ *
+ * Sets the FW busy flag and reduces the host-can_queue if the
+ * cmd has not been completed within the timeout period.
+ */
+static enum
+scsi_eh_timer_return megasas_reset_timer(struct scsi_cmnd *scmd)
+{
+   struct megasas_cmd *cmd = (struct megasas_cmd *)scmd-SCp.ptr;
+   struct megasas_instance *instance;
+   unsigned long flags;
+
+   if (cmd) {
+   if (time_after(jiffies, scmd-jiffies_at_alloc + 170 * HZ))
+   return EH_NOT_HANDLED;
+
+   instance = cmd-instance;
+   if (!(instance-flag  MEGASAS_FW_BUSY)) {
+   /* FW is busy, throttle IO */
+   spin_lock_irqsave(instance-throttle_io_lock, flags);
+
+   instance-host-can_queue = 16;
+   instance-last_time = jiffies;
+   instance-flag |= MEGASAS_FW_BUSY;
+
+   spin_unlock_irqrestore(instance-throttle_io_lock, 
flags);
+   }
+   return EH_RESET_TIMER;
+   }
+   return EH_HANDLED;
+}
+
+/**
  * megasas_reset_device -  Device reset handler entry point
  */
 static int megasas_reset_device(struct scsi_cmnd *scmd)
@@ -1112,6 +1147,7 @@ static struct scsi_host_template megasas
.eh_device_reset_handler = megasas_reset_device,
.eh_bus_reset_handler = megasas_reset_bus_host,
.eh_host_reset_handler = megasas_reset_bus_host,
+   .eh_timed_out = megasas_reset_timer,
.bios_param = megasas_bios_param,
.use_clustering = ENABLE_CLUSTERING,
 };
@@ -1215,9 +1251,8 @@ megasas_complete_cmd(struct megasas_inst
int exception = 0;
struct megasas_header *hdr = cmd-frame-hdr;
 
-   if (cmd-scmd) {
+   if (cmd-scmd)
cmd-scmd-SCp.ptr = (char *)0;
-   }
 
switch (hdr-cmd) {
 
@@ -1806,6 +1841,7 @@ static void megasas_complete_cmd_dpc(uns
u32 context;
struct megasas_cmd *cmd;
struct megasas_instance *instance = (struct megasas_instance 
*)instance_addr;
+   unsigned long flags;
 
/* If we have already declared adapter dead, donot complete cmds */
if (instance-hw_crit_error)
@@ -1828,6 +1864,22 @@ static void megasas_complete_cmd_dpc(uns
}
 
*instance-consumer = producer;
+
+   /*
+* Check if we can restore can_queue
+*/
+   if (instance-flag  MEGASAS_FW_BUSY
+time_after(jiffies, instance-last_time + 5 * HZ)
+atomic_read(instance-fw_outstanding)  17) {
+
+   spin_lock_irqsave(instance-throttle_io_lock, flags);
+
+   instance-flag = ~MEGASAS_FW_BUSY;
+   instance-host-can_queue =
+   instance-max_fw_cmds - MEGASAS_INT_CMDS;
+
+   

Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Jeff Garzik

Kristen Carlson Accardi wrote:

Allow user space to determine if an ATAPI device supports
async notification (AN) of media changes.  This is done by
adding a new sysfs file async_notification to genhd.
If the file reads 1, then the device supports async 
notification.  If the file reads 0, it does not.  


A flag is set in the generic disk to indicate whether
or not AN is supported.  This flag is set by the SCSI
subsystem when it registers with add_disk.  The SCSI
system gets information from libata on whether the
device supports AN during dev_configure. 


Signed-off-by: Kristen Carlson Accardi [EMAIL PROTECTED]

Index: 2.6-mm/block/genhd.c
===
--- 2.6-mm.orig/block/genhd.c
+++ 2.6-mm/block/genhd.c
@@ -372,6 +372,11 @@ static ssize_t disk_size_read(struct gen
 {
return sprintf(page, %llu\n, (unsigned long long)get_capacity(disk));
 }
+static ssize_t disk_AN_read(struct gendisk * disk, char *page)
+{
+   return sprintf(page, %d\n,
+   (disk-flags  GENHD_FL_ASYNC_NOTIFICATION ? 1 : 0));
+}
 
 static ssize_t disk_stats_read(struct gendisk * disk, char *page)

 {
@@ -419,6 +424,10 @@ static struct disk_attribute disk_attr_s
.attr = {.name = stat, .mode = S_IRUGO },
.show   = disk_stats_read
 };
+static struct disk_attribute disk_attr_AN = {
+   .attr = {.name = media_change_events, .mode = S_IRUGO },
+   .show   = disk_AN_read
+};
 
 #ifdef CONFIG_FAIL_MAKE_REQUEST
 
@@ -455,6 +464,7 @@ static struct attribute * default_attrs[

disk_attr_removable.attr,
disk_attr_size.attr,
disk_attr_stat.attr,
+   disk_attr_AN.attr,
 #ifdef CONFIG_FAIL_MAKE_REQUEST
disk_attr_fail.attr,
 #endif
Index: 2.6-mm/include/linux/genhd.h
===
--- 2.6-mm.orig/include/linux/genhd.h
+++ 2.6-mm/include/linux/genhd.h
@@ -94,6 +94,7 @@ struct hd_struct {
 
 #define GENHD_FL_REMOVABLE			1

 #define GENHD_FL_DRIVERFS  2
+#define GENHD_FL_ASYNC_NOTIFICATION4
 #define GENHD_FL_CD8
 #define GENHD_FL_UP16
 #define GENHD_FL_SUPPRESS_PARTITION_INFO   32
Index: 2.6-mm/include/scsi/scsi_device.h
===
--- 2.6-mm.orig/include/scsi/scsi_device.h
+++ 2.6-mm/include/scsi/scsi_device.h
@@ -126,7 +126,7 @@ struct scsi_device {
unsigned fix_capacity:1;/* READ_CAPACITY is too high by 1 */
unsigned guess_capacity:1;  /* READ_CAPACITY might be too high by 1 
*/
unsigned retry_hwerror:1;   /* Retry HARDWARE_ERROR */
-
+   unsigned async_notification:1;  /* device supports async notification */
unsigned int device_blocked;/* Device returned QUEUE_FULL. */
 
 	unsigned int max_device_blocked; /* what device_blocked counts down from  */

Index: 2.6-mm/drivers/ata/libata-scsi.c
===
--- 2.6-mm.orig/drivers/ata/libata-scsi.c
+++ 2.6-mm/drivers/ata/libata-scsi.c
@@ -899,6 +899,9 @@ static void ata_scsi_dev_config(struct s
blk_queue_max_hw_segments(q, q-max_hw_segments - 1);
}
 
+	if (dev-flags  ATA_DFLAG_AN)

+   sdev-async_notification = 1;
+
if (dev-flags  ATA_DFLAG_NCQ) {
int depth;
 
Index: 2.6-mm/drivers/scsi/sr.c

===
--- 2.6-mm.orig/drivers/scsi/sr.c
+++ 2.6-mm/drivers/scsi/sr.c
@@ -603,6 +603,8 @@ static int sr_probe(struct device *dev)
 
 	dev_set_drvdata(dev, cd);

disk-flags |= GENHD_FL_REMOVABLE;
+   if (sdev-async_notification)
+   disk-flags |= GENHD_FL_ASYNC_NOTIFICATION;
add_disk(disk);


(added linux-scsi to CC)

Comments:

1) From a procedural standpoint, you'll want to separate this patch into 
three patches:  generic block layer stuff, SCSI stuff, and libata stuff.


2) I don't claim to be a sysfs expert, but this seems like a reasonable 
approach for reporting async-notification capabilities


3) I would make the contents of 'media_change_events' be a list of 
flags, rather than a boolean.  Thus, when AN is present, 
media_change_events would return AN\n.  It would return \n (no 
flags) when AN is absent.  This permits future expansion of this 
capabilities reporting variable.


4) Figure out some place to document 'media_change_events', in 
Documentation/*


5) I think the method of delivery probably needs discussing, and some 
work.  Presumably the normal hotplug paths should be traversed for this 
sort of thing.


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Fastboot] Panics for AACRAID driver during 'insmod' for kexec test.

2007-03-28 Thread Vivek Goyal
On Wed, Mar 28, 2007 at 02:54:32PM -0700, Judith Lebzelter wrote:
 Hello, 
 
 I have been running a series of kexec tests using LKDTT on the 
 aacraid driver on this card (ASR-4805SAS (Marauder-E)) on x86_64
 using the latest top of scsi-misc git-tree(as of yesterday), and 
 I have found that it is not coming up consistantly when booted 
 through kexec.
 
 I have included 4 different types of failures I found here because 
 I assume they might be related, and thought maybe there could 
 be an issue with the card's state on reboot (through kexec).
 
 The most common problem is this oops/panic, which has happened 
 with various types of crash points (6 times out of 40):
 
 Loading aacraid.Adaptec aacraid driver (1.1-5[2437]-mh4)^M
 ko module^M
 ACPI: PCI Interrupt :03:0e.0[A] - Link [LNKC] - GSI 3 (level, low) - 
 IRQ 3^M
 general protection fault:  [1] ^M
 CPU 0 ^M
 Modules linked in: aacraid^M
 Pid: 0, comm: swapper Not tainted 2.6.21-rc3-kdump #1^M
 RIP: 0010:[88008a99]  [88008a99] 
 :aacraid:aac_intr_normal+0x17a/0x1b1^M
 RSP: :81523ed8  EFLAGS: 00010006^M
 RAX: 810004102000 RBX: 8100014f01e0 RCX: 0086^M
 RDX: 810004041540 RSI: 8100014f01e0 RDI: ^M
 RBP: 810004702cd8 R08: a6037e6c R09: 0016001562d7^M
 R10: 0023 R11:  R12: 0011^M
 R13: 810004702cd8 R14: 810004001400 R15: ^M
 FS:  () GS:814d5000() knlGS:^M
 CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b^M
 CR2: 006ba5a0 CR3: 0474d000 CR4: 06e0^M
 Process swapper (pid: 0, threadinfo 814e4000, task 81470360)^M
 Stack:  0011 810004702cd8 0100 0003^M
  0001 88009470  810004041540^M
  814d5080 810428f4  814d5080^M
 Call Trace:^M
  IRQ  [88009470] :aacraid:aac_rx_intr_message+0x2c/0x60^M
  [810428f4] note_interrupt+0xd3/0x1db^M
  [8104319b] handle_level_irq+0x7e/0xab^M
  [8100b0b1] do_IRQ+0xd7/0x132^M
  [810085a1] mwait_idle+0x0/0x43^M
  [81009651] ret_from_intr+0x0/0xa^M
  EOI  [810085e0] mwait_idle+0x3f/0x43^M
  [81008540] cpu_idle+0x3d/0x5c^M
  [814e78d2] start_kernel+0x28f/0x29b^M
  [814e7140] _sinittext+0x140/0x144^M
 ^M
 ^M
 Code: ff 53 38 eb 20 9c 58 fa 83 7b 30 00 75 07 c7 43 30 01 00 00 ^M
 RIP  [88008a99] :aacraid:aac_intr_normal+0x17a/0x1b1^M
 Kernel panic - not syncing: Aiee, killing interrupt handler!^M
  
 

I don't much about the aacraid code but looking little bit, it looks like
the typical case where driver in second kernel receives the pending
interrupt from the device and in the interrupt handler it accesses some
data structures which are not even initialized yet. This interrupt must
have been pending from crashed kernel's context.

Either we should reset the device before doing request_irq(), so that
interrupts are cleared or do some kind of ABORT, FLUSH messages or
whatever the card firmware supports to clear the pending interrupts and 
flush exisiting commands before doing request_irq().

Thanks
Vivek
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Tejun Heo

Jeff Garzik wrote:

Kristen Carlson Accardi wrote:

Allow user space to determine if an ATAPI device supports
async notification (AN) of media changes.  This is done by
adding a new sysfs file async_notification to genhd.
If the file reads 1, then the device supports async notification.  If 
the file reads 0, it does not. 
A flag is set in the generic disk to indicate whether

or not AN is supported.  This flag is set by the SCSI
subsystem when it registers with add_disk.  The SCSI
system gets information from libata on whether the
device supports AN during dev_configure.
Signed-off-by: Kristen Carlson Accardi [EMAIL PROTECTED]



3) I would make the contents of 'media_change_events' be a list of 
flags, rather than a boolean.  Thus, when AN is present, 
media_change_events would return AN\n.  It would return \n (no 
flags) when AN is absent.  This permits future expansion of this 
capabilities reporting variable.


I'm not sure about this.  AN is kind of specific term for ATA while 
media change event is generic.  So, I think the original approach is 
okay.  No matter how the actual thing is implemented, it's the same 
media change event and as long as event delivery interface is the same, 
upper layer shouldn't care about how it's done.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Jeff Garzik

Tejun Heo wrote:

Jeff Garzik wrote:

Kristen Carlson Accardi wrote:

Allow user space to determine if an ATAPI device supports
async notification (AN) of media changes.  This is done by
adding a new sysfs file async_notification to genhd.
If the file reads 1, then the device supports async notification.  If 
the file reads 0, it does not. A flag is set in the generic disk to 
indicate whether

or not AN is supported.  This flag is set by the SCSI
subsystem when it registers with add_disk.  The SCSI
system gets information from libata on whether the
device supports AN during dev_configure.
Signed-off-by: Kristen Carlson Accardi [EMAIL PROTECTED]



3) I would make the contents of 'media_change_events' be a list of 
flags, rather than a boolean.  Thus, when AN is present, 
media_change_events would return AN\n.  It would return \n (no 
flags) when AN is absent.  This permits future expansion of this 
capabilities reporting variable.


I'm not sure about this.  AN is kind of specific term for ATA while 
media change event is generic.  So, I think the original approach is 
okay.  No matter how the actual thing is implemented, it's the same 
media change event and as long as event delivery interface is the same, 
upper layer shouldn't care about how it's done.


AN is a generic concept that I feel will propagate elsewhere.

Though perhaps it should be in a 'capability_flags' file rather than a 
'media_change_event' file.


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/3] libata: expose AN support to user space via sysfs

2007-03-28 Thread Tejun Heo

Jeff Garzik wrote:

AN is a generic concept that I feel will propagate elsewhere.


I think SCSI already has it or am I imagining things again?  :-)

Though perhaps it should be in a 'capability_flags' file rather than a 
'media_change_event' file.


IMHO, if it's genhd.capability_flags then the flag should be 
MEDIA_CHANGE_NOTIFY not ASYNC_NOTIFICATION because AN itself doesn't 
imply any specific event.  It's just a notification mechanism, for ATAPI 
devices, it means media change, for PMP it has a different meaning, so I 
think we need to export the processed meaning not the specific mechanism 
to userland.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html