Re: [PATCH V4 0/5] virtio-scsi multiqueue

2013-03-18 Thread Asias He
On Mon, Mar 11, 2013 at 10:43:57AM +0800, Wanlong Gao wrote:
> This series implements virtio-scsi queue steering, which gives
> performance improvements of up to 50% (measured both with QEMU and
> tcm_vhost backends).
> 
> This version rebased on Rusty's virtio ring rework patches.
> We hope this can go into virtio-next together with the virtio ring
> rework pathes.
> 
> 
> V4: rebase on virtio ring rework patches (rusty's pending-rebases branch)
> 
> V3 and be found http://marc.info/?l=linux-virtualization&m=136067440717154&w=2
> 
> 
> It would probably be easier to get it in via Rusty's tree
> because of the prerequisites.  James, can we get your Acked-by?
> 
> Paolo Bonzini (4):
>   virtio-scsi: redo allocation of target data
>   virtio-scsi: pass struct virtio_scsi to virtqueue completion function
>   virtio-scsi: push vq lock/unlock into virtscsi_vq_done
>   virtio-scsi: introduce multiqueue support
> 
> Wanlong Gao (1):
>   virtio-scsi: reset virtqueue affinity when doing cpu hotplug
> 
>  drivers/scsi/virtio_scsi.c | 364 
> -
>  1 file changed, 291 insertions(+), 73 deletions(-)
> 
> -- 
> 1.8.2.rc2
> 

Patch 2-5
Reviewed-by: Asias He 

-- 
Asias
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 1/5] virtio-scsi: redo allocation of target data

2013-03-18 Thread Asias He
On Mon, Mar 11, 2013 at 10:43:58AM +0800, Wanlong Gao wrote:
> From: Paolo Bonzini 
> 
> virtio_scsi_target_state is now empty, but we will find new uses
> for it in the next few patches.  However, dropping the sglist lets
> us turn the array-of-pointers into a simple array, which simplifies
> the allocation.
> 
> However, we do not leave the virtio_scsi_target_state structs in the
> flexible array member at the end of struct virtio_scsi, because
> we will place the virtqueues there in the next patches.
> 
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: Wanlong Gao 
> ---
>  drivers/scsi/virtio_scsi.c | 45 -
>  1 file changed, 12 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 6307099..75ccde1 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -76,7 +76,7 @@ struct virtio_scsi {
>   /* Get some buffers ready for event vq */
>   struct virtio_scsi_event_node event_list[VIRTIO_SCSI_EVENT_LEN];
>  
> - struct virtio_scsi_target_state *tgt[];
> + struct virtio_scsi_target_state *tgt;
>  };
>  
>  static struct kmem_cache *virtscsi_cmd_cache;
> @@ -568,18 +568,9 @@ static void virtscsi_init_vq(struct virtio_scsi_vq 
> *virtscsi_vq,
>   virtscsi_vq->vq = vq;
>  }
>  
> -static struct virtio_scsi_target_state *virtscsi_alloc_tgt(
> - struct virtio_device *vdev, int sg_elems)
> +static void virtscsi_init_tgt(struct virtio_scsi_target_state *tgt)
>  {
> - struct virtio_scsi_target_state *tgt;
> - gfp_t gfp_mask = GFP_KERNEL;
> -
> - tgt = kmalloc(sizeof(*tgt), gfp_mask);
> - if (!tgt)
> - return NULL;
> -
>   spin_lock_init(&tgt->tgt_lock);
> - return tgt;
>  }
>  
>  static void virtscsi_scan(struct virtio_device *vdev)
> @@ -593,17 +584,10 @@ static void virtscsi_remove_vqs(struct virtio_device 
> *vdev)
>  {
>   struct Scsi_Host *sh = virtio_scsi_host(vdev);
>   struct virtio_scsi *vscsi = shost_priv(sh);
> - u32 i, num_targets;
>  
>   /* Stop all the virtqueues. */
>   vdev->config->reset(vdev);
> -
> - num_targets = sh->max_id;
> - for (i = 0; i < num_targets; i++) {
> - kfree(vscsi->tgt[i]);
> - vscsi->tgt[i] = NULL;
> - }
> -
> + kfree(vscsi->tgt);
>   vdev->config->del_vqs(vdev);
>  }
>  
> @@ -612,7 +596,7 @@ static int virtscsi_init(struct virtio_device *vdev,
>  {
>   int err;
>   struct virtqueue *vqs[3];
> - u32 i, sg_elems;
> + u32 i;
>  
>   vq_callback_t *callbacks[] = {
>   virtscsi_ctrl_done,
> @@ -640,16 +624,14 @@ static int virtscsi_init(struct virtio_device *vdev,
>   if (virtio_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG))
>   virtscsi_kick_event_all(vscsi);
>  
> - /* We need to know how many segments before we allocate.  */
> - sg_elems = virtscsi_config_get(vdev, seg_max) ?: 1;

The drop of sg_elems does not belong to this patch. It should be done in
'virtio-scsi: use virtqueue_add_sgs for command buffers'.

> - for (i = 0; i < num_targets; i++) {
> - vscsi->tgt[i] = virtscsi_alloc_tgt(vdev, sg_elems);
> - if (!vscsi->tgt[i]) {
> - err = -ENOMEM;
> - goto out;
> - }
> + vscsi->tgt = kmalloc(num_targets * sizeof(vscsi->tgt[0]), GFP_KERNEL);
> + if (!vscsi->tgt) {
> + err = -ENOMEM;
> + goto out;
>   }
> + for (i = 0; i < num_targets; i++)
> + virtscsi_init_tgt(&vscsi->tgt[i]);
> +
>   err = 0;
>  
>  out:
> @@ -668,10 +650,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
>  
>   /* Allocate memory and link the structs together.  */
>   num_targets = virtscsi_config_get(vdev, max_target) + 1;
> - shost = scsi_host_alloc(&virtscsi_host_template,
> - sizeof(*vscsi)
> - + num_targets * sizeof(struct virtio_scsi_target_state));
> -
> + shost = scsi_host_alloc(&virtscsi_host_template, sizeof(*vscsi));
>   if (!shost)
>   return -ENOMEM;
>  
> -- 
> 1.8.2.rc2
> 

-- 
Asias
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_transport_fc: Make 'port_state' writeable

2013-03-18 Thread Jeremy Linton
On 3/15/2013 8:28 AM, Bryn M. Reeves wrote:
> On 03/15/2013 12:46 PM, Bart Van Assche wrote:
>> The SCSI EH keeps trying until all outstanding request have been 
>> finished. Does lpfc_host_reset_handler() invoke scsi_done() for
> 
> I don't think so (ends up calling lpfc_sli_cancel_iocbs() via 
> lpfc_hba_down_post() after shutting down the mailbox) but I've not seen the
> EH escalate all the way to host reset in most of my testing -
...
> The problem is that getting to this stage can take a very long time - much
> longer than most cluster's node eviction timer for e.g. which is the source
> of much of the complaint about this behaviour.

>> outstanding requests ? If not, how about modifying 
>> lpfc_host_reset_handler() such that it finishes all outstanding requests 
>> if the remote port is not reachable ?

It does call the done() function on the outstanding command IOCBs after 
the
lpfc_reset_flush_io_context() call aborts them. The "problem" is that they are
returned with ScsiResult(DID_REQUEUE, 0) which basically queues them back to the
port as long as the port is still "up". Which results in the commands hanging
out until their timeouts expire (if the device isn't responding).

If the device does resume after the reset, in the case of a tape device 
it is
possible corrupt the tape because the 2900's get trapped by the TUR in the eh
routines depending on which commands were hung. Take write for example, the
reset can result in a tape rewind, and when the write gets fired back at the
device the tape is at BOT and effectively erases all data already on the tape.
Whops!

Also, as I stated elsewhere, in my testing its impossible to escalate 
beyond
the flush_io_context() in the lpfc_device_reset_handler driver because it always
returns true if the card firmware is responding.



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -next] target: fix possible memory leak in core_tpg_register()

2013-03-18 Thread Nicholas A. Bellinger
On Fri, 2013-03-15 at 17:19 +0800, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> 'se_tpg->tpg_lun_list' is malloced in core_tpg_register() and should be freed
> before leaving from the error handling cases, otherwise it will cause memory
> leak. 
> 'se_tpg' is malloced out of this function, and will be freed if we return 
> error, so
> remove free for 'se_tpg'.
> 
> Signed-off-by: Wei Yongjun 
> ---

Applied to target-pending/master, and including in the next
v3.9-rc-fixes pull request.

Thanks Wei!

--nab

>  drivers/target/target_core_tpg.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/target/target_core_tpg.c 
> b/drivers/target/target_core_tpg.c
> index 9169d6a..aac9d27 100644
> --- a/drivers/target/target_core_tpg.c
> +++ b/drivers/target/target_core_tpg.c
> @@ -711,7 +711,8 @@ int core_tpg_register(
>  
>   if (se_tpg->se_tpg_type == TRANSPORT_TPG_TYPE_NORMAL) {
>   if (core_tpg_setup_virtual_lun0(se_tpg) < 0) {
> - kfree(se_tpg);
> + array_free(se_tpg->tpg_lun_list,
> +TRANSPORT_MAX_LUNS_PER_TPG);
>   return -ENOMEM;
>   }
>   }
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcm_fc: using kfree_rcu() to simplify the code

2013-03-18 Thread Nicholas A. Bellinger
Hi Wei,

On Mon, 2013-03-11 at 21:48 +0800, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> The callback function of call_rcu() just calls a kfree(), so we
> can use kfree_rcu() instead of call_rcu() + callback function.
> 
> Signed-off-by: Wei Yongjun 
> ---

Apologies for the delay.

Applied to target-pending/for-next.

Thanks!

--nab

>  drivers/target/tcm_fc/tfc_sess.c | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/drivers/target/tcm_fc/tfc_sess.c 
> b/drivers/target/tcm_fc/tfc_sess.c
> index 113f335..4859505 100644
> --- a/drivers/target/tcm_fc/tfc_sess.c
> +++ b/drivers/target/tcm_fc/tfc_sess.c
> @@ -428,19 +428,12 @@ static int ft_prli(struct fc_rport_priv *rdata, u32 
> spp_len,
>   return ret;
>  }
>  
> -static void ft_sess_rcu_free(struct rcu_head *rcu)
> -{
> - struct ft_sess *sess = container_of(rcu, struct ft_sess, rcu);
> -
> - kfree(sess);
> -}
> -
>  static void ft_sess_free(struct kref *kref)
>  {
>   struct ft_sess *sess = container_of(kref, struct ft_sess, kref);
>  
>   transport_deregister_session(sess->se_sess);
> - call_rcu(&sess->rcu, ft_sess_rcu_free);
> + kfree_rcu(sess, rcu);
>  }
>  
>  void ft_sess_put(struct ft_sess *sess)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -next] [SCSI] lpfc 8.3.36: fix potential NULL pointer dereference in lpfc_sli4_rq_put()

2013-03-18 Thread James Smart

James,

Can you please merge this fix into scsi git for merging into 3.9 ?

-- james s



On 12/6/2012 10:46 AM, James Smart wrote:

Acked-By: James Smart  

Thanks

-- james s


On 12/2/2012 8:33 AM, Wei Yongjun wrote:

From: Wei Yongjun 

The dereference to 'put_index' should be moved below the NULL test.

Signed-off-by: Wei Yongjun 
---
  drivers/scsi/lpfc/lpfc_sli.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 624eab3..a9a7fd7 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -431,11 +431,12 @@ lpfc_sli4_rq_put(struct lpfc_queue *hq, struct 
lpfc_queue *dq,

  struct lpfc_rqe *temp_hrqe;
  struct lpfc_rqe *temp_drqe;
  struct lpfc_register doorbell;
-int put_index = hq->host_index;
+int put_index;
/* sanity check on queue memory */
  if (unlikely(!hq) || unlikely(!dq))
  return -ENOMEM;
+put_index = hq->host_index;
  temp_hrqe = hq->qe[hq->host_index].rqe;
  temp_drqe = dq->qe[dq->host_index].rqe;






--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] scsi: Use W_LUN for scanning

2013-03-18 Thread Hannes Reinecke

On 03/15/2013 04:54 PM, Steffen Maier wrote:

While we're at it: I recently figured that there are targets
responding to inquiry with PQ=1 && PDT=31 for LUN0 if LUN0 has no
backing device (e.g. no disk mapped for the initiator host). While
this is likely to work with in-kernel lun scanning, the kernel does
not even allocate an sg dev in this case.

Yes.


If the target (such do exist) now also does not support report_luns
on WLUN, this effectively prevents any user space lun discovery.
E.g. non-NPIV zfcp-attached LUNs cannot be discovered, because we
cannot do in-kernel scanning due to the lack of initiator lun masking.


Yes, I know; the NetApp case.
But you can always initiate a user-space discovery by

echo '- - -' > /sys/class/scsi_host/hostX/scan


The reason for Linux ignoring LUNs with PDT=31 and PQ=1 is:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=84961f28e9d13a4b193d0c8545f3c060c1890ff3

[SCSI] Don't add scsi_device for devices that return PQ=1, PDT=0x1f
from kernel 2.6.19.
Since Linux on System z could no longer perform report luns with IBM
DS8000, the following workaround was implemented:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=01b291bd66564b4bd826326af6bd0b6d17e99439

[SCSI] fix check of PQ and PDT bits for WLUNs
in kernel 2.6.27.
However, this only works for WLUNs reporting PDT=31 and PQ=1 but not
for LUN0.
What made it worse is that, the attached LUN looks perfect from a
zfcp status point of view but is still missing any user visible
handle for the scsi midlayer, so I was puzzled as a user despite the
otherwise clear scsi log message:
"scsi scan: peripheral device type of 31, ***no device added***".



Hmm. You sure this is still valid today?
Assuming the device supports REPORT LUNS we're having this:

res = scsi_probe_and_add_lun(starget, 0, &bflags, NULL, rescan, NULL);
if (res == SCSI_SCAN_LUN_PRESENT || res == SCSI_SCAN_TARGET_PRESENT) {
if (scsi_report_lun_scan(starget, bflags, rescan) != 0)
/*
 * The REPORT LUN did not scan the target,
 * do a sequential scan.
 */
scsi_sequential_lun_scan(starget, bflags,
 starget->scsi_level, rescan);
}

 out_reap:
scsi_autopm_put_target(starget);
/* now determine if the target has any children at all
 * and if not, nuke it */
scsi_target_reap(starget);

So we don't actually need an sdev to invoke a report lun scan;
all we require is a correct return code from scsi_probe_and_add_lun().
Which we'll get irrespective of the PQ setting.
So from what I'm seeing this case should be covered.
Unless I'm missing something ...


On 03/15/2013 10:46 AM, Hannes Reinecke wrote:

SAM advertises the use of a Well-known LUN (W_LUN) for scanning.
As this avoids exposing LUN 0 (which might be a valid LUN) for
all initiators it is the preferred method for LUN scanning on
some arrays.
So we should be using W_LUN for scanning, too. If the W_LUN is
not supported we'll fall back to use LUN 0.
For broken W_LUN implementations a new blacklist flag
'BLIST_NO_WLUN' is added.

Signed-off-by: Hannes Reinecke 

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 3e58b22..f4ccdea 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1312,6 +1312,7 @@ static int scsi_report_lun_scan(struct
scsi_target *starget, int bflags,
  unsigned int num_luns;
  unsigned int retries;
  int result;
+int w_lun = SCSI_W_LUN_REPORT_LUNS;
  struct scsi_lun *lunp, *lun_data;
  u8 *data;
  struct scsi_sense_hdr sshdr;
@@ -1337,11 +1338,20 @@ static int scsi_report_lun_scan(struct
scsi_target *starget, int bflags,
  return 0;
  if (starget->no_report_luns)
  return 1;
+if (bflags & BLIST_NO_WLUN)
+w_lun = 0;

+retry_report_lun_scan:
  if (!(sdev = scsi_device_lookup_by_target(starget, 0))) {
-sdev = scsi_alloc_sdev(starget, 0, NULL);
-if (!sdev)
-return 0;
+sdev = scsi_alloc_sdev(starget, w_lun, NULL);
+if (!sdev) {
+if (w_lun != 0) {
+w_lun = 0;
+sdev = scsi_alloc_sdev(starget, w_lun, NULL);
+}



+if (!sdev)
+return 0;


Semantically this belongs to the 2nd scsi_alloc_sdev(starget, w_lun,
NULL) i.e. inside (at the end of) the body of if (w_lun != 0).
Then you can even merge the outer and inner if statement to
if (!sdev && wlun != 0)
Semantics: WLUN did not work and we haven't already tried LUN0.
Maybe it's just me but I found it more difficult to read if the 2nd
check is on its own inside the outer if statement with exactly the
same boolean expression.



+}
  if (scsi_device_get(sdev)) {
  __scsi_remove_device(sdev);
  return 0;


Now th

Possible reentrancy issue in be_iopoll

2013-03-18 Thread Shlomo Pongratz

Hi Jayamohan.

I think that there is a reentrancy issue in 
"drivers/scsi/be2iscsi/be_main.c::be_iopoll".
The driver creates "NAPI" context per core which is fine, however the 
above routine declares the ret variable as static!
Thus there is only one instance of this variable, so if this routine is 
called from more than one thread of execution the result is unpredictable.


static unsigned int ret;
.

ret = beiscsi_process_cq(pbe_eq);
<--Another 
thread can enter here and change "ret".

if (ret < budget) {
   
}
<--Another 
thread can enter here and change "ret".

return ret;

I can't submit a patch without checking it first, which is impossible 
since I don't have the HW, but the fix seems trivial, just take out the 
static.


Best regards,

S.P.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] aacraid: Dual firmware image support

2013-03-18 Thread Tomas Henzl
On 03/18/2013 05:58 AM, Mahesh Rajashekhara wrote:
> This patch adds dual flash firmware support for Series 7 and above 
> controllers.
>
> Signed-off-by: Mahesh Rajashekhara 
> ---
>  drivers/scsi/aacraid/aacraid.h  |6 +-
>  drivers/scsi/aacraid/comminit.c |2 +-
>  drivers/scsi/aacraid/src.c  |   31 ++-
>  3 files changed, 36 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
> index a6f7190..9323d05 100644
> --- a/drivers/scsi/aacraid/aacraid.h
> +++ b/drivers/scsi/aacraid/aacraid.h
> @@ -12,7 +12,7 @@
>   
> **/
>  
>  #ifndef AAC_DRIVER_BUILD
> -# define AAC_DRIVER_BUILD 3
> +# define AAC_DRIVER_BUILD 30200
>  # define AAC_DRIVER_BRANCH "-ms"
>  #endif
>  #define MAXIMUM_NUM_CONTAINERS   32
> @@ -1918,6 +1918,10 @@ extern struct aac_common aac_config;
>  #define  MONITOR_PANIC   0x0020
>  #define  KERNEL_UP_AND_RUNNING   0x0080
>  #define  KERNEL_PANIC0x0100
> +#define  FLASH_UPD_PENDING   0x2000
> +#define  FLASH_UPD_SUCCESS   0x4000
> +#define  FLASH_UPD_FAILED0x8000
> +#define  FWUPD_TIMEOUT   (5 * 60)
>  
>  /*
>   *   Doorbell bit defines
> diff --git a/drivers/scsi/aacraid/comminit.c b/drivers/scsi/aacraid/comminit.c
> index 3f75995..177b094 100644
> --- a/drivers/scsi/aacraid/comminit.c
> +++ b/drivers/scsi/aacraid/comminit.c
> @@ -214,7 +214,7 @@ int aac_send_shutdown(struct aac_dev * dev)
>   cmd = (struct aac_close *) fib_data(fibctx);
>  
>   cmd->command = cpu_to_le32(VM_CloseAll);
> - cmd->cid = cpu_to_le32(0x);
> + cmd->cid = cpu_to_le32(0xfffe);
>  
>   status = aac_fib_send(ContainerCommand,
> fibctx,
> diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
> index e2e3492..b8be2ab 100644
> --- a/drivers/scsi/aacraid/src.c
> +++ b/drivers/scsi/aacraid/src.c
> @@ -685,6 +685,7 @@ int aac_srcv_init(struct aac_dev *dev)
>   unsigned long status;
>   int restart = 0;
>   int instance = dev->id;
> + int waitCount;
>   const char *name = dev->name;
>  
>   dev->a_ops.adapter_ioremap = aac_srcv_ioremap;
> @@ -703,6 +704,32 @@ int aac_srcv_init(struct aac_dev *dev)
>   !aac_src_restart_adapter(dev, 0))
>   ++restart;
>   /*
> +  *  Check to see if flash update is running.
> +  *  Wait for the adapter to be up and running. Wait up to 5 minutes
> +  */
> + status = src_readl(dev, MUnit.OMR);
> + if (status & FLASH_UPD_PENDING) {
> + start = jiffies;
> + do {
> + status = src_readl(dev, MUnit.OMR);
> + if (time_after(jiffies, start+HZ*FWUPD_TIMEOUT)) {
> + printk(KERN_ERR "%s%d: adapter flash update 
> failed.\n",
> + dev->name, instance);
> + goto error_iounmap;
> + }
> + } while (!(status & FLASH_UPD_SUCCESS) &&
> +  !(status & FLASH_UPD_FAILED));
> + /* Delay 10 seconds.
> +  * Because right now FW is doing a soft reset,
> +  * do not read scratch pad register at this time
> +  */
> + waitCount = 10 * 1;
> + while (waitCount) {
> + udelay(100);/* delay 100 microseconds */
> + waitCount--;

Hi Mahesh,
what is the reason for udelay here ? 
Maybe a ssleep (10); does the same job.
Regards, Tomas

> + }
> + }
> + /*
>*  Check to see if the board panic'd while booting.
>*/
>   status = src_readl(dev, MUnit.OMR);
> @@ -730,7 +757,9 @@ int aac_srcv_init(struct aac_dev *dev)
>   /*
>*  Wait for the adapter to be up and running. Wait up to 3 minutes
>*/
> - while (!((status = src_readl(dev, MUnit.OMR)) & KERNEL_UP_AND_RUNNING)) 
> {
> + while (!((status = src_readl(dev, MUnit.OMR)) &
> + KERNEL_UP_AND_RUNNING) ||
> + status == 0x) {
>   if ((restart &&
> (status & (KERNEL_PANIC|SELF_TEST_FAILED|MONITOR_PANIC))) ||
> time_after(jiffies, start+HZ*startup_timeout)) {

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI]: print the msgbytes and statusbyte from scsi result

2013-03-18 Thread Namjae Jeon
2013/3/18, James Bottomley :
> On Sun, 2013-03-17 at 17:29 +0900, Namjae Jeon wrote:
>> From: Namjae Jeon 
>>
>> Introduce msgbyte and statusbyte in the prints as part of the
>> result which is returned by the lower layer driver in response to
>> SCSI command issued, in case of any error conditions.
>>
>> Purpose of adding these prints is to convey, during any I/O
>> error case, which condition exactly has happened in lower device and
>> from the prints we can directly deduce, what is the status of command
>> issued. This will help to quickly debug the scenario and also making
>> a test case to create new scenarios.
>>
>> Also change the printk to more appropriate pr_* macro.
>>
>> Signed-off-by: Namjae Jeon 
>> Signed-off-by: Amit Sahrawat 
>> ---
>>  drivers/scsi/constants.c |6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
>> index 76e4c03..77bb1dc 100644
>> --- a/drivers/scsi/constants.c
>> +++ b/drivers/scsi/constants.c
>> @@ -1445,8 +1445,10 @@ void scsi_show_result(int result)
>>
>>  void scsi_show_result(int result)
>>  {
>> -printk("Result: hostbyte=0x%02x driverbyte=0x%02x\n",
>> -   host_byte(result), driver_byte(result));
>> +pr_info("Result: hostbyte=0x%02x driverbyte=0x%02x"
>> +"msgbyte=0x%02x statusbyte=0x%02x\n",
>> +   host_byte(result), driver_byte(result), msg_byte(result),
>> +status_byte(result));
Hi James.
>
> You didn't test this, did you? If you did, you'd have noticed the change
> from printk to pr_info gives you an unwanted "6" in the message.
Yes, we tested with "printk" version of the patch, but before sending
the patch, when we checked for issues using "checkpatch.pl" it showed
warning.
So, we thought that to be a cosmetic change and replaced the printk
with  pr_info. Sorry, if below your douting is clear, I will change
log level as current one.

>
> Also, what are you hoping to achieve? scsi_show_result() is only used by
> sd in a very few special command situations.  I can't believe the msg
> byte would be anything other than zero and the status byte check
> condition.
Regarding the introduction of additional information in prints.
We encountered an error with error logs like:
[ 131.673096] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
[ 131.679038] sd 0:0:0:0: [sda] Sense Key : 0xb [current]
[ 131.684801] sd 0:0:0:0: [sda] ASC=0x8 ASCQ=0x3
[ 131.689241] sd 0:0:0:0: [sda] CDB: cdb[0]=0x2a: 2a 00 00 cb 0c 00 00 00 f0 00

Looking at the logs it was clear it was due "ABORTED command" but we
wanted to check in the code if there was any retry in such case:
In ‘scsi_decide_disposition’ there are "3" main conditions

switch (host_byte(scmd->result)) -> this returned "DID_OK"
if (msg_byte(scmd->result) != COMMAND_COMPLETE)
return FAILED;

And the last was:
switch (status_byte(scmd->result)) {
…
case TASK_ABORTED:
goto maybe_retry;
case CHECK_CONDITION:
…

So, if the status/host bytes were known - we could have directly
deduced from the code. Instead we needed to introduce prints and then
check the path.

Thanks.
>
> James
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 53281] megaraid_mbox kernel panic during boot

2013-03-18 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=53281


justgivemeafkenaccount...@yahoo.co.uk changed:

   What|Removed |Added

 CC||justgivemeafkenaccountplx@y
   ||ahoo.co.uk




--- Comment #10 from justgivemeafkenaccount...@yahoo.co.uk  2013-03-18 11:52:33 
---
I'm getting a very similar kernel dump on an LSI MegaRaid SATA 300-8X card.
(1000:0409) This card also uses the megaraid_mbox driver. Boots fine, but
crashes consistently when running badblocks -wsv over an array, usually within
an hour. Sometimes it will dump to console and flash the keyboard lights,
sometimes it just hangs.

Server is running nothing else at the moment beyond the basics, and the LSI
currently has no filesystems. I have no reason to suspect the card or the
server as both were running Win2003 server for the last 5 - 6 years without any
issues beyond the fact Windows was running on it :)

Ubuntu 12.04 i386 server, same results with Ubuntu kernel 3.5.0-25 and latest
compiled kernel 3.8.3. If there is any interest here I can post the kernel
dump, hardware details, etc. In a nutshell it's a single hyperthread P4 Xeon
2.4GHz on an Intel PCI-X server board with one jigglybyte of DDR1 ECC. LSI card
is running four 200GB SATA drives in a RAID5, configured with Write Through
caching and DirectIO, 128MB of cache, no backup battery.

I have tried various kernel flags to no avail but seemed to have some success
when I turned off all the performance enhancing settings in the LSI BIOS.
(Multiple PCI delayed transactions, command queuing, HDD write caching) It got
4 - 5 hours through badblocks without crashing but I stopped it as it was
taking forever. Could well have been just a result of decreased load though.
Turning hyperthread on/off makes no difference.

Currently testing various different settings in the LSI BIOS but it's slow
going. Any help would be appreciated here.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] aacraid: Dual firmware image support

2013-03-18 Thread Mahesh Rajashekhara
This patch adds dual flash firmware support for Series 7 and above controllers.

Signed-off-by: Mahesh Rajashekhara 
---
 drivers/scsi/aacraid/aacraid.h  |6 +-
 drivers/scsi/aacraid/comminit.c |2 +-
 drivers/scsi/aacraid/src.c  |   31 ++-
 3 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
index a6f7190..9323d05 100644
--- a/drivers/scsi/aacraid/aacraid.h
+++ b/drivers/scsi/aacraid/aacraid.h
@@ -12,7 +12,7 @@
  
**/
 
 #ifndef AAC_DRIVER_BUILD
-# define AAC_DRIVER_BUILD 3
+# define AAC_DRIVER_BUILD 30200
 # define AAC_DRIVER_BRANCH "-ms"
 #endif
 #define MAXIMUM_NUM_CONTAINERS 32
@@ -1918,6 +1918,10 @@ extern struct aac_common aac_config;
 #defineMONITOR_PANIC   0x0020
 #defineKERNEL_UP_AND_RUNNING   0x0080
 #defineKERNEL_PANIC0x0100
+#defineFLASH_UPD_PENDING   0x2000
+#defineFLASH_UPD_SUCCESS   0x4000
+#defineFLASH_UPD_FAILED0x8000
+#defineFWUPD_TIMEOUT   (5 * 60)
 
 /*
  * Doorbell bit defines
diff --git a/drivers/scsi/aacraid/comminit.c b/drivers/scsi/aacraid/comminit.c
index 3f75995..177b094 100644
--- a/drivers/scsi/aacraid/comminit.c
+++ b/drivers/scsi/aacraid/comminit.c
@@ -214,7 +214,7 @@ int aac_send_shutdown(struct aac_dev * dev)
cmd = (struct aac_close *) fib_data(fibctx);
 
cmd->command = cpu_to_le32(VM_CloseAll);
-   cmd->cid = cpu_to_le32(0x);
+   cmd->cid = cpu_to_le32(0xfffe);
 
status = aac_fib_send(ContainerCommand,
  fibctx,
diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
index e2e3492..b8be2ab 100644
--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -685,6 +685,7 @@ int aac_srcv_init(struct aac_dev *dev)
unsigned long status;
int restart = 0;
int instance = dev->id;
+   int waitCount;
const char *name = dev->name;
 
dev->a_ops.adapter_ioremap = aac_srcv_ioremap;
@@ -703,6 +704,32 @@ int aac_srcv_init(struct aac_dev *dev)
!aac_src_restart_adapter(dev, 0))
++restart;
/*
+*  Check to see if flash update is running.
+*  Wait for the adapter to be up and running. Wait up to 5 minutes
+*/
+   status = src_readl(dev, MUnit.OMR);
+   if (status & FLASH_UPD_PENDING) {
+   start = jiffies;
+   do {
+   status = src_readl(dev, MUnit.OMR);
+   if (time_after(jiffies, start+HZ*FWUPD_TIMEOUT)) {
+   printk(KERN_ERR "%s%d: adapter flash update 
failed.\n",
+   dev->name, instance);
+   goto error_iounmap;
+   }
+   } while (!(status & FLASH_UPD_SUCCESS) &&
+!(status & FLASH_UPD_FAILED));
+   /* Delay 10 seconds.
+* Because right now FW is doing a soft reset,
+* do not read scratch pad register at this time
+*/
+   waitCount = 10 * 1;
+   while (waitCount) {
+   udelay(100);/* delay 100 microseconds */
+   waitCount--;
+   }
+   }
+   /*
 *  Check to see if the board panic'd while booting.
 */
status = src_readl(dev, MUnit.OMR);
@@ -730,7 +757,9 @@ int aac_srcv_init(struct aac_dev *dev)
/*
 *  Wait for the adapter to be up and running. Wait up to 3 minutes
 */
-   while (!((status = src_readl(dev, MUnit.OMR)) & KERNEL_UP_AND_RUNNING)) 
{
+   while (!((status = src_readl(dev, MUnit.OMR)) &
+   KERNEL_UP_AND_RUNNING) ||
+   status == 0x) {
if ((restart &&
  (status & (KERNEL_PANIC|SELF_TEST_FAILED|MONITOR_PANIC))) ||
  time_after(jiffies, start+HZ*startup_timeout)) {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI]: print the msgbytes and statusbyte from scsi result

2013-03-18 Thread James Bottomley
On Sun, 2013-03-17 at 17:29 +0900, Namjae Jeon wrote:
> From: Namjae Jeon 
> 
> Introduce msgbyte and statusbyte in the prints as part of the
> result which is returned by the lower layer driver in response to
> SCSI command issued, in case of any error conditions.
> 
> Purpose of adding these prints is to convey, during any I/O
> error case, which condition exactly has happened in lower device and
> from the prints we can directly deduce, what is the status of command
> issued. This will help to quickly debug the scenario and also making
> a test case to create new scenarios.
> 
> Also change the printk to more appropriate pr_* macro.
> 
> Signed-off-by: Namjae Jeon 
> Signed-off-by: Amit Sahrawat 
> ---
>  drivers/scsi/constants.c |6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
> index 76e4c03..77bb1dc 100644
> --- a/drivers/scsi/constants.c
> +++ b/drivers/scsi/constants.c
> @@ -1445,8 +1445,10 @@ void scsi_show_result(int result)
>  
>  void scsi_show_result(int result)
>  {
> - printk("Result: hostbyte=0x%02x driverbyte=0x%02x\n",
> -host_byte(result), driver_byte(result));
> + pr_info("Result: hostbyte=0x%02x driverbyte=0x%02x"
> + "msgbyte=0x%02x statusbyte=0x%02x\n",
> +host_byte(result), driver_byte(result), msg_byte(result),
> + status_byte(result));

You didn't test this, did you? If you did, you'd have noticed the change
from printk to pr_info gives you an unwanted "6" in the message.

Also, what are you hoping to achieve? scsi_show_result() is only used by
sd in a very few special command situations.  I can't believe the msg
byte would be anything other than zero and the status byte check
condition.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_transport_fc: Make 'port_state' writeable

2013-03-18 Thread Hannes Reinecke

On 03/15/2013 08:13 PM, Bart Van Assche wrote:

On 03/15/13 19:51, Mike Christie wrote:

On 03/15/2013 08:41 AM, Bart Van Assche wrote:

How about using the value of scsi_cmnd.jiffies_at_alloc to finish
only
those SCSI commands in the host reset handler that exceeded a
certain
processing time ?


We basically do this now. When a scsi command times out the scsi
layer
blocks the host from processing new commands and waits for all
outstanding commands to either finish normally or timeout. When all
commands have finished or timedout, then we start the scsi eh
code. So,
by the time we have go to the scsi eh callbacks we are in a state
where
all the commands being processed by the eh have exceeded a certain
processing time.

If you mean you want to drop the block and wait part, then I think it
could speed things up to do the abort callbacks while other IO is
running (as long as the driver can support it). However if the abort
fails and you need to escalate to operations like resets which
interfere
with multiple commands, then the driver/scsi-ml does not have much
choice in what it does cleanup wise. There would be no point in
checking
the jiffies_at_alloc. The commands that are going to be affected
by the
tmf or host reset operation must be returned to the scsi-ml for
retries
or failure upwards.


Hello Mike,

It seems like there is a misunderstanding. With my comment I was not
referring to the SCSI ML but to the SCSI LLD. LLD drivers like
ib_srp keep track of outstanding SCSI requests. With the SRP
protocol it is possible to tell the InfiniBand HCA not to deliver
completions for outstanding requests by closing the connection used
for SRP communication. Hence my suggestion to finish SCSI commands
that were queued longer than a certain time ago from inside the LLD
host reset handler. I'm not sure though whether all types of FC
HBA's allow something equivalent.


Well, this is not quite identical to what I've been trying to
achieve with this patch.
This patch is for an individual rport which has gone out to lunch.
Sure we could down the link from the HBA, but that would terminate
I/O to _all_ connected rports, not just the malfunctioning one.
So that wouldn't help us here.

The closest equivalent to that would be a port logout; however, as 
discussed in the I_T nexus reset thread we would need another 
callout to the LLDs here as this definitely needs LLD support

and none of the current LLDs have it implemented.

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html