Re: [PATCH] mpt2sas: setpci reset kernel panic fix

2015-06-16 Thread Johannes Thumshirn
On Wed, Jun 17, 2015 at 11:37:53AM +0530, Nagarajkumar Narayanan wrote:
> Problem Description:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=95101
> 
> Due to lack of synchronization between ioctl, BRM status access, pci
> resource removal kernel oops happen as ioctl path and BRM status
> access path still tries to access the removed resources
> 
> kernel: BUG: unable to handle kernel paging request at c900171e
> 
> Oops:  [#1] SMP
> 
> 
> Patch Description:
> 
> Two locks added to provide syncrhonization
> 
> 1. pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and
> pci resource handling. PCI resource freeing will lead to free
> vital hardware/memory resource, which might be in use by cli/sysfs
> path functions resulting in Null pointer reference followed by kernel
> crash. To avoid the above race condition we use mutex syncrhonization
> which ensures the syncrhonization between cli/sysfs_show path
> 
> 2. spinlock on list operations over IOCs
> 
> Case: when multiple warpdrive cards(IOCs) are in use
> Each IOC will added to the ioc list stucture on initialization.
> Watchdog threads run at regular intervals to check IOC for any
> fault conditions which will trigger the dead_ioc thread to
> deallocate pci resource, resulting deleting the IOC netry from list,
> this deletion need to protected by spinlock to enusre that
> ioc removal is syncrhonized, if not synchronized it might lead to
> list_del corruption as the ioc list is traversed in cli path
> 
> 
> 
> please find the patch to apply on scsi/mpt2sas driver
> 
> http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
> 
> 
> From ba692140278e6e2b660896c32206b26dac98d215 Mon Sep 17 00:00:00 2001
> From: Nagarajkumar Narayanan 
> Date: Thu, 19 Mar 2015 12:02:07 +0530
> Subject: [PATCH] mpt2sas setpci kernel oops fix
> 
> Signed-off-by: Nagarajkumar Narayanan 
> ---
>  drivers/scsi/mpt2sas/mpt2sas_base.c  |   10 +++
>  drivers/scsi/mpt2sas/mpt2sas_base.h  |   20 +-
>  drivers/scsi/mpt2sas/mpt2sas_ctl.c   |   48 +
>  drivers/scsi/mpt2sas/mpt2sas_scsih.c |   32 ++-
>  4 files changed, 102 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
> b/drivers/scsi/mpt2sas/mpt2sas_base.c
> index 11248de..d2a498c 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> @@ -108,13 +108,18 @@ _scsih_set_fwfault_debug(const char *val, struct
> kernel_param *kp)
>  {
>   int ret = param_set_int(val, kp);
>   struct MPT2SAS_ADAPTER *ioc;
> + unsigned long flags;
> 
>   if (ret)
>   return ret;
> 
> + /* global ioc spinlock to protect controller list on list operations */
> + mpt2sas_initialize_gioc_lock();
>   printk(KERN_INFO "setting fwfault_debug(%d)\n", mpt2sas_fwfault_debug);
> + spin_lock_irqsave(&gioc_lock, flags);
>   list_for_each_entry(ioc, &mpt2sas_ioc_list, list)
>   ioc->fwfault_debug = mpt2sas_fwfault_debug;
> + spin_unlock_irqrestore(&gioc_lock, flags);
>   return 0;
>  }
> 
> @@ -4436,6 +4441,9 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc)
>  __func__));
> 
>   if (ioc->chip_phys && ioc->chip) {
> + /* synchronizing freeing resource with pci_access_mutex lock */
> + if (ioc->is_warpdrive)
> + mutex_lock(&ioc->pci_access_mutex);
>   _base_mask_interrupts(ioc);
>   ioc->shost_recovery = 1;
>   _base_make_ioc_ready(ioc, CAN_SLEEP, SOFT_RESET);
> @@ -4454,6 +4462,8 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc)
>   pci_disable_pcie_error_reporting(pdev);
>   pci_disable_device(pdev);
>   }
> + if (ioc->is_warpdrive)
> + mutex_unlock(&ioc->pci_access_mutex);
>   return;
>  }
> 
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h
> b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..a0d26f0 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -799,6 +799,12 @@ typedef void (*MPT2SAS_FLUSH_RUNNING_CMDS)(struct
> MPT2SAS_ADAPTER *ioc);
>   * @delayed_tr_list: target reset link list
>   * @delayed_tr_volume_list: volume target reset link list
>   * @@temp_sensors_count: flag to carry the number of temperature sensors
> + * @pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and
> + * pci resource handling. PCI resource freeing will lead to free
> + * vital hardware/memory resource, which might be in use by cli/sysfs
> + * path functions resulting in Null pointer reference followed by kernel
> + * crash. To avoid the above race condition we use mutex syncrhonization
> + * which ensures the syncrhonization between cli/sysfs_show path
>   */
>  struct MPT2SAS_ADAPTER {
>   struct list_head list;
> @@ -1015,6 +1021,7 @@ struct MPT2SAS_ADAPTER {
>   u8 mfg_pg10_hide_flag;
>   u8 hide_drives;
> 
> + struct mutex pci_access_mutex;
>  };
> 
>  typedef u8 (*MPT_CALLBACK)(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8
> msix_index,
> @@ -1023,6 +1030,17 @@ typedef u8 (*MPT_CALLBACK)(struct
> MPT2SAS_ADAPTER *ioc, u16

Re: [PATCH 0/6] target: Update UA handling

2015-06-16 Thread Hannes Reinecke
On 06/17/2015 08:10 AM, Nicholas A. Bellinger wrote:
> On Thu, 2015-06-11 at 10:01 +0200, Hannes Reinecke wrote:
>> Hi Nic,
>>
>> lio-target is very minimalistic when it comes to generate UAs;
>> primarily they are generated for persistent reservations, but
>> generic changes tend to be ignored.
>>
>> This patchset updates the UA handling and generates UA for internal
>> state changes (REPORTED LUNS DATA CHANGED, INQUIRY DATA CHANGED,
>> and LUN RESET OCCURRED).
>>
>> Funnily enough this triggers some issues with the SCSI stack;
>> I'll be sending out patches for that, too.
>>
>> Hannes Reinecke (6):
>>   target_core_alua: Correct UA handling when switching states
>>   target: Remove 'ua_nacl' pointer from se_ua structure
>>   target: use 'se_dev_entry' when allocating UAs
>>   target: Send UA on ALUA target port group change
>>   target: Send UA upon LUN RESET tmr completion
>>   target: Send UA when changing LUN inventory
>>
>>  drivers/target/target_core_alua.c  | 56 
>> +-
>>  drivers/target/target_core_device.c| 26 +++-
>>  drivers/target/target_core_pr.c| 31 +++
>>  drivers/target/target_core_transport.c | 29 ++
>>  drivers/target/target_core_ua.c| 24 ++-
>>  drivers/target/target_core_ua.h|  5 ++-
>>  include/target/target_core_base.h  |  1 -
>>  7 files changed, 121 insertions(+), 51 deletions(-)
>>
> 
> Applied to target-pending/for-next, with the extra incremental patch for
> a common target_ua_alloc_lun() caller.
> 
> Btw, very happy to see REPORTED_LUNS_DATA_HAS_CHANGED support include
> for v4.2-rc1 code.  8-)
> 
Yeah; I needed a quick testbed for my ALUA update, and thought that
tcm_loop would fit the bill.

As it turned out, not quite. Hence the patches.

BTW: the main issue I have with current lio-target is that you can
only configure it _after_ the target has been enabled.

IE if you want to add another ALUA state you have to create another
TPG, and set this to the required ALUA state.
But you can modify the TPG allegiance only _after_ the LUN has been
created and is visible to the host.
Which means that the initiator inevitably sees both states, and it's
impossible to have the LUN start off with a different than the
default ALUA state.
(This is especially important if one would want to test the READ
CAPACITY support in ALUA standby state).

Would you be okay with changing that?

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/6] target: use 'se_dev_entry' when allocating UAs

2015-06-16 Thread Hannes Reinecke
On 06/17/2015 08:06 AM, Nicholas A. Bellinger wrote:
> Hey Hannes,
> 
> Apologies for the delayed follow-up on these, one comment below.
> 
> On Thu, 2015-06-11 at 10:01 +0200, Hannes Reinecke wrote:
>> We need to use 'se_dev_entry' as argument when allocating
>> UAs, otherwise we'll never see any UAs for an implicit
>> ALUA state transition triggered from userspace.
>>
>> Signed-off-by: Hannes Reinecke 
>> ---
>>  drivers/target/target_core_alua.c  | 27 ++-
>>  drivers/target/target_core_pr.c| 31 +--
>>  drivers/target/target_core_transport.c | 18 --
>>  drivers/target/target_core_ua.c| 23 +++
>>  drivers/target/target_core_ua.h|  2 +-
>>  5 files changed, 59 insertions(+), 42 deletions(-)
>>
> 
> 
> 
>> diff --git a/drivers/target/target_core_pr.c 
>> b/drivers/target/target_core_pr.c
>> index 436e30b..bb28a97 100644
>> --- a/drivers/target/target_core_pr.c
>> +++ b/drivers/target/target_core_pr.c
>> @@ -125,6 +125,25 @@ static struct t10_pr_registration 
>> *core_scsi3_locate_pr_reg(struct se_device *,
>>  struct se_node_acl *, struct se_session 
>> *);
>>  static void core_scsi3_put_pr_reg(struct t10_pr_registration *);
>>  
>> +static void core_scsi3_pr_ua_allocate(struct se_node_acl *nacl,
>> +  u32 unpacked_lun, u8 asc, u8 ascq)
>> +{
>> +struct se_dev_entry *deve;
>> +
>> +if (!nacl)
>> +return;
>> +
>> +rcu_read_lock();
>> +deve = target_nacl_find_deve(nacl, unpacked_lun);
>> +if (!deve) {
>> +rcu_read_unlock();
>> +return;
>> +}
>> +
>> +core_scsi3_ua_allocate(deve, asc, ascq);
>> +rcu_read_unlock();
>> +}
>> +
> 
> This should be common for TCM_RESERVATION_CONFLICT case outside of PR
> code too.
> 
> Any objections for squashing the following into your original patch..?
> 
> Thank you,
> 
> --nab
> 
[ .. ]

None at all.
Do go ahead.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] target: Update UA handling

2015-06-16 Thread Nicholas A. Bellinger
On Thu, 2015-06-11 at 10:01 +0200, Hannes Reinecke wrote:
> Hi Nic,
> 
> lio-target is very minimalistic when it comes to generate UAs;
> primarily they are generated for persistent reservations, but
> generic changes tend to be ignored.
> 
> This patchset updates the UA handling and generates UA for internal
> state changes (REPORTED LUNS DATA CHANGED, INQUIRY DATA CHANGED,
> and LUN RESET OCCURRED).
> 
> Funnily enough this triggers some issues with the SCSI stack;
> I'll be sending out patches for that, too.
> 
> Hannes Reinecke (6):
>   target_core_alua: Correct UA handling when switching states
>   target: Remove 'ua_nacl' pointer from se_ua structure
>   target: use 'se_dev_entry' when allocating UAs
>   target: Send UA on ALUA target port group change
>   target: Send UA upon LUN RESET tmr completion
>   target: Send UA when changing LUN inventory
> 
>  drivers/target/target_core_alua.c  | 56 
> +-
>  drivers/target/target_core_device.c| 26 +++-
>  drivers/target/target_core_pr.c| 31 +++
>  drivers/target/target_core_transport.c | 29 ++
>  drivers/target/target_core_ua.c| 24 ++-
>  drivers/target/target_core_ua.h|  5 ++-
>  include/target/target_core_base.h  |  1 -
>  7 files changed, 121 insertions(+), 51 deletions(-)
> 

Applied to target-pending/for-next, with the extra incremental patch for
a common target_ua_alloc_lun() caller.

Btw, very happy to see REPORTED_LUNS_DATA_HAS_CHANGED support include
for v4.2-rc1 code.  8-)

Thanks Hannes!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mpt2sas: setpci reset kernel panic fix

2015-06-16 Thread Nagarajkumar Narayanan
Problem Description:

https://bugzilla.kernel.org/show_bug.cgi?id=95101

Due to lack of synchronization between ioctl, BRM status access, pci
resource removal kernel oops happen as ioctl path and BRM status
access path still tries to access the removed resources

kernel: BUG: unable to handle kernel paging request at c900171e

Oops:  [#1] SMP


Patch Description:

Two locks added to provide syncrhonization

1. pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and
pci resource handling. PCI resource freeing will lead to free
vital hardware/memory resource, which might be in use by cli/sysfs
path functions resulting in Null pointer reference followed by kernel
crash. To avoid the above race condition we use mutex syncrhonization
which ensures the syncrhonization between cli/sysfs_show path

2. spinlock on list operations over IOCs

Case: when multiple warpdrive cards(IOCs) are in use
Each IOC will added to the ioc list stucture on initialization.
Watchdog threads run at regular intervals to check IOC for any
fault conditions which will trigger the dead_ioc thread to
deallocate pci resource, resulting deleting the IOC netry from list,
this deletion need to protected by spinlock to enusre that
ioc removal is syncrhonized, if not synchronized it might lead to
list_del corruption as the ioc list is traversed in cli path



please find the patch to apply on scsi/mpt2sas driver

http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/


>From ba692140278e6e2b660896c32206b26dac98d215 Mon Sep 17 00:00:00 2001
From: Nagarajkumar Narayanan 
Date: Thu, 19 Mar 2015 12:02:07 +0530
Subject: [PATCH] mpt2sas setpci kernel oops fix

Signed-off-by: Nagarajkumar Narayanan 
---
 drivers/scsi/mpt2sas/mpt2sas_base.c  |   10 +++
 drivers/scsi/mpt2sas/mpt2sas_base.h  |   20 +-
 drivers/scsi/mpt2sas/mpt2sas_ctl.c   |   48 +
 drivers/scsi/mpt2sas/mpt2sas_scsih.c |   32 ++-
 4 files changed, 102 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index 11248de..d2a498c 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -108,13 +108,18 @@ _scsih_set_fwfault_debug(const char *val, struct
kernel_param *kp)
 {
  int ret = param_set_int(val, kp);
  struct MPT2SAS_ADAPTER *ioc;
+ unsigned long flags;

  if (ret)
  return ret;

+ /* global ioc spinlock to protect controller list on list operations */
+ mpt2sas_initialize_gioc_lock();
  printk(KERN_INFO "setting fwfault_debug(%d)\n", mpt2sas_fwfault_debug);
+ spin_lock_irqsave(&gioc_lock, flags);
  list_for_each_entry(ioc, &mpt2sas_ioc_list, list)
  ioc->fwfault_debug = mpt2sas_fwfault_debug;
+ spin_unlock_irqrestore(&gioc_lock, flags);
  return 0;
 }

@@ -4436,6 +4441,9 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc)
 __func__));

  if (ioc->chip_phys && ioc->chip) {
+ /* synchronizing freeing resource with pci_access_mutex lock */
+ if (ioc->is_warpdrive)
+ mutex_lock(&ioc->pci_access_mutex);
  _base_mask_interrupts(ioc);
  ioc->shost_recovery = 1;
  _base_make_ioc_ready(ioc, CAN_SLEEP, SOFT_RESET);
@@ -4454,6 +4462,8 @@ mpt2sas_base_free_resources(struct MPT2SAS_ADAPTER *ioc)
  pci_disable_pcie_error_reporting(pdev);
  pci_disable_device(pdev);
  }
+ if (ioc->is_warpdrive)
+ mutex_unlock(&ioc->pci_access_mutex);
  return;
 }

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h
b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..a0d26f0 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -799,6 +799,12 @@ typedef void (*MPT2SAS_FLUSH_RUNNING_CMDS)(struct
MPT2SAS_ADAPTER *ioc);
  * @delayed_tr_list: target reset link list
  * @delayed_tr_volume_list: volume target reset link list
  * @@temp_sensors_count: flag to carry the number of temperature sensors
+ * @pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and
+ * pci resource handling. PCI resource freeing will lead to free
+ * vital hardware/memory resource, which might be in use by cli/sysfs
+ * path functions resulting in Null pointer reference followed by kernel
+ * crash. To avoid the above race condition we use mutex syncrhonization
+ * which ensures the syncrhonization between cli/sysfs_show path
  */
 struct MPT2SAS_ADAPTER {
  struct list_head list;
@@ -1015,6 +1021,7 @@ struct MPT2SAS_ADAPTER {
  u8 mfg_pg10_hide_flag;
  u8 hide_drives;

+ struct mutex pci_access_mutex;
 };

 typedef u8 (*MPT_CALLBACK)(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8
msix_index,
@@ -1023,6 +1030,17 @@ typedef u8 (*MPT_CALLBACK)(struct
MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index,

 /* base shared API */
 extern struct list_head mpt2sas_ioc_list;
+/* spinlock on list operations over IOCs
++ * Case: when multiple warpdrive cards(IOCs) are in use
++ * Each IOC will added to the ioc list stucture on initialization.
++ * Watchdog threads run at regular intervals to check IOC for any
++ 

Re: [PATCH 3/6] target: use 'se_dev_entry' when allocating UAs

2015-06-16 Thread Nicholas A. Bellinger
Hey Hannes,

Apologies for the delayed follow-up on these, one comment below.

On Thu, 2015-06-11 at 10:01 +0200, Hannes Reinecke wrote:
> We need to use 'se_dev_entry' as argument when allocating
> UAs, otherwise we'll never see any UAs for an implicit
> ALUA state transition triggered from userspace.
> 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/target/target_core_alua.c  | 27 ++-
>  drivers/target/target_core_pr.c| 31 +--
>  drivers/target/target_core_transport.c | 18 --
>  drivers/target/target_core_ua.c| 23 +++
>  drivers/target/target_core_ua.h|  2 +-
>  5 files changed, 59 insertions(+), 42 deletions(-)
> 



> diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c
> index 436e30b..bb28a97 100644
> --- a/drivers/target/target_core_pr.c
> +++ b/drivers/target/target_core_pr.c
> @@ -125,6 +125,25 @@ static struct t10_pr_registration 
> *core_scsi3_locate_pr_reg(struct se_device *,
>   struct se_node_acl *, struct se_session 
> *);
>  static void core_scsi3_put_pr_reg(struct t10_pr_registration *);
>  
> +static void core_scsi3_pr_ua_allocate(struct se_node_acl *nacl,
> +   u32 unpacked_lun, u8 asc, u8 ascq)
> +{
> + struct se_dev_entry *deve;
> +
> + if (!nacl)
> + return;
> +
> + rcu_read_lock();
> + deve = target_nacl_find_deve(nacl, unpacked_lun);
> + if (!deve) {
> + rcu_read_unlock();
> + return;
> + }
> +
> + core_scsi3_ua_allocate(deve, asc, ascq);
> + rcu_read_unlock();
> +}
> +

This should be common for TCM_RESERVATION_CONFLICT case outside of PR
code too.

Any objections for squashing the following into your original patch..?

Thank you,

--nab

diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c
index bb28a97..0bb3292 100644
--- a/drivers/target/target_core_pr.c
+++ b/drivers/target/target_core_pr.c
@@ -125,25 +125,6 @@ static struct t10_pr_registration 
*core_scsi3_locate_pr_reg(struct se_device *,
struct se_node_acl *, struct se_session 
*);
 static void core_scsi3_put_pr_reg(struct t10_pr_registration *);
 
-static void core_scsi3_pr_ua_allocate(struct se_node_acl *nacl,
- u32 unpacked_lun, u8 asc, u8 ascq)
-{
-   struct se_dev_entry *deve;
-
-   if (!nacl)
-   return;
-
-   rcu_read_lock();
-   deve = target_nacl_find_deve(nacl, unpacked_lun);
-   if (!deve) {
-   rcu_read_unlock();
-   return;
-   }
-
-   core_scsi3_ua_allocate(deve, asc, ascq);
-   rcu_read_unlock();
-}
-
 static int target_check_scsi2_reservation_conflict(struct se_cmd *cmd)
 {
struct se_session *se_sess = cmd->se_sess;
@@ -2216,7 +2197,7 @@ core_scsi3_emulate_pro_register(struct se_cmd *cmd, u64 
res_key, u64 sa_res_key,
&pr_tmpl->registration_list,
pr_reg_list) {
 
-   core_scsi3_pr_ua_allocate(
+   target_ua_allocate_lun(
pr_reg_p->pr_reg_nacl,
pr_reg_p->pr_res_mapped_lun,
0x2A,
@@ -2643,7 +2624,7 @@ core_scsi3_emulate_pro_release(struct se_cmd *cmd, int 
type, int scope,
if (pr_reg_p == pr_reg)
continue;
 
-   core_scsi3_pr_ua_allocate(pr_reg_p->pr_reg_nacl,
+   target_ua_allocate_lun(pr_reg_p->pr_reg_nacl,
pr_reg_p->pr_res_mapped_lun,
0x2A, ASCQ_2AH_RESERVATIONS_RELEASED);
}
@@ -2728,7 +2709,7 @@ core_scsi3_emulate_pro_clear(struct se_cmd *cmd, u64 
res_key)
 *additional sense code set to RESERVATIONS PREEMPTED.
 */
if (!calling_it_nexus)
-   core_scsi3_pr_ua_allocate(pr_reg_nacl, 
pr_res_mapped_lun,
+   target_ua_allocate_lun(pr_reg_nacl, pr_res_mapped_lun,
0x2A, ASCQ_2AH_RESERVATIONS_PREEMPTED);
}
spin_unlock(&pr_tmpl->registration_lock);
@@ -2937,7 +2918,7 @@ core_scsi3_pro_preempt(struct se_cmd *cmd, int type, int 
scope, u64 res_key,
NULL, 0);
}
if (!calling_it_nexus)
-   core_scsi3_pr_ua_allocate(pr_reg_nacl,
+   target_ua_allocate_lun(pr_reg_nacl,
pr_res_mapped_lun, 0x2A,
ASCQ_2AH_REGISTRATIONS_PREEMPTED);
}
@@ -3043,7 +3024,7 @@ core_scsi3_pro_preempt(struct se_cmd *cmd, int type, int

[PATCH] iSCSI: let session recovery_tmo sysfs writes persist across recovery

2015-06-16 Thread Chris Leech
The iSCSI session recovery_tmo setting is writeable in sysfs, but it's
also set every time a connection is established when parameters are set
from iscsid over netlink.  That results in the timeout being reset to
the default value after every recovery.

The DM multipath tools want to use the sysfs interface to lower the
default timeout when there are multiple paths to fail over.  It has
caused confusion that we have a writeable sysfs value that seem to keep
resetting itself.

This patch adds an in-kernel flag that gets set once a sysfs write
occurs, and then ignores netlink parameter setting once it's been
modified via the sysfs interface.  My thinking here is that the sysfs
interface is much simpler for external tools to influence the session
timeout, but if we're going to allow it to be modified directly we
should ensure that setting is maintained.

Signed-off-by: Chris Leech 
---
 drivers/scsi/scsi_transport_iscsi.c | 11 ---
 include/scsi/scsi_transport_iscsi.h |  1 +
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index 67d43e3..35ef55f 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2040,6 +2040,7 @@ iscsi_alloc_session(struct Scsi_Host *shost, struct 
iscsi_transport *transport,
session->transport = transport;
session->creator = -1;
session->recovery_tmo = 120;
+   session->recovery_tmo_sysfs_override = false;
session->state = ISCSI_SESSION_FREE;
INIT_DELAYED_WORK(&session->recovery_work, session_recovery_timedout);
INIT_LIST_HEAD(&session->sess_list);
@@ -2784,7 +2785,8 @@ iscsi_set_param(struct iscsi_transport *transport, struct 
iscsi_uevent *ev)
switch (ev->u.set_param.param) {
case ISCSI_PARAM_SESS_RECOVERY_TMO:
sscanf(data, "%d", &value);
-   session->recovery_tmo = value;
+   if (!session->recovery_tmo_sysfs_override)
+   session->recovery_tmo = value;
break;
default:
err = transport->set_param(conn, ev->u.set_param.param,
@@ -4047,13 +4049,15 @@ store_priv_session_##field(struct device *dev,  
\
if ((session->state == ISCSI_SESSION_FREE) ||   \
(session->state == ISCSI_SESSION_FAILED))   \
return -EBUSY;  \
-   if (strncmp(buf, "off", 3) == 0)\
+   if (strncmp(buf, "off", 3) == 0) {  \
session->field = -1;\
-   else {  \
+   session->field##_sysfs_override = true; \
+   } else {\
val = simple_strtoul(buf, &cp, 0);  \
if (*cp != '\0' && *cp != '\n') \
return -EINVAL; \
session->field = val;   \
+   session->field##_sysfs_override = true; \
}   \
return count;   \
 }
@@ -4064,6 +4068,7 @@ store_priv_session_##field(struct device *dev,
\
 static ISCSI_CLASS_ATTR(priv_sess, field, S_IRUGO | S_IWUSR,   \
show_priv_session_##field,  \
store_priv_session_##field)
+
 iscsi_priv_session_rw_attr(recovery_tmo, "%d");
 
 static struct attribute *iscsi_session_attrs[] = {
diff --git a/include/scsi/scsi_transport_iscsi.h 
b/include/scsi/scsi_transport_iscsi.h
index 2555ee5..6183d20 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -241,6 +241,7 @@ struct iscsi_cls_session {
 
/* recovery fields */
int recovery_tmo;
+   bool recovery_tmo_sysfs_override;
struct delayed_work recovery_work;
 
unsigned int target_id;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] [PATCH] block: Add blk_max_rw_sectors limit

2015-06-16 Thread Martin K. Petersen

Brian,

I only have minor nits wrt. your patch since you did what I asked.
However, now that I'm less jet lagged and blurry eyed I wonder if
the tweak below wouldn't suffice?


sd: Fix maximum I/O size for BLOCK_PC requests

Commit bcdb247c6b6a ("sd: Limit transfer length") clamped the maximum
size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the BLOCK
LIMITS VPD. This had the unfortunate effect of also limiting the maximum
size of non-filesystem requests sent to the device through sg/bsg.

Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue
limit directly.

Also update the comment in blk_limits_max_hw_sectors() to clarify that
max_hw_sectors defines the limit for the I/O controller only.

Signed-off-by: Martin K. Petersen 
Reported-by: Brian King 
Cc: sta...@vger.kernel.org # 3.17+

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 12600bfffca9..e0057d035200 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -241,8 +241,8 @@ EXPORT_SYMBOL(blk_queue_bounce_limit);
  * Description:
  *Enables a low level driver to set a hard upper limit,
  *max_hw_sectors, on the size of requests.  max_hw_sectors is set by
- *the device driver based upon the combined capabilities of I/O
- *controller and storage device.
+ *the device driver based upon the capabilities of the I/O
+ *controller.
  *
  *max_sectors is a soft limit imposed by the block layer for
  *filesystem type requests.  This value can be overridden on a
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 79beebf53302..cfc0de75d763 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2779,9 +2779,9 @@ static int sd_revalidate_disk(struct gendisk *disk)
max_xfer = sdkp->max_xfer_blocks;
max_xfer <<= ilog2(sdp->sector_size) - 9;
 
-   max_xfer = min_not_zero(queue_max_hw_sectors(sdkp->disk->queue),
-   max_xfer);
-   blk_queue_max_hw_sectors(sdkp->disk->queue, max_xfer);
+   sdkp->disk->queue->limits.max_sectors =
+   min_not_zero(queue_max_hw_sectors(sdkp->disk->queue), max_xfer);
+
set_capacity(disk, sdkp->capacity);
sd_config_write_same(sdkp);
kfree(buffer);
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: optimal io size / custom alignment

2015-06-16 Thread Martin K. Petersen
> "Tom" == Tom Yan  writes:

Tom> All drives I have are flash drives so none of them reports 4k
Tom> physical sectors.

There are plenty of SSDs that report 4K physical sectors, fwiw.

Tom> The usb-storage driver does not read vpd so it won't be a thing,
Tom> but the the uas driver does.

We gave up on USB-SATA bridges long ago. Their designers appear to have
a pretty comprehensive misunderstanding of both the ATA and SCSI
protocols.

We had higher hopes for UAS since it provided a clean slate. So far,
however, the results are equally discouraging.

Tom> I just feel like the kernel shouldn't bind values from totally
Tom> different source (raid stripe vs vpd limit) to the same variable.

RAID devices communicate the stripe width through the Block Limits VPD.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: optimal io size / custom alignment

2015-06-16 Thread Tom Yan
On 17 June 2015 at 01:08, Martin K. Petersen  wrote:
> The two values have nothing to do with each other. They just happen to
> be the same in your case (65535 is the maximum block count for the WRITE
> SAME(10) command).
>
> Your device sets the transfer length granularity to 1 logical block and
> the optimal transfer length to 65535 logical blocks. If it then reports
> a 4096-byte physical block size in response to READ CAPACITY(16) then
> it's clearly on crack.
>
> There's only so much we can do about devices that report garbage.

All drives I have are flash drives so none of them reports 4k physical
sectors. But it does seems possible in the case I linked. The thing is
these VPDs/transfer lengths are probably provided by the USB to
ATA(/SCSI?) bridges. I can't judge if they are wrong to set the
lengths that way but it seem to be a common practice. I have two USB
devices provide the SBC-2 (Block limit VPD), one is a SanDisk Extreme
USB (SDCZ80), another an Intel X25-M Gen1 on an ASMedia SATA adapter,
and both of them set the Optimal transfer length. The usb-storage
driver does not read vpd so it won't be a thing, but the the uas
driver does.

> Also, the kernel only reports things. It is up to Karel to decide
> whether to sanity check the values before he uses them.

I just feel like the kernel shouldn't bind values from totally
different source (raid stripe vs vpd limit) to the same variable. I
don't know if what else would make use of this variable but by only
considering the fdisk case, it seems the scsi disk driver should be
the one who should stop binding.

> The best fix, of course, is to complain to the manufacturer of your
> broken widget and hope for a firmware upgrade.

This is simply too idealistic especially when it seems that this issue
mostly happens on USB bridges. I am not even sure if the SCSI
standards has anything to say about this practice.

> Failing that, adjust your partitions manually.

Yeah that's why I said fdisk should allow custom alignment.

On 17 June 2015 at 01:08, Martin K. Petersen  wrote:
>> "Tom" == Tom Yan  writes:
>
> Tom> From the adapter/drive I have, it is the same as the "Maximum
> Tom> transfer length" and they seem to be simply limits of SCSI "WRITE
> Tom> SAME (10/16)" command:
>
> The two values have nothing to do with each other. They just happen to
> be the same in your case (65535 is the maximum block count for the WRITE
> SAME(10) command).
>
> Tom> [tom@localhost ~]$ sudo sg_inq -p 0xb0 /dev/sdb VPD INQUIRY: Block
> Tom> limits page (SBC) Maximum compare and write length: 0 blocks
> Tom> Optimal transfer length granularity: 1 blocks Maximum transfer
> Tom> length: 65535 blocks Optimal transfer length: 65535 blocks
>
> Your device sets the transfer length granularity to 1 logical block and
> the optimal transfer length to 65535 logical blocks. If it then reports
> a 4096-byte physical block size in response to READ CAPACITY(16) then
> it's clearly on crack.
>
> There's only so much we can do about devices that report garbage.
>
> Also, the kernel only reports things. It is up to Karel to decide
> whether to sanity check the values before he uses them.
>
> I would probably err on the side of trusting the physical block size
> reporting more than anything seeded from the Block Limits VPD. And in
> this case, assuming the alignment offset is reported to be 0, I guess
> one could entertain aligning to the nearest 4K boundary. But on the
> other hand it'll quickly get hairy to have to maintain this kind of
> heuristics.
>
> The best fix, of course, is to complain to the manufacturer of your
> broken widget and hope for a firmware upgrade. Failing that, adjust your
> partitions manually.
>
> Tom> The thing is, why any io/transfer size/length should be considered
> Tom> when it comes to partition alignment? From what I understand,
> Tom> partition alignment is only to make sure partition starts at
> Tom> physical boundaries of the disk because of the mismatch between
> Tom> logicial sector (512 bytes) and physical sectors (4096 bytes) or
> Tom> pages/erase blocks of SSDs.
>
> For RAID it makes a big difference to ensure the partition is aligned on
> a stripe boundary.
>
> --
> Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mpt2sas: Abort initialization if no memory I/O resources, detected

2015-06-16 Thread Timothy Pearson

On 06/16/2015 12:42 PM, Joe Lawrence wrote:

On 06/16/2015 12:28 PM, Timothy Pearson wrote:

On 06/12/2015 05:05 PM, Timothy Pearson wrote:

The mpt2sas driver crashes if the BIOS does not set up at least one
memory I/O resource. This failure can happen if the device is too
slow to respond during POST and is missed by the BIOS, but Linux
then detects the device later in the boot process.

This patch aborts initialization and prints a warning if no memory I/O
resources are found.

Signed-off-by: Timothy Pearson
Tested-by: Timothy Pearson
---
drivers/scsi/mpt2sas/mpt2sas_base.c | 9 +
1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index 11248de..15c9504 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -6,6 +6,8 @@
* Copyright (C) 2007-2014 LSI Corporation
* Copyright (C) 20013-2014 Avago Technologies
* (mailto: mpt-fusionlinux@avagotech.com)
+ * Copyright (C) 2015 Raptor Engineering
+ * (mailto: supp...@araptorengineeringinc.com)
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
@@ -1582,6 +1584,13 @@ mpt2sas_base_map_resources(struct MPT2SAS_ADAPTER
*ioc)
}
}

+ if (ioc->chip == NULL) {
+ printk(MPT2SAS_ERR_FMT "unable to map "
+ "adapter memory (resource not found)!\n", ioc->name);
+ r = -EINVAL;
+ goto out_fail;
+ }
+
_base_mask_interrupts(ioc);

r = _base_get_ioc_facts(ioc, CAN_SLEEP);


Just following up on this patch as I have not yet received any response.

Thanks!


Hi Tim -- just curious, why was the similar check on ioc->chip just a
few lines above the one added by the patch insufficient?

That loop block sets memap_sz when it finds an IORESOURCE_MEM so that it
only sets ioc->chip once.  I wonder if the fix might be simpler if the
existing ioc->chip check relocated entirely to where you put it (maybe
also pulling the entire error text onto one line for easier grepping).

Regards,

-- Joe


If there are no IORESOURCE_MEM resources allocated by the BIOS (i.e. if 
the BIOS does not run resource allocation on the mpt2sas device) then 
the check you are referring to is not executed, and the driver attempts 
to perform operations on a null ioc->chip pointer.


I can relocate the check if desired.

--
Timothy Pearson
Raptor Engineering
+1 (415) 727-8645
http://www.raptorengineeringinc.com
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mpt2sas: Abort initialization if no memory I/O resources, detected

2015-06-16 Thread Joe Lawrence
On 06/16/2015 12:28 PM, Timothy Pearson wrote:
> On 06/12/2015 05:05 PM, Timothy Pearson wrote:
>> The mpt2sas driver crashes if the BIOS does not set up at least one
>> memory I/O resource. This failure can happen if the device is too
>> slow to respond during POST and is missed by the BIOS, but Linux
>> then detects the device later in the boot process.
>>
>> This patch aborts initialization and prints a warning if no memory I/O
>> resources are found.
>>
>> Signed-off-by: Timothy Pearson 
>> Tested-by: Timothy Pearson 
>> ---
>> drivers/scsi/mpt2sas/mpt2sas_base.c | 9 +
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
>> b/drivers/scsi/mpt2sas/mpt2sas_base.c
>> index 11248de..15c9504 100644
>> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
>> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
>> @@ -6,6 +6,8 @@
>> * Copyright (C) 2007-2014 LSI Corporation
>> * Copyright (C) 20013-2014 Avago Technologies
>> * (mailto: mpt-fusionlinux@avagotech.com)
>> + * Copyright (C) 2015 Raptor Engineering
>> + * (mailto: supp...@araptorengineeringinc.com)
>> *
>> * This program is free software; you can redistribute it and/or
>> * modify it under the terms of the GNU General Public License
>> @@ -1582,6 +1584,13 @@ mpt2sas_base_map_resources(struct MPT2SAS_ADAPTER
>> *ioc)
>> }
>> }
>>
>> + if (ioc->chip == NULL) {
>> + printk(MPT2SAS_ERR_FMT "unable to map "
>> + "adapter memory (resource not found)!\n", ioc->name);
>> + r = -EINVAL;
>> + goto out_fail;
>> + }
>> +
>> _base_mask_interrupts(ioc);
>>
>> r = _base_get_ioc_facts(ioc, CAN_SLEEP);
> 
> Just following up on this patch as I have not yet received any response.
> 
> Thanks!

Hi Tim -- just curious, why was the similar check on ioc->chip just a
few lines above the one added by the patch insufficient?

That loop block sets memap_sz when it finds an IORESOURCE_MEM so that it
only sets ioc->chip once.  I wonder if the fix might be simpler if the
existing ioc->chip check relocated entirely to where you put it (maybe
also pulling the entire error text onto one line for easier grepping).

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mpt2sas: Abort initialization if no memory I/O resources, detected

2015-06-16 Thread Timothy Pearson

On 06/12/2015 05:05 PM, Timothy Pearson wrote:

The mpt2sas driver crashes if the BIOS does not set up at least one
memory I/O resource. This failure can happen if the device is too
slow to respond during POST and is missed by the BIOS, but Linux
then detects the device later in the boot process.

This patch aborts initialization and prints a warning if no memory I/O
resources are found.

Signed-off-by: Timothy Pearson 
Tested-by: Timothy Pearson 
---
drivers/scsi/mpt2sas/mpt2sas_base.c | 9 +
1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c
b/drivers/scsi/mpt2sas/mpt2sas_base.c
index 11248de..15c9504 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -6,6 +6,8 @@
* Copyright (C) 2007-2014 LSI Corporation
* Copyright (C) 20013-2014 Avago Technologies
* (mailto: mpt-fusionlinux@avagotech.com)
+ * Copyright (C) 2015 Raptor Engineering
+ * (mailto: supp...@araptorengineeringinc.com)
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
@@ -1582,6 +1584,13 @@ mpt2sas_base_map_resources(struct MPT2SAS_ADAPTER
*ioc)
}
}

+ if (ioc->chip == NULL) {
+ printk(MPT2SAS_ERR_FMT "unable to map "
+ "adapter memory (resource not found)!\n", ioc->name);
+ r = -EINVAL;
+ goto out_fail;
+ }
+
_base_mask_interrupts(ioc);

r = _base_get_ioc_facts(ioc, CAN_SLEEP);


Just following up on this patch as I have not yet received any response.

Thanks!

--
Timothy Pearson
Raptor Engineering
+1 (415) 727-8645
http://www.raptorengineeringinc.com
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/20] [SCSI] mpt3sas: Get IOC_FACTS information using handshake protocol only after HBA card gets into READY or Operational state.

2015-06-16 Thread Tomas Henzl
On 06/12/2015 11:42 AM, Sreekanth Reddy wrote:
> Driver initialization fails if driver tries to send IOC facts request message 
> when the IOC is in reset or in a fault state.
> 
> This patch will make sure that
>  1.Driver to send IOC facts request message only if HBA is in operational or 
> ready state.
>  2.If IOC is in fault state, a diagnostic reset would be issued.
>  3.If IOC is in reset state then driver will wait for 10 seconds to exit out 
> of reset state.
>If the HBA continues to be in reset state, then the HBA wouldn't be 
> claimed by the driver.
> 
> Signed-off-by: Sreekanth Reddy 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.c | 65 
> +
>  1 file changed, 65 insertions(+)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
> b/drivers/scsi/mpt3sas/mpt3sas_base.c
> index c13a365..ce57320 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
> @@ -3169,6 +3169,9 @@ _base_wait_on_iocstate(struct MPT3SAS_ADAPTER *ioc, u32 
> ioc_state, int timeout,
>   * Notes: MPI2_HIS_IOC2SYS_DB_STATUS - set to one when IOC writes to 
> doorbell.
>   */
>  static int
> +_base_diag_reset(struct MPT3SAS_ADAPTER *ioc, int sleep_flag);
> +
> +static int
>  _base_wait_for_doorbell_int(struct MPT3SAS_ADAPTER *ioc, int timeout,
>   int sleep_flag)
>  {
> @@ -3711,6 +3714,61 @@ _base_get_port_facts(struct MPT3SAS_ADAPTER *ioc, int 
> port, int sleep_flag)
>  }
>  
>  /**
> + * _base_wait_for_iocstate - Wait until the card is in READY or OPERATIONAL
> + * @ioc: per adapter object
> + * @timeout:
> + * @sleep_flag: CAN_SLEEP or NO_SLEEP
> + *
> + * Returns 0 for success, non-zero for failure.
> + */
> +static int
> +_base_wait_for_iocstate(struct MPT3SAS_ADAPTER *ioc, int timeout,
> + int sleep_flag)
> +{
> + u32 ioc_state;
> + int rc;
> +
> + dinitprintk(ioc, printk(MPT3SAS_FMT "%s\n", ioc->name,
> + __func__));
> +
> + if (ioc->pci_error_recovery)
> + return 0;
Hi Sreekanth, isn't that^ an error condition - 'return -EFAULT;'
would be better?
Tomas
> +
> + ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> + dhsprintk(ioc, printk(MPT3SAS_FMT "%s: ioc_state(0x%08x)\n",
> + ioc->name, __func__, ioc_state));
> +
> + if (((ioc_state & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_READY) ||
> + (ioc_state & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_OPERATIONAL)
> + return 0;
> +
> + if (ioc_state & MPI2_DOORBELL_USED) {
> + dhsprintk(ioc, printk(MPT3SAS_FMT
> + "unexpected doorbell active!\n", ioc->name));
> + goto issue_diag_reset;
> + }
> +
> + if ((ioc_state & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_FAULT) {
> + mpt3sas_base_fault_info(ioc, ioc_state &
> + MPI2_DOORBELL_DATA_MASK);
> + goto issue_diag_reset;
> + }
> +
> + ioc_state = _base_wait_on_iocstate(ioc, MPI2_IOC_STATE_READY,
> + timeout, sleep_flag);
> + if (ioc_state) {
> + dfailprintk(ioc, printk(MPT3SAS_FMT
> + "%s: failed going to ready state (ioc_state=0x%x)\n",
> + ioc->name, __func__, ioc_state));
> + return -EFAULT;
> + }
> +
> + issue_diag_reset:
> + rc = _base_diag_reset(ioc, sleep_flag);
> + return rc;
> +}
> +
> +/**
>   * _base_get_ioc_facts - obtain ioc facts reply and save in ioc
>   * @ioc: per adapter object
>   * @sleep_flag: CAN_SLEEP or NO_SLEEP
> @@ -3728,6 +3786,13 @@ _base_get_ioc_facts(struct MPT3SAS_ADAPTER *ioc, int 
> sleep_flag)
>   dinitprintk(ioc, pr_info(MPT3SAS_FMT "%s\n", ioc->name,
>   __func__));
>  
> + r = _base_wait_for_iocstate(ioc, 10, sleep_flag);
> + if (r) {
> + dfailprintk(ioc, printk(MPT3SAS_FMT
> + "%s: failed getting to correct state\n",
> + ioc->name, __func__));
> + return r;
> + }
>   mpi_reply_sz = sizeof(Mpi2IOCFactsReply_t);
>   mpi_request_sz = sizeof(Mpi2IOCFactsRequest_t);
>   memset(&mpi_request, 0, mpi_request_sz);
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to "decode" SG_IO: bad/missing sense data?

2015-06-16 Thread Tom Yan
I knew both sg_decode_sense and the SAT spec, but I didn't think of
using them together to decode this. (Didn't bother to download the
spec as well because it requires a bit of "registration" :P) Thanks
for the pointer.

According to the latest draft, if "CK_COND" is set to 1, the sense
data returned will be:

"No error, successful completion or command in progress. The SATL shall
terminate the command with CHECK CONDITION status with the sense key
set to RECOVERED ERROR with the additional sense code set to ATA
PASS-THROUGH INFORMATION AVAILABLE (see SPC-4)."

though I do not find anything useful to further decode the additional
fields like "status=0x50". (Some google results did tell me that 50
means success.)

The man page of sg_sat_identify also mentioned a bit about it:

   -c, --ck_cond
  sets  the CK_COND bit in the ATA PASS-THROUGH SCSI cdb.
The default setting is clear (i.e. 0). When set the SATL should yield
a sense buffer containing a ATA Result descriptor irrespective of
  whether the command succeeded or failed. When clear the
SATL should only yield a sense buffer containing a ATA Result
descriptor if the command failed.

When I run `sg_sat_identify` with `-c` to my different drives, the
drive which makes hdparm gives the SG_IO error seems to be the only
one actually respond to the "CK_COND" bit:

[tom@localhost ~]$ sudo sg_sat_identify -c /dev/sdb
expected descriptor sense format, response code=0xf0

others simply reacts as if the bit is not set.

Anyway the SG_IO error is given by hdparm, so I guess I should talk to
the hdparm guys now.

On 16 June 2015 at 19:24, Douglas Gilbert  wrote:
> On 15-06-16 01:05 PM, Tom Yan wrote:
>>
>> When I "ATA Secure Erase" a USB Flash Drive, I got:
>>
>> SG_IO: bad/missing sense data, sb[]:  f0 00 01 00 50 40 00 0a 00 00 00
>> 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>
>> While the erase seems to work without bricking the device (multiple
>> trials with shred and hexdump), this message still bothers me a bit.
>> So is there anyway I could "decode" it so that I can know what it
>> actually means?
>
>
> Hi,
> Install the sg3_utils package then copy that hex sequence and place
> it after the sg_decode_sense command. In this case:
>
> $ sg_decode_sense f0 00 01 00 50 40 00 0a 00 00 00 00 00 1d 00 00 00 00 00
> 00 00 00 00 00 00 00 00 00 00 00 00 00
>  Fixed format, current;  Sense key: Recovered Error
>  Additional sense: ATA pass through information available
>   error=0x0, status=0x50, device=0x40, sector_count(7:0)=0x0
>   extend=0, log_index=0x0, lba_high,mid,low(7:0)=0x0,0x0,0x0
>
> So that ATA Secure Erase command is sending a SCSI error back
> through the SAT mechanism. Check the SAT standard (at www.t10.org)
> for details.
>
> Doug Gilbert
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] ipr: Fix locking for unit attention handling

2015-06-16 Thread Gabriel Krisman Bertazi
Brian King  writes:

> Make sure we have the host lock held when calling scsi_report_bus_reset. Fixes
> a crash seen as the __devices list in the scsi host was changing as we were
> iterating through it.

Brian,

The patch series look good to me as a whole, thanks for doing that.
Please add the tag:

Reviewed-by: Gabriel Krisman Bertazi 

Thanks,

-- 
Gabriel Krisman Bertazi

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch V2 9/9] [SCSI] aacraid: Update driver version

2015-06-16 Thread Johannes Thumshirn
On Wed, Jun 10, 2015 at 06:42:31PM -0700, rajinikanth.panduran...@pmcs.com 
wrote:
> From: Rajinikanth Pandurangan 
> 
> Signed-off-by: Rajinikanth Pandurangan 
> ---
>  drivers/scsi/aacraid/aacraid.h | 2 +-
>  drivers/scsi/aacraid/linit.c   | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
> index 7b95227..73c3384 100644
> --- a/drivers/scsi/aacraid/aacraid.h
> +++ b/drivers/scsi/aacraid/aacraid.h
> @@ -62,7 +62,7 @@ enum {
>  #define  PMC_GLOBAL_INT_BIT0 0x0001
>  
>  #ifndef AAC_DRIVER_BUILD
> -# define AAC_DRIVER_BUILD 40709
> +# define AAC_DRIVER_BUILD 41010
>  # define AAC_DRIVER_BRANCH "-ms"
>  #endif
>  #define MAXIMUM_NUM_CONTAINERS   32
> diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
> index 3df0dfb..1627928 100644
> --- a/drivers/scsi/aacraid/linit.c
> +++ b/drivers/scsi/aacraid/linit.c
> @@ -56,7 +56,7 @@
>  
>  #include "aacraid.h"
>  
> -#define AAC_DRIVER_VERSION   "1.2-1"
> +#define AAC_DRIVER_VERSION   "1.2-2"
>  #ifndef AAC_DRIVER_BRANCH
>  #define AAC_DRIVER_BRANCH""
>  #endif
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn   Storage
jthumsh...@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch V2 7/9] [SCSI] aacraid: Unblock IOCTLs to controller once system resumed from suspend

2015-06-16 Thread Johannes Thumshirn
On Wed, Jun 10, 2015 at 06:42:29PM -0700, rajinikanth.panduran...@pmcs.com 
wrote:
> From: Rajinikanth Pandurangan 
> 
> Description:
>   Driver blocks ioctls once it received shutdown/suspend request during
>   suspend/hybernation. This patch unblocks ioctls on resume path.
> 
> Signed-off-by: Rajinikanth Pandurangan 
> ---
>  drivers/scsi/aacraid/linit.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
> index 8020348..1142c28 100644
> --- a/drivers/scsi/aacraid/linit.c
> +++ b/drivers/scsi/aacraid/linit.c
> @@ -1448,6 +1448,11 @@ static int aac_resume(struct pci_dev *pdev)
>   pci_set_master(pdev);
>   if (aac_acquire_resources(aac))
>   goto fail_device;
> + /*
> + * reset this flag to unblock ioctl() as it was set at
> + * aac_send_shutdown() to block ioctls from upperlayer
> + */
> + aac->adapter_shutdown = 0;
>   scsi_unblock_requests(shost);
>  
>   return 0;
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn   Storage
jthumsh...@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch V2 6/9] [SCSI] aacraid: Reset irq affinity hints before releasing irq

2015-06-16 Thread Johannes Thumshirn
On Wed, Jun 10, 2015 at 06:42:28PM -0700, rajinikanth.panduran...@pmcs.com 
wrote:
> From: Rajinikanth Pandurangan 
> 
> Description:
> Reset irq affinity hints before releasing IRQ
> Removed duplicate code of IRQ acquire/release
> 
> Signed-off-by: Rajinikanth Pandurangan 
> ---
>  drivers/scsi/aacraid/aacraid.h |   2 +
>  drivers/scsi/aacraid/commsup.c | 113 
> ++---
>  drivers/scsi/aacraid/src.c |  48 ++---
>  3 files changed, 88 insertions(+), 75 deletions(-)
> 
> diff --git a/drivers/scsi/aacraid/aacraid.h b/drivers/scsi/aacraid/aacraid.h
> index e54f597..7b95227 100644
> --- a/drivers/scsi/aacraid/aacraid.h
> +++ b/drivers/scsi/aacraid/aacraid.h
> @@ -2110,6 +2110,8 @@ static inline unsigned int cap_to_cyls(sector_t 
> capacity, unsigned divisor)
>  #define AAC_OWNER_ERROR_HANDLER  0x103
>  #define AAC_OWNER_FIRMWARE   0x106
>  
> +int aac_acquire_irq(struct aac_dev *dev);
> +void aac_free_irq(struct aac_dev *dev);
>  const char *aac_driverinfo(struct Scsi_Host *);
>  struct fib *aac_fib_alloc(struct aac_dev *dev);
>  int aac_fib_setup(struct aac_dev *dev);
> diff --git a/drivers/scsi/aacraid/commsup.c b/drivers/scsi/aacraid/commsup.c
> index 4da5749..a1f90fe 100644
> --- a/drivers/scsi/aacraid/commsup.c
> +++ b/drivers/scsi/aacraid/commsup.c
> @@ -1270,13 +1270,12 @@ retry_next:
>  static int _aac_reset_adapter(struct aac_dev *aac, int forced)
>  {
>   int index, quirks;
> - int retval, i;
> + int retval;
>   struct Scsi_Host *host;
>   struct scsi_device *dev;
>   struct scsi_cmnd *command;
>   struct scsi_cmnd *command_list;
>   int jafo = 0;
> - int cpu;
>  
>   /*
>* Assumptions:
> @@ -1339,35 +1338,7 @@ static int _aac_reset_adapter(struct aac_dev *aac, int 
> forced)
>   aac->comm_phys = 0;
>   kfree(aac->queues);
>   aac->queues = NULL;
> - cpu = cpumask_first(cpu_online_mask);
> - if (aac->pdev->device == PMC_DEVICE_S6 ||
> - aac->pdev->device == PMC_DEVICE_S7 ||
> - aac->pdev->device == PMC_DEVICE_S8 ||
> - aac->pdev->device == PMC_DEVICE_S9) {
> - if (aac->max_msix > 1) {
> - for (i = 0; i < aac->max_msix; i++) {
> - if (irq_set_affinity_hint(
> - aac->msixentry[i].vector,
> - NULL)) {
> - printk(KERN_ERR "%s%d: Failed to reset 
> IRQ affinity for cpu %d\n",
> - aac->name,
> - aac->id,
> - cpu);
> - }
> - cpu = cpumask_next(cpu,
> - cpu_online_mask);
> - free_irq(aac->msixentry[i].vector,
> -  &(aac->aac_msix[i]));
> - }
> - pci_disable_msix(aac->pdev);
> - } else {
> - free_irq(aac->pdev->irq, &(aac->aac_msix[0]));
> - }
> - } else {
> - free_irq(aac->pdev->irq, aac);
> - }
> - if (aac->msi)
> - pci_disable_msi(aac->pdev);
> + aac_free_irq(aac);
>   kfree(aac->fsa_dev);
>   aac->fsa_dev = NULL;
>   quirks = aac_get_driver_ident(index)->quirks;
> @@ -1978,3 +1949,83 @@ int aac_command_thread(void *data)
>   dev->aif_thread = 0;
>   return 0;
>  }
> +
> +int aac_acquire_irq(struct aac_dev *dev)
> +{
> + int i;
> + int j;
> + int ret = 0;
> + int cpu;
> +
> + cpu = cpumask_first(cpu_online_mask);
> + if (!dev->sync_mode && dev->msi_enabled && dev->max_msix > 1) {
> + for (i = 0; i < dev->max_msix; i++) {
> + dev->aac_msix[i].vector_no = i;
> + dev->aac_msix[i].dev = dev;
> + if (request_irq(dev->msixentry[i].vector,
> + dev->a_ops.adapter_intr,
> + 0, "aacraid", &(dev->aac_msix[i]))) {
> + printk(KERN_ERR "%s%d: Failed to register IRQ 
> for vector %d.\n",
> + dev->name, dev->id, i);
> + for (j = 0 ; j < i ; j++)
> + free_irq(dev->msixentry[j].vector,
> +  &(dev->aac_msix[j]));
> + pci_disable_msix(dev->pdev);
> + ret = -1;
> + }
> + if (irq_set_affinity_hint(dev->msixentry[i].vector,
> + get_cpu_mask(cpu))) {
> + printk(KERN_ERR "%s%d: Failed to set IRQ 
> affinity for cpu %d\n",
> + dev-

Re: [Patch V2 3/9] [SCSI] aacraid: Enable MSI interrupt for series-6 controller

2015-06-16 Thread Johannes Thumshirn
On Wed, Jun 10, 2015 at 06:42:25PM -0700, rajinikanth.panduran...@pmcs.com 
wrote:
> From: Rajinikanth Pandurangan 
> 
> Description:
>   Enable MSI interrupt mode for series-6 controller.
> 
> Signed-off-by: Rajinikanth Pandurangan 
> ---
>  drivers/scsi/aacraid/src.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
> index b147341..eb07b3d 100644
> --- a/drivers/scsi/aacraid/src.c
> +++ b/drivers/scsi/aacraid/src.c
> @@ -742,7 +742,7 @@ int aac_src_init(struct aac_dev *dev)
>   if (dev->comm_interface != AAC_COMM_MESSAGE_TYPE1)
>   goto error_iounmap;
>  
> - dev->msi = aac_msi && !pci_enable_msi(dev->pdev);
> + dev->msi = !pci_enable_msi(dev->pdev);
>  
>   dev->aac_msix[0].vector_no = 0;
>   dev->aac_msix[0].dev = dev;
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn   Storage
jthumsh...@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch V2 1/9] [SCSI] aacraid: Fix for logical device name and UID not exposed to the OS

2015-06-16 Thread Johannes Thumshirn
On Wed, Jun 10, 2015 at 06:42:23PM -0700, rajinikanth.panduran...@pmcs.com 
wrote:
> From: Rajinikanth Pandurangan 
> 
> Description:
>   Driver sends the right size of the response buffer.
> 
> Signed-off-by: Rajinikanth Pandurangan 
> ---
>  drivers/scsi/aacraid/aachba.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/aacraid/aachba.c b/drivers/scsi/aacraid/aachba.c
> index 9b3dd6e..fe59b00 100644
> --- a/drivers/scsi/aacraid/aachba.c
> +++ b/drivers/scsi/aacraid/aachba.c
> @@ -570,7 +570,7 @@ static int aac_get_container_name(struct scsi_cmnd * 
> scsicmd)
>  
>   status = aac_fib_send(ContainerCommand,
> cmd_fibcontext,
> -   sizeof (struct aac_get_name),
> +   sizeof(struct aac_get_name_resp),
> FsaNormal,
> 0, 1,
> (fib_callback)get_container_name_callback,
> @@ -1052,7 +1052,7 @@ static int aac_get_container_serial(struct scsi_cmnd * 
> scsicmd)
>  
>   status = aac_fib_send(ContainerCommand,
> cmd_fibcontext,
> -   sizeof (struct aac_get_serial),
> +   sizeof(struct aac_get_serial_resp),
> FsaNormal,
> 0, 1,
> (fib_callback) get_container_serial_callback,
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn   Storage
jthumsh...@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to "decode" SG_IO: bad/missing sense data?

2015-06-16 Thread Douglas Gilbert

On 15-06-16 01:05 PM, Tom Yan wrote:

When I "ATA Secure Erase" a USB Flash Drive, I got:

SG_IO: bad/missing sense data, sb[]:  f0 00 01 00 50 40 00 0a 00 00 00
00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

While the erase seems to work without bricking the device (multiple
trials with shred and hexdump), this message still bothers me a bit.
So is there anyway I could "decode" it so that I can know what it
actually means?


Hi,
Install the sg3_utils package then copy that hex sequence and place
it after the sg_decode_sense command. In this case:

$ sg_decode_sense f0 00 01 00 50 40 00 0a 00 00 00 00 00 1d 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00

 Fixed format, current;  Sense key: Recovered Error
 Additional sense: ATA pass through information available
  error=0x0, status=0x50, device=0x40, sector_count(7:0)=0x0
  extend=0, log_index=0x0, lba_high,mid,low(7:0)=0x0,0x0,0x0

So that ATA Secure Erase command is sending a SCSI error back
through the SAT mechanism. Check the SAT standard (at www.t10.org)
for details.

Doug Gilbert



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


how to "decode" SG_IO: bad/missing sense data?

2015-06-16 Thread Tom Yan
When I "ATA Secure Erase" a USB Flash Drive, I got:

SG_IO: bad/missing sense data, sb[]:  f0 00 01 00 50 40 00 0a 00 00 00
00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

While the erase seems to work without bricking the device (multiple
trials with shred and hexdump), this message still bothers me a bit.
So is there anyway I could "decode" it so that I can know what it
actually means?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: optimal io size / custom alignment

2015-06-16 Thread Tom Yan
I heard about that it matters for RAID but since I don't really know
about RAID so I can't comment.

I do wonder whether the scsi disk driver should derive minimum/optimal
i/o size from VPD at all then. It might still be "tolerable" if it's
the limit of WRITE SAME(10), but definitely not if it's that of WRITE
SAME (16):

[tom@localhost ~]$ sudo fdisk /dev/sdc

Welcome to fdisk (util-linux 2.26.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0xccb261a9.

Command (m for help): p
Disk /dev/sdc: 29.2 GiB, 31376707072 bytes, 61282631 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 4294966784 bytes
Disklabel type: dos
Disk identifier: 0xccb261a9

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p):

Using default response p.
Partition number (1-4, default 1):
First sector (8388607-61282630, default 8388607):
Last sector, +sectors or +size{K,M,G,T,P} (8388607-61282630, default 61282630):

Created a new partition 1 of type 'Linux' and of size 25.2 GiB.

On 16 June 2015 at 17:43, Karel Zak  wrote:
> On Tue, Jun 16, 2015 at 01:20:37PM +0800, Tom Yan wrote:
>> The thing is, why any io/transfer size/length should be considered
>> when it comes to partition alignment? From what I understand,
>> partition alignment is only to make sure partition starts at physical
>> boundaries of the disk because of the mismatch between logicial sector
>> (512 bytes) and physical sectors (4096 bytes) or pages/erase blocks of
>> SSDs.
>
> It's more complicated, the I/O limits are the most important for RAIDs
> where optimal I/O size is usually stripe size and you want to use it
> for partitions alignment for better performance (if you align to
> sector size then read/write on RAID maybe performed on more disks on
> unaligned partitions). And it's not only fdisk who cares, it's also
> important for mkfs. (for example XFS align according to I/O limits).
>
> And because all this is mess and sometimes HW does not provide
> relevant information and because people use dd(1) to copy partition
> tables we have decided to use 1MiB granularity if possible. If 1MiB is
> useless then we use optimal_io_size, if undefined then minimal_io_size
> and if undefined then sector_size.
>
> http://people.redhat.com/msnitzer/docs/io-limits.txt
>
>
> Unfortunately the current code does not check if optimal_io_size makes
> sense, so thing like 33553920 for 4k device is blindly accepted ;-(
>
> Karel
>
>
> --
>  Karel Zak  
>  http://karelzak.blogspot.com
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html