Re: Debugging scsi abort handling ?

2014-08-28 Thread Martin Peschke
On Thu, 2014-08-28 at 14:04 +0200, Hannes Reinecke wrote:
 On 08/25/2014 01:15 PM, Paolo Bonzini wrote:
  - abort the command, and then the driver should never call the
  -scsi_done callback for the Scsi_Cmnd*.
 
 In practice we rely on the latter behaviour; when -scsi_done is 
 called while the command is under eh_abort _really bad things_
 will happen.
 As soon as eh_abort is called control is transferred back to the
 SCSI midlayer, so any LLDD should never send completions for these
 commands back to the midlayer.

Mmh, then there is a small race window starting at the point in time
when the abort function of an LLDD is called up to the point in time
when that abort function has taken the necessary measures to make sure
that scsi_done won't be called for a racing command completion , isn't
it?

Cheers
Martin

-- 
Linux on System z Development

IBM Deutschland Research  Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/7] scsi: Use Scsi_Host as argument for eh_host_reset_handler

2014-06-27 Thread Martin Peschke
On Fri, 2014-06-27 at 13:04 +0200, Hannes Reinecke wrote:
 On 06/27/2014 12:47 PM, Steffen Maier wrote:
  On 06/27/2014 08:27 AM, Hannes Reinecke wrote:
  Issuing a host reset should not rely on any commands.
  So use Scsi_Host as argument for eh_host_reset_handler.
 
  Signed-off-by: Hannes Reinecke h...@suse.de
  ---
 
drivers/s390/scsi/zfcp_scsi.c |  3 +-
 
69 files changed, 289 insertions(+), 379 deletions(-)
 
  diff --git a/drivers/s390/scsi/zfcp_scsi.c
  b/drivers/s390/scsi/zfcp_scsi.c
  index dc42c93..fe50f69 100644
  --- a/drivers/s390/scsi/zfcp_scsi.c
  +++ b/drivers/s390/scsi/zfcp_scsi.c
  @@ -281,13 +281,14 @@ static int
  zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt)
return zfcp_task_mgmt_function(scpnt, FCP_TMF_TGT_RESET);
}
 
  -static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
  +static int zfcp_scsi_eh_host_reset_handler(struct Scsi_Host *host)
{
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt-device);
 
  Scpnt argument no longer exists, so this line must likely go away to
  compile.
 
struct zfcp_adapter *adapter = zfcp_sdev-port-adapter;
 
  This needs the assignment from the added line below since zfcp_sdev
  no longer exists.
  Alternatively, drop the assignment here, only have the auto variable
  definition, and do the assignment below with the added line.
 
struct fc_rport *rport = zfcp_sdev-port-rport;
int ret;
 
  +adapter = (struct zfcp_adapter *)host-hostdata[0];
zfcp_erp_adapter_reopen(adapter, 0, schrh_1);
zfcp_erp_wait(adapter);
ret = fc_block_scsi_eh(rport);
 
  A question just for my understanding: Calling fc_block_scsi_eh at
  the end *after* our adapter recovery is OK to wait for rports to
  become unblocked (or fast_io_failed)?
  I would guess so.
 
 Hehe. That's actually a question for _you_ :-)

git says:

commit af4de36d911ab907b92c5f3f81ceff8474ed7485
Author: Christof Schmitt christof.schm...@de.ibm.com
Date:   Tue Nov 24 16:54:16 2009 +0100

[SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED

In case the SCSI error recovery starts because of a SCSI command
timeout, but then something else triggers the rport to be deleted,
the SCSI error recovery will run to the end and set the SCSI device
offline. To prevent this, call the FC transport function
fc_block_scsi_eh which waits until the rport leaves the BLOCKED
state. This guarantees that communication is possible if the rport
is ONLINE, or the SCSI devices will be removed if the rport state
switches to NOT_PRESENT.


Not sure if this reasoning is (still) valid.

Isn't command timeouts the only reason for scsi eh to kick in?

Martin
-- 
Linux on System z Development

IBM Deutschland Research  Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] scsi_debug: fix endianness bug in sdebug_build_parts()

2013-07-29 Thread Martin Peschke
On Sat, 2013-07-27 at 02:07 +0900, Akinobu Mita wrote:
 2013/7/26 Martin Peschke mpesc...@linux.vnet.ibm.com:
  On Mon, 2013-07-15 at 20:52 +0900, Akinobu Mita wrote:
  With module parameter num_parts  0, partition table is built on the
  ramdisk storage when loading the driver.  Unfortunately, there is an
  endianness bug in sdebug_build_parts().  So the partition table is not
  correctly initialized on big-endian systems.
 
  Signed-off-by: Akinobu Mita akinobu.m...@gmail.com
  Cc: James E.J. Bottomley jbottom...@parallels.com
  Cc: Douglas Gilbert dgilb...@interlog.com
  Cc: linux-scsi@vger.kernel.org
  ---
   drivers/scsi/scsi_debug.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)
 
  diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
  index cb4fefa..2f39b13 100644
  --- a/drivers/scsi/scsi_debug.c
  +++ b/drivers/scsi/scsi_debug.c
  @@ -2659,8 +2659,8 @@ static void __init sdebug_build_parts(unsigned char 
  *ramp,
   / sdebug_sectors_per;
pp-end_sector = (end_sec % sdebug_sectors_per) + 1;
 
  - pp-start_sect = start_sec;
  - pp-nr_sects = end_sec - start_sec + 1;
  + pp-start_sect = cpu_to_le32(start_sec);
  + pp-nr_sects = cpu_to_le32(end_sec - start_sec + 1);
pp-sys_ind = 0x83; /* plain Linux partition */
}
   }
 
 
  I have posted the same fix several times, e.g.
  http://marc.info/?l=linux-scsim=137051617907423w=2
  Good luck!
 
  Acked-by: Martin Peschke mpesc...@linux.vnet.ibm.com
 
 Thanks.  I found this problem from sparse warning noticed by 0-DAY kernel
 build testing.

It surfaced as a real problem on System z (big endian).
That is why I can confirm that the above patch works.

Thanks,
Martin


-- 
Linux on System z Development

IBM Deutschland Research  Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] scsi_debug: fix endianness bug in sdebug_build_parts()

2013-07-25 Thread Martin Peschke
On Mon, 2013-07-15 at 20:52 +0900, Akinobu Mita wrote:
 With module parameter num_parts  0, partition table is built on the
 ramdisk storage when loading the driver.  Unfortunately, there is an
 endianness bug in sdebug_build_parts().  So the partition table is not
 correctly initialized on big-endian systems.
 
 Signed-off-by: Akinobu Mita akinobu.m...@gmail.com
 Cc: James E.J. Bottomley jbottom...@parallels.com
 Cc: Douglas Gilbert dgilb...@interlog.com
 Cc: linux-scsi@vger.kernel.org
 ---
  drivers/scsi/scsi_debug.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
 index cb4fefa..2f39b13 100644
 --- a/drivers/scsi/scsi_debug.c
 +++ b/drivers/scsi/scsi_debug.c
 @@ -2659,8 +2659,8 @@ static void __init sdebug_build_parts(unsigned char 
 *ramp,
  / sdebug_sectors_per;
   pp-end_sector = (end_sec % sdebug_sectors_per) + 1;
 
 - pp-start_sect = start_sec;
 - pp-nr_sects = end_sec - start_sec + 1;
 + pp-start_sect = cpu_to_le32(start_sec);
 + pp-nr_sects = cpu_to_le32(end_sec - start_sec + 1);
   pp-sys_ind = 0x83; /* plain Linux partition */
   }
  }


I have posted the same fix several times, e.g.
http://marc.info/?l=linux-scsim=137051617907423w=2
Good luck!

Acked-by: Martin Peschke mpesc...@linux.vnet.ibm.com


-- 
Linux on System z Development

IBM Deutschland Research  Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND] scsi_debug: Fix endianess in partition table

2013-06-06 Thread Martin Peschke
James,
could you pick up this fix, please?

Btw., it looks like others where bitten by this one:
https://nazar.karan.org/blob/distro!
parted.git/8780c767126938173f49dfbcd4360813932f7756/SOURCES!
disable-t9020.patch

Thanks,
Martin





Both start_sect and nr_sects in struct partition are __le32 and
require cpu_to_le32() on assignment.

Without this fix tools like fdisk show an invalid partition table
for SCSI devices emulated by scsi_debug on big-endian architectures,
like s390x. Besides a kernel message like this was emitted:

sda: p1 start 536870912 is beyond EOD, enabling native capacity
sda: p1 start 536870912 is beyond EOD, truncated

For verification 'xxd -l 512 /dev/sda' has been used to make sure
that this fix makes scsi_debug generated partition tables on s390x
look like the ones generated on my laptop.

Signed-off-by: Martin Peschke mpesc...@linux.vnet.ibm.com
Reviewed-by: Steffen Maier ma...@linux.vnet.ibm.com
Acked-by: Douglas Gilbert dgilb...@interlog.com

---
 drivers/scsi/scsi_debug.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -2662,8 +2662,8 @@ static void __init sdebug_build_parts(un
   / sdebug_sectors_per;
pp-end_sector = (end_sec % sdebug_sectors_per) + 1;
 
-   pp-start_sect = start_sec;
-   pp-nr_sects = end_sec - start_sec + 1;
+   pp-start_sect = cpu_to_le32(start_sec);
+   pp-nr_sects = cpu_to_le32(end_sec - start_sec + 1);
pp-sys_ind = 0x83; /* plain Linux partition */
}
 }


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to fix this sleep-inside-lock problem?

2013-06-05 Thread Martin Peschke
Hi,

I would like to ask for advice, prior to submitting a patch for our lldd
zfcp, or alternatively for common code.

Someone reported this warning and function call stack:

BUG: sleeping function called from invalid context at kernel/workqueue.c:2752
in_atomic(): 1, irqs_disabled(): 1, pid: 360, name: zfcperp0.0.1700
CPU: 1 Not tainted 3.9.3+ #69
Process zfcperp0.0.1700 (pid: 360, task: 75b7e080, ksp: 
7476bc30)
snip
Call Trace:
([001165de] show_trace+0x106/0x154)
 [001166a0] show_stack+0x74/0xf4
 [006ff646] dump_stack+0xc6/0xd4
 [0017f3a0] __might_sleep+0x128/0x148
 [0015ece8] flush_work+0x54/0x1f8
 [001630de] __cancel_work_timer+0xc6/0x128
 [005067ac] scsi_device_dev_release_usercontext+0x164/0x23c
 [00161816] execute_in_process_context+0x96/0xa8
 [004d33d8] device_release+0x60/0xc0
 [0048af48] kobject_release+0xa8/0x1c4
 [004f4bf2] __scsi_iterate_devices+0xfa/0x130
 [03ff801b307a] zfcp_erp_strategy+0x4da/0x1014 [zfcp]
 [03ff801b3caa] zfcp_erp_thread+0xf6/0x2b0 [zfcp]
 [0016b75a] kthread+0xf2/0xfc
 [0070c9de] kernel_thread_starter+0x6/0xc
 [0070c9d8] kernel_thread_starter+0x0/0xc

Apparently, the ref_count for some scsi_device drops down to zero,
triggering device removal through execute_in_process_context(), while
the lldd error recovery thread iterates through a scsi device list.
Unfortunately, execute_in_process_context() decides to immediately
execute that device removal function, instead of scheduling asynchronous
execution, since it detects process context and thinks it is safe to do
so. But almost all calls to shost_for_each_device() in our lldd are
inside spin_lock_irq, even in thread context. Obviously, schedule()
inside spin_lock_irq sections is a bad idea.

Does it make sense to teach execute_in_process_context() to run a
function asynchronously when irqs are disabled, as inside a
spin_lock_irq section? If so, is the addition of an irqs_disabled()
check sufficient?

Or is it preferable to change the lldd to use __shost_for_each_device()
with shost-host_lock? (hoping that doesn't result in locking order
issues with host_lock inside our lldd erp_lock...)

Other suggestions?

The problem has been introduced when our LUN related data was moved to
the midlayer (commit b62a8d9b45b971a67a0f8413338c230e3117dff5) back in
2010.


Thanks,
Martin

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


PATCH] scsi_debug: Fix endianess in partition table

2013-04-26 Thread Martin Peschke
James,
could you pick up this fix, preferably for 3.10, please?

I am not the only one you has run into the issue:
https://nazar.karan.org/blob/distro!
parted.git/8780c767126938173f49dfbcd4360813932f7756/SOURCES!
disable-t9020.patch

Thanks,
Martin




Both start_sect and nr_sects in struct partition are __le32 and
require cpu_to_le32() on assignment.

Without this fix tools like fdisk show an invalid partition table
for SCSI devices emulated by scsi_debug on big-endian architectures,
like s390x. Besides a kernel message like this was emitted:

sda: p1 start 536870912 is beyond EOD, enabling native capacity
sda: p1 start 536870912 is beyond EOD, truncated

For verification 'xxd -l 512 /dev/sda' has been used to make sure
that this fix makes scsi_debug generated partition tables on s390x
look like the ones generated on my laptop.

Signed-off-by: Martin Peschke mpesc...@linux.vnet.ibm.com
Reviewed-by: Steffen Maier ma...@linux.vnet.ibm.com
Acked-by: Douglas Gilbert dgilb...@interlog.com

---
 drivers/scsi/scsi_debug.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -2662,8 +2662,8 @@ static void __init sdebug_build_parts(un
   / sdebug_sectors_per;
pp-end_sector = (end_sec % sdebug_sectors_per) + 1;
 
-   pp-start_sect = start_sec;
-   pp-nr_sects = end_sec - start_sec + 1;
+   pp-start_sect = cpu_to_le32(start_sec);
+   pp-nr_sects = cpu_to_le32(end_sec - start_sec + 1);
pp-sys_ind = 0x83; /* plain Linux partition */
}
 }


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] scsi_debug: Fix endianess in partition table

2013-02-15 Thread Martin Peschke
James,
could you pick up this fix, please?

Btw., it looks like others where bitten by this one:
https://nazar.karan.org/blob/distro!
parted.git/8780c767126938173f49dfbcd4360813932f7756/SOURCES!
disable-t9020.patch

Thanks,
Martin





Both start_sect and nr_sects in struct partition are __le32 and
require cpu_to_le32() on assignment.

Without this fix tools like fdisk show an invalid partition table
for SCSI devices emulated by scsi_debug on big-endian architectures,
like s390x. Besides a kernel message like this was emitted:

sda: p1 start 536870912 is beyond EOD, enabling native capacity
sda: p1 start 536870912 is beyond EOD, truncated

For verification 'xxd -l 512 /dev/sda' has been used to make sure
that this fix makes scsi_debug generated partition tables on s390x
look like the ones generated on my laptop.

Signed-off-by: Martin Peschke mpesc...@linux.vnet.ibm.com
Reviewed-by: Steffen Maier ma...@linux.vnet.ibm.com
Acked-by: Douglas Gilbert dgilb...@interlog.com

---
 drivers/scsi/scsi_debug.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -2662,8 +2662,8 @@ static void __init sdebug_build_parts(un
   / sdebug_sectors_per;
pp-end_sector = (end_sec % sdebug_sectors_per) + 1;
 
-   pp-start_sect = start_sec;
-   pp-nr_sects = end_sec - start_sec + 1;
+   pp-start_sect = cpu_to_le32(start_sec);
+   pp-nr_sects = cpu_to_le32(end_sec - start_sec + 1);
pp-sys_ind = 0x83; /* plain Linux partition */
}
 }


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_debug: Fix endianess in partition table

2013-02-13 Thread Martin Peschke
On Tue, 2013-02-12 at 09:45 -0500, Douglas Gilbert wrote:
 However since SCSI is big endian and you are introducing
 some le code then a line or so of explanation (comments)
 in your revised patch might be helpful.

I would argue that the definition of struct partition itself should be
sufficient explanation. Only scsi_debug failed to assign values in a
correct manner. 

 BTW Finding a big endian architecture to test this patch on is
 not easy.

No big deal ;-) I can help out with test data from my System z (aka
s390x).


Without fix:

[root@ ~]# xxd -l 512 /dev/sda
snip
1b0:        0001  
1c0: 0100 8307 203f  0020  3fe0    ?... ..?...
1d0:          
1e0:          
1f0:        55aa  ..U.


With fix:

[root@ ~]# xxd -l 512 /dev/sda
snip
1b0:        0001  
1c0: 0100 8307 203f 2000  e03f     ? ?
1d0:          
1e0:          
1f0:        55aa  ..U.


Which makes it exactly look like the table seen on my laptop:

root@:~# xxd -l 512 /dev/sdb
snip
1b0:        0001  
1c0: 0100 8307 203f 2000  e03f     ? ?
1d0:          
1e0:          
1f0:        55aa  ..U.


Assuming:
modprobe scsi_debug physblk_exp=3 lowest_aligned=7 num_parts=1

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_debug: Fix endianess in partition table

2013-02-12 Thread Martin Peschke
On Mon, 2013-02-11 at 18:34 +0100, Martin Peschke wrote:
 Both start_sect and nr_sects in struct partition are __le32 and
 require cpu_to_le32() on assignment.

Steffen Maier has pointed me at:

block/partitions/msdos.c:   return
(sector_t)get_unaligned_le32(p-start_sect);

Unfortunately, both get_unaligned_le32() and le32_to_cpu() appear to be
in use for start_sect and nr_sects.

Any one who would argue for changing my patch from cpu_to_le32 to
put_unaligned_le32()?

Thanks,
Martin



 
 Without this fix tools like fdisk show an invalid partition table
 for SCSI devices emulated by scsi_debug on big-endian architectures,
 like s390x. Besides a kernel message like this was emitted:
 
 sda: p1 start 536870912 is beyond EOD, enabling native capacity
 sda: p1 start 536870912 is beyond EOD, truncated
 
 For verification 'xxd -l 512 /dev/sda' has been used to make sure
 that this fix makes scsi_debug generated partition tables on s390x
 look like the ones generated on my laptop.
 
 Signed-off-by: Martin Peschke mpesc...@linux.vnet.ibm.com
 Reviewed-by: Steffen Maier ma...@linux.vnet.ibm.com
 
 ---
  drivers/scsi/scsi_debug.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 --- a/drivers/scsi/scsi_debug.c
 +++ b/drivers/scsi/scsi_debug.c
 @@ -2662,8 +2662,8 @@ static void __init sdebug_build_parts(un
  / sdebug_sectors_per;
   pp-end_sector = (end_sec % sdebug_sectors_per) + 1;
 
 - pp-start_sect = start_sec;
 - pp-nr_sects = end_sec - start_sec + 1;
 + pp-start_sect = cpu_to_le32(start_sec);
 + pp-nr_sects = cpu_to_le32(end_sec - start_sec + 1);
   pp-sys_ind = 0x83; /* plain Linux partition */
   }
  }
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] scsi_debug: Fix endianess in partition table

2013-02-11 Thread Martin Peschke
Both start_sect and nr_sects in struct partition are __le32 and
require cpu_to_le32() on assignment.

Without this fix tools like fdisk show an invalid partition table
for SCSI devices emulated by scsi_debug on big-endian architectures,
like s390x. Besides a kernel message like this was emitted:

sda: p1 start 536870912 is beyond EOD, enabling native capacity
sda: p1 start 536870912 is beyond EOD, truncated

For verification 'xxd -l 512 /dev/sda' has been used to make sure
that this fix makes scsi_debug generated partition tables on s390x
look like the ones generated on my laptop.

Signed-off-by: Martin Peschke mpesc...@linux.vnet.ibm.com
Reviewed-by: Steffen Maier ma...@linux.vnet.ibm.com

---
 drivers/scsi/scsi_debug.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -2662,8 +2662,8 @@ static void __init sdebug_build_parts(un
   / sdebug_sectors_per;
pp-end_sector = (end_sec % sdebug_sectors_per) + 1;
 
-   pp-start_sect = start_sec;
-   pp-nr_sects = end_sec - start_sec + 1;
+   pp-start_sect = cpu_to_le32(start_sec);
+   pp-nr_sects = cpu_to_le32(end_sec - start_sec + 1);
pp-sys_ind = 0x83; /* plain Linux partition */
}
 }


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2][RFC] scsi_transport_fc: Implement I_T nexus reset

2012-12-11 Thread Martin Peschke
Hello Hannes,

 fc_eh_it_nexus_loss_handler() is invoked as the
 eh_target_reset_handler() callback and the
 eh_bus_reset_handler() is removed.

lpfc_target_reset_handler(), which is replaced by your patch, used to
issue a TARGET_RESET task management function over FCP in the
eh_target_reset_handler() callback. What's wrong with that?

Martin


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 0/2] zfcp: bug fixes for error recovery

2007-11-15 Thread Martin Peschke
I am posting 2 bug fixes which are required for the zfcp error recovery
to work correctly. It takes some error injection tests to strain the
zfcp error recovery to the extend which triggers both bugs. The bugs are
related to recovery actions which need to be dismissed because some
stronger action has been requested. We have verified both patches
running our error injection tests. Patches are against 2.6.24-rc2-git5.
Please apply.

[Patch 1/2] zfcp: fix dismissal of error recovery actions
[Patch 2/2] zfcp: fix cleanup of dismissed error recovery actions

Signed-off-by: Martin Peschke [EMAIL PROTECTED]
Acked-by: Swen Schillig [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch 2/2] zfcp: fix cleanup of dismissed error recovery actions

2007-11-15 Thread Martin Peschke
Calling zfcp_erp_strategy_check_action() after zfcp_erp_action_to_running()
in zfcp_erp_strategy() might cause an unbalanced up() for erp_ready_sem,
which makes the zfcp recovery fail somewhere along the way:


erp thread processing erp_action:
|
|   someone waking up erp thread for erp_action
|   |
|   |   someone else dismissing erp_action:
|   |   |
V   V   V

write_lock_irqsave(adapter-erp_lock, flags);
...
if (zfcp_erp_action_exists(erp_action) == ZFCP_ERP_ACTION_RUNNING) {
zfcp_erp_action_to_ready(erp_action);
up(adapter-erp_ready_sem);/* first up() for erp_action */
}
write_unlock_irqrestore(adapter-erp_lock, flags);


write_lock_irqsave(adapter-erp_lock, flags);
...
zfcp_erp_action_to_running(erp_action);
write_unlock_restore(adapter-erp_lock, flags);
/* processing erp_action */


write_lock_irqsave(adapter-erp_lock, flags);
...
erp_action-status |= ZFCP_STATUS_ERP_DISMISSED;
if (zfcp_erp_action_exists(erp_action) ==
ZFCP_ERP_ACTION_RUNNING) {
zfcp_erp_action_to_ready(erp_action);
up(adapter-erp_ready_sem);
/* second, unbalanced up() for erp_action */
}
...
write_unlock_restore(adapter-erp_lock, flags);


write_lock_irqsave(adapter-erp_lock, flags);
if (erp_action-status  ZFCP_STATUS_ERP_DISMISSED) {
zfcp_erp_action_dequeue(erp_action);
retval = ZFCP_ERP_DISMISSED;
}
...
write_unlock_restore(adapter-erp_lock, flags);
down(adapter-erp_ready_sem);
/* this down() is meant to balance the first up() */


The erp thread must not dismiss an erp_action after moving that action to
erp_running_head. Instead it should just go through the down() operation,
which balances the first up(), and run through zfcp_erp_strategy one more
time for the second up(), which eventually cleans up erp_action. Which
is similar to the normal processing of an event for erp_action doing
something asynchronously (e.g. waiting for the completion of an fsf_req).

This only works if we make sure that a dismissed erp_action is passed to
zfcp_erp_strategy() prior to the other action, which caused actions to be
dismissed. Therefore the patch implements this rule: running actions go to
the head of the ready list; new actions go to the tail of the ready list;
the erp thread picks actions to be processed from the ready list's head.

Signed-off-by: Martin Peschke [EMAIL PROTECTED]
Acked-by: Swen Schillig [EMAIL PROTECTED]

---
 drivers/s390/scsi/zfcp_erp.c |   14 ++
 1 files changed, 6 insertions(+), 8 deletions(-)

--- linux-2.6.24-rc2-git5.orig/drivers/s390/scsi/zfcp_erp.c
+++ linux-2.6.24-rc2-git5/drivers/s390/scsi/zfcp_erp.c
@@ -1065,7 +1065,7 @@ zfcp_erp_thread(void *data)
 adapter-status)) {
 
write_lock_irqsave(adapter-erp_lock, flags);
-   next = adapter-erp_ready_head.prev;
+   next = adapter-erp_ready_head.next;
write_unlock_irqrestore(adapter-erp_lock, flags);
 
if (next != adapter-erp_ready_head) {
@@ -1155,15 +1155,13 @@ zfcp_erp_strategy(struct zfcp_erp_action
 
/*
 * check for dismissed status again to avoid follow-up actions,
-* failing of targets and so on for dismissed actions
+* failing of targets and so on for dismissed actions,
+* we go through down() here because there has been an up()
 */
-   retval = zfcp_erp_strategy_check_action(erp_action, retval);
+   if (erp_action-status  ZFCP_STATUS_ERP_DISMISSED)
+   retval = ZFCP_ERP_CONTINUES;
 
switch (retval) {
-   case ZFCP_ERP_DISMISSED:
-   /* leave since this action has ridden to its ancestors */
-   debug_text_event(adapter-erp_dbf, 6, a_st_dis2);
-   goto unlock;
case ZFCP_ERP_NOMEM:
/* no memory to continue immediately, let it sleep */
if (!(erp_action-status  ZFCP_STATUS_ERP_LOWMEM)) {
@@ -3091,7 +3089,7 @@ zfcp_erp_action_enqueue(int action,
++adapter-erp_total_count;
 
/* finally put it into 'ready' queue and kick erp thread */
-   list_add(erp_action-list, adapter-erp_ready_head);
+   list_add_tail(erp_action-list, adapter-erp_ready_head);
up(adapter-erp_ready_sem);
retval = 0;
  out:


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] zfcp: zfcp: add statistics and other zfcp relatedinformation to sysfs

2007-09-07 Thread Martin Peschke

I am afraid this patch can't be applied as it is.

The problem is that the code doesn't take into account that older adapters do 
not provide the data which your patch tries to extract. There is a feature flag 
for that in the hardware specification. So the right response to Heiko's 
complaint about an unused feature flag wasn't to remove the flag definition... :-)


Is this patch against 2.6.23-rc or something? It doesn't apply without some 
minor noise. Besides there are some HEADs below, which makes me think this is 
against some internal CVS revision...


For more nit-picking please see below.

Thanks,
Martin

Swen Schillig wrote:

From: Swen Schillig [EMAIL PROTECTED]

add statistics and other zfcp related information to sysfs

The zfcp adapter provides a variety of information which was never
published to the external world. This patch adds a few of those statistics
to the sysfs tree structure.

These are reflected in the following attributes

/sys/bus/ccw/drivers/zfcp/0.0.1707 
/* information for the virtual adapter */
statistic_services/ 
requests
megabytes

utilization
seconds_active
/* LUN specific info for channel- and fabric-latency */
0x500507630300c562/0x401040a6/  
read_latency

write_latency
cmd_latency

Signed-off-by: Swen Schillig [EMAIL PROTECTED]
---
 drivers/s390/scsi/Makefile|2 
 drivers/s390/scsi/zfcp_aux.c  |   30 -

 drivers/s390/scsi/zfcp_def.h  |   14 ++
 drivers/s390/scsi/zfcp_ext.h  |2 
 drivers/s390/scsi/zfcp_fsf.c  |1 
 drivers/s390/scsi/zfcp_fsf.h  |   28 -

 drivers/s390/scsi/zfcp_qdio.c |   39 +++
 drivers/s390/scsi/zfcp_sysfs_statistics.c |  162 ++
 drivers/s390/scsi/zfcp_sysfs_unit.c   |   63 +++
 9 files changed, 333 insertions(+), 8 deletions(-)

Index: HEAD/drivers/s390/scsi/zfcp_def.h
===
--- HEAD.orig/drivers/s390/scsi/zfcp_def.h
+++ HEAD/drivers/s390/scsi/zfcp_def.h
@@ -868,6 +868,17 @@ struct zfcp_erp_action {
struct timer_list timer;
 };

+struct latency_cont {
+   u32 channel;
+   u32 fabric;
+   u32 counter;
+};


Are you sure you want just 32 bits for sums? Given the unit used (hw ticks) it 
overruns quickly.



+
+struct zfcp_latencies {
+   struct latency_cont read;
+   struct latency_cont write;
+   struct latency_cont cmd;
+};

 struct zfcp_adapter {
struct list_headlist;  /* list of adapters */
@@ -883,6 +894,7 @@ struct zfcp_adapter {
u32 adapter_features;  /* FCP channel features */
u32 connection_features; /* host connection 
features */
 u32hardware_version;  /* of FCP channel */
+   u16 timer_ticks;   /* time int for a tick */
struct Scsi_Host*scsi_host;/* Pointer to mid-layer */
struct list_headport_list_head;/* remote port list */
struct list_headport_remove_lh;/* head of ports to be
@@ -930,6 +942,7 @@ struct zfcp_adapter {
struct zfcp_scsi_dbf_record scsi_dbf_buf;
struct zfcp_adapter_mempool pool;  /* Adapter memory pools */
struct qdio_initialize  qdio_init_data;/* for qdio_establish */
+   struct device   stat_services;


After my fancy, stat_services is not a good choice. Shorten it; statistics is 
descriptive enough.


In case you have been inspired by generic_services below: generic_services 
refers to Generic Services (GS) as defined by FC standard documents.



struct device   generic_services;  /* directory for WKA ports */
struct fc_host_statistics *fc_stats;
struct fsf_qtcb_bottom_port *stats_reset_data;
@@ -986,6 +999,7 @@ struct zfcp_unit {
  all scsi_scan_target
  requests have been
  completed. */
+   struct zfcp_latencies   latencies;
 };

 /* FSF request */
Index: HEAD/drivers/s390/scsi/zfcp_fsf.c
===
--- HEAD.orig/drivers/s390/scsi/zfcp_fsf.c
+++ HEAD/drivers/s390/scsi/zfcp_fsf.c
@@ -2079,6 +2079,7 @@ zfcp_fsf_exchange_config_evaluate(struct
fc_host_supported_classes(shost) =
FC_COS_CLASS2 | FC_COS_CLASS3;
adapter-hydra_version = bottom-adapter_type;
+   adapter-timer_ticks = bottom-timer_interval;
if (fc_host_permanent_port_name(shost) == -1)
fc_host_permanent_port_name(shost) =

Re: [PATCH, RFC] SCSI cleanups for 2.5

2001-01-18 Thread Martin Peschke



On Tue, 16 Jan 2001, Prasenjit Sarkar wrote:

 
 
 
 Well, WWN may be either WWPN (port name) or WWNN (node name).
 According to the mapping of targetID and D_ID, WWPN seems to be the right
 choice.
 
  Of course, some devices allow access to a given storage through
  different nodes as well as different ports.  There are a vendor
 
 an example:
 http://www.storage.ibm.com/hardsoft/products/ess/support/essfcwp.pdf
 
 
  unique methods of determining when that is the case, unfortunately.
  I presume such devices are beyond the scope of this proposal.
 
 Do you know more about such methods?
 
 -
 
 Actually, the SCSI INQUIRY command with EVPD bit on and page code 83 should
 return the VPID of every LU -- this command is used by some SCSI initiators
 and most multipathing drivers to determine whether the same LU has been
 accessed through different paths.
 
 You can find more details in SPC-2 (or SPC-3).

I am aware of these drafts or standards. But I can't find VPID.
I know VPD (Vital Product Data).

Regards
Martin Peschke

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]