[LSF/MM TOPIC][LSF/MM ATTEND] blk-mq and I/O scheduling

2016-02-26 Thread Andreas Herrmann
Hi,

I'd like to participate in LSF/MM and would like to present/discuss
ideas for introducing I/O scheduling support to blk-mq.

Motiviation for this is to be able use scsi-mq even on systems that
have slow (spinning) devices attached to the SCSI stack.

I think the presentation/discussion should consist of the following

(1) short overview how blk-mq currently performs with spinning devices
(in comparison to CFQ)

(2) information about my attempt to introduce per sw queue time-slice
to blk-mq to mitigate the performance degradation with spinning
devices (in comparison to CFQ)

(3) hopefully I can share information about working code for another
approach to introduce I/O scheduling for blk-mq (which I am
currently looking into)

(4) other ideas (e.g. toggle blk-mq per host and/or why we won't look
into it)


Thanks,

Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2] blk-mq: Introduce per sw queue time-slice

2016-02-10 Thread Andreas Herrmann
On Wed, Feb 10, 2016 at 08:47:15PM +0100, Markus Trippelsdorf wrote:
> On 2016.02.10 at 20:34 +0100, Andreas Herrmann wrote:
> > On Tue, Feb 09, 2016 at 06:41:56PM +0100, Markus Trippelsdorf wrote:
> > > > Recently Johannes sent a patch to enable scsi-mq per driver, see
> > > > http://marc.info/?l=linux-scsi&m=145347009631192&w=2
> > > > 
> > > > Probably that is a good solution (at least in the short term) to allow
> > > > users to switch to blk-mq for some host adapters (with fast storage
> > > > attached) but to stick to legacy stuff on other host adapters with
> > > > rotary devices.
> > > 
> > > I don't think that Johannes' patch is a good solution.
> > 
> > Why? Because it's not per device?
> 
> Yes. Like Christoph said in his reply to the patch: »The host is simply
> the wrong place to decide these things.«
> 
> > > The best solution for the user would be if blk-mq could be toggled
> > > per drive (or even automatically enabled if queue/rotational == 0).
> > 
> > Yes, I aggree, but ...
> > 
> > > Is there a fundamental reason why this is not feasible?
> > 
> > ... it's not possible (*) with the current implementation.
> > 
> > Tag handling/command allocation differs. Respective functions are set
> > per host.
> > 
> > (*) Or maybe it's possible but just hard to achieve and I didn't look
> > long enough into relevant code to get an idea how to do it.
> > 
> > > Your solution is better than nothing, but it requires that the user
> > > finds out the drive <=> host mapping by hand and then runs something
> > > like: 
> > > echo "250" > 
> > > /sys/devices/pci:00/:00:11.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/mq/0/time_slice_us
> > > during boot for spinning rust drives...
> > 
> > Or it could automatically be set in case of rotational device.
> > (Once we know for sure that it doesn't cause performance degradation.)
> 
> Yes, this sound like a good idea.
> 
> But, if I understand things correctly, your patch is only an interim
> solution until proper I/O scheduler support gets implemented for blk-mq, no?

That's to be discussed. (Hence the RFC)

My (potentially wrong) claims are

- I don't think that fast storage (e.g. SSDs) requires I/O scheduler
  support with blk-mq. blk-mq is very good at pushing a large number
  of requests from per CPU sw queues to hw queue(s). Why then
  introduce any overhead for I/O scheduler support?

- Slow storage (e.g. spinning drives) is fine with the old code which
  provides scheduler support and I doubt that there is any benefit for
  those devices when switching to blk-mq.

- The big hammer (scsi_mod.use_blk_mq) for the entire scsi stack to
  decide what to use is suboptimal. You can't have optimal performance
  when you have both slow and fast storage devices in your system.

I doubt that it is possible to add I/O scheduling support to blk-mq
which can be on par with what CFQ is able to achieve for slow devices
at the moment.

Requests are scattered among per-CPU software queues (and almost
instantly passed to hardware queue(s)). Due to CPU scheduling,
requests initiated from one process might come down via different
software queues. What is an efficient way to sort/merge requests from
all the software queues in such a way that the result is comparable to
what CFQ does (assuming that CFQ provides optimal performance)? So far
I didn't find a solution to this problem. (I just have this patch
which adds not too much overhead and improves the situation a little
bit.)

Maybe the solution is to avoid per-CPU queues for slow storage and
fall back to a set of queues comparable to what CFQ uses.

One way to do this is by falling back to non-blk-mq code and direct
use of CFQ.

Code that allows to select blk-mq per host would help to some
extent. But when you have both device types connected to the same host
adapter it doesn't help either.


Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2] blk-mq: Introduce per sw queue time-slice

2016-02-10 Thread Andreas Herrmann
On Tue, Feb 09, 2016 at 06:41:56PM +0100, Markus Trippelsdorf wrote:
> On 2016.02.09 at 18:12 +0100, Andreas Herrmann wrote:
> > [CC-ing linux-block and linux-scsi and adding some comments]
> > 
> > On Mon, Feb 01, 2016 at 11:43:40PM +0100, Andreas Herrmann wrote:
> > > This introduces a new blk_mq hw attribute time_slice_us which allows
> > > to specify a time slice in usecs.
> > > 
> > > Fio test results are sent in a separate mail to this.
> > 
> > See http://marc.info/?l=linux-kernel&m=145436682607949&w=2
> > 
> > In short it shows significant performance gains in some tests,
> > e.g. sequential read iops up by >40% with 8 jobs. But it's never on
> > par with CFQ when more than 1 job was used during the test.
> > 
> > > Results for fio improved to some extent with this patch. But in
> > > reality the picture is quite mixed. Performance is highly dependend on
> > > task scheduling. There is no guarantee that the requests originated
> > > from one CPU belong to the same process.
> > > 
> > > I think for rotary devices CFQ is by far the best choice. A simple
> > > illustration is:
> > > 
> > >   Copying two files (750MB in this case) in parallel on a rotary
> > >   device. The elapsed wall clock time (seconds) for this is
> > >meanstdev
> > >cfq, slice_idle=8   16.18   4.95
> > >cfq, slice_idle=0   23.74   2.82
> > >blk-mq, time_slice_usec=0   24.37   2.05
> > >blk-mq, time_slice_usec=250 25.58   3.16
> > 
> > This illustrates that although their was performance gain with fio
> > tests, the patch can cause higher variance and lower performance in
> > comparison to unmodified blk-mq with other tests. And it underscores
> > superiority of CFQ for rotary disks.
> > 
> > Meanwhile my opinion is that it's not really worth to look further
> > into introduction of I/O scheduling support in blk-mq. I don't see the
> > need for scheduling support (deadline or something else) for fast
> > storage devices. And rotary devices should really avoid usage of blk-mq
> > and stick to CFQ.
> > 
> > Thus I think that introducing some coexistence of blk-mq and the
> > legacy block with CFQ is the best option.
> > 
> > Recently Johannes sent a patch to enable scsi-mq per driver, see
> > http://marc.info/?l=linux-scsi&m=145347009631192&w=2
> > 
> > Probably that is a good solution (at least in the short term) to allow
> > users to switch to blk-mq for some host adapters (with fast storage
> > attached) but to stick to legacy stuff on other host adapters with
> > rotary devices.
> 
> I don't think that Johannes' patch is a good solution.

Why? Because it's not per device?

> The best solution for the user would be if blk-mq could be toggled
> per drive (or even automatically enabled if queue/rotational == 0).

Yes, I aggree, but ...

> Is there a fundamental reason why this is not feasible?

... it's not possible (*) with the current implementation.

Tag handling/command allocation differs. Respective functions are set
per host.

(*) Or maybe it's possible but just hard to achieve and I didn't look
long enough into relevant code to get an idea how to do it.

> Your solution is better than nothing, but it requires that the user
> finds out the drive <=> host mapping by hand and then runs something
> like: 
> echo "250" > 
> /sys/devices/pci:00/:00:11.0/ata2/host1/target1:0:0/1:0:0:0/block/sdb/mq/0/time_slice_us
> during boot for spinning rust drives...

Or it could automatically be set in case of rotational device.
(Once we know for sure that it doesn't cause performance degradation.)

> -- 
> Markus

Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2] blk-mq: Introduce per sw queue time-slice

2016-02-09 Thread Andreas Herrmann
[CC-ing linux-block and linux-scsi and adding some comments]

On Mon, Feb 01, 2016 at 11:43:40PM +0100, Andreas Herrmann wrote:
> This introduces a new blk_mq hw attribute time_slice_us which allows
> to specify a time slice in usecs.
> 
> Default value is 0 and implies no modification to blk-mq behaviour.
> 
> A positive value changes blk-mq to service only one software queue
> within this time slice until it expires or the software queue is
> empty. Then the next software queue with pending requests is selected.
> 
> Signed-off-by: Andreas Herrmann 
> ---
>  block/blk-mq-sysfs.c   |  27 +++
>  block/blk-mq.c | 208 
> +
>  include/linux/blk-mq.h |   9 +++
>  3 files changed, 211 insertions(+), 33 deletions(-)
> 
> Hi,
> 
> This update is long overdue (sorry for the delay).
> 
> Change to v1:
> - time slice is now specified in usecs instead of msecs.
> - time slice is extended (up to 3 times the initial value) when there
>   was actually a request to be serviced for the software queue
> 
> Fio test results are sent in a separate mail to this.

See http://marc.info/?l=linux-kernel&m=145436682607949&w=2

In short it shows significant performance gains in some tests,
e.g. sequential read iops up by >40% with 8 jobs. But it's never on
par with CFQ when more than 1 job was used during the test.

> Results for fio improved to some extent with this patch. But in
> reality the picture is quite mixed. Performance is highly dependend on
> task scheduling. There is no guarantee that the requests originated
> from one CPU belong to the same process.
> 
> I think for rotary devices CFQ is by far the best choice. A simple
> illustration is:
> 
>   Copying two files (750MB in this case) in parallel on a rotary
>   device. The elapsed wall clock time (seconds) for this is
>meanstdev
>cfq, slice_idle=8   16.18   4.95
>cfq, slice_idle=0   23.74   2.82
>blk-mq, time_slice_usec=0   24.37   2.05
>blk-mq, time_slice_usec=250 25.58   3.16

This illustrates that although their was performance gain with fio
tests, the patch can cause higher variance and lower performance in
comparison to unmodified blk-mq with other tests. And it underscores
superiority of CFQ for rotary disks.

Meanwhile my opinion is that it's not really worth to look further
into introduction of I/O scheduling support in blk-mq. I don't see the
need for scheduling support (deadline or something else) for fast
storage devices. And rotary devices should really avoid usage of blk-mq
and stick to CFQ.

Thus I think that introducing some coexistence of blk-mq and the
legacy block with CFQ is the best option.

Recently Johannes sent a patch to enable scsi-mq per driver, see
http://marc.info/?l=linux-scsi&m=145347009631192&w=2

Probably that is a good solution (at least in the short term) to allow
users to switch to blk-mq for some host adapters (with fast storage
attached) but to stick to legacy stuff on other host adapters with
rotary devices.

What do others think?


Thanks,

Andreas


> Regards,
> 
> Andreas
> 
> diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c
> index 1cf1878..77c875c 100644
> --- a/block/blk-mq-sysfs.c
> +++ b/block/blk-mq-sysfs.c
> @@ -247,6 +247,26 @@ static ssize_t blk_mq_hw_sysfs_cpus_show(struct 
> blk_mq_hw_ctx *hctx, char *page)
>   return ret;
>  }
>  
> +static ssize_t blk_mq_hw_sysfs_tslice_show(struct blk_mq_hw_ctx *hctx,
> +   char *page)
> +{
> + return sprintf(page, "%u\n", hctx->tslice_us);
> +}
> +
> +static ssize_t blk_mq_hw_sysfs_tslice_store(struct blk_mq_hw_ctx *hctx,
> + const char *page, size_t length)
> +{
> + unsigned long long store;
> + int err;
> +
> + err = kstrtoull(page, 10, &store);
> + if (err)
> + return -EINVAL;
> +
> + hctx->tslice_us = (unsigned)store;
> + return length;
> +}
> +
>  static struct blk_mq_ctx_sysfs_entry blk_mq_sysfs_dispatched = {
>   .attr = {.name = "dispatched", .mode = S_IRUGO },
>   .show = blk_mq_sysfs_dispatched_show,
> @@ -305,6 +325,12 @@ static struct blk_mq_hw_ctx_sysfs_entry 
> blk_mq_hw_sysfs_poll = {
>   .show = blk_mq_hw_sysfs_poll_show,
>  };
>  
> +static struct blk_mq_hw_ctx_sysfs_entry blk_mq_hw_sysfs_tslice = {
> + .attr = {.name = "time_slice_us", .mode = S_IRUGO | S_IWUSR },
> + .show = blk_mq_hw_sysfs_tslice_show,
> + .store = blk_mq_hw_sysfs_tslice_store,
> +};
> +
>  static struct attribute *default_hw_ctx_attrs[] = {
>   &blk_mq_hw_sysfs_queue

Re: Documents for SCSI mid layer

2007-04-05 Thread Andreas Herrmann
On Fri, Apr 06, 2007 at 03:30:52AM +0200, mahesh wrote:
> Where can I find documents for Linux SCSI mid layer

How about

  Documentation/scsi/scsi_mid_low_api.txt
  Documentation/scsi/scsi_eh.txt

and Documentation/scsi in general as a start?


Regards,

Andreas

-- 
AMD Saxony, Dresden, Germany
Operating System Research Center



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] zfcp: Invalid locking order

2007-02-08 Thread Andreas Herrmann
On Thu, Feb 08, 2007 at 03:40:03PM +0100, Heiko Carstens wrote:
> On Wed, Feb 07, 2007 at 05:06:43PM +0100, Andreas Herrmann wrote:
> > On Wed, Feb 07, 2007 at 01:17:57PM +0100, Swen Schillig wrote:
> > > From: Swen Schillig <[EMAIL PROTECTED]>
> > > 
> > > Invalid locking order. Kernel hangs after trying to take two locks
> > > which are dependend on each other. Introducing temporary variable
> > > to free requests. Free lock after requests are copied.
> > > 
> > 
> > I am just curious. You didn't mention which locks are causing the dead
> > lock.
> > 
> > I've glanced through the code and it seems that locking order
> > of abort_lock and req_list_lock for adapters is inconsistent.
> > Is that the bug you try to fix?
> 
> It's a possible A-B-B-A deadlock on the erp_lock and req_list_lock
> (see output below).  The bug was introduced with
> fea9d6c7bcd8ff1d60ff74f27ba483b3820b18a3 and this patch reverts
> parts of it.
> 

Pretty good catch.

I might be wrong but the patch seems to fix another potential deadlock:


CPU#0
0: zfcp_fsf_req_dimiss_all()
spin_lock_irqsave(&adapter->req_list_lock, flags);
1: zfcp_fsf_req_dismiss()
2: zfcp_fsf_protstatus_eval()
3: zfcp_fsf_fsfstatus_eval()
4: zfcp_fsf_req_dispatch()
5: zfcp_fsf_send_fcp_command_handler()
6: zfcp_fsf_send_fcp_command_task_handler()
read_lock_irqsave(&fsf_req->adapter->abort_lock, flags);

CPU#1
0: zfcp_scsi_eh_abort_handler()
write_lock_irqsave(&adapter->abort_lock, flags);
spin_lock(&adapter->req_list_lock);

But I currently do not overlook whether zfcp_fsf_req_dimiss_all()
and zfcp_scsi_eh_abort_handler() can run simultaneously for the
same adapter.

However that may be, with Swen's patch this case won't happen.


Regards,

Andreas

-- 
AMD Saxony, Dresden, Germany
Operating System Research Center



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] zfcp: Invalid locking order

2007-02-07 Thread Andreas Herrmann
On Wed, Feb 07, 2007 at 01:17:57PM +0100, Swen Schillig wrote:
> From: Swen Schillig <[EMAIL PROTECTED]>
> 
> Invalid locking order. Kernel hangs after trying to take two locks
> which are dependend on each other. Introducing temporary variable
> to free requests. Free lock after requests are copied.
> 

I am just curious. You didn't mention which locks are causing the dead
lock.

I've glanced through the code and it seems that locking order
of abort_lock and req_list_lock for adapters is inconsistent.
Is that the bug you try to fix?


Regards,

Andreas



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] zfcp: introduce eh_timed_out handler

2005-09-04 Thread Andreas Herrmann

On Sat, Sep 03, 2005 at 14:53:00, Christoph Hellwig wrote:
 > Please don't do this.  We have proper midlayer handling (plus FC
 > transport
 > class wrappers) to handle that case without introducing a big mess in
 > the
 > driver.  Please take a look at fc_remote_port_{block,unblock}

Thanks for that hint.

But does fc_remote_port_unblock set online devices that were
previously offlined?

If not the problem of offlined devises is not solved with
this. If zfcp would block requests if it detects a path error a scsi
request might have already send to the devise which might cause that
the devise if offlined. Now if zfcp detects the path is up again and
calls fc_remote_port_unblock the device can still not be used because
it is offline.

I'll have to check the code if I am back from vacation.


Regards,

Andreas

(PS: sorry for using this alternate email account)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] zfcp: introduce eh_timed_out handler

2005-09-04 Thread Andreas Herrmann

James,

I wished I would have put patch 1/7 at the end of the patch series ;-(
Because if patch 1 is not applied the other patches won't apply without
rejects.

I am on vacation for 1 week and I am not able to recreate the patches
before 12th of September.

Do you see any problems to bring the new features (patches 6/7 and
7/7) into 2.6.14 if I resend patches 2-7 that late?  (I remember there
is a deadline set by Linus for integration of new features into
2.6.14.)


Regards,

Andreas

(PS: sorry for using this alternate email account)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] zfcp: introduce eh_timed_out handler

2005-09-04 Thread Andreas Herrmann

On Sat, 2005-09-03 13:45:01, James Bottomley wrote:
  > But that's not what the patch does.  It short circuits the error
  > handler
  > globally, not just in the cable pulled case.

  > For any error induced timeout, you're going to follow this logic. In
  > particular, if the device itself actually has an issue and genuinely
  > needs to be reset, that's never going to happen.

Ok, I aggree. It is short-sighted to introduce the patch. I was
totally focusing on a multipath setup and the cable pull case.

Now there is still the question how do prevent the SCSI stack from
taking SCSI devices offline if dm-multipath is used.

The target should be to re-enable paths if they come up again.  But
this just works if the SCSI device is online. This is required for
instance by multipathd to succesfully check the paths (e.g. using TUR
checker).

To "short circuit the error handler globally" is wrong. So how about
changing error handling while running
scsi_unjam_host/scsi_eh_ready_devs.  The problem that I observed is
that the timed out scsi command is kept in work_q and not moved to
done_q before scsi_eh_offline_sdevs is called.  How about moving all
scsi commands to done_q if blk_noretry_request(scmd->reqeust) is true
before scsi_eh_offline_sdevs is called, e.g. changing
scsi_eh_ready_devs to something like:

if (!scsi_device_online(...))
   if (!scsi_eh_bus_device_reset(...))
  if (!scsi_eh_bus_reset(...))
 if (!scsi_eh_host_reset(...))
if (!scsi_eh_move_blk_noretry_requests(...))
   scsi_eh_offline_sdevs(...);

or as an alternative perform the move from work_q to done_q in one
(which?) of the reset functions.

  > Is this really what you want to do?

No, I don't.


Regards,

Andreas

PS: sorry for using this alternate email account

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] zfcp: introduce eh_timed_out handler

2005-09-03 Thread Andreas Herrmann
On 03.09.2005 14:56 Arjan van de Ven <[EMAIL PROTECTED]> wrote:

  > > zfcp: introduce eh_timed_out handler
  > > 
  > > This handler is required to avoid offlined SCSI devices in a 
multipath
  > > setup if scsi commands time out on cable pulls lasting longer than 
30
  > > seconds.

  > hmm why is this needed? doesn't the fc transport class do this for you
  > already? If not.. I think it should since it's not unique to your FC
  > driver but common to all..

Sorry, I am not sure whether fc transport does this already. Problem
may be that zfcp does not fully support the fc transport class at the
moment ... ;-(

Problem was that although multipath was used during error injection
tests (long cable pulls) SCSI devices were offlined. This was due to
error escalation in scsi_error.c (see function scsi_eh_ready_devs)
when commands timed out.

By providing an eh_timed_out handler the timed out requests are
returned as fast as possible to the upper layers. zfcp returns
EH_HANDLED for timed-out commands and scsi_times_out calls scsi_done:

if (scmd->device->host->hostt->eh_timed_out)
switch (scmd->device->host->hostt->eh_timed_out(scmd)) {
case EH_HANDLED:
__scsi_done(scmd);
return;

Then dm-multipath has to decide what to do with the request.

With this patch and a proper multipath configuration
(e.g. queue_if_no_path option) Linux survived hard stress tests
with different kind of error injects (fimware updates, cable pulls
etc.) without I/O errors at user level.

In any case this patch will complete zfcp's set of eh_handlers.


Regards,

Andreas

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] zfcp: provide support for NPIV

2005-09-03 Thread Andreas Herrmann
From: Maxim Shchetynin <[EMAIL PROTECTED]>

zfcp: provide support for NPIV

N_Port ID Virtualization (NPIV) allows a single FCP port to appear as
multiple, distinct ports providing separate port identification. NPIV
is supported by FC HBAs on System z9. zfcp was adapted to support this
new feature.

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

 zfcp_dbf.c   |   11 +
 zfcp_def.h   |   13 +-
 zfcp_erp.c   |   95 +-
 zfcp_ext.h   |3 
 zfcp_fsf.c   |  330 +--
 zfcp_fsf.h   |   51 ++-
 zfcp_sysfs_adapter.c |4 
 7 files changed, 367 insertions(+), 140 deletions(-)

diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_dbf.c 
linux-2.6.13/drivers/s390/scsi/zfcp_dbf.c
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_dbf.c  2005-09-03 
12:23:43.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_dbf.c   2005-09-03 12:32:48.0 
+0200
@@ -279,9 +279,18 @@ zfcp_hba_dbf_event_fsf_unsol(const char 
break;
 
case FSF_STATUS_READ_LINK_DOWN:
-   rec->type.status.payload_size = sizeof(u64);
+   switch (status_buffer->status_subtype) {
+   case FSF_STATUS_READ_SUB_NO_PHYSICAL_LINK:
+   case FSF_STATUS_READ_SUB_FDISC_FAILED:
+   rec->type.status.payload_size =
+   sizeof(struct fsf_link_down_info);
+   }
break;
 
+   case FSF_STATUS_READ_FEATURE_UPDATE_ALERT:
+   rec->type.status.payload_size =
+   ZFCP_DBF_UNSOL_PAYLOAD_FEATURE_UPDATE_ALERT;
+   break;
}
memcpy(&rec->type.status.payload,
   &status_buffer->payload, rec->type.status.payload_size);
diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_def.h 
linux-2.6.13/drivers/s390/scsi/zfcp_def.h
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_def.h  2005-09-03 
12:23:43.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_def.h   2005-09-03 12:32:48.0 
+0200
@@ -66,7 +66,7 @@
 /* GENERAL DEFINES */
 
 /* zfcp version number, it consists of major, minor, and patch-level number */
-#define ZFCP_VERSION   "4.4.0"
+#define ZFCP_VERSION   "4.5.0"
 
 /**
  * zfcp_sg_to_address - determine kernel address from struct scatterlist
@@ -154,6 +154,11 @@ typedef u32 scsi_lun_t;
 #define ZFCP_EXCHANGE_CONFIG_DATA_FIRST_SLEEP  100
 #define ZFCP_EXCHANGE_CONFIG_DATA_RETRIES  7
 
+/* Retry 5 times every 2 second, then every minute */
+#define ZFCP_EXCHANGE_PORT_DATA_SHORT_RETRIES  5
+#define ZFCP_EXCHANGE_PORT_DATA_SHORT_SLEEP200
+#define ZFCP_EXCHANGE_PORT_DATA_LONG_SLEEP 6000
+
 /* timeout value for "default timer" for fsf requests */
 #define ZFCP_FSF_REQUEST_TIMEOUT (60*HZ);
 
@@ -638,6 +643,7 @@ do { \
 #define ZFCP_STATUS_ADAPTER_ERP_THREAD_KILL0x0080
 #define ZFCP_STATUS_ADAPTER_ERP_PENDING0x0100
 #define ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED 0x0200
+#define ZFCP_STATUS_ADAPTER_XPORT_OK   0x0800
 
 #define ZFCP_STATUS_ADAPTER_SCSI_UP\
(ZFCP_STATUS_COMMON_UNBLOCKED | \
@@ -915,13 +921,16 @@ struct zfcp_adapter {
wwn_t   peer_wwnn; /* P2P peer WWNN */
wwn_t   peer_wwpn; /* P2P peer WWPN */
fc_id_t peer_d_id; /* P2P peer D_ID */
+   wwn_t   physical_wwpn; /* WWPN of physical port */
+   fc_id_t physical_s_id; /* local FC port ID */
struct ccw_device   *ccw_device;   /* S/390 ccw device */
u8  fc_service_class;
u32 fc_topology;   /* FC topology */
u32 fc_link_speed; /* FC interface speed */
u32 hydra_version; /* Hydra version */
u32 fsf_lic_version;
-u32supported_features;/* of FCP channel */
+   u32 adapter_features;  /* FCP channel features */
+   u32 connection_features; /* host connection 
features */
 u32hardware_version;  /* of FCP channel */
 u8 serial_number[32]; /* of hardware */
struct Scsi_Host*scsi_host;/* Pointer to mid-layer */
diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_erp.c 
linux-2.6.13/drivers/s390/scsi/zfcp_erp.c
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_erp.c  2005-09-03 
12:20:07.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_erp.c   2005-09-03 12:32:48.

[PATCH 5/7] zfcp: shorten eh_bus_reset and eh_host_reset handlers

2005-09-03 Thread Andreas Herrmann
zfcp: shorten eh_bus_reset and eh_host_reset handlers

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_scsi.c 
linux-2.6.13/drivers/s390/scsi/zfcp_scsi.c
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_scsi.c 2005-09-03 
12:21:58.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_scsi.c  2005-09-03 12:22:13.0 
+0200
@@ -643,50 +643,38 @@ zfcp_task_management_function(struct zfc
return retval;
 }
 
-/*
- * function:   zfcp_scsi_eh_bus_reset_handler
- *
- * purpose:
- *
- * returns:
+/**
+ * zfcp_scsi_eh_bus_reset_handler - reset bus (reopen adapter)
  */
 int
 zfcp_scsi_eh_bus_reset_handler(struct scsi_cmnd *scpnt)
 {
-   int retval = 0;
-   struct zfcp_unit *unit;
+   struct zfcp_unit *unit = (struct zfcp_unit*) scpnt->device->hostdata;
+   struct zfcp_adapter *adapter = unit->port->adapter;
 
-   unit = (struct zfcp_unit *) scpnt->device->hostdata;
ZFCP_LOG_NORMAL("bus reset because of problems with "
"unit 0x%016Lx\n", unit->fcp_lun);
-   zfcp_erp_adapter_reopen(unit->port->adapter, 0);
-   zfcp_erp_wait(unit->port->adapter);
-   retval = SUCCESS;
+   zfcp_erp_adapter_reopen(adapter, 0);
+   zfcp_erp_wait(adapter);
 
-   return retval;
+   return SUCCESS;
 }
 
-/*
- * function:   zfcp_scsi_eh_host_reset_handler
- *
- * purpose:
- *
- * returns:
+/**
+ * zfcp_scsi_eh_host_reset_handler - reset host (reopen adapter)
  */
 int
 zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
 {
-   int retval = 0;
-   struct zfcp_unit *unit;
+   struct zfcp_unit *unit = (struct zfcp_unit*) scpnt->device->hostdata;
+   struct zfcp_adapter *adapter = unit->port->adapter;
 
-   unit = (struct zfcp_unit *) scpnt->device->hostdata;
ZFCP_LOG_NORMAL("host reset because of problems with "
"unit 0x%016Lx\n", unit->fcp_lun);
-   zfcp_erp_adapter_reopen(unit->port->adapter, 0);
-   zfcp_erp_wait(unit->port->adapter);
-   retval = SUCCESS;
+   zfcp_erp_adapter_reopen(adapter, 0);
+   zfcp_erp_wait(adapter);
 
-   return retval;
+   return SUCCESS;
 }
 
 /*
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] zfcp: remove function zfcp_fsf_req_wait_and_cleanup

2005-09-03 Thread Andreas Herrmann
zfcp: remove function zfcp_fsf_req_wait_and_cleanup

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

 zfcp_ext.h  |1 -
 zfcp_fsf.c  |   46 --
 zfcp_scsi.c |   21 +
 3 files changed, 9 insertions(+), 59 deletions(-)

diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_ext.h 
linux-2.6.13/drivers/s390/scsi/zfcp_ext.h
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_ext.h  2005-09-03 
12:17:16.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_ext.h   2005-09-03 12:21:24.0 
+0200
@@ -109,7 +109,6 @@ extern int zfcp_fsf_req_create(struct zf
 extern int zfcp_fsf_send_ct(struct zfcp_send_ct *, mempool_t *,
struct zfcp_erp_action *);
 extern int zfcp_fsf_send_els(struct zfcp_send_els *);
-extern int  zfcp_fsf_req_wait_and_cleanup(struct zfcp_fsf_req *, int, u32 *);
 extern int  zfcp_fsf_send_fcp_command_task(struct zfcp_adapter *,
   struct zfcp_unit *,
   struct scsi_cmnd *,
diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_fsf.c 
linux-2.6.13/drivers/s390/scsi/zfcp_fsf.c
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_fsf.c  2005-09-03 
12:21:07.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_fsf.c   2005-09-03 12:21:24.0 
+0200
@@ -4548,52 +4548,6 @@ skip_fsfstatus:
return retval;
 }
 
-
-/*
- * function:zfcp_fsf_req_wait_and_cleanup
- *
- * purpose:
- *
- * FIXME(design): signal seems to be <0 !!!
- * returns:0   - request completed (*status is valid), cleanup succ.
- * <0  - request completed (*status is valid), cleanup failed
- * >0  - signal which interrupted waiting (*status invalid),
- *   request not completed, no cleanup
- *
- * *status is a copy of status of completed fsf_req
- */
-int
-zfcp_fsf_req_wait_and_cleanup(struct zfcp_fsf_req *fsf_req,
- int interruptible, u32 * status)
-{
-   int retval = 0;
-   int signal = 0;
-
-   if (interruptible) {
-   __wait_event_interruptible(fsf_req->completion_wq,
-  fsf_req->status &
-  ZFCP_STATUS_FSFREQ_COMPLETED,
-  signal);
-   if (signal) {
-   ZFCP_LOG_DEBUG("Caught signal %i while waiting for the "
-  "completion of the request at %p\n",
-  signal, fsf_req);
-   retval = signal;
-   goto out;
-   }
-   } else {
-   __wait_event(fsf_req->completion_wq,
-fsf_req->status & ZFCP_STATUS_FSFREQ_COMPLETED);
-   }
-
-   *status = fsf_req->status;
-
-   /* cleanup request */
-   zfcp_fsf_req_free(fsf_req);
- out:
-   return retval;
-}
-
 static inline int
 zfcp_fsf_req_sbal_check(unsigned long *flags,
struct zfcp_qdio_queue *queue, int needed)
diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_scsi.c 
linux-2.6.13/drivers/s390/scsi/zfcp_scsi.c
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_scsi.c 2005-09-03 
12:21:07.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_scsi.c  2005-09-03 12:21:24.0 
+0200
@@ -614,9 +614,8 @@ static int
 zfcp_task_management_function(struct zfcp_unit *unit, u8 tm_flags)
 {
struct zfcp_adapter *adapter = unit->port->adapter;
-   int retval;
-   int status;
struct zfcp_fsf_req *fsf_req;
+   int retval = 0;
 
/* issue task management function */
fsf_req = zfcp_fsf_send_fcp_command_task_management
@@ -630,18 +629,16 @@ zfcp_task_management_function(struct zfc
goto out;
}
 
-   retval = zfcp_fsf_req_wait_and_cleanup(fsf_req,
-  ZFCP_UNINTERRUPTIBLE, &status);
-   /*
-* check completion status of task management function
-* (status should always be valid since no signals permitted)
-*/
-   if (status & ZFCP_STATUS_FSFREQ_TMFUNCFAILED)
+   __wait_event(fsf_req->completion_wq,
+fsf_req->status & ZFCP_STATUS_FSFREQ_COMPLETED);
+
+   /* check completion status of task management function */
+   if (fsf_req->status & ZFCP_STATUS_FSFREQ_TMFUNCFAILED)
retval = -EIO;
-   else if (status & ZFCP_STATUS_FSFREQ_TMFUNCNOTSUPP)
+   else if (fsf_req->status & ZFCP_STATUS_FSFREQ_TMFUNCNOTSUPP)
retval = -ENOTSUPP;
-   else
-   retval = 0;
+
+   zfcp_fsf_req_free(fsf_req);
  out:
return retval;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] zfcp: remove union zfcp_req_data, use unit refcount for FCP commands

2005-09-03 Thread Andreas Herrmann
zfcp: remove union zfcp_req_data, use unit refcount for FCP commands

o union zfcp_req_data removed
o increment unit refcount when processing FCP commands
 (This fixes a theoretical race: When all scsi commands of a unit
  are aborted and the scsi_device is removed then the unit could be
  removed before all fsf_requests of that unit are completely processed.)

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

 zfcp_aux.c  |8 +---
 zfcp_def.h  |   69 +---
 zfcp_fsf.c  |  101 +---
 zfcp_scsi.c |   69 ++--
 4 files changed, 72 insertions(+), 175 deletions(-)

diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_aux.c 
linux-2.6.13/drivers/s390/scsi/zfcp_aux.c
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_aux.c  2005-09-03 
12:17:16.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_aux.c   2005-09-03 12:20:22.0 
+0200
@@ -141,7 +141,7 @@ zfcp_cmd_dbf_event_fsf(const char *text,
 
spin_lock_irqsave(&adapter->dbf_lock, flags);
if (zfcp_fsf_req_is_scsi_cmnd(fsf_req)) {
-   scsi_cmnd = fsf_req->data.send_fcp_command_task.scsi_cmnd;
+   scsi_cmnd = (struct scsi_cmnd*) fsf_req->data;
debug_text_event(adapter->cmd_dbf, level, "fsferror");
debug_text_event(adapter->cmd_dbf, level, text);
debug_event(adapter->cmd_dbf, level, &fsf_req,
@@ -167,14 +167,12 @@ void
 zfcp_cmd_dbf_event_scsi(const char *text, struct scsi_cmnd *scsi_cmnd)
 {
struct zfcp_adapter *adapter;
-   union zfcp_req_data *req_data;
struct zfcp_fsf_req *fsf_req;
int level = ((host_byte(scsi_cmnd->result) != 0) ? 1 : 5);
unsigned long flags;
 
adapter = (struct zfcp_adapter *) scsi_cmnd->device->host->hostdata[0];
-   req_data = (union zfcp_req_data *) scsi_cmnd->host_scribble;
-   fsf_req = (req_data ? req_data->send_fcp_command_task.fsf_req : NULL);
+   fsf_req = (struct zfcp_fsf_req  *) scsi_cmnd->host_scribble;
spin_lock_irqsave(&adapter->dbf_lock, flags);
debug_text_event(adapter->cmd_dbf, level, "hostbyte");
debug_text_event(adapter->cmd_dbf, level, text);
@@ -1609,7 +1607,7 @@ zfcp_fsf_incoming_els(struct zfcp_fsf_re
u32 els_type;
struct zfcp_adapter *adapter;
 
-   status_buffer = fsf_req->data.status_read.buffer;
+   status_buffer = (struct fsf_status_read_buffer *) fsf_req->data;
els_type = *(u32 *) (status_buffer->payload);
adapter = fsf_req->adapter;
 
diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_def.h 
linux-2.6.13/drivers/s390/scsi/zfcp_def.h
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_def.h  2005-09-03 
12:17:16.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_def.h   2005-09-03 12:20:22.0 
+0200
@@ -635,45 +635,6 @@ struct zfcp_adapter_mempool {
mempool_t *data_gid_pn;
 };
 
-struct  zfcp_exchange_config_data{
-};
-
-struct zfcp_open_port {
-struct zfcp_port *port;
-};
-
-struct zfcp_close_port {
-   struct zfcp_port *port;
-};
-
-struct zfcp_open_unit {
-   struct zfcp_unit *unit;
-};
-
-struct zfcp_close_unit {
-   struct zfcp_unit *unit;
-};
-
-struct zfcp_close_physical_port {
-struct zfcp_port *port;
-};
-
-struct zfcp_send_fcp_command_task {
-   struct zfcp_fsf_req *fsf_req;
-   struct zfcp_unit *unit;
-   struct scsi_cmnd *scsi_cmnd;
-   unsigned long start_jiffies;
-};
-
-struct zfcp_send_fcp_command_task_management {
-   struct zfcp_unit *unit;
-};
-
-struct zfcp_abort_fcp_command {
-   struct zfcp_fsf_req *fsf_req;
-   struct zfcp_unit *unit;
-};
-
 /*
  * header for CT_IU
  */
@@ -781,33 +742,6 @@ struct zfcp_send_els {
int status;
 };
 
-struct zfcp_status_read {
-   struct fsf_status_read_buffer *buffer;
-};
-
-struct zfcp_fsf_done {
-   struct completion *complete;
-   int status;
-};
-
-/* request specific data */
-union zfcp_req_data {
-   struct zfcp_exchange_config_data exchange_config_data;
-   struct zfcp_open_port open_port;
-   struct zfcp_close_portclose_port;
-   struct zfcp_open_unit open_unit;
-   struct zfcp_close_unitclose_unit;
-   struct zfcp_close_physical_port   close_physical_port;
-   struct zfcp_send_fcp_command_task send_fcp_command_task;
-struct zfcp_send_fcp_command_task_management
- send_fcp_command_task_management;
-   struct zfcp_abort_fcp_command abort_fcp_command;
-   struct zfcp_send_ct *send_ct;
-   struct zfcp_send_els *send_els;
-   struct zfcp_status_read   status_read;
-   struct fsf_qtcb_bottom_port *port_data;
-};
-
 struct zfcp_qdio_queue {
struct qdio_buffer *buffer[Q

[PATCH 2/7] zfcp: fix race conditions when accessing erp_action lists

2005-09-03 Thread Andreas Herrmann
zfcp: fix race conditions when accessing erp_action lists

o always use locking when changing erp_action lists,
o avoid escalation to ERP_ACTION_REOPEN_PORT_FORCED if erp_action is
  still in use for ERP_ACTION_REOPEN_PORT

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_erp.c 
linux-2.6.13/drivers/s390/scsi/zfcp_erp.c
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_erp.c  2005-09-03 
12:17:16.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_erp.c   2005-09-03 12:19:22.0 
+0200
@@ -886,7 +886,7 @@ static int
 zfcp_erp_strategy_check_fsfreq(struct zfcp_erp_action *erp_action)
 {
int retval = 0;
-   struct zfcp_fsf_req *fsf_req;
+   struct zfcp_fsf_req *fsf_req = NULL;
struct zfcp_adapter *adapter = erp_action->adapter;
 
if (erp_action->fsf_req) {
@@ -896,7 +896,7 @@ zfcp_erp_strategy_check_fsfreq(struct zf
list_for_each_entry(fsf_req, &adapter->fsf_req_list_head, list)
if (fsf_req == erp_action->fsf_req)
break;
-   if (fsf_req == erp_action->fsf_req) {
+   if (fsf_req && (fsf_req->erp_action == erp_action)) {
/* fsf_req still exists */
debug_text_event(adapter->erp_dbf, 3, "a_ca_req");
debug_event(adapter->erp_dbf, 3, &fsf_req,
@@ -2291,7 +2291,9 @@ zfcp_erp_adapter_strategy_open_fsf_xconf
atomic_clear_mask(ZFCP_STATUS_ADAPTER_HOST_CON_INIT,
  &adapter->status);
ZFCP_LOG_DEBUG("Doing exchange config data\n");
+   write_lock(&adapter->erp_lock);
zfcp_erp_action_to_running(erp_action);
+   write_unlock(&adapter->erp_lock);
zfcp_erp_timeout_init(erp_action);
if (zfcp_fsf_exchange_config_data(erp_action)) {
retval = ZFCP_ERP_FAILED;
@@ -3194,11 +3196,19 @@ zfcp_erp_action_enqueue(int action,
/* fall through !!! */
 
case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED:
-   if (atomic_test_mask
-   (ZFCP_STATUS_COMMON_ERP_INUSE, &port->status)
-   && port->erp_action.action ==
-   ZFCP_ERP_ACTION_REOPEN_PORT_FORCED) {
-   debug_text_event(adapter->erp_dbf, 4, "pf_actenq_drp");
+   if (atomic_test_mask(ZFCP_STATUS_COMMON_ERP_INUSE,
+&port->status)) {
+   if (port->erp_action.action !=
+   ZFCP_ERP_ACTION_REOPEN_PORT_FORCED) {
+   ZFCP_LOG_INFO("dropped erp action %i (port "
+ "0x%016Lx, action in use: %i)\n",
+ action, port->wwpn,
+ port->erp_action.action);
+   debug_text_event(adapter->erp_dbf, 4,
+"pf_actenq_drp");
+   } else 
+   debug_text_event(adapter->erp_dbf, 4,
+"pf_actenq_drpcp");
debug_event(adapter->erp_dbf, 4, &port->wwpn,
sizeof (wwn_t));
goto out;
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] zfcp: introduce eh_timed_out handler

2005-09-03 Thread Andreas Herrmann
zfcp: introduce eh_timed_out handler

This handler is required to avoid offlined SCSI devices in a multipath
setup if scsi commands time out on cable pulls lasting longer than 30
seconds.

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

diff -Nup linux-2.6.13/drivers/s390/scsi-orig/zfcp_scsi.c 
linux-2.6.13/drivers/s390/scsi/zfcp_scsi.c
--- linux-2.6.13/drivers/s390/scsi-orig/zfcp_scsi.c 2005-09-03 
12:17:16.0 +0200
+++ linux-2.6.13/drivers/s390/scsi/zfcp_scsi.c  2005-09-03 12:17:53.0 
+0200
@@ -44,6 +44,7 @@ static int zfcp_scsi_eh_abort_handler(st
 static int zfcp_scsi_eh_device_reset_handler(struct scsi_cmnd *);
 static int zfcp_scsi_eh_bus_reset_handler(struct scsi_cmnd *);
 static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *);
+static enum scsi_eh_timer_return zfcp_scsi_eh_timed_out(struct scsi_cmnd *);
 static int zfcp_task_management_function(struct zfcp_unit *, u8);
 
 static struct zfcp_unit *zfcp_unit_lookup(struct zfcp_adapter *, int, 
scsi_id_t,
@@ -69,6 +70,7 @@ struct zfcp_data zfcp_data = {
  eh_device_reset_handler: zfcp_scsi_eh_device_reset_handler,
  eh_bus_reset_handler:zfcp_scsi_eh_bus_reset_handler,
  eh_host_reset_handler:   zfcp_scsi_eh_host_reset_handler,
+ eh_timed_out:zfcp_scsi_eh_timed_out,
   /* FIXME(openfcp): Tune */
  can_queue:   4096,
  this_id: 0,
@@ -242,7 +244,6 @@ static void
 zfcp_scsi_command_fail(struct scsi_cmnd *scpnt, int result)
 {
set_host_byte(&scpnt->result, result);
-   zfcp_cmd_dbf_event_scsi("failing", scpnt);
/* return directly */
scpnt->scsi_done(scpnt);
 }
@@ -414,59 +415,18 @@ zfcp_port_lookup(struct zfcp_adapter *ad
return (struct zfcp_port *) NULL;
 }
 
-/*
- * function:   zfcp_scsi_eh_abort_handler
- *
- * purpose:tries to abort the specified (timed out) SCSI command
- *
- * note:   We do not need to care for a SCSI command which completes
- * normally but late during this abort routine runs.
- * We are allowed to return late commands to the SCSI stack.
- * It tracks the state of commands and will handle late commands.
- * (Usually, the normal completion of late commands is ignored with
- * respect to the running abort operation. Grep for 'done_late'
- * in the SCSI stacks sources.)
- *
- * returns:SUCCESS - command has been aborted and cleaned up in internal
- *   bookkeeping,
- *   SCSI stack won't be called for aborted command
- * FAILED  - otherwise
- */
 int
-__zfcp_scsi_eh_abort_handler(struct scsi_cmnd *scpnt)
+zfcp_scsi_abort_async(struct scsi_cmnd *scpnt,
+ struct zfcp_fsf_req **fsf_req_ptr)
 {
-   int retval = SUCCESS;
-   struct zfcp_fsf_req *new_fsf_req, *old_fsf_req;
-   struct zfcp_adapter *adapter = (struct zfcp_adapter *) 
scpnt->device->host->hostdata[0];
+   struct Scsi_Host *host = scpnt->device->host;
+   struct zfcp_adapter *adapter = (struct zfcp_adapter *) 
host->hostdata[0];
struct zfcp_unit *unit = (struct zfcp_unit *) scpnt->device->hostdata;
-   struct zfcp_port *port = unit->port;
-   struct Scsi_Host *scsi_host = scpnt->device->host;
union zfcp_req_data *req_data = NULL;
+   struct zfcp_fsf_req *new_fsf_req;
+   struct zfcp_fsf_req *old_fsf_req;
+   int req_flags;
unsigned long flags;
-   u32 status = 0;
-
-   /* the components of a abort_dbf record (fixed size record) */
-   u64 dbf_scsi_cmnd = (unsigned long) scpnt;
-   char dbf_opcode[ZFCP_ABORT_DBF_LENGTH];
-   wwn_t dbf_wwn = port->wwpn;
-   fcp_lun_t dbf_fcp_lun = unit->fcp_lun;
-   u64 dbf_retries = scpnt->retries;
-   u64 dbf_allowed = scpnt->allowed;
-   u64 dbf_timeout = 0;
-   u64 dbf_fsf_req = 0;
-   u64 dbf_fsf_status = 0;
-   u64 dbf_fsf_qual[2] = { 0, 0 };
-   char dbf_result[ZFCP_ABORT_DBF_LENGTH] = "##undef";
-
-   memset(dbf_opcode, 0, ZFCP_ABORT_DBF_LENGTH);
-   memcpy(dbf_opcode,
-  scpnt->cmnd,
-  min(scpnt->cmd_len, (unsigned char) ZFCP_ABORT_DBF_LENGTH));
-
-   ZFCP_LOG_INFO("aborting scsi_cmnd=%p on adapter %s\n",
- scpnt, zfcp_get_busid_by_adapter(adapter));
-
-   spin_unlock_irq(scsi_host->host_lock);
 
/*
 * Race condition between normal (late) completion and abort has
@@ -494,31 +454,18 @@ __zfcp_scsi_eh_abort_handler(struct scsi
 * Do not initiate abort but return SUCCESS.
 */
write_unlock_irqrestore(&adapter->abort_lock, flags);
-   retval = SUCCESS;
-   strncpy(dbf_result, "##la

[PATCH 0/7] zfcp: update to driver version 4.5.0

2005-09-03 Thread Andreas Herrmann
Hi,

Following a series of 7 patches (based on 2.6.13) to update the zfcp
device driver. Patches contain bugfixes, some cleanups and new
features. Overall diffstat of all patches is:

 Makefile |2 
 zfcp_aux.c   |  188 -
 zfcp_dbf.c   |  999 ++-
 zfcp_def.h   |  295 +--
 zfcp_erp.c   |  119 +-
 zfcp_ext.h   |   30 +
 zfcp_fsf.c   |  718 
 zfcp_fsf.h   |   54 ++
 zfcp_qdio.c  |   30 -
 zfcp_scsi.c  |  403 
 zfcp_sysfs_adapter.c |4 
 11 files changed, 1874 insertions(+), 968 deletions(-)

Please apply.


Regards,

Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: correct deregistration from scsi_transport_fc

2005-08-29 Thread Andreas Herrmann
On 29.08.2005 13:03 Christoph Hellwig <[EMAIL PROTECTED]> wrote:

  > > Won't work for all zSeries FC host adapters because they are
  > > virtualized and you can have several virtual adapters using the same
  > > WWPN/WWNN.  Using LUN masking and zoning it is not possible to
  > > configure the SAN such that one virtual adapter sees just that LUNs
  > > that are supposed to be used by it. There is a tool to write an 
access
  > > control table to the adapter. This ACT specifies which virtual 
adapter
  > > can access which ports and FCP LUNs ... 

  > That's totally broken.  most FC sans have zoning and access control,
  > but this is by no way a feature of the HBA.  Your feature is totally
  > broken, different from other FC setups and must go away.

Of course, also in my opinion this is not the optimal way to virtualize
adapters. But that is how the hardware works. I cannot change this.
zfcp just can exploit the hardware "as is". And therefor zfcp needs for
compatibility reasons the scsi_add_device interface. 

  > > A REPORT LUNs scan from adapter X of port Y might report thousands 
of
  > > LUNs that I don't like to use with adapter X because they are for
  > > another virtual adapter.

  > So what?  That's the same as for every FC SAN, and it is nessecary to
  > support proper managment applications.

The main point is that we do not want that all LUNs that are reported
by REPORT LUNS are configured for the virtual adapter.
I think Martin Peschke gave you further information about this.
BTW, are there any proper management applications for Linux (all
architectures, based on which interface)?

  > > Thats the reason why I like to stick to
  > > manual configuration (triggered from zfcp) of scsi_devices. Hence 
zfcp
  > > has not enabled a proper lun scan when fc_remote_port_add is
  > > called. slave_alloc will fail for scsi_devices not added by zfcp
  > > itself.  (BTW, new FC cards that are already announced will provide 
a
  > > feature called NPort Id Virtualization. With this feature each 
virtual
  > > adapter will have its own WWPN. This will allow zfcp to use the lun
  > > scan during fc_remote_port_add.)
  > > 
  > > Do you mean, that scsi_add_device is not supposed to work with 
  > > fc_transport
  > > code anymore?

  > It works by accident, but I will veto any updates you're going to send
  > for this broken behaviour.

So scsi_add_device will soon be mentioned in
Documentation/feature-removal-schedule.txt?

What is the rationale of proscribing usage of scsi_add_device() when
scsi_transport_fc is used?

  > >   > fc_remove_host removes all rports for you.
  > > 
  > > Ok, works. But it still fails to remove the scsi target 
representation
  > > of that rport.

  > That's intentional.  See the discussion during development of the FC
  > transport class.  I don't like that behaviour but it's a compromise we
  > agreed on.

Where is the function to remove the scsi target representation of
an rport? You did not agree on having memory leaks in the kernel, did
you?


Regards,

Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: correct deregistration from scsi_transport_fc

2005-08-29 Thread Andreas Herrmann
On 29.08.2005 10:48 Christoph Hellwig <[EMAIL PROTECTED]> worte:

  > Never use scsi_add_device with the fc_transport code. 
fc_remote_port_add
  > will do a proper lun scan if the added rport is a scsi target.

Won't work for all zSeries FC host adapters because they are
virtualized and you can have several virtual adapters using the same
WWPN/WWNN.  Using LUN masking and zoning it is not possible to
configure the SAN such that one virtual adapter sees just that LUNs
that are supposed to be used by it. There is a tool to write an access
control table to the adapter. This ACT specifies which virtual adapter
can access which ports and FCP LUNs ... 

A REPORT LUNs scan from adapter X of port Y might report thousands of
LUNs that I don't like to use with adapter X because they are for
another virtual adapter. Thats the reason why I like to stick to
manual configuration (triggered from zfcp) of scsi_devices. Hence zfcp
has not enabled a proper lun scan when fc_remote_port_add is
called. slave_alloc will fail for scsi_devices not added by zfcp
itself.  (BTW, new FC cards that are already announced will provide a
feature called NPort Id Virtualization. With this feature each virtual
adapter will have its own WWPN. This will allow zfcp to use the lun
scan during fc_remote_port_add.)

Do you mean, that scsi_add_device is not supposed to work with 
fc_transport
code anymore?

Or is it mere a recommendation not to stick to scsi_add_device but to use
the automatic LUN scanning provided with fc_transport code?


  > fc_remove_host removes all rports for you.

Ok, works. But it still fails to remove the scsi target representation
of that rport.


Regards,

Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


correct deregistration from scsi_transport_fc

2005-08-29 Thread Andreas Herrmann
Hi,

What is the correct sequence to register and deregister host/rport/scsi_devices
at the scsi stack?

ZFCP's sequence is like follows:

scsi_add_host
fc_remote_port_add (if port succesfully configured/opened in/by zfcp)
scsi_add_device (if unit successfully configured/opened in/by zfcp)

If a zfcp-adapter is set offline I do the following to get rid of the
entries within /sys/class/fc_*:

fc_remote_port_delete (for each rport registered for this adapter)
fc_remove_host
scsi_remove_host

Setting zfcp adapters offline it happens that the entries target0:0:0
for the old host_id 0 are still present in /sys/class/fc_transport.
The symlink "device" points to a non-existent directory
../../../devices/css0/0.0.000f/0.0.50d3/host0/rport-0:0-0/target0:0:0
of the removed host.

I observed the same behaviour when trying to delete the SCSI device
using its delete attribute before removing the host.  Furthermore I
put some printk into the release function for the SCSI device
(scsi_device_dev_release in scsi_scan.c) but this funtion is never
called -- neither if the delete attribute was used nor when the host
was completely removed.

Is there a problem with proper deregistration of kobjects for SCSI
devices in the SCSI subsystem if fc_transport is used? Or is this
obvious memory leak caused by a wrong usage of the scsi_transport_fc
interface?

Any thoughts about this problem?


Regards,

Andreas

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] zfcp: add rports to enable scsi_add_device to work again

2005-08-29 Thread Andreas Herrmann
On 27.08.2005 19:38 James Bottomley <[EMAIL PROTECTED]> wrote:
 
  > > this patch fixes a severe problem with 2.6.13-rc7.
  > > 
  > > Due to recent SCSI changes it is not possible to add any
  > > LUNs to the zfcp device driver anymore. With registration
  > > of remote ports this is fixed.
  > > 
  > > Please integrate the patch in the 2.6.13 kernel or if it
  > > is already too late for this release then please integrate it
  > > in 2.6.13.1
  > > 
  > > Thanks a lot.

  > Well, OK, but your usage isn't quite optimal.  The fibre channel
  > transport class retains a list of ports per host, so your maintenance 
of
  > an identical list in zfcp_adapter duplicates this.

I know what you mean. It would be better to store all "private" zfcp
data at dd_data in fc_rport.  Unfortunately it won't fit to zfcp's
current behaviour.  The rport depends on a specific host_id. Even the
name of the rport in sys/class/fc_remote_port inherits this id. This
means if the host is deregistered and registered again the old rport
structure is useless because new host_ids are assigned. Or do I miss
something here?

The zfcp_port structure is thought to be persistent if once configured
by the user, i.e. even if the host is deregistered and registered
again the port structure is kept.

I am not sure at the moment how this can be solved with the current
fc_transport. I think it would have been better to use the WWPN of an
rport as the name in sys/class/fc_remote_port. This would be a start
to keep rport structures independent of the host_id. (Why should the
transport depend on OS assigned ids anyway? The transport has already
unique identifiers like WWPNs.)

  > However, we can put this in for now and worry about removing all of 
the
  > fc transport class duplication from zfcp later.

  > James

In any case attributes provided by an rport should be removed from the
zfcp_port structure. This is what I will do next.


Regards,

Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] zfcp: add rports to enable scsi_add_device to work again

2005-08-27 Thread Andreas Herrmann
Hi,

this patch fixes a severe problem with 2.6.13-rc7.

Due to recent SCSI changes it is not possible to add any
LUNs to the zfcp device driver anymore. With registration
of remote ports this is fixed.

Please integrate the patch in the 2.6.13 kernel or if it
is already too late for this release then please integrate it
in 2.6.13.1

Thanks a lot.


Regards,

Andreas


Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

diffstat:
 zfcp_aux.c  |   29 +++--
 zfcp_ccw.c  |   10 ++
 zfcp_def.h  |2 +-
 zfcp_erp.c  |   25 ++---
 zfcp_ext.h  |2 ++
 zfcp_fsf.c  |1 +
 zfcp_scsi.c |   25 -
 7 files changed, 63 insertions(+), 31 deletions(-)

diff -urN linux-2.6.13-rcx/drivers/s390/scsi/zfcp_aux.c 
linux-2.6.13-zfcpfctc/drivers/s390/scsi/zfcp_aux.c
--- linux-2.6.13-rcx/drivers/s390/scsi/zfcp_aux.c   2005-08-25 
10:53:15.0 +0200
+++ linux-2.6.13-zfcpfctc/drivers/s390/scsi/zfcp_aux.c  2005-08-27 
13:05:17.0 +0200
@@ -1299,13 +1299,10 @@
 zfcp_port_enqueue(struct zfcp_adapter *adapter, wwn_t wwpn, u32 status,
  u32 d_id)
 {
-   struct zfcp_port *port, *tmp_port;
+   struct zfcp_port *port;
int check_wwpn;
-   scsi_id_t scsi_id;
-   int found;
 
check_wwpn = !(status & ZFCP_STATUS_PORT_NO_WWPN);
-
/*
 * check that there is no port with this WWPN already in list
 */
@@ -1368,7 +1365,7 @@
} else {
snprintf(port->sysfs_device.bus_id,
 BUS_ID_SIZE, "0x%016llx", wwpn);
-   port->sysfs_device.parent = &adapter->ccw_device->dev;
+   port->sysfs_device.parent = &adapter->ccw_device->dev;
}
port->sysfs_device.release = zfcp_sysfs_port_release;
dev_set_drvdata(&port->sysfs_device, port);
@@ -1388,24 +1385,8 @@
 
zfcp_port_get(port);
 
-   scsi_id = 1;
-   found = 0;
write_lock_irq(&zfcp_data.config_lock);
-   list_for_each_entry(tmp_port, &adapter->port_list_head, list) {
-   if (atomic_test_mask(ZFCP_STATUS_PORT_NO_SCSI_ID,
-&tmp_port->status))
-   continue;
-   if (tmp_port->scsi_id != scsi_id) {
-   found = 1;
-   break;
-   }
-   scsi_id++;
-   }
-   port->scsi_id = scsi_id;
-   if (found)
-   list_add_tail(&port->list, &tmp_port->list);
-   else
-   list_add_tail(&port->list, &adapter->port_list_head);
+   list_add_tail(&port->list, &adapter->port_list_head);
atomic_clear_mask(ZFCP_STATUS_COMMON_REMOVE, &port->status);
atomic_set_mask(ZFCP_STATUS_COMMON_RUNNING, &port->status);
if (d_id == ZFCP_DID_DIRECTORY_SERVICE)
@@ -1422,11 +1403,15 @@
 void
 zfcp_port_dequeue(struct zfcp_port *port)
 {
+   struct fc_port *rport;
+
zfcp_port_wait(port);
write_lock_irq(&zfcp_data.config_lock);
list_del(&port->list);
port->adapter->ports--;
write_unlock_irq(&zfcp_data.config_lock);
+   if (port->rport)
+   fc_remote_port_delete(rport);
zfcp_adapter_put(port->adapter);
zfcp_sysfs_port_remove_files(&port->sysfs_device,
 atomic_read(&port->status));
diff -urN linux-2.6.13-rcx/drivers/s390/scsi/zfcp_ccw.c 
linux-2.6.13-zfcpfctc/drivers/s390/scsi/zfcp_ccw.c
--- linux-2.6.13-rcx/drivers/s390/scsi/zfcp_ccw.c   2005-03-02 
08:37:50.0 +0100
+++ linux-2.6.13-zfcpfctc/drivers/s390/scsi/zfcp_ccw.c  2005-08-27 
13:28:35.0 +0200
@@ -202,9 +202,19 @@
 zfcp_ccw_set_offline(struct ccw_device *ccw_device)
 {
struct zfcp_adapter *adapter;
+   struct zfcp_port *port;
+   struct fc_port *rport;
 
down(&zfcp_data.config_sema);
adapter = dev_get_drvdata(&ccw_device->dev);
+   /* might be racy, but we cannot take config_lock due to the fact that
+  fc_remote_port_delete might sleep */
+   list_for_each_entry(port, &adapter->port_list_head, list)
+   if (port->rport) {
+   rport = port->rport;
+   port->rport = NULL;
+   fc_remote_port_delete(rport);
+   }
zfcp_erp_adapter_shutdown(adapter, 0);
zfcp_erp_wait(adapter);
zfcp_adapter_scsi_unregister(adapter);
diff -urN linux-2.6.13-rcx/drivers/s390/scsi/zfcp_def.h 
linux-2.6.13-zfcpfctc/drivers/s390/scsi/zfcp_def.h
--- linux-2.6.13-rcx/drivers/s390/scsi/zfcp_def.h   2005-08-25 
10:53:15.0 +0200
+++ linux-2.6.13-zfcpfctc/drivers/s390/scsi/zfcp_def.h  2005-08-26 
19:00:18.0 +0200
@@ -906,6 +906,7 @@

Re: [PATCH] remove name length check in a workqueue

2005-08-11 Thread Andreas Herrmann
Simon Derr <[EMAIL PROTECTED]> wrote:

  > It is sufficient to have a few HBAs and to insmod/rmmod the driver a 
few 
  > times.

  > Since the host_no is choosen with a mere counter increment 
  > in scsi_host_alloc():

  >   shost->host_no = scsi_host_next_hn++; /* XXX(hch): still racy */

  > Unused `host_no's are not reused and the 100 limit is reached even on 
  > smaller systems.

  > I have no idea of why someone would do repeated insmod/rmmods, though.
  > (But someone did).

You even don't have to use insmod/rmmod.  On s390 (using zfcp) it
suffices to take adapters offline and online (triggered via VM,
hardware, or within Linux). Just do so about 100 times ... You
know the result.


Regards,

Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [dm-devel] Re: fastfail operation and retries

2005-04-21 Thread Andreas Herrmann
Lars Marowsky-Bree <[EMAIL PROTECTED]>
21.04.2005 21:54
 
> On 2005-04-21T09:42:05, Patrick Mansfield <[EMAIL PROTECTED]> wrote:

> > On Tue, Apr 19, 2005 at 07:19:53PM +0200, Andreas Herrmann wrote:

  

> > 
> > We need a patch like Mike Christie had, this:
> > 
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=107961883914541&w=2
> > 
> > The scsi core should decode the sense data and pass up the result, 
then dm
> > need not decode sense data, and we don't need sense data passed around 
via
> > the block layer.

> The most recent udm patchset has a patch by Jens Axboe and myself to
> pass up sense data / error codes in the bio so the dm mpath module can
> deal with it. 

> Only issue still is that the SCSI midlayer does only generate a single
> "EIO" code also for timeouts; however, that pretty much means it's a
> transport error, because if it was a media error, we'd be getting sense
> data ;-)

Well, there are various situations when all paths to the ESS are
"temporarily unavailable". In some cases TASK_SET_FULL/BUSY is
reported as it should be. In other cases we just encounter data
underruns or exchange sequences are aborted and finally it might be
that requests just time out. BTW, it is not only ESS where I have seen
such (broken) behaviour.

> Together with the "queue_if_no_path" feature flag for dm-mpath that
> should do what you need to handle this (arguably broken) array
> behaviour: It'll queue until the error goes away and multipathd retests
> and reactivates the paths. That ought to work, but given that I don't
> have an IBM ESS accessible, please confirm that.

Sounds good. Will make some tests using the "queue_if_no_path" feature.

> It is possible that to fully support them a dm mpath hardware handler
> (like for the EMC CX family) might be required, too.

For the time being I hope "queue_if_no_path" feature is sufficient
to succesfully pass our tests ;-)

> (For easier testing, you'll find that all this functionality is
> available in the latest SLES9 SP2 betas, to which you ought to have
> access at IBM, and the kernels are also available via
> ftp://ftp.suse.com/pub/projects/kernel/kotd/.)

> > scsi core could be changed to handle device specific decoding via 
sense
> > tables that can be modified via sysfs, similar to devinfo code (well,
> > devinfo still lacks a sysfs interface).

> dm-path's capabilities go a bit beyond just the error decoding (which
> for generic devices is also provided for in a generic
> dm_scsi_err_handler()); for example you can code special initialization
> commands and behaviour an array might need.

> Maybe this could indeed be abstracted further to download the command
> and/or specific decoding tables from user-space via sysfs or configfs by
> a generic user-space customizable dm-hw-handler-generic.[ch] plugin; I
> think patches are being accepted ;-)

Thanks for the information.


Regards,

Andreas

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] zfcp: fix compile error

2005-04-21 Thread Andreas Herrmann
Jan Dittmer <[EMAIL PROTECTED]> wrote:
21.04.2005 10:49

> > [EMAIL PROTECTED]:
> > [PATCH] zfcp: convert to compat_ioctl

> This does not seem to compile anymore with defconfig:

>   CC  drivers/s390/scsi/zfcp_aux.o
> /usr/src/ctest/rc/kernel/drivers/s390/scsi/zfcp_aux.c:63: warning: 
> initialization from incompatible pointer type
> /usr/src/ctest/rc/kernel/drivers/s390/scsi/zfcp_aux.c:366: error: conflicting 
> types for `zfcp_cfdc_dev_ioctl'


Oops. Submitted patch was incorrect.
Attached patch (against 2.6.12-rc3) will fix the problem.
Sorry, for any inconvenience.


Regards,

Andreas


zfcp: fix compile error

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>


diff -bBrauN linux-2.6.x/drivers/s390/scsi-orig/zfcp_aux.c 
linux-2.6.x/drivers/s390/scsi/zfcp_aux.c
--- linux-2.6.x/drivers/s390/scsi-orig/zfcp_aux.c   2005-04-21 
12:36:44.0 +0200
+++ linux-2.6.x/drivers/s390/scsi/zfcp_aux.c2005-04-21 12:40:48.0 
+0200
@@ -52,7 +52,7 @@
 static inline int zfcp_sg_list_copy_to_user(void __user *,
struct zfcp_sg_list *, size_t);
 
-static int zfcp_cfdc_dev_ioctl(struct file *, unsigned int, unsigned long);
+static long zfcp_cfdc_dev_ioctl(struct file *, unsigned int, unsigned long);
 
 #define ZFCP_CFDC_IOC_MAGIC 0xDD
 #define ZFCP_CFDC_IOC \
diff -bBrauN linux-2.6.x/drivers/s390/scsi-orig/zfcp_def.h 
linux-2.6.x/drivers/s390/scsi/zfcp_def.h
--- linux-2.6.x/drivers/s390/scsi-orig/zfcp_def.h   2005-04-21 
12:36:44.0 +0200
+++ linux-2.6.x/drivers/s390/scsi/zfcp_def.h2005-04-21 12:41:56.0 
+0200
@@ -61,7 +61,6 @@
 #include 
 #include 
 #include 
-#include 
 
 / DEBUG FLAGS 
*/
 

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fastfail operation and retries

2005-04-20 Thread Andreas Herrmann
??? <[EMAIL PROTECTED]> wrote:
20.04.2005 03:17
 
> what multipath are you using? Software, or hardware,
> or both?

We are using udm with evms (Linux on zSeries).
Hardware setup is:
- switched fabric FC-SAN,
- 4 paths to each FC-LUN on the ESS 800

All 4 paths are "failing fast" during operations on
the ESS and our stress test tool encounteres I/O-errors.


Regards,

Andreas

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


fastfail operation and retries

2005-04-19 Thread Andreas Herrmann
Hi,

I have question(s) regarding the fastfail operation of the SCSI stack.

Performing multipath-tests with an IBM ESS I encountered problems.
During certain operations on an ESS (quiesce/resume and such) requests
on all paths fail temporarily with an data underrun (resid is set in
the FCP-response).  In another situation abort sequences happen (see
FC-FS).

In both cases it is not a path failure but the device (ESS) reports
error conditions temporarily (some seconds).

Now on error on the first path the multipath layer initiates failover
to other available path(s) where requests will immediately fail.

Using linux-2.4 and LVM such problems did not occure. There were
enough retries (5 for each path) to handle such situations.

Now if the FASTFAIL flag is set the SCSI stack prevents retries for
failed SCSI commands.

Problem is that the multipath layer cannot distinguish between path
and device failures (and won't do any retries for the failed request
on the same path anyway).

How can an lld force the SCSI stack to retry a failed scsi-command
(without using DID_REQUEUE or DID_IMM_RETRY, which both do not change
the retry counter).

What about a DID_FORCE_RETRY ?  Or is there any outlook when there
will be a better interface between the SCSI stack and the multipath
layer to properly handle retries.


Regards,

Andreas

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


evaluation of scsi_cmnd->resid

2005-04-12 Thread Andreas Herrmann
Hi,

Am I right in the assumption that scsi_cmnd->resid is just of use for
requests initiated by sg?

How does the SCSI-stack handle normal (non-sg) requests for SCSI disks
for which a scsi_cmnd->resid is set?  AFAIK, resid is ignored by
sd. So, such requests are returned to the block layer although the
amount of data transferred is less the amount of data that should have
been transferred.  For FCP-drivers this means that some error situations
cannot be handled by just using scsi_cmnd->resid.

What am I missing here?


Regards,

Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Proper handling of data underrun

2005-04-08 Thread Andreas Herrmann
Douglas Gilbert <[EMAIL PROTECTED]> wrote:

>  underflow- LLD should place (DID_ERROR << 16) in 'result' if
> actual number of bytes transferred is less than this
> figure. Not many LLDs implement this check and some
> that do just output an error message to the log
> rather than report a DID_ERROR. Better for an LLD
> to implement 'resid'.

> Andreas,
> The last sentence is were the stress should be. It implies
> the LLD should use one or the other, preferably resid.

Ok.
BTW, resid is used in scsi_lib.c to set rq->data_len.
Who is actually evaluating this field -- the block layer 
(I have seen some usage in elevator.c)?

> Historically 'underflow' has been there the longest but was
> insufficient to distinguish between serious underflows
> (e.g. on a READ of a block device) and informative underflows
> (e.g. fetching a mode page with an arbitrarily large buffer).

> So 'resid' was added later and conveys more information and
> doesn't jump to conclusions that it is a serious error.
> Perhaps 'underflow' should be marked as deprecated.

> Is any data conveyed or is the underflow value the same
> as the requested length?

Value is the same as the requested length.

> See what happens if 'underflow' is ignored (i.e. not
> written to be the LLD) and DID_ERROR is not set in
> the host_byte.

Yes I will give it a try.
Thanks.


Regards,

Andreas

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Proper handling of data underrun

2005-04-08 Thread Andreas Herrmann
Hi,

Documentation/scsi/scsi_mid_low_api.txt says:

resid- an LLD should set this signed integer to the ...



underflow- LLD should place (DID_ERROR << 16) in 'result' if ...



ZFCP is setting resid and DID_ERROR if an underrun is indicated in the
FCP-response.

In some error situations it occurs that the storage box reports a BUSY
or TASK_SET_FULL scsi state as well as data underrun in the
FCP-response.

Now zfcp sets DID_ERROR in host_byte as suggested in
scsi_mid_low_api.txt. And the BUSY/TASK_SET_FULL state is returned in
stauts_byte.

Problem is:
Due to the its fastfail-operation the scsi-stack won't do any
retry for this kind of failed commands because DID_ERROR is
evaluated before BUSY/TASK_SET_FULL.

What is the proper handling of situations where the device reports a
BUSY/TASK_SET_FULL and a data underrun?


Regards,

Andreas
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] zfcp: add point-2-point support

2005-04-06 Thread Andreas Herrmann
Hi,

This patch mainly introduces support for point-2-point
topology.


Regards,

Andreas

From: Heiko Carstens <[EMAIL PROTECTED]>
From: Maxim Shchetynin <[EMAIL PROTECTED]>
From: Andreas Herrmann <[EMAIL PROTECTED]>

zfcp changes:
 - add support for point-2-point topology
 - correct permissions for module parameters
 - add log mesage if CFDC was hardened

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

diff -bBrauN zfcp-bk/zfcp_aux.c zfcp-bk-adapt/zfcp_aux.c
--- zfcp-bk/zfcp_aux.c  2005-04-06 15:48:24.620792000 +0200
+++ zfcp-bk-adapt/zfcp_aux.c2005-04-06 15:53:44.489164616 +0200
@@ -88,10 +88,10 @@
 ("FCP (SCSI over Fibre Channel) HBA driver for IBM eServer zSeries");
 MODULE_LICENSE("GPL");
 
-module_param(device, charp, 0);
+module_param(device, charp, 0400);
 MODULE_PARM_DESC(device, "specify initial device");
 
-module_param(loglevel, uint, 0);
+module_param(loglevel, uint, 0400);
 MODULE_PARM_DESC(loglevel,
 "log levels, 8 nibbles: "
 "FC ERP QDIO CIO Config FSF SCSI Other, "
diff -bBrauN zfcp-bk/zfcp_def.h zfcp-bk-adapt/zfcp_def.h
--- zfcp-bk/zfcp_def.h  2005-04-06 15:48:24.681782728 +0200
+++ zfcp-bk-adapt/zfcp_def.h2005-04-06 15:53:44.490164464 +0200
@@ -69,7 +69,7 @@
 /* GENERAL DEFINES */
 
 /* zfcp version number, it consists of major, minor, and patch-level number */
-#define ZFCP_VERSION   "4.2.0"
+#define ZFCP_VERSION   "4.3.0"
 
 /**
  * zfcp_sg_to_address - determine kernel address from struct scatterlist
@@ -850,6 +850,9 @@
wwn_t   wwnn;  /* WWNN */
wwn_t   wwpn;  /* WWPN */
fc_id_t s_id;  /* N_Port ID */
+   wwn_t   peer_wwnn; /* P2P peer WWNN */
+   wwn_t   peer_wwpn; /* P2P peer WWPN */
+   fc_id_t peer_d_id; /* P2P peer D_ID */
struct ccw_device   *ccw_device;   /* S/390 ccw device */
u8  fc_service_class;
u32 fc_topology;   /* FC topology */
diff -bBrauN zfcp-bk/zfcp_erp.c zfcp-bk-adapt/zfcp_erp.c
--- zfcp-bk/zfcp_erp.c  2005-04-06 15:48:24.681782728 +0200
+++ zfcp-bk-adapt/zfcp_erp.c2005-04-06 15:53:44.490164464 +0200
@@ -2568,6 +2568,23 @@
case ZFCP_ERP_STEP_UNINITIALIZED:
case ZFCP_ERP_STEP_PHYS_PORT_CLOSING:
case ZFCP_ERP_STEP_PORT_CLOSING:
+   if (adapter->fc_topology == FSF_TOPO_P2P) {
+   if (port->wwpn != adapter->peer_wwpn) {
+   ZFCP_LOG_NORMAL("Failed to open port 0x%016Lx "
+   "on adapter %s.\nPeer WWPN "
+   "0x%016Lx does not match\n",
+   port->wwpn,
+   
zfcp_get_busid_by_adapter(adapter),
+   adapter->peer_wwpn);
+   zfcp_erp_port_failed(port);
+   retval = ZFCP_ERP_FAILED;
+   break;
+   }
+   port->d_id = adapter->peer_d_id;
+   atomic_set_mask(ZFCP_STATUS_PORT_DID_DID, 
&port->status);
+   retval = zfcp_erp_port_strategy_open_port(erp_action);
+   break;
+   }
if (!(adapter->nameserver_port)) {
retval = zfcp_nameserver_enqueue(adapter);
if (retval != 0) {
@@ -3516,8 +3533,9 @@
debug_text_event(adapter->erp_dbf, 3, "a_access_unblock");
debug_event(adapter->erp_dbf, 3, &adapter->name, 8);
 
-   zfcp_erp_port_access_changed(adapter->nameserver_port);
read_lock_irqsave(&zfcp_data.config_lock, flags);
+   if (adapter->nameserver_port)
+   zfcp_erp_port_access_changed(adapter->nameserver_port);
list_for_each_entry(port, &adapter->port_list_head, list)
if (port != adapter->nameserver_port)
zfcp_erp_port_access_changed(port);
diff -bBrauN zfcp-bk/zfcp_fsf.c zfcp-bk-adapt/zfcp_fsf.c
--- zfcp-bk/zfcp_fsf.c  2005-04-06 15:48:24.682782576 +0200
+++ zfcp-bk-adapt/zfcp_fsf.c2005-04-06 15:53:44.491164312 +0200
@@ -2107,6 +2107,9 @@
   bottom->low_qtcb_version, bottom->high_qtcb_version);
adapter->fsf_lic_version = bottom->lic_version;
adapter->supported_features = bottom->supported_features;
+   adapter->peer_wwpn = 0;
+   adapter->peer_wwnn = 0;
+   adapter->pee

[PATCH] zfcp: convert to compat_ioctl

2005-04-06 Thread Andreas Herrmann
Hi,

Patch converts zfcp to use compat_ioctl.

Thanks to Cristoph who reminded me to adapt zfcp.


Regards,

Andreas


zfcp changes: convert zfcp to compat_ioctl

Signed-off-by: Andreas Herrmann <[EMAIL PROTECTED]>

= zfcp_aux.c 1.19 vs edited =
--- 1.19/drivers/s390/scsi/zfcp_aux.c   2005-03-29 06:39:24 +02:00
+++ edited/zfcp_aux.c   2005-04-06 14:55:52 +02:00
@@ -52,19 +52,18 @@
 static inline int zfcp_sg_list_copy_to_user(void __user *,
struct zfcp_sg_list *, size_t);
 
-static int zfcp_cfdc_dev_ioctl(struct inode *, struct file *,
-   unsigned int, unsigned long);
+static long zfcp_cfdc_dev_ioctl(struct file *, unsigned int, unsigned long);
 
 #define ZFCP_CFDC_IOC_MAGIC 0xDD
 #define ZFCP_CFDC_IOC \
_IOWR(ZFCP_CFDC_IOC_MAGIC, 0, struct zfcp_cfdc_sense_data)
 
-#ifdef CONFIG_COMPAT
-static struct ioctl_trans zfcp_ioctl_trans = {ZFCP_CFDC_IOC, (void*) 
sys_ioctl};
-#endif
 
 static struct file_operations zfcp_cfdc_fops = {
-   .ioctl = zfcp_cfdc_dev_ioctl
+   .unlocked_ioctl = zfcp_cfdc_dev_ioctl,
+#ifdef CONFIG_COMPAT
+   .compat_ioctl = zfcp_cfdc_dev_ioctl
+#endif
 };
 
 static struct miscdevice zfcp_cfdc_misc = {
@@ -308,23 +307,16 @@
if (!zfcp_transport_template)
return -ENODEV;
 
-   retval = register_ioctl32_conversion(zfcp_ioctl_trans.cmd,
-zfcp_ioctl_trans.handler);
-   if (retval != 0) {
-   ZFCP_LOG_INFO("registration of ioctl32 conversion failed\n");
-   goto out;
-   }
-
retval = misc_register(&zfcp_cfdc_misc);
if (retval != 0) {
ZFCP_LOG_INFO("registration of misc device "
  "zfcp_cfdc failed\n");
-   goto out_misc_register;
-   } else {
-   ZFCP_LOG_TRACE("major/minor for zfcp_cfdc: %d/%d\n",
-  ZFCP_CFDC_DEV_MAJOR, zfcp_cfdc_misc.minor);
+   goto out;
}
 
+   ZFCP_LOG_TRACE("major/minor for zfcp_cfdc: %d/%d\n",
+  ZFCP_CFDC_DEV_MAJOR, zfcp_cfdc_misc.minor);
+
/* Initialise proc semaphores */
sema_init(&zfcp_data.config_sema, 1);
 
@@ -348,8 +340,6 @@
 
  out_ccw_register:
misc_deregister(&zfcp_cfdc_misc);
- out_misc_register:
-   unregister_ioctl32_conversion(zfcp_ioctl_trans.cmd);
  out:
return retval;
 }
@@ -370,9 +360,9 @@
  *  -EPERM  - Cannot create or queue FSF request or create 
SBALs
  *  -ERESTARTSYS- Received signal (is mapped to EAGAIN by VFS)
  */
-static int
-zfcp_cfdc_dev_ioctl(struct inode *inode, struct file *file,
-unsigned int command, unsigned long buffer)
+static long
+zfcp_cfdc_dev_ioctl(struct file *file, unsigned int command,
+   unsigned long buffer)
 {
struct zfcp_cfdc_sense_data *sense_data, __user *sense_data_user;
struct zfcp_adapter *adapter = NULL;
= zfcp_def.h 1.20 vs edited =
--- 1.20/drivers/s390/scsi/zfcp_def.h   2004-12-07 11:19:06 +01:00
+++ edited/zfcp_def.h   2005-04-06 14:19:31 +02:00
@@ -61,7 +61,6 @@
 #include 
 #include 
 #include 
-#include 
 
 / DEBUG FLAGS 
*/
 

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html