On 05/10/2013 03:24 PM, Hannes Reinecke wrote:
However, this time is only defined _on the initiator_.
The specification does _NOT_ have any fixed timeout values for _any_
command. As such it could in theory (and does, if you happen to run
against certain arrays under certain conditions) take several
minutes to return a completion.

That's my understanding too - in a multipath configuration we're waiting only for our own fast_io_fail_tmo (if set), which is essentially an arbitrary, administrator-controlled interval. You can tune it between extremes of rapid fault identification vs. paths twitching at every transient glitch.

Yes, that was the idea.
Which I'll get down to eventually; if only customers wouldn't have
all these obnoxious issues no-one has ever seen...

The class I've been looking at is really very easy to reproduce and we've seen it at least a half dozen times at different sites with different FC switches (so it's certainly not that unusual).

To recreate it artificially you just need a target, a host, and a switch that can block RSCN propagation on a per-port basis. I've been using brocades with the rscnsupr portcfg attribute.

It's important that you block a port on the switch<->target side otherwise the host will see a link event which short-circuits everything.

E.g. if you have one port of an array attached to port 1 on a brocade the following two commands will set up this scenario:

portcfg rscnsupr 1 --enable
portdisable 1

Regards,
Bryn.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to