[PATCH] Undo __scsi_kill_request()

2007-12-06 Thread Hannes Reinecke
Hi all,

the main goal of using FAILFAST is to have requests terminated
early if the link to the target is lost. This is indicated by
the device state SDEV_BLOCK.
So we only have to check for the FAILFAST flag when we check
the queue state in scsi_prep_fn().

This patch reverts parts of the original patch
'Do not requeue requests if REQ_FAILFAST is set'
as the return values of ->queuecommand() should not be affected
by the FAILFAST setting.

James, please apply.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
[EMAIL PROTECTED] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
Remove __scsi_kill_request()

We only have to evaluate the FAILFAST flag if the device is in
status SDEV_BLOCK, as this indicates a link failure.
The return status of ->queuecommand() should not be affected by
this, so remove the check there.

Signed-off-by: Hannes Reinecke <[EMAIL PROTECTED]>

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 1148c40..6f4862b 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1424,17 +1424,6 @@ static inline int scsi_host_queue_ready(struct 
request_queue *q,
return 1;
 }
 
-static void __scsi_kill_request(struct request *req)
-{
-   struct scsi_cmnd *cmd = req->special;
-   struct scsi_device *sdev = cmd->device;
-
-   cmd->result = DID_NO_CONNECT << 16;
-   atomic_inc(&cmd->device->iorequest_cnt);
-   sdev->device_busy--;
-   __scsi_done(cmd);
-}
-
 /*
  * Kill a request for a dead device
  */
@@ -1639,12 +1628,9 @@ static void scsi_request_fn(struct request_queue *q)
 * later time.
 */
spin_lock_irq(q->queue_lock);
-   if (unlikely(req->cmd_flags & REQ_FAILFAST))
-   __scsi_kill_request(req);
-   else {
-   blk_requeue_request(q, req);
-   sdev->device_busy--;
-   }
+   blk_requeue_request(q, req);
+   sdev->device_busy--;
+
if(sdev->device_busy == 0)
blk_plug_device(q);
  out:


Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)

2007-12-06 Thread Boaz Harrosh
On Thu, Dec 06 2007 at 2:26 +0200, Kiyoshi Ueda <[EMAIL PROTECTED]> wrote:
> Hi Boaz,
> 
> On Tue, 04 Dec 2007 15:39:12 +0200, Boaz Harrosh <[EMAIL PROTECTED]> wrote:
>> On Sat, Dec 01 2007 at 1:35 +0200, Kiyoshi Ueda <[EMAIL PROTECTED]> wrote:
>>> This patch converts bidi of scsi mid-layer to use blk_end_request().
>>>
>>> rq->next_rq represents a pair of bidi requests.
>>> (There are no other use of 'next_rq' of struct request.)
>>> For both requests in the pair, end_that_request_chunk() should be
>>> called before end_that_request_last() is called for one of them.
>>> Since the calls to end_that_request_first()/chunk() and
>>> end_that_request_last() are packaged into blk_end_request(),
>>> the handling of next_rq completion has to be moved into
>>> blk_end_request(), too.
>>>
>>> Bidi sets its specific value to rq->data_len before the request is
>>> completed so that upper-layer can read it.
>>> This setting must be between end_that_request_chunk() and
>>> end_that_request_last(), because rq->data_len may be used
>>> in end_that_request_chunk() by blk_trace and so on.
>>> To satisfy the requirement, use blk_end_request_callback() which
>>> is added in PATCH 25 only for the tricky drivers.
>>>
>>> If bidi didn't reuse rq->data_len and added new members to request
>>> for the specific value, it could set before end_that_request_chunk()
>>> and use the standard blk_end_request() like below.
>>>
>>> void scsi_end_bidi_request(struct scsi_cmnd *cmd)
>>> {
>>> struct request *req = cmd->request;
>>>
>>> rq->resid = scsi_out(cmd)->resid;
>>> rq->next_rq->resid = scsi_in(cmd)->resid;
>>>
>>> if (blk_end_request(req, 1, req->data_len))
>>> BUG();
>>>
>>> scsi_release_buffers(cmd);
>>> scsi_next_command(cmd);
>>> }
> ...
> snip
> ...
>> rq->data_len = scsi_out(cmd)->resid is Not Just a problem of bidi
>> it is a General problem of scsi residual handling, and user code.
>>
>> Even today before any bidi. at scsi_lib.c at scsi_io_completion()
>> we do req->data_len = scsi_get_resid(cmd);
>> ( or: req->data_len = cmd->resid; depends which version you look)
>> And then call scsi_end_request() which calls __end_that_request_first/last
>> So it is assumed even today that req->data_len is not touched by
>> __end_that_request_first/last unless __end_that_request_first returned
>> that there is more work to do and the command is resubmitted in which
>> case the resid information is discarded.
>>
>> So if the regular resid handling is acceptable - Set req->data_len
>> before the call to __end_that_request_first/last, or blk_end_request()
>> in your case, then here goes your second client of the _callback and
>> it can be removed.
>> But if it is found that req->data_len is touched and the resid information
>> gets lost, than it should be fixed for the common uni-io case, by - for 
>> example
>> - pass resid to the blk_end_request() function.
>> (So in any way the _callback can go)
> 
> Thank you for the explanation of scsi's rq->data_len usage.
> I see that scsi usually uses rq->data_len for cmd->resid.
> 
> I have investigated the possibility of setting data_len before
> the call to blk_end_request.
> But no matter whether data_len is touched or not, we need a callback
> for bidi.  So I would like to go with the current patch.
> 
> I explained the reason and some details below.
> 
> 
> As far as I can see, rq->data_len is just referenced
> by blk_add_trace_rq() in __end_that_request_first(), not modified.
> And I don't change any logic around there in the block-layer.
> So there shouldn't be any critical problem for scsi residual handing.
> (although I'm not sure that scsi expectes cmd->resid to be traced
>  by blk_trace.)
> 
> Anyway, I see that it is no critical problem for bidi to set cmd->resid
> to rq->data_len before blk_end_request() call.
> But if I do that, blk_end_request() can't get the next_rq's size
> to complete in its code below.
> 
>> +/* Bidi request must be completed as a whole */
>> +if (blk_bidi_rq(rq) &&
>> +__end_that_request_first(rq->next_rq, uptodate,
>> + blk_rq_bytes(rq->next_rq)))
>> +return 1;
> 
> So I will have to move next_rq completion to bidi and use _callback()
> anyway like the following.
> -
> static int dummy_cb(struct request *rq)
> {
>   return 1;
> }
> 
> void scsi_end_bidi_request(struct scsi_cmnd *cmd)
> {
>   struct request *req = cmd->request;
>   unsigned int dlen = req->data_len;
>   unsigned int next_dlen = req->next_rq->data_len;
>  
>   req->data_len = scsi_out(cmd)->resid;
>   req->next_rq->data_len = scsi_in(cmd)->resid;
>  
>   /* Complete only DATA of next_rq using _callback and dummy function */
>   if (!blk_end_request_callback(req->next_rq, 1, next_dlen, dummy_cb))
>   BUG();
>  
>   if (blk_end_request(req, 1, dlen))

Re: [PATCH] zfcp: add some internal zfcp adapter statistics

2007-12-06 Thread Swen Schillig
On Monday 26 November 2007 11:23, Swen Schillig wrote:
> On Sunday 25 November 2007 12:16, James Bottomley wrote:
> > 
> > On Wed, 2007-10-31 at 11:33 +0100, Swen Schillig wrote:
> > > From: Swen Schillig <[EMAIL PROTECTED]>
> > > 
> > > add some statistics provided by the zFCP adapter to the sysfs
> > > 
> > > The new zFCP adapter statistics provide a variety of information
> > > about the virtual adapter (subchannel). In order to collect this 
> > > information
> > > the zFCP driver is extended on one side to query the adapter and
> > > on the other side summarize certain values which can then be fetched on 
> > > demand.
> > > This information is made available via files(attributes) in the sysfs 
> > > filesystem.
> > > 
> > > The information provided by the new zFCP adapter statistics can be fetched
> > > by reading from the following files in the sysfs filesystem
> > > 
> > >  /sys/class/scsi_host/host/seconds_active
> > >  /sys/class/scsi_host/host/requests
> > >  /sys/class/scsi_host/host/megabytes
> > >  /sys/class/scsi_host/host/utilization
> > 
> > This lot all look like they belong in the FC transport class statistics
> > (some even already exist there).
> 
> They might look alike but they are not the same. The values provided through 
> the FC transport
> class always refer to the physical port whereas the new values here refer to 
> a virtual adapter or subchannel.
> The attributes provided here are all new and not covered or displayed 
> anywhere else !
> 
> > 
> > > These are the statistics on a virtual adapter (subchannel) level.
> > > In addition latency information is provided on a SCSI device level (LUN) 
> > > which
> > > can be found at the following location
> > > 
> > >  /sys/class/scsi_device//device/cmd_latency
> > >  /sys/class/scsi_device//device/read_latency
> > >  /sys/class/scsi_device//device/write_latency
> > 
> > These look to duplicate to some degree the figures
> > in /sys/block//stat.  Isn't the block device the best place to
> > gather these, if they're useful?  Since user latencies should probably
> > include elevator times.
> 
> Actually no, the latencies covered here are channel- and fabric-latencies 
> grouped by scsi-devices
> and not device-, scsi- or block-latencies. In contrast to the stats provided 
> by the block-layer structure,
> tape devices will be covered here as well .
> 
> > 
> > James
> > 
> > 
> > 
> 
> Cheers Swen
> 

James

were my answers, comments sufficient enough to apply my patch
or is there still something missing required ?

Thanks

Cheers Swen
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24-rc4-mm1: hostbyte=0x01 driverbyte=0x00 (now bisected)

2007-12-06 Thread Jens Axboe
On Thu, Dec 06 2007, Hannes Reinecke wrote:
> Alexey Dobriyan wrote:
> >>  git-scsi-misc.patch
> > 
> > Apologies for not looking into the problem earlier. See
> > http://marc.info/?t=11962802235&r=1&w=2
> > "2.6.24-rc3-mm2: Result: hostbyte=0x01 driverbyte=0x00\nend_request: I/O 
> > error"
> > for previous installment.
> > 
> > I've bisected it to the following patch in git-scsi-misc branch.
> > Revert on top of 2.6.24-rc4-mm1 also helps.
> > 
> > commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> > Author: Hannes Reinecke <[EMAIL PROTECTED]>
> > Date:   Tue Nov 6 09:23:40 2007 +0100
> > 
> > [SCSI] Do not requeue requests if REQ_FAILFAST is set
> > 
> > Any requests with the REQ_FAILFAST flag set should not be requeued
> > to the requeust queue, but rather terminated directly.
> > Otherwise the multipath failover will stall until the command
> > timeout triggers.
> > 
> > Signed-off-by: Hannes Reinecke <[EMAIL PROTECTED]>
> > Signed-off-by: James Bottomley <[EMAIL PROTECTED]>
> > 
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index 0f44bdb..0da0dd0 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -1286,6 +1286,11 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
> > struct request *req)
> >  */
> > if (!(req->cmd_flags & REQ_PREEMPT))
> > ret = BLKPREP_DEFER;
> > +   /*
> > +* Return failfast requests immediately
> > +*/
> > +   if (req->cmd_flags & REQ_FAILFAST)
> > +   ret = BLKPREP_KILL;
> > break;
> > default:
> > /*
> > @@ -1414,6 +1419,17 @@ static inline int scsi_host_queue_ready(struct 
> > request_queue *q,
> > return 1;
> >  }
> >  
> > +static void __scsi_kill_request(struct request *req)
> > +{
> > +   struct scsi_cmnd *cmd = req->special;
> > +   struct scsi_device *sdev = cmd->device;
> > +
> > +   cmd->result = DID_NO_CONNECT << 16;
> > +   atomic_inc(&cmd->device->iorequest_cnt);
> > +   sdev->device_busy--;
> > +   __scsi_done(cmd);
> > +}
> > +
> >  /*
> >   * Kill a request for a dead device
> >   */
> > @@ -1527,8 +1543,16 @@ static void scsi_request_fn(struct request_queue *q)
> >  * accept it.
> >  */
> > req = elv_next_request(q);
> > -   if (!req || !scsi_dev_queue_ready(q, sdev))
> > +   if (!req)
> > +   break;
> > +
> > +   if (!scsi_dev_queue_ready(q, sdev)) {
> > +   if (req->cmd_flags & REQ_FAILFAST) {
> > +   scsi_kill_request(req, q);
> > +   continue;
> > +   }
> > break;
> > +   }
> >  
> > if (unlikely(!scsi_device_online(sdev))) {
> > sdev_printk(KERN_ERR, sdev,
> > @@ -1609,8 +1633,12 @@ static void scsi_request_fn(struct request_queue *q)
> >  * later time.
> >  */
> > spin_lock_irq(q->queue_lock);
> > -   blk_requeue_request(q, req);
> > -   sdev->device_busy--;
> > +   if (unlikely(req->cmd_flags & REQ_FAILFAST))
> > +   __scsi_kill_request(req);
> > +   else {
> > +   blk_requeue_request(q, req);
> > +   sdev->device_busy--;
> > +   }
> > if(sdev->device_busy == 0)
> > blk_plug_device(q);
> >   out:
> Yeah, sorry. That patch was bad. Please use the attached one instead.
> Andrew, can you replace them?
> 
> Cheers,
> 
> Hannes
> -- 
> Dr. Hannes Reinecke zSeries & Storage
> [EMAIL PROTECTED]   +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Markus Rex, HRB 16746 (AG Nürnberg)

> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 13e7e09..9ec1566 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1284,13 +1284,15 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
> struct request *req)
>   /*
>* If the devices is blocked we defer normal commands.
>*/
> - if (!(req->cmd_flags & REQ_PREEMPT))
> - ret = BLKPREP_DEFER;
> - /*
> -  * Return failfast requests immediately
> -  */
> - if (req->cmd_flags & REQ_FAILFAST)
> - ret = BLKPREP_KILL;
> + if (!(req->cmd_flags & REQ_PREEMPT)) {
> + /*
> +  * Return failfast requests immediately
> +  */
> + if (req->cmd_flags & REQ_FAILFAST)
> + ret = BLKPREP_KILL;
> + else
> + ret = 

[PATCH 10/20] drivers/scsi/ipr.c: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-06 Thread Denis Cheng
single list_head variable initialized with LIST_HEAD_INIT could almost
always can be replaced with LIST_HEAD declaration, this shrinks the code
and looks better.

Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>
---
 drivers/scsi/ipr.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index 0841df0..9018ee8 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -84,7 +84,7 @@
 /*
  *   Global Data
  */
-static struct list_head ipr_ioa_head = LIST_HEAD_INIT(ipr_ioa_head);
+static LIST_HEAD(ipr_ioa_head);
 static unsigned int ipr_log_level = IPR_DEFAULT_LOG_LEVEL;
 static unsigned int ipr_max_speed = 1;
 static int ipr_testmode = 0;
-- 
1.5.3.4

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] libsas: Don't issue commands to devices that have been hot-removed.

2007-12-06 Thread Brian King
Darrick J. Wong wrote:
> In general, I agree that sas-ata should adopt the new EH.
> Unfortunately, I believe the old way of sas-ata configuring ATA ports is
> somehow not compatible with the new EH stuff and causes a crash during
> the device probe with my patch to move sas-ata to the new EH.  If I
> apply the patch that migrates sas-ata to use brking's latest ata-sas
> configuration mechanism (the one that creates real ata_hosts), I see
> (a) lots and lots of ATA hosts getting created (one per ATA port;
> possibly undesirable if you've a SAS topology with a lot of SATA disks)

The new libata EH ends up spending more time in the error handling thread
than the old code did. One of the reasons having multiple ATA/SCSI hosts
is a good thing is that is the granularity of error handling, so it
prevents stalling all the other devices under that SAS HBA while we are
hitting errors on an ATAPI SATA device, for example.

Arguably, SATA users of libata already have one SCSI host per ATA port,
so my SAS patches really just bring SAS in line with that design...

-Brian

-- 
Brian King
Linux on Power Virtualization
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/20] drivers/scsi/ipr.c: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-06 Thread Brian King
Acked-by: Brian King <[EMAIL PROTECTED]>

Denis Cheng wrote:
> single list_head variable initialized with LIST_HEAD_INIT could almost
> always can be replaced with LIST_HEAD declaration, this shrinks the code
> and looks better.
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>
> ---
>  drivers/scsi/ipr.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
> index 0841df0..9018ee8 100644
> --- a/drivers/scsi/ipr.c
> +++ b/drivers/scsi/ipr.c
> @@ -84,7 +84,7 @@
>  /*
>   *   Global Data
>   */
> -static struct list_head ipr_ioa_head = LIST_HEAD_INIT(ipr_ioa_head);
> +static LIST_HEAD(ipr_ioa_head);
>  static unsigned int ipr_log_level = IPR_DEFAULT_LOG_LEVEL;
>  static unsigned int ipr_max_speed = 1;
>  static int ipr_testmode = 0;


-- 
Brian King
Linux on Power Virtualization
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG 2.6.24-rc4-mm1 -- Boot still hangs w/ async scsi scan

2007-12-06 Thread Lee Schermerhorn
On Wed, 2007-12-05 at 13:20 -0800, Andrew Morton wrote:
> On Wed, 05 Dec 2007 11:36:39 -0500
> Lee Schermerhorn <[EMAIL PROTECTED]> wrote:
> 
> > As reported here:
> > 
> > http://marc.info/?l=linux-scsi&m=119645761124683&w=4
> > 
> > against 24-rc3-mm2, I'm still seeing the hang on my HP ia64 NUMA
> > platform under 24-rc4-mm1 with async scsi scan enabled.  I'm still
> > seeing the message  "mptspi: ioc#: mpt_config failed" when it hangs. 
> > 
> > I can boot by disabling async scan.  However, I've also noticed some
> > disks attached via one of the "mpt" adapters ["scsi8" in console long in
> > message linked above] going "off-line" during stress tests.  This was
> > under 24-rc3-mm2.  Haven't got that far yet with 24-rc4-mm1.
> > 
> 
> Is ther any way of tricking you into
> http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt?
> 
> Obvious culprits to start with would be git-scsi-misc and maybe
> scsi-early-detection-of-medium-not-present-updated.patch.  But there are
> only 20-odd scsi patches in there.

The reported hang occurs after pushing the git-scsi-misc patch.  I'm
looking into it now, but it's rather large and I'm a neophyte in this
area.  If James can point me at a broken-out quilt series for this
patch, I'd be willing to try to bisect that--assuming that it IS
bisectable.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG 2.6.24-rc4-mm1 -- Boot still hangs w/ async scsi scan

2007-12-06 Thread Andrew Morton
On Thu, 06 Dec 2007 13:14:22 -0500 Lee Schermerhorn <[EMAIL PROTECTED]> wrote:

> On Wed, 2007-12-05 at 13:20 -0800, Andrew Morton wrote:
> > On Wed, 05 Dec 2007 11:36:39 -0500
> > Lee Schermerhorn <[EMAIL PROTECTED]> wrote:
> > 
> > > As reported here:
> > > 
> > >   http://marc.info/?l=linux-scsi&m=119645761124683&w=4
> > > 
> > > against 24-rc3-mm2, I'm still seeing the hang on my HP ia64 NUMA
> > > platform under 24-rc4-mm1 with async scsi scan enabled.  I'm still
> > > seeing the message  "mptspi: ioc#: mpt_config failed" when it hangs. 
> > > 
> > > I can boot by disabling async scan.  However, I've also noticed some
> > > disks attached via one of the "mpt" adapters ["scsi8" in console long in
> > > message linked above] going "off-line" during stress tests.  This was
> > > under 24-rc3-mm2.  Haven't got that far yet with 24-rc4-mm1.
> > > 
> > 
> > Is ther any way of tricking you into
> > http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt?
> > 
> > Obvious culprits to start with would be git-scsi-misc and maybe
> > scsi-early-detection-of-medium-not-present-updated.patch.  But there are
> > only 20-odd scsi patches in there.
> 
> The reported hang occurs after pushing the git-scsi-misc patch.

OK, thanks.

>  I'm
> looking into it now, but it's rather large and I'm a neophyte in this
> area.  If James can point me at a broken-out quilt series for this
> patch, I'd be willing to try to bisect that--

I doubt if such a thing exists.

> assuming that it IS
> bisectable.

Often git trees are not bisectable.  But they should be.

Your best bet is to do a git-bisect on
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git

http://www.kernel.org/doc/local/git-quick.html

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG 2.6.24-rc4-mm1 -- Boot still hangs w/ async scsi scan

2007-12-06 Thread Lee Schermerhorn
On Thu, 2007-12-06 at 10:35 -0800, Andrew Morton wrote:
> On Thu, 06 Dec 2007 13:14:22 -0500 Lee Schermerhorn <[EMAIL PROTECTED]> wrote:
> 
> > On Wed, 2007-12-05 at 13:20 -0800, Andrew Morton wrote:
> > > On Wed, 05 Dec 2007 11:36:39 -0500
> > > Lee Schermerhorn <[EMAIL PROTECTED]> wrote:
> > > 
> > > > As reported here:
> > > > 
> > > > http://marc.info/?l=linux-scsi&m=119645761124683&w=4
> > > > 
> > > > against 24-rc3-mm2, I'm still seeing the hang on my HP ia64 NUMA
> > > > platform under 24-rc4-mm1 with async scsi scan enabled.  I'm still
> > > > seeing the message  "mptspi: ioc#: mpt_config failed" when it hangs. 
> > > > 
> > > > I can boot by disabling async scan.  However, I've also noticed some
> > > > disks attached via one of the "mpt" adapters ["scsi8" in console long in
> > > > message linked above] going "off-line" during stress tests.  This was
> > > > under 24-rc3-mm2.  Haven't got that far yet with 24-rc4-mm1.
> > > > 
> > > 
> > > Is ther any way of tricking you into
> > > http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt?
> > > 
> > > Obvious culprits to start with would be git-scsi-misc and maybe
> > > scsi-early-detection-of-medium-not-present-updated.patch.  But there are
> > > only 20-odd scsi patches in there.
> > 
> > The reported hang occurs after pushing the git-scsi-misc patch.
> 
> OK, thanks.
> 
> >  I'm
> > looking into it now, but it's rather large and I'm a neophyte in this
> > area.  If James can point me at a broken-out quilt series for this
> > patch, I'd be willing to try to bisect that--
> 
> I doubt if such a thing exists.
> 
> > assuming that it IS
> > bisectable.
> 
> Often git trees are not bisectable.  But they should be.
> 
> Your best bet is to do a git-bisect on
> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git
> 
> http://www.kernel.org/doc/local/git-quick.html
> 

Ah, well... Can't promise that will happen any time soon...

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24-rc4-mm1: hostbyte=0x01 driverbyte=0x00 (now bisected)

2007-12-06 Thread Alexey Dobriyan
On Thu, Dec 06, 2007 at 08:52:29AM +0100, Hannes Reinecke wrote:
> Alexey Dobriyan wrote:
> >>  git-scsi-misc.patch
> > 
> > Apologies for not looking into the problem earlier. See
> > http://marc.info/?t=11962802235&r=1&w=2
> > "2.6.24-rc3-mm2: Result: hostbyte=0x01 driverbyte=0x00\nend_request: I/O 
> > error"
> > for previous installment.
> > 
> > I've bisected it to the following patch in git-scsi-misc branch.
> > Revert on top of 2.6.24-rc4-mm1 also helps.
> > 
> > commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> > Author: Hannes Reinecke <[EMAIL PROTECTED]>
> > Date:   Tue Nov 6 09:23:40 2007 +0100
> > 
> > [SCSI] Do not requeue requests if REQ_FAILFAST is set
> > 
> > Any requests with the REQ_FAILFAST flag set should not be requeued
> > to the requeust queue, but rather terminated directly.
> > Otherwise the multipath failover will stall until the command
> > timeout triggers.
> > 
> > Signed-off-by: Hannes Reinecke <[EMAIL PROTECTED]>
> > Signed-off-by: James Bottomley <[EMAIL PROTECTED]>
> > 
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index 0f44bdb..0da0dd0 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -1286,6 +1286,11 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
> > struct request *req)
> >  */
> > if (!(req->cmd_flags & REQ_PREEMPT))
> > ret = BLKPREP_DEFER;
> > +   /*
> > +* Return failfast requests immediately
> > +*/
> > +   if (req->cmd_flags & REQ_FAILFAST)
> > +   ret = BLKPREP_KILL;
> > break;
> > default:
> > /*
> > @@ -1414,6 +1419,17 @@ static inline int scsi_host_queue_ready(struct 
> > request_queue *q,
> > return 1;
> >  }
> >  
> > +static void __scsi_kill_request(struct request *req)
> > +{
> > +   struct scsi_cmnd *cmd = req->special;
> > +   struct scsi_device *sdev = cmd->device;
> > +
> > +   cmd->result = DID_NO_CONNECT << 16;
> > +   atomic_inc(&cmd->device->iorequest_cnt);
> > +   sdev->device_busy--;
> > +   __scsi_done(cmd);
> > +}
> > +
> >  /*
> >   * Kill a request for a dead device
> >   */
> > @@ -1527,8 +1543,16 @@ static void scsi_request_fn(struct request_queue *q)
> >  * accept it.
> >  */
> > req = elv_next_request(q);
> > -   if (!req || !scsi_dev_queue_ready(q, sdev))
> > +   if (!req)
> > +   break;
> > +
> > +   if (!scsi_dev_queue_ready(q, sdev)) {
> > +   if (req->cmd_flags & REQ_FAILFAST) {
> > +   scsi_kill_request(req, q);
> > +   continue;
> > +   }
> > break;
> > +   }
> >  
> > if (unlikely(!scsi_device_online(sdev))) {
> > sdev_printk(KERN_ERR, sdev,
> > @@ -1609,8 +1633,12 @@ static void scsi_request_fn(struct request_queue *q)
> >  * later time.
> >  */
> > spin_lock_irq(q->queue_lock);
> > -   blk_requeue_request(q, req);
> > -   sdev->device_busy--;
> > +   if (unlikely(req->cmd_flags & REQ_FAILFAST))
> > +   __scsi_kill_request(req);
> > +   else {
> > +   blk_requeue_request(q, req);
> > +   sdev->device_busy--;
> > +   }
> > if(sdev->device_busy == 0)
> > blk_plug_device(q);
> >   out:
> Yeah, sorry. That patch was bad. Please use the attached one instead.
> Andrew, can you replace them?

Instead? It won't apply. And it doesn't help on top of git-scsi.
It helps if 3 hunks involving __scsi_kill_request() are ducked.

> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1284,13 +1284,15 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
> struct request *req)
>   /*
>* If the devices is blocked we defer normal commands.
>*/
> - if (!(req->cmd_flags & REQ_PREEMPT))
> - ret = BLKPREP_DEFER;
> - /*
> -  * Return failfast requests immediately
> -  */
> - if (req->cmd_flags & REQ_FAILFAST)
> - ret = BLKPREP_KILL;
> + if (!(req->cmd_flags & REQ_PREEMPT)) {
> + /*
> +  * Return failfast requests immediately
> +  */
> + if (req->cmd_flags & REQ_FAILFAST)
> + ret = BLKPREP_KILL;
> + else
> + ret = BLKPREP_DEFER;
> + }
>   break;
>   default:
>   /*

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" 

Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)

2007-12-06 Thread Kiyoshi Ueda
Hi Boaz, Jens,

On Thu, 06 Dec 2007 11:24:44 +0200, Boaz Harrosh <[EMAIL PROTECTED]> wrote:
> > Index: 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c
> > ===
> > --- 2.6.24-rc3-mm2.orig/drivers/scsi/scsi_lib.c
> > +++ 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c
> > @@ -629,28 +629,6 @@ void scsi_run_host_queues(struct Scsi_Ho
> > scsi_run_queue(sdev->request_queue);
> >  }
> >  
> > -static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate)
> > -{
> > -   struct request_queue *q = cmd->device->request_queue;
> > -   struct request *req = cmd->request;
> > -   unsigned long flags;
> > -
> > -   add_disk_randomness(req->rq_disk);
> > -
> > -   spin_lock_irqsave(q->queue_lock, flags);
> > -   if (blk_rq_tagged(req))
> > -   blk_queue_end_tag(q, req);
> > -
> > -   end_that_request_last(req, uptodate);
> > -   spin_unlock_irqrestore(q->queue_lock, flags);
> > -
> > -   /*
> > -* This will goose the queue request function at the end, so we don't
> > -* need to worry about launching another command.
> > -*/
> > -   scsi_next_command(cmd);
> > -}
> > -
> >  /*
> >   * Function:scsi_end_request()
> >   *
> > @@ -921,6 +899,20 @@ void scsi_release_buffers(struct scsi_cm
> >  EXPORT_SYMBOL(scsi_release_buffers);
> >  
> >  /*
> > + * Called from blk_end_request_callback() after all DATA in rq and its 
> > next_rq
> > + * are completed before rq is completed/freed.
> > + */
> > +static int scsi_end_bidi_request_cb(struct request *rq)
> > +{
> > +   struct scsi_cmnd *cmd = rq->special;
> > +
> > +   rq->data_len = scsi_out(cmd)->resid;
> > +   rq->next_rq->data_len = scsi_in(cmd)->resid;
> > +
> > +   return 0;
> > +}
> > +
> > +/*
> >   * Bidi commands Must be complete as a whole, both sides at once.
> >   * If part of the bytes were written and lld returned
> >   * scsi_in()->resid and/or scsi_out()->resid this information will be left
> > @@ -931,22 +923,28 @@ void scsi_end_bidi_request(struct scsi_c
> >  {
> > struct request *req = cmd->request;
> >  
> > -   end_that_request_chunk(req, 1, req->data_len);
> > -   req->data_len = scsi_out(cmd)->resid;
> > -
> > -   end_that_request_chunk(req->next_rq, 1, req->next_rq->data_len);
> > -   req->next_rq->data_len = scsi_in(cmd)->resid;
> > -
> > -   scsi_release_buffers(cmd);
> > -
> > /*
> >  *FIXME: If ll_rw_blk.c is changed to also put_request(req->next_rq)
> > -*   in end_that_request_last() then this WARN_ON must be removed.
> > +*   in blk_end_request() then this WARN_ON must be removed.
> >  *   for now, upper-driver must have registered an end_io.
> >  */
> > WARN_ON(!req->end_io);
> >  
> > -   scsi_finalize_request(cmd, 1);
> > +   /*
> > +* blk_end_request() family take care of data completion of next_rq.
> > +* blk_end_request() family use next_rq->data_len for 
> > +* the completion data size of next_rq.
> > +* So resid can't be set before the data completion of next_rq
> > +* in blk_end_request().
> > +* To resolve that, use the callback feature of blk_end_request().
> > +*/
> > +   if (blk_end_request_callback(req, 1, req->data_len,
> > +scsi_end_bidi_request_cb))
> > +   /* req has not been completed */
> > +   BUG();
> > +
> > +   scsi_release_buffers(cmd);
> > +   scsi_next_command(cmd);
> >  }
> >  
> >  /*
> > Index: 2.6.24-rc3-mm2/block/ll_rw_blk.c
> > ===
> > --- 2.6.24-rc3-mm2.orig/block/ll_rw_blk.c
> > +++ 2.6.24-rc3-mm2/block/ll_rw_blk.c
> > @@ -3817,6 +3817,12 @@ int blk_end_request(struct request *rq, 
> > if (blk_fs_request(rq) || blk_pc_request(rq)) {
> > if (__end_that_request_first(rq, uptodate, nr_bytes))
> > return 1;
> > +
> > +   /* Bidi request must be completed as a whole */
> > +   if (blk_bidi_rq(rq) &&
> > +   __end_that_request_first(rq->next_rq, uptodate,
> > +blk_rq_bytes(rq->next_rq)))
> > +   return 1;
> > }
> >  
> > add_disk_randomness(rq->rq_disk);
> > @@ -3840,6 +3846,12 @@ int __blk_end_request(struct request *rq
> > if (blk_fs_request(rq) || blk_pc_request(rq)) {
> > if (__end_that_request_first(rq, uptodate, nr_bytes))
> > return 1;
> > +
> > +   /* Bidi request must be completed as a whole */
> > +   if (blk_bidi_rq(rq) &&
> > +   __end_that_request_first(rq->next_rq, uptodate,
> > +blk_rq_bytes(rq->next_rq)))
> > +   return 1;
> > }
> >  
> > add_disk_randomness(rq->rq_disk);
> > @@ -3884,6 +3896,12 @@ int blk_end_request_callback(struct requ
> > if (blk_fs_request(rq) || blk_pc_request(rq)) {
> > if (__end_that_request_first(rq, uptodate, nr_bytes))
> > 

Re: Patch submission question [not in the FAQ]

2007-12-06 Thread adam radford
On Dec 5, 2007 3:36 AM, Gabriele Gorla <[EMAIL PROTECTED]> wrote:
> Hello,
> I have submitted a patch for the 3x- driver on
> alpha several months ago to both the driver maintainer
> and the linux-scsi mailing list.
> I have read all the FAQ and I tried to stick to the
> instructions to the letter.
> However the patch has been completely ignored. No
> reply, no comment, no flames, absolutely nothing...
>
> the original email submission is at the end of the
> email.
>
> could anyone please explain what I am doing wrong?
>
> thanks,
> GG

Gabriele,

I ignored your patch because:

1. I do not believe you have the 3w- driver running on an
   alpha SMP system.

2. I removed the bitfields from the 3w- driver but I have
   yet to add full big endian support due to lack of demand.  I have
   such a patch for this driver (which already includes the unpacking
   of the wait_queue_head_t variable) but I have not submitted it to
   the main-line kernel.  The in-kernel 3w- driver is still missing
   the byte-swaps.

   The 3w-9xxx (9000 series 3ware driver) has full big endian support.

3. Your patch was garbled.

Is this an official request for big endian support for the 3w- driver or
are you looking for anybody who has a packed 'wait_queue_head_t' and
submitting a patch to fix it?

-Adam
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Reduce stack used by lib/hexdump.c

2007-12-06 Thread Joe Perches
On Wed, 2007-12-05 at 16:01 -0800, Andrew Morton wrote:
> No, I think print_hex_dump() is too low-level to be doing allocations. 
> For example, one could easily choose to call print_hex_dump() at oops time,
> and then what happens if we oops in kmalloc() (as we often do...)?
> 
> You could trim linebuf[] to 80 chars or so.  Extra points for making it
> very clear when someone tries to exceed that - strcpy(linebuf, "stop being
> stupid").

No extra points, but here's a revised patch to hexdump against
Linus' current:

hex_dump_to_buffer:
Removes casts to type for non-1 group sizes
Used by: fs/ext(3|4)super.c, fs/jfs
If someone really dislikes this change, please say so.
I think casting to type in a hex dump odd, especially
for mixed type structures.
If you want an array of type dumper, it probably
shouldn't be called hex_dump_to_buffer.
Groups by arbitrary size

print_hex_dump:
Removes rowsize argument
Reduces linebuf stack use to ~120 bytes
prefix:25 + address:20 + data:48 + ascii:20)
Aligns multiline ascii output
Changes return to size_t, number of bytes actually output

include/linux/kernel.h
Removes hex_asc define
Updates hex_dump prototypes

The rest are trivial conversions to new argument list.

size before:
   textdata bss dec hex filename
   1142   0   01142 476 lib/hexdump.o

size after:
   textdata bss dec hex filename
823   0   0 823 337 lib/hexdump.o

Signed-off-by: Joe Perches <[EMAIL PROTECTED]>
---
 include/linux/kernel.h  |   13 +-
 lib/hexdump.c   |  164 ---
 drivers/mtd/ubi/debug.c |2 +-
 drivers/mtd/ubi/io.c|2 +-
 drivers/net/wireless/iwlwifi/iwl3945-base.c |4 +-
 drivers/net/wireless/iwlwifi/iwl4965-base.c |4 +-
 drivers/scsi/ide-scsi.c |8 +-
 drivers/usb/gadget/file_storage.c   |4 +-
 fs/ext3/super.c |6 +-
 fs/ext4/super.c |6 +-
 fs/jffs2/wbuf.c |4 +-
 fs/jfs/jfs_imap.c   |2 +-
 fs/jfs/jfs_logmgr.c |6 +-
 fs/jfs/jfs_metapage.c   |2 +-
 fs/jfs/jfs_txnmgr.c |8 +-
 fs/jfs/xattr.c  |4 +-
 16 files changed, 110 insertions(+), 129 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 94bc996..ab45524 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -248,15 +248,14 @@ enum {
DUMP_PREFIX_ADDRESS,
DUMP_PREFIX_OFFSET
 };
-extern void hex_dump_to_buffer(const void *buf, size_t len,
-   int rowsize, int groupsize,
-   char *linebuf, size_t linebuflen, bool ascii);
+extern size_t hex_dump_to_buffer(const void *buf, size_t len,
+size_t rowsize, size_t groupsize,
+char *linebuf, size_t linebuflen, bool ascii);
 extern void print_hex_dump(const char *level, const char *prefix_str,
-   int prefix_type, int rowsize, int groupsize,
-   const void *buf, size_t len, bool ascii);
+  int prefix_type, size_t groupsize,
+  const void *buf, size_t len, bool ascii);
 extern void print_hex_dump_bytes(const char *prefix_str, int prefix_type,
-   const void *buf, size_t len);
-#define hex_asc(x) "0123456789abcdef"[x]
+const void *buf, size_t len);
 
 #define pr_emerg(fmt, arg...) \
printk(KERN_EMERG fmt, ##arg)
diff --git a/lib/hexdump.c b/lib/hexdump.c
index 3435465..df82012 100644
--- a/lib/hexdump.c
+++ b/lib/hexdump.c
@@ -12,18 +12,21 @@
 #include 
 #include 
 
+#define ROWSIZE ((size_t)16)
+#define MAX_PREFIX_LEN ((size_t)20)
+
 /**
  * hex_dump_to_buffer - convert a blob of data to "hex ASCII" in memory
  * @buf: data blob to dump
  * @len: number of bytes in the @buf
- * @rowsize: number of bytes to print per line; must be 16 or 32
+ * @rowsize: maximum number of bytes to output (aligns ascii)
  * @groupsize: number of bytes to print at a time (1, 2, 4, 8; default = 1)
  * @linebuf: where to put the converted data
  * @linebuflen: total size of @linebuf, including space for terminating NUL
  * @ascii: include ASCII after the hex output
  *
  * hex_dump_to_buffer() works on one "line" of output at a time, i.e.,
- * 16 or 32 bytes of input data converted to hex + ASCII output.
+ * input data converted to hex + ASCII output.
  *
  * Given a buffer of u8 data, hex_dump_to_buffer() converts the input data
  * to a hex + ASCII 

Re: sym53c8xx2: incredible sloth after parity error / SCSI bus reset

2007-12-06 Thread Andrew Morton
On Sun, 02 Dec 2007 00:14:21 +
Nix <[EMAIL PROTECTED]> wrote:

> About once a year I get a SCSI parity error on one of my systems (the
> only one with SCSI). I presume the cabling is substandard, but given my
> coordination deficits and the rarity of the errors I'd do far more
> damage replacing it than leaving it be.
> 
> I had one of these today.
> 
> The system (2.6.23.9) spotted the error, and seemingly recovered:
> 
> Dec  1 12:53:40 loki warning: kernel: sym0: SCSI parity error detected: 
> SCR1=132 DBC=5000 SBCL=0
> Dec  1 12:53:40 loki warning: kernel: sym0:0: ERROR (81:0) (8-0-0) (10/9d/0) 
> @ (mem c2800048:).
> Dec  1 12:53:40 loki warning: kernel: sym0: regdump: da 00 00 9d 47 10 00 0e 
> 00 08 80 00 80 00 0f 0a d0 58 3c 01 02 ff ff ff.
> Dec  1 12:53:40 loki warning: kernel: sym0: SCSI BUS reset detected.
> Dec  1 12:53:40 loki notice: kernel: sym0: SCSI BUS has been reset.
> 
> However, after that reset I/O to any device on that controller is
> *incredibly* slow.
> 
> A monthly RAID check kicked off shortly afterwards and provided my first
> clue. Load average >15, and:
> 
> md1 : active raid5 sda6[0] hdc5[3] sdb6[1]
>   76807296 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
>   [>]  check = 42.8% (16450780/38403648) 
> finish=253.3min speed=1442K/sec
> 
> 1442Kb/s is a bit less than I'd expect from a three-drive array with
> disks capable of 40Mb/s easily.
> 
> /dev/sda:
>  Timing buffered disk reads:8 MB in  3.50 seconds =   2.29 MB/sec
> 
> A somewhat slower ATAPI disk on the same machine:
> 
> /dev/hdc:
>  Timing buffered disk reads:  110 MB in  3.05 seconds =  36.08 MB/sec
> 
> So, um, what could cause this? Can I speed it up again other than by
> rebooting (which I'm just about to do, but it is annoying).
> 

cc linux-scsi.

Nothing is likely to happen.  Please raise a report at bugzilla.kernel.org.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Andrew Morton
On Thu, 6 Dec 2007 18:16:12 -0600 (CST)
[EMAIL PROTECTED] (Bob Tracy) wrote:

> OK.  Finally have this thing painted into a corner: git has identified
> 6f37ac793d6ba7b35d338f791974166f67fdd9ba as the first bad commit.
> 
> >From "git bisect log", this corresponds to 
> 
> # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
> master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
> 
> Here's the full log:
> 
> git-bisect start
> # good: [9aae299f7fd1888ea3a195cfe0edef17bb647415] Linux 2.6.24-rc2
> git-bisect good 9aae299f7fd1888ea3a195cfe0edef17bb647415
> # bad: [f05092637dc0d9a3f2249c9b283b973e6e96b7d2] Linux 2.6.24-rc3
> git-bisect bad f05092637dc0d9a3f2249c9b283b973e6e96b7d2
> # good: [e6a5c27f3b0fef72e528fc35e343af4b2db790ff] Merge branch 'for-linus' 
> of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
> git-bisect good e6a5c27f3b0fef72e528fc35e343af4b2db790ff
> # good: [42614fcde7bfdcbe43a7b17035c167dfebc354dd] vmstat: fix section 
> mismatch warning
> git-bisect good 42614fcde7bfdcbe43a7b17035c167dfebc354dd
> # bad: [a052f4473603765eb6b4c19754689977601dc1d1] Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/sam/x86
> git-bisect bad a052f4473603765eb6b4c19754689977601dc1d1
> # good: [d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5] CRISv10 improve and bugfix 
> fasttimer
> git-bisect good d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5
> # good: [d90bf5a976793edfa88d3bb2393f0231eb8ce1e5] [NET]: rt_check_expire() 
> can take a long time, add a cond_resched()
> git-bisect good d90bf5a976793edfa88d3bb2393f0231eb8ce1e5
> # good: [2a113281f5cd2febbab21a93c8943f8d3eece4d3] kconfig: use $K64BIT to 
> set 64BIT with all*config targets
> git-bisect good 2a113281f5cd2febbab21a93c8943f8d3eece4d3
> # good: [2e2cd8bad6e03ceea73495ee6d557044213d95de] CRISv10 memset library add 
> lineendings to asm
> git-bisect good 2e2cd8bad6e03ceea73495ee6d557044213d95de
> # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
> master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
> git-bisect bad 6f37ac793d6ba7b35d338f791974166f67fdd9ba
> # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
> INLINE and name timeval_cmp better
> git-bisect good 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3

commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba
Merge: 2f1f53b... d90bf5a...
Author: Linus Torvalds <[EMAIL PROTECTED]>
Date:   Wed Nov 14 18:51:48 2007 -0800

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/n

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: rt_check_expire() can take a long time, add a cond_resched()
  [ISDN] sc: Really, really fix warning
  [ISDN] sc: Fix sndpkt to have the correct number of arguments
  [TCP] FRTO: Clear frto_highmark only after process_frto that uses it
  [NET]: Remove notifier block from chain when register_netdevice_notifier f
  [FS_ENET]: Fix module build.
  [TCP]: Make sure write_queue_from does not begin with NULL ptr
  [TCP]: Fix size calculation in sk_stream_alloc_pskb
  [S2IO]: Fixed memory leak when MSI-X vector allocation fails
  [BONDING]: Fix resource use after free
  [SYSCTL]: Fix warning for token-ring from sysctl checker
  [NET] random : secure_tcp_sequence_number should not assume CONFIG_KTIME_S
  [IWLWIFI]: Not correctly dealing with hotunplug.
  [TCP] FRTO: Plug potential LOST-bit leak
  [TCP] FRTO: Limit snd_cwnd if TCP was application limited
  [E1000]: Fix schedule while atomic when called from mii-tool.
  [NETX]: Fix build failure added by 2.6.24 statistics cleanup.
  [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes.
  [PKT_SCHED]: Check subqueue status before calling hard_start_xmit

I'm struggling to see how any of those could have broken block device
mounting on alpha.  Are you sure you bisected right?

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Rafael J. Wysocki
On Friday, 7 of December 2007, Bob Tracy wrote:
> OK.  Finally have this thing painted into a corner: git has identified
> 6f37ac793d6ba7b35d338f791974166f67fdd9ba as the first bad commit.
> 
> From "git bisect log", this corresponds to 
> 
> # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
> master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

Something's gone wrong, as this commit doesn't modify code.

> Here's the full log:
> 
> git-bisect start
> # good: [9aae299f7fd1888ea3a195cfe0edef17bb647415] Linux 2.6.24-rc2
> git-bisect good 9aae299f7fd1888ea3a195cfe0edef17bb647415
> # bad: [f05092637dc0d9a3f2249c9b283b973e6e96b7d2] Linux 2.6.24-rc3
> git-bisect bad f05092637dc0d9a3f2249c9b283b973e6e96b7d2
> # good: [e6a5c27f3b0fef72e528fc35e343af4b2db790ff] Merge branch 'for-linus' 
> of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
> git-bisect good e6a5c27f3b0fef72e528fc35e343af4b2db790ff
> # good: [42614fcde7bfdcbe43a7b17035c167dfebc354dd] vmstat: fix section 
> mismatch warning
> git-bisect good 42614fcde7bfdcbe43a7b17035c167dfebc354dd
> # bad: [a052f4473603765eb6b4c19754689977601dc1d1] Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/sam/x86
> git-bisect bad a052f4473603765eb6b4c19754689977601dc1d1
> # good: [d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5] CRISv10 improve and bugfix 
> fasttimer
> git-bisect good d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5
> # good: [d90bf5a976793edfa88d3bb2393f0231eb8ce1e5] [NET]: rt_check_expire() 
> can take a long time, add a cond_resched()
> git-bisect good d90bf5a976793edfa88d3bb2393f0231eb8ce1e5
> # good: [2a113281f5cd2febbab21a93c8943f8d3eece4d3] kconfig: use $K64BIT to 
> set 64BIT with all*config targets
> git-bisect good 2a113281f5cd2febbab21a93c8943f8d3eece4d3
> # good: [2e2cd8bad6e03ceea73495ee6d557044213d95de] CRISv10 memset library add 
> lineendings to asm
> git-bisect good 2e2cd8bad6e03ceea73495ee6d557044213d95de
> # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
> master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
> git-bisect bad 6f37ac793d6ba7b35d338f791974166f67fdd9ba
> # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
> INLINE and name timeval_cmp better
> git-bisect good 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Bob Tracy
OK.  Finally have this thing painted into a corner: git has identified
6f37ac793d6ba7b35d338f791974166f67fdd9ba as the first bad commit.

>From "git bisect log", this corresponds to 

# bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

Here's the full log:

git-bisect start
# good: [9aae299f7fd1888ea3a195cfe0edef17bb647415] Linux 2.6.24-rc2
git-bisect good 9aae299f7fd1888ea3a195cfe0edef17bb647415
# bad: [f05092637dc0d9a3f2249c9b283b973e6e96b7d2] Linux 2.6.24-rc3
git-bisect bad f05092637dc0d9a3f2249c9b283b973e6e96b7d2
# good: [e6a5c27f3b0fef72e528fc35e343af4b2db790ff] Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
git-bisect good e6a5c27f3b0fef72e528fc35e343af4b2db790ff
# good: [42614fcde7bfdcbe43a7b17035c167dfebc354dd] vmstat: fix section mismatch 
warning
git-bisect good 42614fcde7bfdcbe43a7b17035c167dfebc354dd
# bad: [a052f4473603765eb6b4c19754689977601dc1d1] Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/sam/x86
git-bisect bad a052f4473603765eb6b4c19754689977601dc1d1
# good: [d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5] CRISv10 improve and bugfix 
fasttimer
git-bisect good d8e5219f9f5ca7518eb820db9f3d287a1d46fcf5
# good: [d90bf5a976793edfa88d3bb2393f0231eb8ce1e5] [NET]: rt_check_expire() can 
take a long time, add a cond_resched()
git-bisect good d90bf5a976793edfa88d3bb2393f0231eb8ce1e5
# good: [2a113281f5cd2febbab21a93c8943f8d3eece4d3] kconfig: use $K64BIT to set 
64BIT with all*config targets
git-bisect good 2a113281f5cd2febbab21a93c8943f8d3eece4d3
# good: [2e2cd8bad6e03ceea73495ee6d557044213d95de] CRISv10 memset library add 
lineendings to asm
git-bisect good 2e2cd8bad6e03ceea73495ee6d557044213d95de
# bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
git-bisect bad 6f37ac793d6ba7b35d338f791974166f67fdd9ba
# good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
INLINE and name timeval_cmp better
git-bisect good 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [OpenFCoE PATCH] [PATCH] Performance improvement, combine received data copy with CRC.

2007-12-06 Thread Joe Eykholt
Rob Love wrote:
> Joe Eykholt wrote:
>> [PATCH] Performance improvement, combine received data copy with CRC.
>>   
> Shouldn't we remove  openfc_cp_to_user() if we're moving that
> functionality into openfc_scsi_recv_data()? 

Yes.  That was an oversight.  I did intend to remove it.
Do you want a new patch for that or will you take care of it?

Joe

> I don't see anything calling
> it anymore. My guess is that your leaving it in there for future usage,
> but my argument would be to remove it since it's not being used and then
> add it back if needed in the future.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [OpenFCoE PATCH] [PATCH] Performance improvement, combine received data copy with CRC.

2007-12-06 Thread Rob Love

Joe Eykholt wrote:

[PATCH] Performance improvement, combine received data copy with CRC.
  
Shouldn't we remove  openfc_cp_to_user() if we're moving that 
functionality into openfc_scsi_recv_data()? I don't see anything calling 
it anymore. My guess is that your leaving it in there for future usage, 
but my argument would be to remove it since it's not being used and then 
add it back if needed in the future.

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: everything in wait_for_completion, what is my system doing?

2007-12-06 Thread Andrew Morton
On Wed, 5 Dec 2007 21:44:54 +0100
Bernd Schubert <[EMAIL PROTECTED]> wrote:

> after scsi-recovery a system here went into some kind lock-up, everything 
> seems to be in wait_for_completion(). Please see the attached 
> blocked_states.txt and all_states.txt files.
> This is 2.6.22.12, I can easily find out the line numbers if required.
> 
> Any help is highly appreciated.
> 
> 

Please cc linux-scsi on scsi-related reports.

> 
> 
> [blocked_states.txt  text/plain (20.5KB)]
> [generate break]
> [ 1818.566436] SysRq : Show Blocked State
> [ 1818.570260]
> [ 1818.570261]  free
> sibling
> [ 1818.579253]   task PCstack   pid father child 
> younger older
> [ 1818.586987] events/7  D 0155dd642280 026  2 (L-TLB)
> [ 1818.593747]  81012b529ac0 0046  
> 810128280d18
> [ 1818.601321]  8100ba2376f8 81012b689630 81012aff76b0 
> 00078023e215
> [ 1818.608870]  00010003ca14  810001065400 
> 000780430c13
> [ 1818.616222] Call Trace:
> [ 1818.618925]  [] io_schedule+0x28/0x36
> [ 1818.624207]  [] get_request_wait+0x104/0x158
> [ 1818.630112]  [] blk_get_request+0x36/0x6b
> [ 1818.635755]  [] scsi_execute+0x51/0x129
> [ 1818.641240]  [] :scsi_transport_spi:spi_execute+0x87/0xf8
> [ 1818.648271]  [] 
> :scsi_transport_spi:spi_dv_device_echo_buffer+0x181/0x27d
> [ 1818.656739]  [] 
> :scsi_transport_spi:spi_dv_retrain+0x4e/0x240
> [ 1818.664139]  [] 
> :scsi_transport_spi:spi_dv_device+0x615/0x69c
> [ 1818.671542]  [] :mptspi:mptspi_dv_device+0xb3/0x14b
> [ 1818.678042]  [] 
> :mptspi:mptspi_dv_renegotiate_work+0xcb/0xef
> [ 1818.685348]  [] run_workqueue+0x8e/0x120
> [ 1818.690905]  [] worker_thread+0x106/0x117
> [ 1818.696540]  [] kthread+0x4b/0x82
> [ 1818.701474]  [] child_rip+0xa/0x12
> [ 1818.706495]
> [ 1818.708022] unionfs-fuse- D 01a76ef63463 0  1119  1 (NOTLB)
> [ 1818.714764]  810129765988 0082  
> 80337e22
> [ 1818.722329]  8101297658c8 81012b652f20 810129eec810 
> 0006
> [ 1818.729895]  00010005204e  81000105c400 
> 000680337c3e
> [ 1818.737249] Call Trace:
> [ 1818.739953]  [] schedule_timeout+0x8a/0xb6
> [ 1818.745673]  [] io_schedule_timeout+0x28/0x36
> [ 1818.751664]  [] congestion_wait+0x9d/0xc2
> [ 1818.757300]  [] 
> balance_dirty_pages_ratelimited_nr+0x196/0x22f
> [ 1818.764781]  [] generic_file_buffered_write+0x52a/0x60d
> [ 1818.771641]  [] 
> __generic_file_aio_write_nolock+0x45a/0x491
> [ 1818.778852]  [] generic_file_aio_write+0x61/0xc1
> [ 1818.785101]  [] nfs_file_write+0x138/0x1b7
> [ 1818.790822]  [] do_sync_write+0xcc/0x112
> [ 1818.796372]  [] vfs_write+0xc3/0x165
> [ 1818.801575]  [] sys_pwrite64+0x68/0x96
> [ 1818.806959]  [] system_call+0x7e/0x83
> [ 1818.812250]  [<2b4eeec3ea73>]
>
> [snippage]
>

Possibly your device driver had conniptions and stopped generating
completion interrupts.

Which driver is in use?

I don't suppose it is repeatable.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2.6.23] 3w-xxxx: Fix bad unaligned accesses on alpha SMP

2007-12-06 Thread Gabriele Gorla
disable packing of the TAG_TW_Device_Extension
structure to prevent kernel unaligned accesses when
accessing the spinlock inside the ioctl_wqueue structure.
Fixes smartmontools kernel panic on alpha SMP

Signed-off-by: Gabriele Gorla <[EMAIL PROTECTED]>
---

--- linux-2.6.23/drivers/scsi/3w-.h 2007-10-09 13:31:38.0 -0700
+++ linux-2.6.23a/drivers/scsi/3w-.h 2007-12-06 17:46:05.0 -0800
@@ -392,6 +392,8 @@
unsigned char padding[12];
 } TW_Passthru;

+#pragma pack()
+
 typedef struct TAG_TW_Device_Extension {
u32 base_addr;
unsigned long   *alignment_virtual_address[TW_Q_LENGTH];
@@ -430,6 +432,4 @@
wait_queue_head_t   ioctl_wqueue;
 } TW_Device_Extension;

-#pragma pack()
-
 #endif /* _3W__H */


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Bob Tracy
Andrew Morton wrote:
> commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba
> Merge: 2f1f53b... d90bf5a...
> Author: Linus Torvalds <[EMAIL PROTECTED]>
> Date:   Wed Nov 14 18:51:48 2007 -0800
> 
> Merge branch 'master' of 
> master.kernel.org:/pub/scm/linux/kernel/git/davem/n
> 
> * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
>   [NET]: rt_check_expire() can take a long time, add a cond_resched()
>   [ISDN] sc: Really, really fix warning
>   [ISDN] sc: Fix sndpkt to have the correct number of arguments
>   [TCP] FRTO: Clear frto_highmark only after process_frto that uses it
>   [NET]: Remove notifier block from chain when 
> register_netdevice_notifier f
>   [FS_ENET]: Fix module build.
>   [TCP]: Make sure write_queue_from does not begin with NULL ptr
>   [TCP]: Fix size calculation in sk_stream_alloc_pskb
>   [S2IO]: Fixed memory leak when MSI-X vector allocation fails
>   [BONDING]: Fix resource use after free
>   [SYSCTL]: Fix warning for token-ring from sysctl checker
>   [NET] random : secure_tcp_sequence_number should not assume 
> CONFIG_KTIME_S
>   [IWLWIFI]: Not correctly dealing with hotunplug.
>   [TCP] FRTO: Plug potential LOST-bit leak
>   [TCP] FRTO: Limit snd_cwnd if TCP was application limited
>   [E1000]: Fix schedule while atomic when called from mii-tool.
>   [NETX]: Fix build failure added by 2.6.24 statistics cleanup.
>   [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes.
>   [PKT_SCHED]: Check subqueue status before calling hard_start_xmit
> 
> I'm struggling to see how any of those could have broken block device
> mounting on alpha.  Are you sure you bisected right?

Based on what's in that commit, it *does* appear something went wrong
with bisection.  If the implicated commit is the next one in time
sequence relative to

# good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
INLINE and name timeval_cmp better

then the test of whether I bisected correctly is as simple as applying
the commit and seeing if things break, because I'm running on the
kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right
now.  Let me give that a try and I'll report back.  Worst case, I'll
have to start over and write off the past four days...

Sorry about this...

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-06 Thread Bob Tracy
I wrote:
> If the implicated commit is the next one in time
> sequence relative to
> 
> # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
> INLINE and name timeval_cmp better
> 
> then the test of whether I bisected correctly is as simple as applying
> the commit and seeing if things break, because I'm running on the
> kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right
> now.  Let me give that a try and I'll report back.

Verified that 6f37ac793d6ba7b35d338f791974166f67fdd9ba is the next
commit after the "good" kernel I'm running now.  The build is running,
and I should have an answer for us in a few hours.

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html