Andrew Morton wrote:
> On Tue, 20 Mar 2007 00:25:02 +0100
> Andreas Steinmetz <[EMAIL PROTECTED]> wrote:
> 
>> Mike Christie wrote:
>>> Mike Christie wrote:
>>>> James Bottomley wrote:
>>>>> On Mon, 2007-03-19 at 12:49 -0500, Mike Christie wrote:
>>>>>>> I can't even say if the tapes are written correctly as I can't read them
>>>>>>> (one does not reboot production machines back to 2.4.x just to try to
>>>>>>> read a backup tape - I don't have 2.6.x older than 2.6.20 on these
>>>>>>> machines).
>>>>>> Could you try this patch
>>>>>> http://marc.info/?l=linux-scsi&m=116464965414878&w=2
>>>>>> I thought st was modified to not send offsets in the last elements but
>>>>>> it looks like it wasn't.
>>>>> Actually, there are two patches in the email referred to.  If the
>>>>> analysis that we're passing NULL to mempool_free is correct, it should
>>>>> be the second one that fixes the problem (the one that checks
>>>>> bio->bi_io_vec before freeing it).  Which would mean we have a
>>>>> nr_vecs==0 bio generated by the tar somehow.
>>>>>
>>>> I think we might only need the first patch if the problem is similar to
>>>> what the lsi guys were seeing. I thought the problem is that we are not
>>>> estimating how large the transfer is correctly because we do not take
>>>> into account offsets at the end. This results in nr_vecs being zero when
>>>> it should be a valid value. I thought Kai's patch:
>>>> http://bugzilla.kernel.org/show_bug.cgi?id=7919
>>>> http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=9abe16c670bd3d4ab5519257514f9f291383d104
>>>> fixed the problem on st's side,
>>> Oh, I noticed that the subject for the mail references 2.6.30.3 and the
>>> patch for st in the bugzilla did not make into 2.6.20 and is not in .3.
>>> Could we try the st patch in the bugzilla first?
>> Ok, the st patch from bugzilla solves the problem (tested on both
>> affected machines).
> 
> 
> If you're referring to the below patch then it's already in mainline, and
> has been for a month.
> 

Yes, that's the patch I'm referring to.

> Have you tested 2.6.21-rc4?  If not, please do so.
> 

Sorry, this is not possible on these machines. They are production
servers and every problem on them that cannot be easily solved via
remote access is a 40km (one way) drive in the middle of the night.

> Perhaps we should merge this into 2.6.20.x?
> 

I would suggest so.

> 
> 
> commit 9abe16c670bd3d4ab5519257514f9f291383d104
> Author: Kai Makisara <[EMAIL PROTECTED]>
> Date:   Sat Feb 3 13:21:29 2007 +0200
> 
>     [SCSI] st: fix Tape dies if wrong block size used, bug 7919
>     
>     On Thu, 1 Feb 2007, Andrew Morton wrote:
>     > On Thu, 1 Feb 2007 15:34:29 -0800
>     > [EMAIL PROTECTED] wrote:
>     >
>     > > http://bugzilla.kernel.org/show_bug.cgi?id=7919
>     > >
>     > >            Summary: Tape dies if wrong block size used
>     > >     Kernel Version: 2.6.20-rc5
>     > >             Status: NEW
>     > >           Severity: normal
>     > >              Owner: [EMAIL PROTECTED]
>     > >          Submitter: [EMAIL PROTECTED]
>     > >
>     > >
>     > > Most recent kernel where this bug did *NOT* occur: 2.6.17.14
>     > >
>     > > Other Kernels Tested and Results:
>     > >
>     > >     OK 2.6.15.7
>     > >     OK 2.6.16.37
>     > >     OK 2.6.17.14
>     > >     BAD 2.6.18.6
>     > >     BAD 2.6.18-1.2869.fc6
>     > >     BAD 2.6.19.2 +
>     > >     BAD 2.6.20-rc5
>     > >
>     > > NOTE: 2.6.18-1.2869.fc6 is a Fedora modified kernel, all others are 
> from kernel.org
>     > >
>     ...
>     > > Steps to reproduce:
>     > > Get a Adaptec AHA-2940U/UW/D / AIC-7881U card and a tape drive,
>     > > install a recent kernel
>     > > set the tape block size - mt setblk 4096
>     > > read from or write to tape using wrong block size - tar -b 7 -cvf 
> /dev/tape foo
>     > >
>     Write does not trigger this bug because the driver refuses in fixed block
>     mode writes that are not a multiple of the block size. Read does trigger
>     it in my system.
>     
>     The bug is not associated with any specific HBA. st tries to do direct i/o
>     in fixed block mode with reads that are not a multiple of tape block size.
>     
>     The patch in this message fixes the st problem by switching to using the
>     driver buffer up to the next close of the device file in fixed block mode
>     if the user asks for a read like this.
>     
>     I don't know why the bug has surfaced only after 2.6.17 although the st
>     problem is old. There may be another bug in the block subsystem and this
>     patch works around it. However, the patch fixes a problem in st and in
>     this way it is a valid fix.
>     
>     This patch may also fix the bug 7900.
>     
>     The patch compiles and is lightly tested.
>     
>     Signed-off-by: Kai Makisara <[EMAIL PROTECTED]>
>     Signed-off-by: James Bottomley <[EMAIL PROTECTED]>
> 
> diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
> index e016e09..fba8b20 100644
> --- a/drivers/scsi/st.c
> +++ b/drivers/scsi/st.c
> @@ -9,7 +9,7 @@
>     Steve Hirsch, Andreas Koppenh"ofer, Michael Leodolter, Eyal Lebedinsky,
>     Michael Schaefer, J"org Weule, and Eric Youngdale.
>  
> -   Copyright 1992 - 2006 Kai Makisara
> +   Copyright 1992 - 2007 Kai Makisara
>     email [EMAIL PROTECTED]
>  
>     Some small formal changes - aeb, 950809
> @@ -17,7 +17,7 @@
>     Last modified: 18-JAN-1998 Richard Gooch <[EMAIL PROTECTED]> Devfs support
>   */
>  
> -static const char *verstr = "20061107";
> +static const char *verstr = "20070203";
>  
>  #include <linux/module.h>
>  
> @@ -1168,6 +1168,7 @@ static int st_open(struct inode *inode, 
>               STps = &(STp->ps[i]);
>               STps->rw = ST_IDLE;
>       }
> +     STp->try_dio_now = STp->try_dio;
>       STp->recover_count = 0;
>       DEB( STp->nbr_waits = STp->nbr_finished = 0;
>            STp->nbr_requests = STp->nbr_dio = STp->nbr_pages = 
> STp->nbr_combinable = 0; )
> @@ -1400,9 +1401,9 @@ static int setup_buffering(struct scsi_t
>       struct st_buffer *STbp = STp->buffer;
>  
>       if (is_read)
> -             i = STp->try_dio && try_rdio;
> +             i = STp->try_dio_now && try_rdio;
>       else
> -             i = STp->try_dio && try_wdio;
> +             i = STp->try_dio_now && try_wdio;
>  
>       if (i && ((unsigned long)buf & queue_dma_alignment(
>                                       STp->device->request_queue)) == 0) {
> @@ -1599,7 +1600,7 @@ st_write(struct file *filp, const char _
>                       STm->do_async_writes && STps->eof < ST_EOM_OK;
>  
>               if (STp->block_size != 0 && STm->do_buffer_writes &&
> -                 !(STp->try_dio && try_wdio) && STps->eof < ST_EOM_OK &&
> +                 !(STp->try_dio_now && try_wdio) && STps->eof < ST_EOM_OK &&
>                   STbp->buffer_bytes < STbp->buffer_size) {
>                       STp->dirty = 1;
>                       /* Don't write a buffer that is not full enough. */
> @@ -1769,7 +1770,7 @@ static long read_tape(struct scsi_tape *
>       if (STp->block_size == 0)
>               blks = bytes = count;
>       else {
> -             if (!(STp->try_dio && try_rdio) && STm->do_read_ahead) {
> +             if (!(STp->try_dio_now && try_rdio) && STm->do_read_ahead) {
>                       blks = (STp->buffer)->buffer_blocks;
>                       bytes = blks * STp->block_size;
>               } else {
> @@ -1948,10 +1949,12 @@ st_read(struct file *filp, char __user *
>               goto out;
>  
>       STm = &(STp->modes[STp->current_mode]);
> -     if (!(STm->do_read_ahead) && STp->block_size != 0 &&
> -         (count % STp->block_size) != 0) {
> -             retval = (-EINVAL);     /* Read must be integral number of 
> blocks */
> -             goto out;
> +     if (STp->block_size != 0 && (count % STp->block_size) != 0) {
> +             if (!STm->do_read_ahead) {
> +                     retval = (-EINVAL);     /* Read must be integral number 
> of blocks */
> +                     goto out;
> +             }
> +             STp->try_dio_now = 0;  /* Direct i/o can't handle split blocks 
> */
>       }
>  
>       STps = &(STp->ps[STp->partition]);
> diff --git a/drivers/scsi/st.h b/drivers/scsi/st.h
> index 05a5cae..50f3deb 100644
> --- a/drivers/scsi/st.h
> +++ b/drivers/scsi/st.h
> @@ -117,7 +117,8 @@ struct scsi_tape {
>       unsigned char cln_sense_value;
>       unsigned char cln_sense_mask;
>       unsigned char use_pf;                   /* Set Page Format bit in all 
> mode selects? */
> -     unsigned char try_dio;                  /* try direct i/o? */
> +     unsigned char try_dio;                  /* try direct i/o in general? */
> +     unsigned char try_dio_now;              /* try direct i/o before next 
> close? */
>       unsigned char c_algo;                   /* compression algorithm */
>       unsigned char pos_unknown;                      /* after reset position 
> unknown */
>       int tape_type;
> 


-- 
Andreas Steinmetz                       SPAMmers use [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to