Re: [Linux-usb-users] Read errors on Flash Drive Transcend TS1GJF2A
On Wed, 5 Dec 2007, Boaz Harrosh wrote: Hi Alen Yes, I have not inspected sr.c very carefully, you are absolutely right. Could you submit a unified patch for sd, sr and scsi.c I have hit this bug 2 in my error injection tests. I was doing sg_chaining tests and now with the possibly very large requests the problem gets worse. At the time, I could not identify the exact problem, thanks I'd like to keep the sd and sr patches separate. The sd patch has already been tested and is known to fix an existing problem. The sr patch below is new and it has not been tested. All I know is that it compiles. Alan Stern Index: usb-2.6/drivers/scsi/scsi.c === --- usb-2.6.orig/drivers/scsi/scsi.c +++ usb-2.6/drivers/scsi/scsi.c @@ -700,6 +700,8 @@ void scsi_finish_command(struct scsi_cmn good_bytes = cmd-request_bufflen; if (cmd-request-cmd_type != REQ_TYPE_BLOCK_PC) { + if (cmd-resid 0 cmd-resid good_bytes) + good_bytes -= cmd-resid; drv = scsi_cmd_to_driver(cmd); if (drv-done) good_bytes = drv-done(cmd); Index: usb-2.6/drivers/scsi/sr.c === --- usb-2.6.orig/drivers/scsi/sr.c +++ usb-2.6/drivers/scsi/sr.c @@ -218,15 +218,29 @@ static int sr_media_change(struct cdrom_ static int sr_done(struct scsi_cmnd *SCpnt) { int result = SCpnt-result; - int this_count = SCpnt-request_bufflen; - int good_bytes = (result == 0 ? this_count : 0); - int block_sectors = 0; - long error_sector; + unsigned int xfer_size = SCpnt-request_bufflen; + unsigned int good_bytes = (result == 0 ? xfer_size : 0); + unsigned long start_lba = SCpnt-request-sector; + u64 bad_lba; + unsigned long num_blocks; + unsigned long bad_sector; struct scsi_cd *cd = scsi_cd(SCpnt-request-rq_disk); + struct scsi_sense_hdr sshdr; + int sense_valid = 0; + int sense_deferred = 0; + int info_valid; #ifdef DEBUG printk(sr.c done: %x\n, result); #endif + if (result) { + sense_valid = scsi_command_normalize_sense(SCpnt, sshdr); + if (sense_valid) + sense_deferred = scsi_sense_is_deferred(sshdr); + } + if (driver_byte(result) != DRIVER_SENSE + (!sense_valid || sense_deferred)) + goto out; /* * Handle MEDIUM ERRORs or VOLUME OVERFLOWs that indicate partial @@ -234,60 +248,83 @@ static int sr_done(struct scsi_cmnd *SCp * care is taken to avoid unnecessary additional work such as * memcpy's that could be avoided. */ - if (driver_byte(result) != 0 /* An error occurred */ - (SCpnt-sense_buffer[0] 0x7f) == 0x70) { /* Sense current */ - switch (SCpnt-sense_buffer[2]) { - case MEDIUM_ERROR: - case VOLUME_OVERFLOW: - case ILLEGAL_REQUEST: - if (!(SCpnt-sense_buffer[0] 0x90)) - break; - error_sector = (SCpnt-sense_buffer[3] 24) | - (SCpnt-sense_buffer[4] 16) | - (SCpnt-sense_buffer[5] 8) | - SCpnt-sense_buffer[6]; - if (SCpnt-request-bio != NULL) - block_sectors = - bio_sectors(SCpnt-request-bio); - if (block_sectors 4) - block_sectors = 4; - if (cd-device-sector_size == 2048) - error_sector = 2; - error_sector = ~(block_sectors - 1); - good_bytes = (error_sector - SCpnt-request-sector) 9; - if (good_bytes 0 || good_bytes = this_count) - good_bytes = 0; - /* -* The SCSI specification allows for the value -* returned by READ CAPACITY to be up to 75 2K -* sectors past the last readable block. -* Therefore, if we hit a medium error within the -* last 75 2K sectors, we decrease the saved size -* value. -*/ - if (error_sector get_capacity(cd-disk) - cd-capacity - error_sector 4 * 75) - set_capacity(cd-disk, error_sector); + switch (sshdr.sense_key) { + case HARDWARE_ERROR: + case MEDIUM_ERROR: + case VOLUME_OVERFLOW: + case ILLEGAL_REQUEST: + if (!blk_fs_request(SCpnt-request)) + goto out; + info_valid =
Re: [Linux-usb-users] Read errors on Flash Drive Transcend TS1GJF2A
On Tue, Dec 04 2007 at 20:03 +0200, Alan Stern [EMAIL PROTECTED] wrote: On Tue, 4 Dec 2007, Boaz Harrosh wrote: perhaps below hunk should be added to your patch. It looks like a good idea. Was it decided when this data corruption bugfix is merged. I don't know -- I haven't heard anything back from James. Also in sr.c. It does the range check but it might enjoy the resid handling as well. I think the range checking in sr.c is completely wrong. The code doesn't check carefully to see that the error sector lies within the range of sectors being accessed; there's a possibility of overflow if the capacity is larger than 2**31 bytes. Also this line in particular makes no sense: error_sector = ~(block_sectors - 1); Like you said, using the residue value too wouldn't hurt. Furthermore the check for the Valid bit is done wrongly: if (!(SCpnt-sense_buffer[0] 0x90)) This will never be true because of the earlier test: if (driver_byte(result) != 0 /* An error occurred */ (SCpnt-sense_buffer[0] 0x7f) == 0x70) { /* Sense current */ It probably should test against 0x80 instead of 0x90. Alan Stern Hi Alen Yes, I have not inspected sr.c very carefully, you are absolutely right. Could you submit a unified patch for sd, sr and scsi.c I have hit this bug 2 in my error injection tests. I was doing sg_chaining tests and now with the possibly very large requests the problem gets worse. At the time, I could not identify the exact problem, thanks Boaz - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-usb-users] Read errors on Flash Drive Transcend TS1GJF2A
On Mon, Nov 26 2007 at 17:35 +0200, Alan Stern [EMAIL PROTECTED] wrote: Not all devices correctly report the error-causing LBA in the Information field of their sense data -- even when they set the Valid bit. This patch (as1019) makes sd much more cautious about accepting the reported LBA. If the value isn't within the range of blocks accessed during the I/O operation it is rejected, and instead the driver will try looking at the residue field (which currently it ignores completely). This fixes a data-corruption bug reported here: http://marc.info/?t=11874576475r=1w=2 Signed-off-by: Alan Stern [EMAIL PROTECTED] CC: Luben Tuikov [EMAIL PROTECTED] --- This patch should be considered for inclusion in 2.6.24. The bug in question has always existed, as far as I know, but before 2.6.18 it was masked by a different bug. This doesn't use the new SCSI accessors. In the development trees I've seen, those accessors haven't yet been imported into sd.c. If the patch needs to be rebased, please let me know where to find the current sd source. Presumably sr should use the same algorithm. That's grist for another patch. Index: usb-2.6/drivers/scsi/sd.c === --- usb-2.6.orig/drivers/scsi/sd.c +++ usb-2.6/drivers/scsi/sd.c @@ -968,7 +968,17 @@ static int sd_done(struct scsi_cmnd *SCp /* This computation should always be done in terms of * the resolution of the device's medium. */ - good_bytes = (bad_lba - start_lba)*SCpnt-device-sector_size; + if (start_lba = bad_lba bad_lba start_lba + + (xfer_size / SCpnt-device-sector_size)) + good_bytes = SCpnt-device-sector_size * + (unsigned int) (bad_lba - start_lba); + + /* If the bad_lba value is no good, maybe the residue value + * is better. + */ + else if (SCpnt-resid 0 SCpnt-resid xfer_size) + good_bytes = (xfer_size - SCpnt-resid) + (- SCpnt-device-sector_size); break; case RECOVERED_ERROR: case NO_SENSE: - perhaps below hunk should be added to your patch. Was it decided when this data corruption bugfix is merged. Also in sr.c. It does the range check but it might enjoy the resid handling as well. - drivers/scsi/scsi.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 796c0bd..1a576bc 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -714,6 +714,8 @@ void scsi_finish_command(struct scsi_cmnd *cmd) good_bytes = cmd-request_bufflen; if (cmd-request-cmd_type != REQ_TYPE_BLOCK_PC) { + if ( cmd-resid 0 cmd-resid good_bytes) + good_bytes -= cmd-resid; drv = scsi_cmd_to_driver(cmd); if (drv-done) good_bytes = drv-done(cmd); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-usb-users] Read errors on Flash Drive Transcend TS1GJF2A
On Tue, 4 Dec 2007, Boaz Harrosh wrote: perhaps below hunk should be added to your patch. It looks like a good idea. Was it decided when this data corruption bugfix is merged. I don't know -- I haven't heard anything back from James. Also in sr.c. It does the range check but it might enjoy the resid handling as well. I think the range checking in sr.c is completely wrong. The code doesn't check carefully to see that the error sector lies within the range of sectors being accessed; there's a possibility of overflow if the capacity is larger than 2**31 bytes. Also this line in particular makes no sense: error_sector = ~(block_sectors - 1); Like you said, using the residue value too wouldn't hurt. Furthermore the check for the Valid bit is done wrongly: if (!(SCpnt-sense_buffer[0] 0x90)) This will never be true because of the earlier test: if (driver_byte(result) != 0 /* An error occurred */ (SCpnt-sense_buffer[0] 0x7f) == 0x70) { /* Sense current */ It probably should test against 0x80 instead of 0x90. Alan Stern - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-usb-users] Read errors on Flash Drive Transcend TS1GJF2A
Not all devices correctly report the error-causing LBA in the Information field of their sense data -- even when they set the Valid bit. This patch (as1019) makes sd much more cautious about accepting the reported LBA. If the value isn't within the range of blocks accessed during the I/O operation it is rejected, and instead the driver will try looking at the residue field (which currently it ignores completely). This fixes a data-corruption bug reported here: http://marc.info/?t=11874576475r=1w=2 Signed-off-by: Alan Stern [EMAIL PROTECTED] CC: Luben Tuikov [EMAIL PROTECTED] --- This patch should be considered for inclusion in 2.6.24. The bug in question has always existed, as far as I know, but before 2.6.18 it was masked by a different bug. This doesn't use the new SCSI accessors. In the development trees I've seen, those accessors haven't yet been imported into sd.c. If the patch needs to be rebased, please let me know where to find the current sd source. Presumably sr should use the same algorithm. That's grist for another patch. Index: usb-2.6/drivers/scsi/sd.c === --- usb-2.6.orig/drivers/scsi/sd.c +++ usb-2.6/drivers/scsi/sd.c @@ -968,7 +968,17 @@ static int sd_done(struct scsi_cmnd *SCp /* This computation should always be done in terms of * the resolution of the device's medium. */ - good_bytes = (bad_lba - start_lba)*SCpnt-device-sector_size; + if (start_lba = bad_lba bad_lba start_lba + + (xfer_size / SCpnt-device-sector_size)) + good_bytes = SCpnt-device-sector_size * + (unsigned int) (bad_lba - start_lba); + + /* If the bad_lba value is no good, maybe the residue value +* is better. +*/ + else if (SCpnt-resid 0 SCpnt-resid xfer_size) + good_bytes = (xfer_size - SCpnt-resid) + (- SCpnt-device-sector_size); break; case RECOVERED_ERROR: case NO_SENSE: - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Linux-usb-users] Read errors on Flash Drive Transcend TS1GJF2A
On Mon, Nov 26 2007 at 17:35 +0200, Alan Stern [EMAIL PROTECTED] wrote: Not all devices correctly report the error-causing LBA in the Information field of their sense data -- even when they set the Valid bit. This patch (as1019) makes sd much more cautious about accepting the reported LBA. If the value isn't within the range of blocks accessed during the I/O operation it is rejected, and instead the driver will try looking at the residue field (which currently it ignores completely). This fixes a data-corruption bug reported here: http://marc.info/?t=11874576475r=1w=2 Signed-off-by: Alan Stern [EMAIL PROTECTED] CC: Luben Tuikov [EMAIL PROTECTED] --- This patch should be considered for inclusion in 2.6.24. The bug in question has always existed, as far as I know, but before 2.6.18 it was masked by a different bug. This doesn't use the new SCSI accessors. In the development trees I've seen, those accessors haven't yet been imported into sd.c. If the patch needs to be rebased, please let me know where to find the current sd source. It's currently only in mm patchset has: bidi-support-sr-sd-remove-dead-code.patch bidi-support-scsi_data_buffer.patch Presumably sr should use the same algorithm. That's grist for another patch. Index: usb-2.6/drivers/scsi/sd.c === --- usb-2.6.orig/drivers/scsi/sd.c +++ usb-2.6/drivers/scsi/sd.c @@ -968,7 +968,17 @@ static int sd_done(struct scsi_cmnd *SCp snip If this is a bugfix for 2.6.24 than I will be the one to rebase, as scsi_data_buffer is only for 2.6.25, I hope ;) Andrew once above goes into scsi-rc-fixes or scsi-misc I will send a rebase, if I've fallen asleep please bang me on the head, thanks. Alen thanks for doing this It was on my must-investigate list. (Though I'm usually not using sd) Boaz - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html