Re: [PATCH] st: implement sysfs based tape statistics v2
On 1/26/2015 6:11 PM, Seymour, Shane M wrote: I was wondering if anyone had any feedback or had any chance to review the changes? Per the other discussion about having the same stat format forever. It seems to me that you might want to preemptively add a few additional counters. A counter for WRITE_FILEMARKS, particularly non immediate count=0 ones, which are often used to flush the drive write buffer. A counter for movement related commands like SPACE/LOCATE/REWIND would also be helpful. Finally, abnormal read conditions like, ILI's, and hit FMs should have their own stat. Those three should provide a better view into how the drive is being used and why performance may not be what is expected. There may be others, but those three are high on my list of things I want to know about a tape stream that is not performing up to expectations. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] SG_SCSI_RESET ioctl: add no_escalate values
Reviewed-by: Jeremy Linton jlin...@tributary.com I will test it (next week or so) when I have access to a configuration that can test it in a meaningful way. Thanks, -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Write cache and surface error behaviour
On 7/20/2014 4:54 PM, joystick wrote: So what happens when the disk tries to write it to the platter and discovers that there is a media error on that sector? (suppose relocation does not happen ; maybe sectors exhausted) Does Linux receive the write error upon the next flush it issues? At least for SCSI I believe the situation you describe is covered in the SCSI specifications as a deferred error. Basically, the device returns a check condition indicating a deferred failure in response to another command. My understanding (and I'm sure others can correct it) is that the device server can issue these check conditions anytime it wants. The only guarantee is that data written before the last successful SYNC is on the media (doesn't mean you can read it!). So, in order to guarantee data is not lost, a system using writeback should retain all of the writeback data until a successful SYNC CACHE operation. For example see, SPC4 4.5.7 note 6. If you consider what happens during power loss to a write-back cache, its the same situation. Bottom line, make sure to issue sync's for data you want to retain and use a filesystem/device that supports barriers and SYNC CACHE/CACHE FLUSH correctly. Still YMMV. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] sg: add SG_FLAG_Q_AT_TAIL flag
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 6/5/2014 10:27 AM, Boaz Harrosh wrote: 1. aio scatter_gather type io. (ie multiple pointers multiple length buffers that are written / read from same linear range on device) [The async aspect of aio can be implemented via bsg with the write+read system calls] 2. mmap of direct device range to user vm memory I suspect that belies a bit of a gap in the understanding of the kinds of applications that use pass-through (vs just using sd, or using it for a guest OS). These use cases don't tend to be useful for things like SCSI changers, tape devices, or SES devices. What is useful is the ability to reset devices, or maybe some of the other edge features provided by SG that never managed to make it into bsg. Nor are they useful for the monitoring type applications that use pass-through to pull some vendor specific statistic or device state. Furthermore, i've see a fair number of cases where people slap together shell scripts using the /dev/sg* handles instead of the /dev/bsg/* ones probably because its simply more convenient. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTkKYLAAoJEL5i86xrzcy78ZEIAK9s8hcgtX3bloYbW+09OHWu M12ySzk6hEOvJcGZwoBobkG5q9cHPk1ehaCtzaTE5MlBaSOSfg+AAHVUusr3PUZR REmwS+eBZu6wRghXPE6c0oLuBulQ1FeJXkDsfuRhkaoBfZxfc/BiTEb67CCbHPm4 gT34VCiVRB0G0Sp5rnu9S9f1LvRmF2DoMCK+CmCBNh0q/dD3EskQJOh5c9sAKHKJ 0TO1LyuRj5jUILgOma/gHX3LHa7JN9EE+DKK5mm8s75vMKwv8FpWMc6B9LeOfcIn XDDMM5tdrtbXMvZ6M5jp+bhbnoydxhRHgXBpiTMe3ze4VZXXLdmSBX/am9oVhKA= =TdvH -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] Invalidate VPD page data
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 3/15/2014 3:51 AM, Hannes Reinecke wrote: Add a flag 'vpd_invalid' to the SCSI device to indicate that the VPD data needs to be refreshed. This is required if either a manual rescan is triggered or if the sense code INQUIRY DATA HAS CHANGED has been received. --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -393,6 +393,7 @@ static void scsi_report_sense(struct scsi_device *sdev, if (sshdr-sense_key == UNIT_ATTENTION) { if (sshdr-asc == 0x3f sshdr-ascq == 0x03) { + sdev-vpd_invalid = 1; I didn't study the whole code path but does the VPD data get updated on a 6/2900? I suspect it should be. I can imagine a number of cases where the luns changed check condition gets preempted/lost by a device reset. I guess much of that should be masked by the port login/logout, but its probably better to be safe... -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTJ3MRAAoJEL5i86xrzcy7vsAIAKyiMPZ0FBlLRxlQsGQxHaet 8FTCoj0GtgE1hmw+BfLvKzdR5VqMNt/yTSsJd/8OZrykDQ298TQlfgoSle7/dpYp FDaMq2uXINGpe+EC/OvVGH8GJbOgdjLectwu2EqKhkMblpyBPM83XXWNOD1lbLYf /TN/WPug9s5NOwdwSxeNhZRZKVw/9T33fxVKlXQg/sExfjIeFqHSTxIRH9bvktvw /ewe85P8WNtTXwZUGj1O3PaPzg0B98+LgHmAJNYREBf7t/mDZpkR492Ty9fRKkxi SauSIvdaNWuc28a88xaJGD+WRDPqSbLjecpNnWiYNfbNrNKx/WoJUpfVJS+Ltmk= =mfSZ -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/16] scsi_dh_alua: Improve error handling
On 3/7/2014 1:12 AM, Hannes Reinecke wrote: Can you file a bug with bugzilla.novell.com and assign it to me? Thanks, its bug 867371 https://bugzilla.novell.com/show_bug.cgi?id=867371. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/16] scsi_dh_alua: Improve error handling
On 1/31/2014 3:29 AM, Hannes Reinecke wrote: Improve error handling and use standard logging functions instead of hand-crafted ones. @@ -182,11 +185,13 @@ static unsigned submit_rtpg(struct scsi_device *sdev, struct alua_dh_data *h, bool rtpg_ext_hdr_req) Can you shorten the timeouts for this? I think submit_rtpg failures are causing a boot failure on a SLES 11 SP2 machine (3.0.101-0.7.17-default). The basic configuration is an ACNC jetstor, mapped as a raw LUN through a vmware (esxi 5.5) virtual mptsas adapter. The sd driver seems to be jamming up udev's creation of the boot /dev entries with a ~360 second delay probing /dev/sdb. This causes the /dev/sda entry for the root partition (on another device) not to show up in time. (output from a screen capture sent to me/ocr'ed) [ 2.918164] sdev dma_alignment 511 [ 3.917810] mptsas2 ioc02 attaching ssp device2 fw_channel 0, fu_id 1, phg 1, sas_addr 0x5000 c29f6c78b0af [ 3.920204] scsi target0:0:1: mptsas2: ioc02: add device: fw_channel 0, fw_id 1, phy 1, sas_add r 0x5000c29f6c78b0af [ 3.921923] scsi 02021202 Direct-Access JetStor VMB-00 R001 PQ: 0 ANSI: 5 [ 4.824734] scsi 02021202 mptscsih: ioc0: qdepth=64, tagged=1, simple=1, ordered:0, scsi_level:6, cmd_que=1 [ 4.827184] sdev dma_alignment 511 [ 4.828006] scsi 02021202 alua: supports implicit TPGS [ 4.828897] scsi 02021202 alua: target naa.201B4D01CA59 port group 01 rel port 1b [ 4.830084] scsi 02021202 alua: Attached [ 4.834297] sd 0:0:0:0: [sda] 33554432 512-byte logical blocks: (17.1 GB/16.0 GiB) [ 4.835141] sd 02020202 [sda] Write Protect is off [ 4.835954] sd 02020202 [sda] Mode Sense2 61 00 00 00 [ 4.835994] sd 02020202 [sda] Cache data unavailable [ 4.836808] sd 02020202 [sda] Assuming drive cache2 write through [ 4.837718] sd 0:0:1:02 [sdb] 390623744 512-byte logical blocks: (199 GB/186 GiB) [ 4.837756] sd 02020202 [sda] Cache data unavailable [ 4.837758] sd 02020202 [sda] Assuming drive cache2 write through [ 4.870098] sda2 sdal sda2 [ 4.871675] sd 02020202 [sda] Cache data unavailable [ 4.872461] sd 02020202 [sda] Assuming drive cache2 write through [ 4.873243] sd 02020202 [sda] Attached SCSI disk [ 5.622838] sd 02021202 [sdb] Write Protect is off [ 5.624410] sd 02021202 [sdb] Mode Sense2 bf 00 00 08 [ 5.624654] sd 02021202 [sdb] Write cache2 enabled, read cache2 enabled, doesn’t support DPO or FUA [ 364.290276] sd 02021202 timing out command, waited 360s [ 364.291495] sd 02021202 alua2 rtpg failed, Result2 hostbgte=DID_OK driuerbgte:DRIVER_OK [ 364.309198] sdb: sdbl [ 367.890274] sd 02021202 [sdb] Attached SCSI disk vmware says this about the disk: naa.6001b4d437346863 Device Display Name: JetStor Fibre Channel Disk (naa.6001b4d437346863 ) Storage Array Type: VMW_SATP_ALUA Storage Array Type Device Config: {implicit_support=on;explicit_support=off; explicit_allow=on;alua_followover=on;{TPG_id=1,TPG_state=ANO}} Path Selection Policy: VMW_PSP_MRU Path Selection Policy Device Config: Current Path=vmhba5:C0:T0:L1 Path Selection Policy Device Custom Config: Working Paths: vmhba5:C0:T0:L1 Is Local SAS Device: false Is Boot USB Device: false -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] Add EVPD page 0x83 entries to sysfs
On 2/11/2014 2:32 AM, Hannes Reinecke wrote: The problem with page 0x80 is that (per spec) it's vendor-defined. So there is no guarantee for it to be unique in any sense. Which makes it rather impractical for normal use. Hence we typically rely on page 0x83 to identify a device, be it for udev or multipath. AFAIK is not vendor defined page, its just not marked as mandatory by T10. For tape (which I what I thought brought much of this on) it is basically mandatory. Which is another place where the spec doesn't sync with the real world. That is because it is the _ONLY_ vendor neutral method to autoconfigure tape libraries. The tape libraries export the drive serial numbers via READ ELEMENT STATUS, dvcid=1. Which means its the de facto method for backup/media manager applications. A tape/media changer device that crashes or fails to return useful 0x80 information will have a very short life in the market. For sure, it would work better than the existing method being used by udev (for tape), which fails (per my other posting) because there is often insufficient information in 0x83 to uniquely identify devices. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] Add EVPD page 0x83 entries to sysfs
On 2/10/2014 5:11 AM, Hannes Reinecke wrote: EVPD page 0x83 is used to uniquely identify the device. So instead of having each and every program issue a separate SG_IO call to retrieve this information it does make far more sense to display it in sysfs. Tested-by: Jeremy Linton jlin...@tributary.com So, I just ran it in 3.14-rc2. No OOPS, that is good. It even survived probing a SPC-2 device without a page 0x83. I tested it with a fairly narrow set of devices, a couple IBM libraries with LTO/359x and a VTL. I did notice this on an old IBM raid adapter running in the machine cat: ident_lun_scsi_name: Invalid argument (that came from this device) sg_inq --page=0x83 --hex /dev/sg2 VPD INQUIRY, page code=0x83: 00 00 83 00 48 01 03 00 08 50 01 0b 90 00 12 1d 90...HP... 10 61 93 00 08 50 01 0b 90 00 12 1d 8e 61 94 00 04a...P...a... 20 00 00 00 01 61 a3 00 08 50 01 0b 90 00 12 1d 8da...P... 30 63 a8 00 18 6e 61 61 2e 35 30 30 31 30 42 39 30c...naa.50010B90 40 30 30 31 32 31 44 38 44 00 00 00 0000121D8D And there may be a couple descriptors missing here and there. For example 3592E05 is missing the total port count (I think). VPD INQUIRY, page code=0x83: 00 01 83 00 5c 02 01 00 24 49 42 4d 20 20 20 20 20...\...$IBM 10 30 33 35 39 32 45 30 35 20 20 20 20 20 20 20 2003592E05 20 30 30 30 30 30 37 38 33 36 33 32 33 01 03 00 0807836323 30 50 05 07 63 02 41 0c 2c 01 13 00 08 50 05 07 63P..c.A.,P..c 40 02 81 0c 2c 01 14 00 04 00 00 00 02 01 23 00 08...,.#.. 50 50 05 07 63 02 41 0c 2c 01 24 00 04 00 00 00 01P..c.A.,.$.. /sys/class/scsi_tape/nst14/device # ls ident_* ident_lun_naa ident_lun_t10 ident_port_naa ident_port_relport ident_target_naa This almost seems like a case where exporting the raw 0x83 data may be better... Also, as I stated previously, my personal bias is to include the page 0x80 serial number data for tape devices as well. That seems to be the most reliable. Mostly because a lot of the VTLs now just give you the same wwnn/wwpn in 0x83 for multiple LUNs. Meaning you can't uniquely identify the device over different physical ports. The IBM devices are nice in that they export a T10 Vendor ID with the man/model/serial in 0x83, but that is not common in my experience. For example (old T10k) VPD INQUIRY, page code=0x83: 00 01 83 00 20 01 03 00 08 50 01 04 f0 00 93 ac f6... P... 10 01 13 00 08 50 01 04 f0 00 93 ac f7 01 14 00 04P... 20 00 00 00 01 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: Do not rewind for SG_IO
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/2/2014 5:42 AM, Hannes Reinecke wrote: This is due to the strictly sequential design udev has. Essentially udev spawns a worker for every device which gets created (= udev receives a uevent for). The part I fail to see in this explanation is why the nst/st/st*a/st*m/etc handles are being treated as separate devices. They aren't. They are all the same physical tape device, so why are the nst devices being handled separately from the st ones? Maybe the problem is that you have to many workers for the tape device. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJS76ykAAoJEL5i86xrzcy7MKMH/0upLuOBOJWzdOfKYq0WmIvZ eIaaZG2vhckYeS2zZtim5uVFQDp5eTOirmjxfwSGzSTSAmNrQJwzZvBO/vA2/Kqk wZSKXp/ZGZhw11+6Kg8f1EArQQwT/i3R6BKglLELFvZVvNOUg3KCnd6nE/4k7ysh H3f6+6/Jb1wUA6h7a65BG7VBQlJ3HqVe01vYTrkb3eYW7IWfN0tX2FMdqYt2zon2 Yo6TRxhTE/dmqJhg8nLB+fA8rUwW7CYU/IX8nsKNn9lPaDdoJ6g22ozpJRrtEZZ+ lt/qL3VxfWu38z0GWhuKuOqY969bMlyaphY7bOgf7LY4osiC7OgarVoxSIgfP9E= =pIPt -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: Do not rewind for SG_IO
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/3/2014 9:06 AM, Hannes Reinecke wrote: That's due to udev. Udev is getting events for each device it should create a device node for. So for 'st' it'll get a series of events for 'stX', and another series of events for 'nstX'. Udev will treat each of these events separately, with distinct worker programs handling them. Each of those workers run fully asynchronous and cannot access information from other workers. So whats wrong with the simple solution? You throw the ones for st away, and create the st handles from the nst worker. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJS77DnAAoJEL5i86xrzcy7j+cH/24oncr+DN1yZ4ReXk3i8QHx LpI83UgP9EAvvcHJb5Op3IQtojccda1rYmecMS8qLYV0IX33lJg6UXbhkNS/skkR gFbPdsD/27JqJZvCU02U+ET0zfO5XH833UCRKOsqoA/GMeikLAaUKPV5t65eyCHh Qy4CYr4Gve9AxMhV9n00IdadOL5NoH/aO+Qb916zeJ2dUng5TUDqQ3WzdlQQdhD5 ObReHBnTBXlsWQdZL2VakP6gEX2ijiZ09GOIeSf1rKz/974OAudmzLsVjF4BwqTA 5JzqQXezYO2CZ8zkiuCCuiuXEwjv3f62rCuqzi5lQByFGDpvJNMZDAU9GjTn9Z8= =ELYk -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: Do not rewind for SG_IO
On 2/3/2014 2:51 PM, Kay Sievers wrote: This is not simple and not going to happen. Sibling devices in /sys cannot have a relationship in udev, there are only parent/child dependencies. Ok.. so if we can't ignore all the spare device nodes in a given /sys entry for a given device. Why open the device to scan it? I've often wondered why the serial number isn't part of the data in /sys along with the manufacture/model. The last tape drive I saw that failed to respond to inquiry page 0x80 was over a decade ago (probably manufactured in the early 90s). So enabling it just for tape is pretty safe. Matching Manufacturer/model/serial is going to be better than anything your going to get out of 0x83 anyway. That data is guaranteed to be there, but its also guaranteed to be unreliable (every device, and every port has a slightly different set of descriptors they choose to support). Plus, your not going to have issues accidentally rewinding a device, or resetting a tape density, or accidentally turning compression off if you don't open the device. Hannes, can't you just drop the weird auto-rewinding device matches from the persistent rules, is that really useful today? The relationship between the st and nst devices is leveraged by a large number of backup applications in the field. If you change it, its likely lots of breakage will ensue. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: Do not rewind for SG_IO
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 1/31/2014 2:46 AM, Hannes Reinecke wrote: This patch make the tape always non-rewinding when SG_IO is used, thus allowing udev to get a proper device id for tapes. This is wholly bad. Just because someone fires a SG ioctl at the device (usually to perform an operation that cannot be done with the st_ops) doesn't mean they don't want the tape rewound on close. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJS69EBAAoJEL5i86xrzcy7Ke0IAJbFISKHpJXuuWkK5EveElgG 8+Oy/ndRTSilqg5Ghn4Givr6LnVgs2hZVu6RUz3Y4WADwehxMof3iq6VhqN8bwkr Zun40DxZAwrxAQQJ8jn+0grKbiL/GdkTr6CwVJ7AUC1odFUOXd9tCqKa8YEzsRwQ dfoHBqU3cgGFir/l9wlvz0n+9kR4O3Y81IzCTJNAaLNRDelss6eqKEXuRI/53/5y K5WcYSxHNvqpBLWlRRF2fouyrxiVdsYr4WGoJZf9ReMK5UV8Ztr3YFG7HsRAAyTA b9PzWQF160U73sh6UFIjxG1UNmkBMxilLdQTJWfVHrQTeWakXRIV9gYB/0Z2l2Q= =teWU -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] st: Do not rewind for SG_IO
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 1/31/2014 2:46 AM, Hannes Reinecke wrote: This patch make the tape always non-rewinding when SG_IO is used, thus allowing udev to get a proper device id for tapes. Maybe instead of silently changing the behavior, if you just _HAVE_ to open the st device, add an ioctl or st/mt_op that disables the rewind on close. That way applications have to explicitly disable the rewind on close. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJS69K7AAoJEL5i86xrzcy7aC8IAJWoag7UaFselARB6eZ4Zfvm qi0Fho04TkqnNUJ5VEU81p05XwPJrmonrmqK55kR0PVkMT3o4Wp/KpkeN7gwrQjx ecR1Ckpoo4Q6n3W/HY06amN6qxLHgwi8RuU9vF7gjRZP4xqW57WRZz1GcuerD94n tF/i2Ajev6ZsdmRSCUN9DDFDR5RNKZ+XmiX3ihx4L1v27I/zMEteO66pDEIRdCoM laJnzsEfh/VNZdLeB3wck5xnW6HVq9YgqtH/oV+2LHeHg/Ji626g5/qsjhaA3YJQ asol8MJsbBGIcaRKEa9EJYy76GVFyCkMLywVFEyN7F9xcFD75P5p2a4siMlYzBc= =7Viv -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Open/INQUIRY fails on RESERVE'd tape device
On 1/23/2014 4:02 PM, Matthias Eble wrote: So: should open() fail on a reserved tape device? Yes, this is expected behavior for tape devices, reserve 6/release is sometimes used by backup applications in SAN environments as an arbitration mechanism across multiple machines. Its not that the INQUIRY is failing, its that the st open sequence is doing a reserve/TUR/etc during the open. If that fails then you can't open the drives sufficiently to send a inquiry via pass-through. In some environments you can bypass that processing with O_NDELAY/O_NONBLOCK. Or you just use the sg device which doesn't perform the tape open processing that st does. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [dm-devel] SCSI's heuristics for enabling WRITE SAME still need work [was: dm mpath: disable WRITE SAME if it fails]
On 9/24/2013 12:39 AM, Hannes Reinecke wrote: My drives support 'report opcodes', and report that write same is supported: ... 93 16Write same(16) ... but no support for page 'b0'. And yes, these are real SAS drives. So the question is, how many devices get the protect bit in the std inquiry incorrect? If that is mostly correct, how about std inquiry (SPC2), protect bit, and then VPD page 0x86 (or alternatively then do the READ CAPACITY, P_TYPE instead of page 0x86)? After all, the set of valid read/write opcodes is limited by the protection mode format, yes? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] scsi: Fix erratic device offline during EH
On 9/2/2013 6:58 AM, Hannes Reinecke wrote: +static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn) +{ + static unsigned char tur_command[6] = {TEST_UNIT_READY, 0, 0, 0, 0, 0}; + + if (scmd-request-cmd_type != REQ_TYPE_BLOCK_PC) { +struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd); + if (sdrv-eh_action) + rtn = sdrv-eh_action(scmd, tur_command, 6, rtn); + } + return rtn; +} + Is there are reason for using TUR here instead of STD inquiry? STD inquiry has the advantage that it can act like a ping but doesn't return unit attentions. Per my previous comments, trapping unit attentions in the error handler has caused UA's like luns changed, or power loss to get lost without being processed. For tape devices loosing UA's like this often means that the higher level driver won't be notified that the tape is rewound, resulting in serious issues. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 17/18] lpfc 8.3.42: Fixed issue of task management commands having a fixed timeout
On 9/8/2013 8:59 AM, James Smart wrote: The other issue - we seem to have missed your prior post. I'll look into it shortly. Thanks, Those patches were the result of various error injection test cases we were performing. The ones that come to mind were, hung sequences, TM rejects, failures to respond to the TM, etc. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 17/18] lpfc 8.3.42: Fixed issue of task management commands having a fixed timeout
On 9/6/2013 11:22 AM, James Smart wrote: Fixed issue of task management commands having a fixed timeout I'm surprised about this change, since it appears a number of issues in the send_taskmgmt() still exist that keep it from handling task management failures correctly. It also continues to have a number of smaller issues like for example dead code of the form if (status != IOCB_STATUS) else if (status == IOCB_BUSY) else See patch: lpfc should check return status for task mgmt IOCB http://marc.info/?l=linux-scsim=136242124409687 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH: scsi: make scsi reset permissions more relaxed (RFC)
On 8/30/2013 7:47 AM, Douglas Gilbert wrote: I proposed the following patch some time back to give the user space finer resolution on resets with the option of stopping the escalation but it has gone nowhere: http://marc.info/?l=linux-scsim=136104139102048w=2 With that patch you might only allow an unprivileged user the non-escalating LU and target reset variants. If changes are made in that area, we might like to think about adding a new RESET variant mapping through to the I_T Nexus Reset TMF. And a fine, incredibly useful patch it is. To the point of basically being a requirement for SAN environments. Without it, all kinds of havoc can ensue. But, the problem of burners going out to lunch, shows why its a stopgap. As most burners are going to be SATA attached (without an expander), you probably want to escalate all the way to the host reset if none of the other options work. With a few other tweaks and the no-escalate patch, its possible to implement escalation logic outside of the kernel that is aware of the device topology and individual device states. That way HBA's aren't getting reset under active functional devices. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SCSI error handling -- one error blocks the whole SCSI host
On 5/27/2013 8:32 PM, Baruch Even wrote: necessary but the command itself if it is already actively handled continues in its path. The abort only cancels those commands that are in the queue and if there really was a problem and the disk is engaging in error recovery of its own you'll just have no response from it and it will seem dead (abort may timeout). Yes, the abort seems to be handled more like a hint in many cases. Having coded a couple targets, abort handling is often _REALLY_ hard to get 100% right. Especially, when its an actual error that is causing the delay, rather than a correctly functional long running command. That said, I've seen devices actually respond to aborts on tape ERASE and similar commands by actually aborting the command as one would expect. So it does sometimes work.. Besides abort timeouts (which is major bad karma) the abort may be accepted, and the next non inquiry/tur type command that gets queued simply blocks waiting for the abort to internally complete. From the target device perspective, if you don't send a response for ABTS out in 2*RA_TOV then your problems start to multiply. So it encourages the target devices to treat aborts in an async manner. As you said, the device simply finds the indicated command on a queue, marks it as being aborted and hopes whatever is processing the command notices and terminates its operation. On subsequent commands the nicer devices will notice the abort hasn't completed and return becoming ready or similar in response to TUR/etc for some number of minutes. This view of aborts also means that reducing timeouts for commands and TMFs is mostly useless and sometimes even a really bad idea. I prefer to just let the device go on with its error recovery and just forget about the command. I want to forget about the DMA so I issue an abort but anything higher than that means a link is dead to me. Well, invariably the manufactures have timeouts that are really long and based on internal error recovery logic. See http://www-01.ibm.com/support/docview.wss?uid=ssg1S7003556aid=1 page 468. Notice the timeouts are specified in minutes, not seconds. Furthermore, the commands that normally complete in fractions of a second have actual timeouts that can be tens of minutes (READ/WRITE for example). So, doing anything before that timeout has expired is a good way to knock the device offline. Some of the newer disks have mode page options to shorten their read/write error recovery, but short error recovery can still be many tens of seconds rather than a couple minutes. Plus, it doesn't help compound commands like SYNCHRONIZE CACHE which may take multiple errors during operation. This is another part of what formed my opinions about error isolation. If one of your devices goes out to lunch and isn't recovering via abort/lun reset. Its done! Wrecking the rest of the SAN doing bus resets and HBA resets is a good way to take a serious problem and turn it into a full blown catastrophe. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: eh_abort_handler() and terminate commands
On 5/24/2013 5:57 AM, Hannes Reinecke wrote: Which leads to the interesting question: What happens with the actual command once eh_abort_handler returns? Well, eventually it ends up on the done_q and gets returned up the stack via flush_done_q(). But that wasn't what you were asking? As normally 'eh_abort_handler' is implemented as a TMF, one does assume that the command itself will be returned by the target with an appropriate status. Uh, well you don't get a proper SCSI status on a TMF or a ABTS/ABTX. So basically, the abort just kills processing of the commands. OTOH it also means that the HBA firmware might receive a completion for a command which the upper layer has already completed. Well, I think there is some rule here (scsi_eh.txt, everyone forgets about the command) that by the time the eh_abort_handler() completes you won't get any new scsi_done()s. This doesn't appear to mean that you won't get them while the abort_handler is running. Hence if you look at send_eh_cmnd() you see that the done completion being triggered at any time after the wait_for_completion_timeout() doesn't really do anything useful. The normal abort path completion doesn't appear to care either. Abort success/failure doesn't appear to fundamentally change the eventual return status of the commands. Will this completion ever being mirrored to the LLDD? Or discarded by the firmware? Yes, if for some reason a status comes in for an aborted exchange the HBA firmware rejects it because its against an invalid exchange (or should, the HBA i'm most familiar with does it this way). This is fairly easy to test if you have a jammer, just inject a FCP_RSP_IU into an aborted exchange. And how is one expected to handle the case where the TMF _failed_ on the target? Doesn't the current path eventually just end up doing the lun reset? Whats wrong with that, stop all the IO, let the existing commands complete or timeout then hit the device with the big hammer? If the lun reset succeeds you can pretty much feel safe that everything is aborted. That is assuming you get the correct return from the bus_device_reset(). It is potentially possible for the lun reset to be rejected, and in the case of some of the drivers return success anyway (consider lpfc_sli_issue_iocb_wait). I bet I could corrupt some disk data like that (format unit, abts reject, lun reset reject, continue operation with format unit still running on the target). I would rather prefer to have the LLDD terminate the command; this way we at least have a chance of getting a decent status back ... Well, you might be able to simplify a few things in scsi_* if eh_abort_handler() were more like the windows async cancel IO IRP and didn't block. It simply marks the IO as being canceled and then the completion eventually runs as normal within the devloss timeout. You probably could abort right out of a function in front of scsi_times_out() and avoid the whole error handling queues/blocking/task/etc. Then you use the abort accept/failure out of scsi_done to either queue the command into the current scsi_times_out logic, or you complete it with a timeout. Pretty clean, except for the fact your going to have to rewrite a lot of stuff in the LLDs to assure that they get the abort status returned within a reasonable amount of time. OTOH, the cancel IO model in windows is one of the things people writing IO drivers on that platform despise. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow error handling timeout to be specified
On 5/13/2013 12:46 AM, Hannes Reinecke wrote: True. But and the end of the day, we _do_ want to recover the failed LUN. If we were to disable that faulty LUN and continue running with the others we won't have a chance of _ever_ recovering that one LUN. I don't buy this. Especially for FC devices, the vast majority of errors I see are related to zoning, SFP and cabling problems. Once one of those happens you tend to get a lot of shotgun debugging, which injects all kinds of further errors. None of these errors are fixed by the linux error recovery paths. That said, if the admin fixes something, for FC/SAS (and potentially others) you _WILL_ get notification that the device is online again. SET when the link is down). So we basically _have_ to escalate it to the next level. Even though that will mean to stop I/O to other, hitherto unaffected instances. And a single failure, turns into performance bubbles and further errors on other devices. Particularly if the functional devices are stateful, and the error recovery mechanism isn't sufficiently intelligent about that state (see tape drives). Think about what happens when a marginal SFP on a target causes a device to repeatably drop off and reappear at some random point in the future. Anyway, It is possible to make a determination about the topology and make decisions about the likely-hood of any given portion being at fault. For example, if one lun on a target has failed and the remainder continue to work, then its unlikely that if abort and lun reset fail that anything higher up in the stack is going to succeed. I feel pretty strongly, at that point your better off providing good diagnostics about the failure and expecting user interaction rather than muddying the waters by causing other device interruptions. If the user tries everything and determines that a HBA reset is the right choice, provide that option, don't do it for them. If every device attached to the HBA fails then resetting the HBA is a valid choice, not before. Same for I_T. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow error handling timeout to be specified
On 5/13/2013 10:03 AM, Hannes Reinecke wrote: The other LUNs haven't reported an error. But how do you know whether they are still okay? The other LUNs might simply be idle, and no commands have been send to them. Well, how about generating std inquiry against them if they are idle and the given HBA has a device in error state? Then you can make a rough approximation of what has failed, and escalate the error handling if all the devices at a particular level have failed. The midlayer may not even need to send the inquiries. If the individual device drivers (sd/st/etc) are responsible for monitoring and error recovery then they can be tasked with determining device availability as well. I think this solves other problems too. For example, the use of TUR in the midlayer, is a problem because it doesn't have enough knowledge about the possible check conditions being returned to act on them appropriately. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow error handling timeout to be specified
On 5/13/2013 3:29 PM, Martin K. Petersen wrote: others. We see cases fairly often where a misbehaving target has confused the HBA enough that we can not bring the device back without doing an HBA firmware reset. Despite I/O completing successfully on other targets connected to the same HBA. This would seem to indicate a HBA/driver bug... So at some point we do need to give up and escalate to a full HBA reset. We would just like to defer that hammer until we have run out of other options. Except that I've seen the linux error recovery cause more problems than it solves on a fairly regular basis. I would rather have a solution designed to isolate failures, than one that makes a lot of mistakes and causes further problems (sometimes with other machines). I'm pretty convinced that attempting everything possible to recover a device when the underlying problem is unknown is a bad strategy. I think maybe its a perspective difference. If the device that is failing is an OS disk, then giving up is paramount to crashing the machine. On the other hand, if the failing device is some shared tape drive in a SAN with a few hundred alternatives then killing the OS in an attempt to recover that drive is a problem. Maybe, the super aggressive recovery paths should be reserved for devices marked critical to system operation. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On 4/24/2013 7:57 AM, Paolo Bonzini wrote: If the device can promise this, we don't care (and don't know) how it manages that promise. It can leave the data on battery backed DRAM, can archive it to flash or any other scheme that works. That's exactly the point of SYNC_NV=1. Well its the point, but the specification is written such that the vendors can choose to implement it any way they wish, especially for split cache systems where there is both volatile and non volatile cache. Flushing the NV cache to medium (as is the current behavior) may not be a bad idea anyway. Thats because I know of a large vendors array where the non-volatile cache might be better described as the sometimes non-volatile cache. That is because a failure to flush the volatile portions results in the non-volatile portions being considered invalid when power is restored. This fences the volume, and the usual method for recovering the array is to call support and have them invalidate the NV portions of the cache. Thereby negating the whole reason for having a NV cache. I'm sure they don't tell customers this fact when they sell the array, when it happened in our lab I was in a state of shock for about a week. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: T10 WCE interpretation in Linux device level access
On 4/23/2013 3:07 PM, James Bottomley wrote: I bet they don't; they probably obey the spec. There's a SYNC_NV bit which if unset (which it is in our implementation) means only sync your non-NV cache. For a device with all NV, that equates to nop. Yes, linux leaves the SYNC_NV bit unset in scsi_setup_flush_cmnd(). The draft specs, and a couple others I have laying about says: says the device shall sync cache to medium for both volatile and non volatile cache data if SYNC_NV is _unset_. With it set, the table could be more confusing! For volatile cache blocks with SYNC_NV set If a non-volatile cache is present, then the device server shall synchronize to non-volatile cache or to the medium. If a non-volatile cache is not present, then the device server shall synchronize to the medium. And for Non-volatile cache with it set No Requirement Which to me says, don't expect any particular behavior if you set this bit and have NV it could flush to medium, flush to NV cache, or do nothing at all. But it seems pretty clear that with it unset its probably going to get synchronized to the medium. If T10 were to do something, maybe they could stop putting bits in the docs that aren't guaranteed to do anything (fill in rant). As for linux, seems the state of the spec really doesn't leave any good options other than provide the user the ability to disable the flush_cmnd() if the NV_SUP bit is set. Or maybe a white list (ick!)... -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/8] [SCSI] Enhanced sense and Unit Attention handling
On 4/15/2013 9:13 AM, Ewan Milne wrote: patch could attempt to clear the check conditions from LUNs that share the I_T. I think the mid-layer will handle that automatically. If check conditions are reported the commands will have to be reissued. But, not automatically (unless i'm missing something again). The UA is going to arrive when each lun gets sent a command, which could be a long time from the initial UA if the lun is idle. Enough time, that the attempts to coalesce the events are going to fail. I guess it depends on what you have udev doing when it gets the event. If it triggers a rescan involving something besides inquiry/report luns then that will trigger the remaining UA's from the luns on the target that changed. But if it does something other than that, I don't see it by reading the patch/scsi_scan.c code. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/8] [SCSI] Enhanced sense and Unit Attention handling
On 2/1/2013 11:53 AM, Ewan D. Milne wrote: The mechanism used is to flag when certain UA ASC/ASCQ codes are received that report asynchronous changes to the storage device configuration. An appropriate uevent is then generated for the scsi_device or scsi_target object. An aggregation mechanism is used to avoid generating uevents at too high a rate, and to coalesce multiple UAs reported by LUNs on the same target for a REPORTED LUNS DATA HAS CHANGED sense code. What happened to this patch? The trail of suggested fixes for the REPORT LUNS DATA HAS CHANGED check condition is getting pretty long. The number of devices (our product included) in the field that have the ability to on the fly modify the luns on an I_T nexus is not decreasing. Is it because these patches are trying to fix more than one thing? What is the preferred way to fix this? Why not simply add a couple sdev_evt_send_simple()'s and an event coalesce function to collapse this event when its received from multiple LUNs on the I_T? A couple extra uevents isn't going to kill udev right? A really fancy patch could attempt to clear the check conditions from LUNs that share the I_T. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_transport_fc: Make 'port_state' writeable
On 3/15/2013 8:28 AM, Bryn M. Reeves wrote: On 03/15/2013 12:46 PM, Bart Van Assche wrote: The SCSI EH keeps trying until all outstanding request have been finished. Does lpfc_host_reset_handler() invoke scsi_done() for I don't think so (ends up calling lpfc_sli_cancel_iocbs() via lpfc_hba_down_post() after shutting down the mailbox) but I've not seen the EH escalate all the way to host reset in most of my testing - ... The problem is that getting to this stage can take a very long time - much longer than most cluster's node eviction timer for e.g. which is the source of much of the complaint about this behaviour. outstanding requests ? If not, how about modifying lpfc_host_reset_handler() such that it finishes all outstanding requests if the remote port is not reachable ? It does call the done() function on the outstanding command IOCBs after the lpfc_reset_flush_io_context() call aborts them. The problem is that they are returned with ScsiResult(DID_REQUEUE, 0) which basically queues them back to the port as long as the port is still up. Which results in the commands hanging out until their timeouts expire (if the device isn't responding). If the device does resume after the reset, in the case of a tape device it is possible corrupt the tape because the 2900's get trapped by the TUR in the eh routines depending on which commands were hung. Take write for example, the reset can result in a tape rewind, and when the write gets fired back at the device the tape is at BOT and effectively erases all data already on the tape. Whops! Also, as I stated elsewhere, in my testing its impossible to escalate beyond the flush_io_context() in the lpfc_device_reset_handler driver because it always returns true if the card firmware is responding. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V3 1/4] Encapsulate scsi_do_report_luns
On 3/7/2013 9:47 AM, Elliott, Robert (Server Storage) wrote: +int scsi_do_report_luns(struct scsi_device *sdev, int length, + * We can get a UNIT ATTENTION, for example a power on/reset, so + * retry a few times (like sd.c does for TEST UNIT READY). + * Experience shows some combinations of adapter/devices get at + * least two power on/resets. + for (retries = 0; retries 3; retries++) { + SCSI_LOG_SCAN_BUS(3, printk(KERN_INFO scsi scan: Sending + REPORT LUNS to %s (try %d)\n, devname, + retries)); + result = scsi_execute_req(sdev, scsi_cmd, DMA_FROM_DEVICE, + lun_data, length, sshdr, + SCSI_TIMEOUT + 4 * HZ, 3, NULL); There's no guarantee that you'll get no more than two unit attention conditions at any particular time; Actually, if your getting any unit attentions from a report luns the device is broken. SAM5 5.14 if a REPORT LUNS command enters the enabled command state, the device server shall process the REPORTS LUNS command and shall not report any unit attention conditions This is not new behavior either. There are a couple other places that say similar things, INQUIRY and REPORT LUNS get special status for UA. Which is how you can scan a target/lun configuration without interfering with its operation. Personally, I think the TUR in the mid layer is incorrect as the TUR functionality needs to be hoisted higher up the stack and the mid layer needs to use inquiry to validate device communications. (got a patch for that too, but no point in posting it, as it will be ignored). -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V3 1/4] Encapsulate scsi_do_report_luns
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 3/7/2013 11:30 AM, James Bottomley wrote: This also means we can't go through the linux SCSI subsystem changing behaviour based on what SAM says the behaviour should be. Most of what the SCSI subsystem does is an accumulation based on years of trying to fix it for annoying and out of spec devices. Well, I wasn't suggesting removing the retries for this patch, cause yes there are a lot of non complaint devices, but I was complaining about a case where there are known problems with the way the code is executing on non broken devices. Basically, prioritizing the functionality of a broken device, of the functionality of a working one. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRONGAAAoJEL5i86xrzcy7a6QH/16EQyMQ3DzLrX2a3OdtSD4Q QdHInok1SAyGKDGHTHXGu0RKuvpzgdSLjORKfEdbok/ZyNXd7qSi57czRV7R5U4b nTLoaP8maXxJsJ1ko11sTEfZNT4cgO4+hLMjcZk9LBJZhNC+WqsszYaOVVLFtSIJ xpBaowSjxLpkhi5cTdZ6p4+Tr2xgZxBXd+5NUZuB1s6ZJ99yNYcn97Q/3VVeFmW9 sprBP3kkiWv3LOIN6ZNTkKRDtgJYzf2LVTogjtNfCQsB/ZUHr5ITzZ1fMBkVrR7c yVe4kdq26RDC57oSJMqAHA8QXBQ2ll8l8fz1X1mebb2TeyOI57/U8ZbPyfGvzxo= =yZvD -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2][RFC] scsi_transport_fc: Implement I_T nexus reset
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 3/7/2013 1:19 PM, Mike Christie wrote: What happens for lpfc? It seems __fc_remote_port_delete ends up calling the fast io fail code right away and that sets FC_RPORT_FAST_FAIL_TIMEDOUT. We will then call lpfc_terminate_rport_io which only will send aborts for the commands. We will then call fc_block_scsi_eh above and that returns FAST_IO_FAIL and we will pass that back up to the scsi eh right away. For lpfc, you never get to the code. Or rather when I was testing it, I couldn't find any way to propagate an error beyond the initial lpfc_reset_flush_io_context() call in lpfc_device_reset_handler(). That call pretty much always returns success indpependent of the remote device because the firmware acks the context clear aborts, resulting in the outstanding iocb count being zero (independent of both the mid layer status and the actual device state). Result: all the code beyond the device reset handler never gets called. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJROPTfAAoJEL5i86xrzcy7MSMIAKaUZV1sfE55/n95b28WTdAS 7HdUechq5JRh2jqW+PVQub3iADgjl5RZkj8T3vNTZgzR9pcQ6NE/qdkwho+p29Wx enBa68HMosO+oiqPVSz7mmyuOsubB/DxPC3D+5ODu3nTJNMBxE4wYgdfGYsXVZS7 f/HCLo0Ysg7SBzTBQKvk0E1UtMJv1miEsIgxxqYSvOAOcHtKwUaYtCclE2z9egby AnyVV1UrVa/cI8R4w0nArnyLCrLzG4IVAMByyb0KAQ3NKOdxGPqxPTkoY6GEpcQ9 GxzoZVWerGbzdjYXz2gckiN8oonBIB3esrrOTyq14sTqfOxtynH+8X3qS2uRFhg= =t9Gx -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2][RFC] scsi_transport_fc: Implement I_T nexus reset
On 3/7/2013 2:20 PM, Mike Christie wrote: On 03/07/2013 02:13 PM, Jeremy Linton wrote: For lpfc, you never get to the code. Or rather when I was testing it, I couldn't find any way to propagate an error beyond the initial lpfc_reset_flush_io_context() call in lpfc_device_reset_handler(). That call pretty much always returns success indpependent of the remote device because the firmware acks the context clear aborts, resulting in the outstanding iocb count being zero (independent of both the mid layer status and the actual device state). Your lpfc patch fixes that right? Yes. It allows the device reset to fail if the device doesn't respond to the task mgmt request, or rejects it, etc. It doesn't unjam the commands that get aborted by the flush_io_context() call. Those have to depend on their timeouts. That is another patch... -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND][PATCH] lpfc should check return status for task mgmt IOCBs
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ping? Comments, suggestions, rejections for this patch? I understand its a little long, but it seems checking the return status from a task management routine could be considered important. Plus, it helps to bring the behavior in line with the other LLDs. Thanks, Jeremy Linton -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRN7jaAAoJEL5i86xrzcy76+sH/2u9V3GJLCgMSB2LAZDDcpAK 4t+Um2IraIJocFvylckJieoMjhfAMcsc8fJzoxvBNVb7g6NvBQZIh2IbiWhBc2Id 3/t9rA5wdbBMAbIYsoxwY1t6laxkwZxxfP3QI5UDf0e5jcd5hg+RKB6PDOD8wLZ3 tsUsDNDic0agY0WiUpied5qh4feO2e6j6Rkv/3uTFQLWIjqZMUhwZEjVivIbfG/m OWj56HuaHg0CAGq1Gos2ruuzfFuwVr8Eo4SgMlnGQNdENc6R+WbNdzCszKgRCKEt QPULdPQ2/2GCqjweHeY10OzAlAcxNRH8Z2EMbmJRmJltnwMzVHBxluIbH207MaM= =6yaT -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RESEND][PATCH] lpfc should check return status for task mgmt IOCBs (now with correct code formatting)
I realized after I sent the last patch, that it was the wrong file (incorrectly formatted) and missing the signature. That has been corrected in this version. Other than that, the two patches are functionally identical. This patch adds code to the lpfc driver to check the return status from the firmware/wire for task mgmt commands. As the firmware tends to also return IOSTAT_FCP_RSP_ERROR, indicating an attached FCP_RSP info field, there is a new function to parse the FCP RSP info field. If the info field indicates the given task mgmt function succeeded, then the error status is cleared and the command is marked as completing successfully. The lun and target reset callbacks have been slightly tweaked to run the flush io context routines only if the reset is successful. The return code from the reset routines now correctly indicates whether the reset was successful allowing the mid layer to escalate the error handling. I considered simply returning SUCCESS from the target reset callback, as that approximates the behavior before this patch. That stops the bus/host reset from being called while allowing lun reset to become a target reset if necessary. As part of this patch a couple general code errors were also corrected. The ulpFCP2Rcvy bit is now set/cleared for task management routines in the same way as SCSI commands, some code which could never execute has been removed from lpfc_sli_issue_iocb_wait(), and the out of resources error in lpfc_prep_seq is now tagged as a IOSTAT_LOCAL_REJECT as there isn't a FCP RSP associated with the error. Signed-off-by: Jeremy Linton jlin...@tributary.com diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index 60e5a17..b940f04 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -4080,7 +4080,9 @@ lpfc_scsi_prep_task_mgmt_cmd(struct lpfc_vport *vport, } if (ndlp-nlp_fcp_info NLP_FCP_2_DEVICE) { piocb-ulpFCP2Rcvy = 1; - } + } else + piocb-ulpFCP2Rcvy = 0; + piocb-ulpClass = (ndlp-nlp_fcp_info 0x0f); /* ulpTimeout is only one byte */ @@ -4569,6 +4571,73 @@ lpfc_taskmgmt_name(uint8_t task_mgmt_cmd) } } + +/** + * lpfc_check_fcp_rsp - check the returned fcp_rsp to see if task failed + * @vport: The virtual port for which this call is being executed. + * @lpfc_cmd: Pointer to lpfc_scsi_buf data structure. + * + * This routine checks the FCP RSP INFO to see if the tsk mgmt command succeded + * + * Return code : + * 0x2003 - Error + * 0x2002 - Success + **/ + +static int +lpfc_check_fcp_rsp(struct lpfc_vport *vport, struct lpfc_scsi_buf *lpfc_cmd) +{ + struct fcp_rsp *fcprsp = lpfc_cmd-fcp_rsp; + uint32_t rsp_info; + uint32_t rsp_len; + uint8_t rsp_info_code; + int ret = FAILED; + + + if (fcprsp == NULL) + lpfc_printf_vlog(vport, KERN_INFO, LOG_FCP, +0702X fcp_rsp is missing\n); + else { + rsp_info = fcprsp-rspStatus2; + rsp_len = be32_to_cpu(fcprsp-rspRspLen); + rsp_info_code = fcprsp-rspInfo3; + + + lpfc_printf_vlog(vport, KERN_INFO, +LOG_FCP, +0702XX fcp_rsp valid 0x%x, + rsp len=%d code 0x%x\n, +rsp_info, +rsp_len, rsp_info_code); + + if ((fcprsp-rspStatus2RSP_LEN_VALID) (rsp_len == 8)) { + switch (rsp_info_code) { + case RSP_NO_FAILURE: + lpfc_printf_vlog(vport, KERN_INFO, LOG_FCP, +0702XX Task Mgmt actually OK, + cancel error\n); + ret = SUCCESS; + break; + case RSP_TM_NOT_SUPPORTED: /* TM rejected */ + lpfc_printf_vlog(vport, KERN_INFO, LOG_FCP, +0702XX Target rejected task +management\n); + break; + case RSP_TM_NOT_COMPLETED: /* TM failed */ + lpfc_printf_vlog(vport, KERN_INFO, LOG_FCP, +0702XX Target failed TM\n); + break; + case RSP_TM_INVALID_LU: /* TM to invalid LU! */ + lpfc_printf_vlog(vport, KERN_INFO, LOG_FCP, +0702XX Task Mgmt +to invalid LUN\n); + break; + } + } + } + return ret; +} + /** * lpfc_send_taskmgmt - Generic SCSI Task Mgmt Handler
Re: [GIT PULL] Final round of SCSI updates for the 3.8+ merge window
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 3/1/2013 9:06 AM, James Bottomley wrote: The results were interesting, there are some really strange things that happen in some of the LLD error paths. Its obvious that error injection is not part of testing many of them, and what at first glance should be a fairly straightforward error can create quite a mess. So anyone sending any kind of reset (especially without the ESCALATE flag which tends to isolate the error handling) to the LLD's should be aware that behavior between them can vary significantly. So the patch does seem to have dangerous side effects. Those are due to bugs in the LLD's that actually are there regardless of that patch. For example the lpfc patch I posted a couple days ago, fixes the LPFC driver so that it actually checks the return status from the task management IOCB's being sent to the firmware. As it stands the reset paths in the lpfc driver always return SUCCESS independently of the status of any aborts, resets, being sent as part of the reset handlers. This is completely non obvious at first glance at the code. This means that the error handling behavior of lpfc is significantly different (and not necessarily better) than the zfcp and qlogic drivers I also tested. I didn't find any cases where this patch makes the problem worse, in fact in general the behavior is significantly better. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRMMjEAAoJEL5i86xrzcy7qrsH/3l0hz22pp/QJsZ2XJghpH/s L8c1m5h9mTvSMFUtFIQoLkclHgJcVkTs7aeHLeEFxzZ5vEPPbuieLkr78s5Z3iVa 99LuuIMKSPZWpgohSVL0xsaaDftc4xVQ09MuuLWNeTxNZvKGRUalyVDLoPthSEwz MXKNng6K1IqBe+u1mS+QhfAqNS5EVOI3gx34XIa0rm81jEKNeVpIF9qU0pCuJnNs 4QeYH+66hK+ILVyRyn+qsdjg8vT5xccaGw1DUbEzxz2QATKP0HOPG+dnlyzPkb4r 12UvBtHqT6+QRFeBe+sdgWRCwszaOOTjH18kMMUZaJJB3EaCV3dJsVESMO4K0GI= =IueP -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][BUG] lpfc doesn't handle failures in lpfc_send_taskmgmt()
The lpfc_send_taskmgmt() routine fails to check the return IOCB from the firmware. This means that all taskmgmt functions appear to complete even when they are failing due to device failures, or task mgmt errors. This patch corrects this by checking the iocb.ulpStatus after the command has completed. Of course even when the command completes successfully the firmware sets IOSTAT_FCP_RSP_ERROR. This indicates that the driver needs to verify the return code in the FCP RSP. So a new routine lpfc_check_fcp_rsp() has been added which verifies the RSP has a info field, and that the info field indicates success. I've also added in a check to see if the task mgmt function succeeds in the reset handlers before running lpfc_reset_flush_io_context(). In a way this is bad because now its possible to actually fall through the mid layer error handlers into the bus and host reset logic. This behavior itself changed not to long ago when the io_context calls were added. The lpfc driver would never get past device_reset_handler() because it would _ALWAYS_ return success even if the io_context failed to abort properly because the firmware would handshake the flushes. This leads to another set of bugs when there are actually commands hung against the device. I have a partial set of patches to fix that problem too.. Trace with successful LU reset. [16785.323122] lpfc :10:00.1: 3:(0):0702 Issue FCP_LUN_RESET to TGT 5 LUN 510 rpi xa nlp_flag x8000 Data: x0 x4 [16785.323329] lpfc :10:00.1: 3:0336 Rsp Ring 0 error: IOCB Data: xff20 x60 x0 x0 xfe x0 x28208ce x3ca29d12 [16785.323349] lpfc :10:00.1: 3:0331 IOCB wake signaled [16785.323356] lpfc :10:00.1: 3:(0):0727 TMF FCP_LUN_RESET to TGT 5 LUN 510 failed (1, 254) iocb_flag x6 [16785.323359] lpfc :10:00.1: 3:(0):0702XX fcp_rsp valid 0x1, rsp len=8 code 0x0 [16785.323362] lpfc :10:00.1: 3:(0):0702XX Task Mgmt actually OK, cancel error [16785.323366] lpfc :10:00.1: 3:(0):0713 SCSI layer issued Device Reset (5, 510) return x2002 [16785.323562] scsi_reset_provider: waking up host to restart after TMF trace with LS reject Target reset. [16870.975793] lpfc :10:00.1: 3:(0):0702 Issue FCP_TARGET_RESET to TGT 5 LUN 510 rpi xa nlp_flag x8000 Data: x0 x4 [16870.976043] lpfc :10:00.1: 3:0336 Rsp Ring 0 error: IOCB Data: xff20 x60 x0 x0 xfe x0 x28408d1 x3ca29d12 [16870.976061] lpfc :10:00.1: 3:0331 IOCB wake signaled [16870.976067] lpfc :10:00.1: 3:(0):0727 TMF FCP_TARGET_RESET to TGT 5 LUN 510 failed (1, 254) iocb_flag x6 [16870.976071] lpfc :10:00.1: 3:(0):0702XX fcp_rsp valid 0x1, rsp len=8 code 0x4 [16870.976074] lpfc :10:00.1: 3:(0):0702XX Target rejected task management [16870.976078] lpfc :10:00.1: 3:(0):0723 SCSI layer issued Target Reset (5, 510) return x2003 trace with bad device failing to respond to target reset. [17383.880074] lpfc :10:00.1: 3:(0):0702 Issue FCP_TARGET_RESET to TGT 5 LUN 510 rpi xa nlp_flag x8000 Data: x0 x6 [17443.116283] lpfc :10:00.1: 3:0336 Rsp Ring 0 error: IOCB Data: x120 x60 x0 x0 x2 x0 x2408d3 x10229d32 [17443.116310] lpfc :10:00.1: 3:0331 IOCB wake signaled [17443.116316] lpfc :10:00.1: 3:(0):0727 TMF FCP_TARGET_RESET to TGT 5 LUN 510 failed (3, 2) iocb_flag x6 [17443.116321] lpfc :10:00.1: 3:(0):0723 SCSI layer issued Target Reset (5, 510) return x2003 diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index 60e5a17..0ff3883 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -4081,6 +4081,9 @@ lpfc_scsi_prep_task_mgmt_cmd(struct lpfc_vport *vport, if (ndlp-nlp_fcp_info NLP_FCP_2_DEVICE) { piocb-ulpFCP2Rcvy = 1; } + else + piocb-ulpFCP2Rcvy = 0; + piocb-ulpClass = (ndlp-nlp_fcp_info 0x0f); /* ulpTimeout is only one byte */ @@ -4569,6 +4572,76 @@ lpfc_taskmgmt_name(uint8_t task_mgmt_cmd) } } + +/** + * lpfc_check_fcp_rsp - check the returned fcp_rsp to see if task failed + * @vport: The virtual port for which this call is being executed. + * @lpfc_cmd: Pointer to lpfc_scsi_buf data structure. + * + * This routine checks the FCP RSP INFO to see if the tsk mgmt command succeded + * + * Return code : + * 0x2003 - Error + * 0x2002 - Success + **/ + +static int +lpfc_check_fcp_rsp(struct lpfc_vport *vport,struct lpfc_scsi_buf *lpfc_cmd) +{ + struct fcp_rsp *fcprsp = lpfc_cmd-fcp_rsp; + uint32_t rsp_info; + uint32_t rsp_len; + uint8_t rsp_info_code; + int ret=FAILED; + + + if (fcprsp==NULL) + { + lpfc_printf_vlog(vport, KERN_INFO, LOG_FCP, + 0702X fcp_rsp is missing\n); + } + else + { + rsp_info = fcprsp-rspStatus2; + rsp_len = be32_to_cpu(fcprsp-rspRspLen); + rsp_info_code=fcprsp-rspInfo3; + + + lpfc_printf_vlog(vport, KERN_INFO, + LOG_FCP, + 0702XX fcp_rsp valid 0x%x, + rsp len=%d code 0x%x\n, + rsp_info, + rsp_len,rsp_info_code); + + if ( (fcprsp-rspStatus2RSP_LEN_VALID) (rsp_len==8) ) { + switch (rsp_info_code) { + case
Re: [PATCH v2] SG_SCSI_RESET ioctl: add no_escalate values
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tested-by: Jeremy Linton jlin...@tributary.com I tested this patch in an environment where the lun and target reset is failing because the target device is misbehaving. This patch appears to work as advertised. That said, I changed my testing methodology for this patch (vs the one I originally posted). The results were interesting, there are some really strange things that happen in some of the LLD error paths. Its obvious that error injection is not part of testing many of them, and what at first glance should be a fairly straightforward error can create quite a mess. So anyone sending any kind of reset (especially without the ESCALATE flag which tends to isolate the error handling) to the LLD's should be aware that behavior between them can vary significantly. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRJ+NjAAoJEL5i86xrzcy7GI0H/2XKCZvhLmE8WeQtMXlHhVyA G3Y34fKeEx9ek56Sr6AcipvV9mUBR9uYvydxZfdGjBT83I2bYGRHEfVEx22WKFXr JShVv43eIl5y/inUr8THNTBfggFcYaBIx21ieLwvR/+cBGWbpiIwi16ktaENX2O7 nHgHVSvFOam9Qy8ewQ2YcULsGFkHQd1SN2LXKSh5bp42eZ998Od4t+agADlNipNz OShNyJO9fHr4XC/pNMdujjDd+eaOPv5/furYAgkU/aQutxjIs5a0OtDiNi0hzmQ+ ZVfoQWk4Bh4LJdQ2ZiqiKBi54z8YxJu5n2WmbKTzwWcabumOWqebNG8JaYDDYjI= =2mMo -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Use a more selective error recovery strategy based on device capabilities
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/14/2013 5:42 PM, Elliott, Robert (Server Storage) wrote: Each logical unit is independent and is allowed to be different. I was actually just thinking about the target reset and IT reset flags. Two flags which affect the I_T not the I_T_L. For the target reset its probably a small proportion of devices anyway, The patch already disables target reset if the device is known to support lun reset. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRHlGAAAoJEL5i86xrzcy7Yz8IAKn3w6DnUkPxmasQXi0WljNB eqBZHIZRx8gjpa6AOP0nBt+FDSmyrhE4vLOUFIiwpcql1jmJ6iwcT+Y4YHwi+GzC sE/ZtB5UDad4RleXcZIBHJwVVFtW0oCya2jYxr2GQFIEz3EefrfwwXEqdeI85uTv aLiKakEP6EDQur280T08R9UEpFHirUnhMKoCFsXjtB5T6u2XDRLLWXQ8hE5ILBnX Mf5HmCF8g1EjOnCJOzcUlhRlDuQe17FcDHyaxPkl2f34Qr+gdPo9WW5Cz38V0RLX UYqmZI/B1GX7rUaU+Xhc4aAkxq6547cWZUwRLdZ6M4osFZT6GBuw3iRBevzNXDs= =/1pc -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow 64-bit LUNs during report lun scan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/13/2013 9:37 PM, James Bottomley wrote: What advantage does this have over setting max_lun to ~0? Actually, after having all those other discussions. I've come to see the elegance in this suggestion. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRHmFuAAoJEL5i86xrzcy7RXkH+QHeQeWZE4Z5Qe2GcKj64SVV eqnQNiOoOR4sKmPM1J8AhBbOj/sl6upSSXrHgcK9EGKCA8R099wwdYjFhyLy+RQn HZURfpJbxEWItGJd9mouGqR6SeUiifs7If9VUp+/OJXiBtePD8Vu3GQB0p7v0DwI BlKkUTnMAQgYPYPc8iMJiGJYf38ZtMJFU0oHow5L0VZG6zTJhxcOmAxEuu1zqJWX 5hvSw0jgXmAJ/p798LWKw4FjhdBFAyG1BcK8RmEsHqoW4XecSHU6qXvxokiVgIMt QKMuCHTRjxzVQCkpyUUc3hEH24kZFU8PqeMPw46dFYndZrM9Jg/3zq74k6aZlJo= =wwUR -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] SG_SCSI_RESET ioctl: add no_escalate values
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/15/2013 1:39 PM, Douglas Gilbert wrote: Further to the thread titled: [PATCH] SG_SCSI_RESET ioctl should only perform requested operation by Jeremy Linton a patch is presented that adds no_escalate versions to the existing ioctl. This should not break any existing code. Looks good, I will apply it here and try it out. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRHpCDAAoJEL5i86xrzcy7lvoIAK5pxe8rdx8WinxG/NqhLkRb +537Ln3asshkWcH385gaa1JzizoWhiS1+H+tGcLdEwPDwvSSGGqtUAn/qtxB64cM F5X0XqjS/XVup7GrkcoQ4ZzXG2rdI6Lb5gwbpb98QeCGzaVKfRMvJCCtXSIFuPXA S7gL0Xl5d5KapPCWVRpucE05XVaAZq2vrxC3/8hDwB8+3HYUW4gUTcwSAM0pvcUb d7w984jlcFHVxT3Yk+ZHuQ3QOKifbqWBE9WlsgqkPnGjlAK8Q19NQsD6/C9Q9wTg OhMjj9HKHOlcBE65e/xKlcncTXjaa1qcLfa94hOukjSQBkf0vqJH3XeWuACESE0= =Oqr2 -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow 64-bit LUNs during report lun scan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/13/2013 9:38 PM, James Bottomley wrote: Yes. The two functions are simple transforms ensuring that we can pack up to two levels of luns into a u32 whatever address method is used. At the time it was done, no array or other extant system went beyond this. At the end of the day, a LUN is just a handle, so even if we go to 64 bits we're still going to be packing the address method into the logical unit number. Ok, I will buy that (probably violates SAM5, 4.7.1, but no big deal), two points. First this requires basically every adapter capable of recieving address method!=0 LUNs to set the 64-bit capable flag that is included in this patch. Otherwise the scsi: %s lun%d has a LUN larger than allowed by the host adapter\n path fires even for a small number of luns because the address method bit creates a lun max_luns in all cases. Second, its possible with address method 11b, that none of the devices are actually visible even with this patch, as a device that chooses to use address method=11b and one of the 16 bit addressing methods gets its LSB truncated by the 32-bit return from scsilun_to_int(). Not that I have see one of those, no one needs that many LUNs chuckle. So, the flag in this patch is somewhat misnamed as it doesn't really support 64-bit luns. To stick to the existing method scsilun_to_int needs to be u64. BTW: Tiny syntax cleanup, scsilun_to_int() should have a return type of unsigned. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRHSa0AAoJEL5i86xrzcy7K6oH/RBnWrpDJGt+mcvR8of6BM6y nwtokc/GCas/RFcn1rxvayicKcqAgYGeE7PRoECvIiDoSNFacNGCvf3XQye4tF2y IMfGZhKlndJWKUppv5ELgyzpEbh49U3XK/Vq7O2B6pB46O6Iiqz1PUWK+yZF757B O1Q+w49FUSbq3AsPxYh4CeHj7+L+6o6mAILzl8lTgGGRkhQFr15jR1K29AUhMyyM xCTeWw++N9Iu5ENjIdiBk0E5bQZujKBBrSpuqWnyqPzhGX74AYexkOkEiXGlEBO7 Vr31C6TBVdpOvVdXlGoR/+ZcUxju1Q9ozmdW0QEzGMvNDbax3sS0/7wSZy9bKb4= =j5FP -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Use a more selective error recovery strategy based on device capabilities
On 2/13/2013 8:43 PM, Michael Christie wrote: For the case where report supported TMFs is not supported can we just have the LLD return some new return code from the eh callback when it gets FUNCTION_REJECTED. scsi-ml would then clear the eh_*_ok bit, so at least it would not be called again.. Hmm, that seems like a good idea. The question is, does propagating the flag change to all the devices on the I_T make sense? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow 64-bit LUNs during report lun scan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/13/2013 9:37 PM, James Bottomley wrote: What advantage does this have over setting max_lun to ~0? Is it possible the adapters have LUN resource limits as well as ID limits? In those cases it would be nice to notify the user that LUNs exist, but are not addressable with the given hba. Of course ignoring the address mode bits keeps this from working properly as the max_lun needs to be set much larger than the actual supported lun limit. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRHVVKAAoJEL5i86xrzcy7zCQIAJzMmQbVb6yFDEv3od16xVI3 DN8ZzscwJQcreJfIpyoBPk4d0gjfCXO1Cc/PeMQegNwgc4TmfoLHXj1/61irATjv GH+xGiJCMxcLX1eIF5D3JC8cleXa+A1YD5ayKeIkYsHSK4S5kPovmS5gzvgJlhPE N5oToRe5RQda0nAeiV0VMPKuxANud2ZC6N61ncMHAn1wLeI7gq2JBtvZi3NXAfub IRacak9LN9QLrlrZh6YQdA8RK9LVGHJwCYahBUG1MYH0ceTyoj15BOPLT/El3ET5 6CSpi7a/TMufwpWtLJp4YzVUU2tIvFxIusTbrzMy0ioYSWD9J7Egangs5ue48Xg= =yyk0 -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow 64-bit LUNs during report lun scan
On 2/14/2013 4:04 PM, Elliott, Robert (Server Storage) wrote: Like James notes, LUNs should generally be treated as opaque values. I agree, except there is a max host lun check based on a decoded lun value. Not really sure why its there other than maybe some of the HBA's have resource issues with a large number of luns. scsilun_to_int() does not appear to be used very much; I see 35 matches in linux-3.7-rc5. Perhaps the callers should be updated to support 64-bit LUNs and decide what to do if they cannot handle larger values. Which is a perfectly valid fix, but if that is being done, why do any swizzling in scsilun_to_int()? -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow 64-bit LUNs during report lun scan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/14/2013 4:04 PM, Elliott, Robert (Server Storage) wrote: Like James notes, LUNs should generally be treated as opaque values. Maybe another issue to consider is how they are being displayed in userland. A device with two luns using one of the alternative lun addressing methods is going to get some pretty strange looking lun numbers showing up in userspace if they aren't decoded properly. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRHWjKAAoJEL5i86xrzcy7UuQH/RG5zp3/H5GoVDH+81M/dtHq RiHjCXuaHpESI6JGem8m7IrhFIRdEEuFL8OoJawMKLsxu6FD9Iwu0A9v99BNqLcc 1Jb+u/AVAhr4W5xB5Oua17IFwIVjxHipYvGDhbzfE/Fvy2lRJy5UDN1GXQMkVtI2 FSYVk3GI51LF2GKbtWtMYb0fTx1jvhlE3WMgeUOyjtNCK7wVOKOPCD8PJkMKC07f v50APFDLp2zCiVel1w3+QQLT96pcMFcfPalwMcfHjSHRxHyXgHlv5kZIk0pqiDxd GNvCc4Kalvc5BzDYw0j3s4+tzRUjtPEniJYO/jDC9MF507XlANVZgAILwu5PJbk= =35DZ -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi: Allow 64-bit LUNs during report lun scan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/13/2013 9:06 AM, Hannes Reinecke wrote: So add a new flag 'support_64bit_luns' to the scsi host and modify report lun scan to not check for max_luns during scanning if that flag is set. This will get rid of the Along these lines, I don't think the scsilun_to_int() and int_to_scsilun() routines are correct for 2^14 luns. SAM 4.6 defines bits 6,7 of byte zero in the LU representation format as the address method. Which when set to 00b limits it to 256 luns but the overflow into the bus ID probably works for some devices. Those routines should probably select/detect an alternative address method for luns 256. Or am I missing something? -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRG+8LAAoJEL5i86xrzcy7SL4H/0vZvSuIVH+6yeN62XkKVzok HBP9Wg9spmOX8ANgJp3KZnOuHSLpVXZTvRRbWpI57sX3UJRZ55nOeA8g75s1hWSp yOrTQZZodD6/uA6QOdVQgqRCrpZ/jKuARHHlZzULnDRV4/eSrLCpU6CRHFviHxLE SkgAAJtQwXMRn3PM8QuzzdJ68tIvVZTW/8r795wV0NxI+AlCM51s/PoPWZxq5tNK tiYbTcRHdh14N4jC6or/hT1r8VdkWEKLhSMLRBVu1wmVIxrdFtoyOqR4CEGwq2vt HaL9L8Te4bmmxN20/Bu593KUymMMndvFDm9OGEuZzcjXdEJp3pauXO8fyhwJrFw= =E2w/ -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Use a more selective error recovery strategy based on device capabilities
Ideally, Linux should not be sending task management commands to devices that don't support the given task mgmt operation. This patch uses the REPORT SUPPORTED TASK MGMT FUNCTIONS command to enable or disable error recovery paths for a given device. For older devices, we make an educated guess about what kind of error recovery the device supports. This isn't going to be 100% accurate as it should probably take the transport as well as the SCSI version into account, but it is a start. While this patch improves the error recovery paths for modern SCSI networks, the error recovery logic continues to fall through to host reset. It also continues to send bus and target resets in cases where they may affect working devices. I have a partial set of patches which attempt to make intelligent decisions in these cases, but they are far more intrusive and at this point not as clear cut. Just in case... Signed-off-by: Jeremy Linton jlin...@tributary.com --- diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..b249c2f 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -572,24 +572,25 @@ static int scsi_try_host_reset(struct scsi_cmnd *scmd) static int scsi_try_bus_reset(struct scsi_cmnd *scmd) { unsigned long flags; - int rtn; + int rtn = FAILED ; struct Scsi_Host *host = scmd-device-host; struct scsi_host_template *hostt = host-hostt; + struct scsi_device *sdev = scmd-device; SCSI_LOG_ERROR_RECOVERY(3, printk(%s: Snd Bus RST\n, __func__)); - if (!hostt-eh_bus_reset_handler) - return FAILED; + if ((sdev-bus_reset_ok) (hostt-eh_bus_reset_handler)) { - rtn = hostt-eh_bus_reset_handler(scmd); + rtn = hostt-eh_bus_reset_handler(scmd); - if (rtn == SUCCESS) { - if (!hostt-skip_settle_delay) - ssleep(BUS_RESET_SETTLE_TIME); - spin_lock_irqsave(host-host_lock, flags); - scsi_report_bus_reset(host, scmd_channel(scmd)); - spin_unlock_irqrestore(host-host_lock, flags); + if (rtn == SUCCESS) { + if (!hostt-skip_settle_delay) +ssleep(BUS_RESET_SETTLE_TIME); + spin_lock_irqsave(host-host_lock, flags); + scsi_report_bus_reset(host, scmd_channel(scmd)); + spin_unlock_irqrestore(host-host_lock, flags); + } } return rtn; @@ -601,6 +602,7 @@ static void __scsi_report_device_reset(struct scsi_device *sdev, void *data) sdev-expecting_cc_ua = 1; } + /** * scsi_try_target_reset - Ask host to perform a target reset * @scmd: SCSI cmd used to send a target reset @@ -614,19 +616,26 @@ static void __scsi_report_device_reset(struct scsi_device *sdev, void *data) static int scsi_try_target_reset(struct scsi_cmnd *scmd) { unsigned long flags; - int rtn; + int rtn = FAILED; + struct scsi_device *sdev = scmd-device; struct Scsi_Host *host = scmd-device-host; struct scsi_host_template *hostt = host-hostt; - if (!hostt-eh_target_reset_handler) - return FAILED; + if ((sdev-target_reset_ok) (hostt-eh_target_reset_handler)) { - rtn = hostt-eh_target_reset_handler(scmd); - if (rtn == SUCCESS) { - spin_lock_irqsave(host-host_lock, flags); - __starget_for_each_device(scsi_target(scmd-device), NULL, - __scsi_report_device_reset); - spin_unlock_irqrestore(host-host_lock, flags); + // TODO: Determine if other devices on this IT are experiencing + // issues. If not, return success without doing anything. + SCSI_LOG_ERROR_RECOVERY(3, printk(%s: Snd target RST\n, + __func__)); + + rtn = hostt-eh_target_reset_handler(scmd); + + if (rtn == SUCCESS) { + spin_lock_irqsave(host-host_lock, flags); + __starget_for_each_device(scsi_target(scmd-device), NULL, + __scsi_report_device_reset); + spin_unlock_irqrestore(host-host_lock, flags); + } } return rtn; @@ -644,24 +653,36 @@ static int scsi_try_target_reset(struct scsi_cmnd *scmd) */ static int scsi_try_bus_device_reset(struct scsi_cmnd *scmd) { - int rtn; + int rtn = FAILED; struct scsi_host_template *hostt = scmd-device-host-hostt; + struct scsi_device *sdev = scmd-device; - if (!hostt-eh_device_reset_handler) - return FAILED; + if ((sdev-task_unit_reset_ok) (hostt-eh_device_reset_handler)) { + SCSI_LOG_ERROR_RECOVERY(3, printk(%s: Snd LUN RST\n, + __func__)); + rtn = hostt-eh_device_reset_handler(scmd); + + if (rtn == SUCCESS) + __scsi_report_device_reset(scmd-device, NULL); + } - rtn = hostt-eh_device_reset_handler(scmd); - if (rtn == SUCCESS) - __scsi_report_device_reset(scmd-device, NULL); return rtn; } static int scsi_try_to_abort_cmd(struct scsi_host_template *hostt, struct scsi_cmnd *scmd) { - if (!hostt-eh_abort_handler) - return FAILED; + int rtn = FAILED; + struct scsi_device *sdev = scmd-device; + + if ((sdev-task_abort_ok) (hostt-eh_abort_handler)) + { + SCSI_LOG_ERROR_RECOVERY(3, printk(%s: Snd Host RST\n, + __func__)); - return hostt-eh_abort_handler(scmd); + rtn=hostt-eh_abort_handler(scmd); + } + + return rtn; } static void scsi_abort_eh_cmnd(struct scsi_cmnd *scmd
Re: [PATCH] Use a more selective error recovery strategy based on device capabilities
On 2/12/2013 2:57 PM, Elliott, Robert (Server Storage) wrote: Ideally the device driver for the SCSI initiator port would report those attributes, and higher level code would combine them with support information from the device server (REPORT SUPPORTED TMF command, REPORT SUPPORTED OPCODES command, etc.) to decide what is supported. Well, for the eh_xxx_handler functions, that is basically what happens now. The host driver can fail to set a callback for the eh_xxx_handlers if it doesn't support the operation. At that point, even if the target device supports a function (say target reset) if the host driver doesn't, then the target reset will be skipped. Of course, a number of the drivers define functions their underlying protocol's don't support. For example, bus reset on fibre channel. Which I personally believe is an error. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] SG_SCSI_RESET ioctl should only perform requested operation
From all the documentation I've found, it is not clear that users of the SG_SCSI_RESET ioctl may have their requests progress up the hierarchy of reset operations. Basically, requests for a SCSI_TRY_RESET_DEVICE may eventually result in a TARGET, BUS, or HOST reset. The sg_reset utility hints at the error handling, but actually leads the user to believe that they should reissue the sg_reset command themselves to enact more aggressive reset functions. It also says: Note that a host reset and a bus reset may cause collateral damage. In the interest of error isolation, I have attached a patch which changes the behavior. The existing behavior is obviously intentional, but I don't believe its the best choice. There are other alternatives to this patch. For one, to avoid the possibility of breaking an existing application, maybe SG_SCSI_RESET needs some new values like SG_SCSI_RESET_DEVICE_ONLY. While leaving the existing operations alone. Just in case... Signed-off-by: Jeremy Linton jlin...@tributary.com --- diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c1b05a8..6f05bc2 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1999,19 +1999,13 @@ scsi_reset_provider(struct scsi_device *dev, int flag) switch (flag) { case SCSI_TRY_RESET_DEVICE: rtn = scsi_try_bus_device_reset(scmd); - if (rtn == SUCCESS) - break; - /* FALLTHROUGH */ + break; case SCSI_TRY_RESET_TARGET: rtn = scsi_try_target_reset(scmd); - if (rtn == SUCCESS) - break; - /* FALLTHROUGH */ + break; case SCSI_TRY_RESET_BUS: rtn = scsi_try_bus_reset(scmd); - if (rtn == SUCCESS) - break; - /* FALLTHROUGH */ + break; case SCSI_TRY_RESET_HOST: rtn = scsi_try_host_reset(scmd); break; -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 0/9] [SCSI] Enhanced sense and Unit Attention handling
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 1/24/2013 8:38 AM, Bart Van Assche wrote: Let me ask this another way. SAN users expect that the LUN list at the initiator side gets updated automatically after a SAN configuration change. How should a SAN system communicate to a SCSI initiator that the LUN list has been changed ? Some FC SAN systems send a LIP after a configuration change to force the initiator to rescan LUNs. But how to What I think your looking for is RSCN (Registered State Change notification) . Hook that, and then check the name server. This will tell you when ports get added/removed. You can then report luns against lun 0 of all the known target ports. This allows you to transparently detect changes. Otherwise, you run the risk of trapping UA's in the lower level portions of the stack that _REALLY_ need to be propagated to the controlling driver or application. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRBpOeAAoJEL5i86xrzcy765gH/AnG1TPJ5Y3RGx00TiLP5shb +yS384FoIis4XuWijUdofsAcnZzUFaMgH7lPBr5TkT1yYDgyXtzvpjV/2rvWlvzA PfHPU4vPFmpF1XO7IX2PJCpHAYheHXhucnMkXVLI9GA5nR9+BPQjjvav24ixGKPc b2889zju7Z7KUb0R4SXWtSCbRZZtYuBj0Rckh8a/ra9wJXHuMpsg7+7OzrLqbSqH OcAmcb5Q8T/5D6Rj4rJVF3d1Fzr5+P2qrMhS+eb98I6phZ5UvHs66nY/pHjCGpbA SShQlGOg7+nIjxsf9jjl2/sgx0jJH40koyW8Xv9WERE75eQ9bVBpBX3BeosvlJs= =isBF -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 0/9] [SCSI] Enhanced sense and Unit Attention handling
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 1/28/2013 9:44 AM, Bart Van Assche wrote: when using Fibre Channel as transport layer. I'm looking for a solution that also works with other SCSI transports, e.g. iSCSI and SRP. Doesn't iSCSI have a SNS server you can subscribe to that provides similar functionality (name services and port add/remove)? SAS has the BROADCAST functionality too which can act as an AEN. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRBp7TAAoJEL5i86xrzcy7jHAIAJlrqG553O0dqiGUFnb+W+oe +KB6SfST0DiqATBJee61kJWw0hzC8EYwDVXgGJnw4c4XqmaAt0JNLPSq8F77DMkC UQzAdz7U8QxrJp9cwdz4HpWiVXolFJ7I6Gg9Og+KBxAMDIWp/mQNa06Y+b0XQgmW /KH4/AJMC+cn5GCYq55cSn+ZmKGO3JVB4tbj1VRgyxF/+dBlfiw94YoHxI6ODvF/ 440Mo1rtI9P5+hdqGlkqxZmjZBzawCWZDHkos75dKIYl332FqrwubBNngL/St8GO 1DGlK6+STDLIE8XyFPhqKLBiObhnKhnZHjtgmGXEw20ZYx7QLd4IPOLLVGdYFik= =PWeG -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] scsi_transport_fc: Implement I_T nexus reset
On 12/7/2012 3:20 PM, Mike Christie wrote: On 12/07/2012 03:05 PM, Jeremy Linton wrote: That said, its far from perfect. The code (as I understand it) isn't differentiating between isolating the failure, or bringing out the big hammer in an attempt to correct problems on a specific I_T_L. If you drop/reset the I_T because one of the LUN's is misbehaving before verifying the status of other LUN's on the target, you risk interrupting operations to functional devices. When this code is called the scsi eh has run the abort handler for each outstanding command and that has failed, and it has run the lun/device reset handler and that has failed (or the eh operations succeeded but the TUR checkup the scsi eh does failed). I think my issue with the error handler (rather than this patch in particular) surrounds the fact that when scsi_eh_bus_device_reset (which maps to lun reset) fails, it falls to scsi_eh_target_reset which issues a TARGET RESET which then broadens the problem to devices which may be working fine, and just happen to be on the same I_T. I think there should be some attempt to determine if there are other devices on the I_T, and whether they have failed before going into target_reset. It looks like there may have been a plan to do that in bus_device_reset, but it doesn't appear to be complete. Now, all that said, I have a few things I wonder about in the eh_bus_device_reset code. For one the use of TUR rather than a command with a more straightforward return status like INQUIRY which also preserves the check conditions. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Error handling on FC devices
On 12/3/2012 1:15 AM, Hannes Reinecke wrote: Well, looking at QLogic and Emulex both emulate a bus reset with a loop over each target and invoke a target reset there. I somewhat fail to see the rationale behind it, other than emulating the bus reset behaviour on SPI. It is actually a _VERY_ bad idea in multiple initiator tape environments with switched fibre where the resets can affect devices that are visible but not owned/controlled by the machine broadcasting resets. Many tape environments operate this way as the physical drives are assigned dynamically to initiators as necessary. In some cases (ACSLS) the machine/OS/backup applications aren't even homogenous. The rewind and loss of PR/etc, which if not handled properly by all the other machines on the SAN can be quite disastrous. Its also somewhat problematic even in single initiator environments as the reset can affect devices not having problems, and the 6/2900's can get eaten by the logic attempting the reset, which leaves the user of a functional device in the dark that it was reset/rewound. I was told last time I brought this up, that it was impossible for a single device's failure to result in that bus reset path being called. Which was patently false as the problem was only tracked down because of a repeatable case of a single device failing in a manner which triggered progressively more aggressive recovery culminating in the bus-reset being called. The result was a single device cascading a failure to a bunch of functional devices and interrupting their operation. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 07/30] Incorrect SCSI transfer length computation from odd sized scsi_execute_async() transfers.
[EMAIL PROTECTED] wrote: From: Jeremy Linton [EMAIL PROTECTED] Any function which use scsi_execute_async() and transfers odd sized data that doesn't align correctly with the segment sizes may have its transfer length padded out to the closest segment size. I would like to strongly suggest that Mike Christie's patch be used instead. http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg06032.html I finally hit the case he was talking about (the block layer retries 0 length commands caused by a size mismatch) and its ugly. I'm not really sure why my initial tests weren't hitting that case, I was trying to understand why some blocks were getting an extra command generated while others weren't. Sufficient to say, his patch fixes both problems, the incorrect transfer lengths and the extra 0 length transfer being generated. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][BUG] Incorrect SCSI transfer length computation from odd sized scsi_execute_async() transfers.
Mike Christie wrote: I think you needed some other bits in there. See this patch I tried just setting the bufflen first, and that still had problems. Could you try the patch here http://marc.info/?l=linux-scsim=117392208211297w=2 I just read the thread.. I didn't see any strange retries with my test case. I will try duplicating the problem tomorrow. Then I will apply your patch and rerun my test. I'm curious if this has been known since 2.6.19 why the patch hasn't propagated to the main kernel tree? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
REQ_SPECIAL from scsi_execute_async()?
I was just looking at the REQ_SPECIAL handling and I was curious why REQ_SPECIAL isn't being set for commands being queued by scsi_execute_aysnc()? It is set for scsi_execute() commands. Did someone overlook setting the flag or is this behavior intentional? - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Disabling block layer
Mark Lobo wrote: I had a question about disabling the block layer for SCSI devices. We have an embedded device, and it runs 2.4.30. We need to be able to support a lot of SCSI devices (in the thousands) for our device, and we talk to the devices via SG. We are facing a memory allocation problem after discovering a few thousand devices. For every device, there seems to be a lot of memory allocated in the block layer. This memory includes cache memory (which IIRC is reclaimable by the kernel memory subsystem when it needs it) and also pages that are used for the alloc_pages pool. My questions were relating to disabling the block layer for the devices. We always talk direct passthrough to the storage(except the local hard disk), and do not need the block layer at all. You may consider something we experimented with here (for performance reasons).. We basically recompiled one of the scsi drivers to call our own version of scsi_host_alloc() and then made calls to the queuecommand() routine directly. You then allow the kernel probe routines to only discover the first target with the local disk. I assume you know ahead of time which scsi cards your using in your system. The point is that you could just build a heavily modified scsi driver with application specific hooks. BTW we aren't currently doing this because in the end we got most of what we needed by writing a driver which replaces sg and bypasses most of the kernel without being as invasive. In the long run we may still use a modified LLDD since the interfaces we depend on are changing a little to fast for our liking and we are not running any system devices on the interface cards we need to directly access. I'm not sure how you would go about tearing down enough of the system that the device doesn't consume any resources, yet leave enough of it around to be accessible. I will be interested to find out what you end up doing. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] scsi_execute_async() should add to the tail of the queue
So instead of adding a parameter, we can make scsi_execute_async() decide for itself based on the SCSI command, with read/write I/Os taking the lowest priority. This seems like a bad idea, I can come up with a number of cases where the priority of a request would better be optimized by a higher level subsystem, rather than a simple prioritization based on the command type. The original suggestion to provide both head and tail insertion options seems like the obvious solution, short of a full priority queuing system. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 7026] CD/DVD burning with USB writer doesn't work
On Wednesday 06 December 2006 16:50, Mike Christie wrote: For iscsi, we could negotiate a value like MaxBurstLength which says don't send commands with a payload larger than that size. I would guess other transports have something similar. We have to check or make sure ... Oh yeah the exception I am thinking about may not be max sectors exactly but something close like iscsi's MaxBurstLength limit. Maybe iscsi LLDs are supposed to be translating that iscsi limit to max_sectors in which case we are talking about the same thing. For this limit we do not want Sort of off topic, but the iSCSI MaxBurstLength doesn't set the max transfer size, it simply is the amount of data that can be sent without a R2T. If the transfer is larger then you have to wait for the R2T. In practice it ends up controlling the _minimum_ amount of buffer space that needs to be available _before_ the transfer starts, otherwise performace sucks. -- PHB REQ: Privileged or confidential information may be contained in this message. If you are not the addressee of this message please notify the sender by return email and thereafter delete the message. You may not use, copy, disclose or rely on the information contained in this message. Internet e-mail may be susceptible to data corruption, interception and unauthorized amendment for which Gresham does not accept liability. While we have taken reasonable precautions to ensure that this e-mail and any attachments have been swept for viruses, Gresham does not accept liability for any damage sustained as a result of viruses. Statements in this message that do not relate to the business of Gresham are neither given nor endorsed by the company or its directors. A list of Gresham's directors is available on our web site: www.gresham-computing.com - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 7026] CD/DVD burning with USB writer doesn't work
On Wednesday 06 December 2006 17:42, Jeremy Linton wrote: On Wednesday 06 December 2006 16:50, Mike Christie wrote: For iscsi, we could negotiate a value like MaxBurstLength which says don't send commands with a payload larger than that size. I would guess other transports have something similar. We have to check or make sure ... Oh yeah the exception I am thinking about may not be max sectors exactly but something close like iscsi's MaxBurstLength limit. Maybe iscsi LLDs are supposed to be translating that iscsi limit to max_sectors in which case we are talking about the same thing. For this limit we do not want Sort of off topic, but the iSCSI MaxBurstLength doesn't set the max transfer size, it simply is the amount of data that can be sent without a R2T. If the transfer is larger then you have to wait for the R2T. In practice it ends up controlling the _minimum_ amount of buffer space that needs to be available _before_ the transfer starts, otherwise performace sucks. Whops, Slight clarification, the MaxBurstLength is the max sent between R2T's what I described above is closer to the FirstBurstLength. What you guys are describing might better be the MaxRecvDataSegmentLength, but not really since that parameter should be hidden within the iSCSI driver. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html