RE: [PATCH] sd: Limit WRITE SAME / WRITE SAME(16) w/UNMAP length for certain devices
I agree that it is disappointing that so many vendors seem to have trouble reading the spec. This case is pretty clear. The best the T10 committee could do is add a bit to indicate that the device uses the length from MAXIMUM UNMAP LBA COUNT field for the length of unmaps via the WRITE SAME w/UNMAP=1 rather than the MAXIMUM WRITE SAME LENGTH field. BUT, I'll be very clear that the setting of any such new bit will be bit=0 is backward compatible for COMPLIANT devices, and bit=1 will be the new setting for "backwards" devices - which means they would STILL require a firmware change to tell you they are backwards, and you'd STILL need a blacklist for their older revisions. And this would just makes the hosts job all that much harder! Once a device is broken (violates the spec), there is not very much we can do in the spec to fix it (they have to fix their broken device). Fred -Original Message- From: Ewan D. Milne [mailto:emi...@redhat.com] Sent: Wednesday, September 27, 2017 12:28 PM To: Martin K. Petersen Cc: linux-scsi@vger.kernel.org; Knight, Frederick Subject: Re: [PATCH] sd: Limit WRITE SAME / WRITE SAME(16) w/UNMAP length for certain devices On Mon, 2017-09-25 at 21:46 -0400, Martin K. Petersen wrote: > Ewan, > > > Some devices do not support a WRITE SAME / WRITE SAME(16) with the > > UNMAP bit set up to the length specified in the MAXIMUM WRITE SAME > > LENGTH field in the block limits VPD page (or, the field is zero, > > indicating there is no limit). Limit the length by the MAXIMUM UNMAP > > LBA COUNT value. Otherwise the command might be rejected. > > From SBC4: > > "A MAXIMUM UNMAP LBA COUNT field set to a non-zero value indicates the > maximum number of LBAs that may be unmapped by an UNMAP command" > > Note that it explicitly states "UNMAP command" and not "unmap > operation". > > "A MAXIMUM WRITE SAME LENGTH field set to a non-zero value indicates > the maximum number of contiguous logical blocks that the device server > allows to be unmapped or written in a single WRITE SAME command." > > It says "unmapped or written" and "WRITE SAME command". > > The spec is crystal clear. The device needs to be fixed. We can > blacklist older firmware revs. > Yes, I know that is what SBC-4 says, and I agree that the devices are not conforming. Unfortunately, I've come across 3 different arrays now from 3 different manufacturers that exhibit this behavior. cc: Fred Knight for his opinion on this (NetApp was not one of the arrays that I've run into, though). -Ewan
RE: [Lsf] Notes from the four separate IO track sessions at LSF/MM
There are multiple possible situations being intermixed in this discussion. First, I assume you're talking only about random access devices (if you try transport level error recover on a sequential access device - tape or SMR disk - there are lots of additional complexities). Failures can occur at multiple places: a) Transport layer failures that the transport layer is able to detect quickly; b) SCSI device layer failures that the transport layer never even knows about. For (a) there are two competing goals. If a port drops off the fabric and comes back again, should you be able to just recover and continue. But how long do you wait during that drop? Some devices use this technique to "move" a WWPN from one place to another. The port drops from the fabric, and a short time later, shows up again (the WWPN moves from one physical port to a different physical port). There are FC driver layer timers that define the length of time allowed for this operation. The goal is fast failover, but not too fast - because too fast will break this kind of "transparent failover". This timer also allows for the "OH crap, I pulled the wrong cable - put it back in; quick" kind of stupid user bug. For (b) the transport never has a failure. A LUN (or a group of LUNs) have an ALUA transition from one set of ports to a different set of ports. Some of the LUNs on the port continue to work just fine, but others enter ALUA TRANSITION state so they can "move" to a different part of the hardware. After the move completes, you now have different sets of optimized and non-optimized paths (or possible standby, or unavailable). The transport will never even know this happened. This kind of "failure" is handled by the SCSI layer drivers. There are other cases too, but these are the most common. Fred -Original Message- From: lsf-boun...@lists.linux-foundation.org [mailto:lsf-boun...@lists.linux-foundation.org] On Behalf Of Bart Van Assche Sent: Thursday, April 28, 2016 11:54 AM To: James Bottomley; Mike Snitzer Cc: linux-bl...@vger.kernel.org; l...@lists.linux-foundation.org; device-mapper development; linux-scsi Subject: Re: [Lsf] Notes from the four separate IO track sessions at LSF/MM On 04/28/2016 08:40 AM, James Bottomley wrote: > Well, the entire room, that's vendors, users and implementors > complained that path failover takes far too long. I think in their > minds this is enough substance to go on. The only complaints I heard about path failover taking too long came from people working on FC drivers. Aren't SCSI transport layer implementations expected to fail I/O after fast_io_fail_tmo expired instead of waiting until the SCSI error handler has finished? If so, why is it considered an issue that error handling for the FC protocol can take very long (hours)? Thanks, Bart. ___ Lsf mailing list l...@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/lsf -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: T10 adds locally assigned UUID designation descriptor
To add a little more information, the reason for the "NO" votes was as follows: If a storage device implements this TODAY - and the only unique identifier in VPD page 0x83 is the UUID identifier, then, any existing shipping host will not find any unique identifier that it recognizes. That host could do any number of other things (including but not limited to): 1) prevent the device from being used; 2) enable only a SINGLE path to the device (do not allow MPIO to operate on a device for which it cannot find a unique ID); 3) enable MPIO to the device using the unique ID of "NONE". Both 1 & 2 are workable situations. BUT, #3 is a problem; not if you have just 1 of these UUID only devices, but if you have a bunch of them, and the host incorrectly assumes they are all the SAME device, and it tries to do IO based on that assumption. So that is the background. The "NO" votes were based on the belief by those companies that situation #3 was a forgone conclusion, and they didn't want to add any new features to the storage until after the hosts added code to support those new features - which the hosts can't do until there are storage devices built (based on a standard) which they can use for testing - CATCH-22. The "YES" votes were based on the assumption that storage would not be configured with ONLY the UUID value unless the storage manager knew that the host to which it would be connected could actually support a UUID only storage system. A configuration of a UUID only storage and a host that does not support UUID only storage is a configuration error. No different than a "thin provisioned" LUN being configured for use by a host that prohibits the use of thin provisioned LUNs. Basically it is assumed that initial deployments of UUID identifiers would be in conjunction with other (NAA/EUI/etc) identifiers in page 0x83). Remember, real H/W vendors already own NAA and EUI values. The primary creator of the UUID form will be S/W defined storage LUNs (as indicated in the preface material in the proposal), where there is no NAA or EUI available. It simply goes back to the catch-22 - which comes first, the host support or the storage device support. The solution is expected to show up in the next revision of the standard - there will be a temporary editor's note added indicating something along the lines of: a UUID only VPD page 0x83 should not be implemented in a storage device until it is known that the host supports such a configuration. That note will be removed before final ANSI/ISO publication, but it will remain during the draft cycle. At least, that is where the discussion ended up last I knew - we'll find out at the next meeting. There was some minor discussion about that lack of uniqueness guarantees, but basically the committee said, you get what you get, and if you don't like it, don't use it. You can also see, that the data structure is already primed for the addition of the 32 byte UUID value (if/when anyone ever invents such a beast, we'll examine whether it too should be added). So I hope that clarifies some of the background around the controversy. Fred Knight -Original Message- From: Douglas Gilbert [mailto:dgilb...@interlog.com] Sent: Monday, February 08, 2016 3:04 PM To: James Bottomley; SCSI development list Cc: Knight, Frederick Subject: Re: T10 adds locally assigned UUID designation descriptor On 16-02-08 02:00 PM, James Bottomley wrote: > On Mon, 2016-02-08 at 12:33 -0500, Douglas Gilbert wrote: >> Recently, in draft spc5r08, T10 added a locally assigned RFC 4122 >> UUID *** designation descriptor. That descriptor can now be >> returned for VPD page 0x83 (device identification) amongst others. >> It can be used anywhere SCSI needs a unique identifier expanding >> the previous set of preferred identifiers: EUI, NAA and SCSI_name >> (iSCSI). >> >> In the soon to be released sg3_utils version 1.42 the new UUID >> designation descriptor is decoded including Hannes' --export >> option found in sg_inq, for example: >> >> # sg_inq --export /dev/sg0 >> ... >> SCSI_IDENT_LUN_UUID=11223344-5566-7788-aabb-ccddeeee >> >> Perhaps some udev work is needed to incorporate this new identifier. > > Hm, we're going to have to do this carefully. With the move to GPT > partitions, both the UUID= designator in fstab and the /dev/disk/by > -uuid/ of udev means the GPT UUID. In theory the design of the UUID > space is to allow random selection without clashing, so we could just > place the SCSI ones in here as well and perhaps there won't be a > problem, but I'd like us to think about the consequences first. The UUID proposal (16-005r1 from Fred Knight and "Dr. Hannes Reinecke") was somewhat contro
FW: [LSF/MM TOPIC] New Storage capabilities
Several new features are becoming a reality in SCSI and ATA this year, and I would like to participate in the discussions on supporting these new features. a) SCSI conglomerate LUNs (using more bits in the LUN to manage groupings of logical units); b) Atomic commands; c) IO and LBA HINTS (for both SCSI and ATA/IDE); a. For storage tiering; b. For cache management; d) FC - 128Gig parallel and breakout mode I attend T10 (SCSI), T11 (FC), T13 (ATA/IDE), IETF (iSCSI), and SNIA and can provide expertise in the areas listed above as well as the topics covered in those standards committees. Fred Knight -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html