Re: [dm-devel] [PATCH V4 1/2] multipath-tools: intermittent IO error accounting to improve reliability

Guan Junxiong Thu, 21 Sep 2017 03:41:51 -0700

Hi, Muneendra

  Thanks for your clarification. I adopt this renaming. If it is convenient for 
you, please review the V5 patch that I sent out 2 hours ago.


Regards,
Guan

On 2017/9/20 20:58, Muneendra Kumar M wrote:
> Hi Guan,
>>>> Shall we use existing PATH_SHAKY ?
> As the path_shaky Indicates path not available for "normal" operations we can 
> use this state. That's  a good idea.
> 
> Regarding the marginal paths below is my explanation. And brocade is 
> publishing couple of white papers regarding the same to educate the SAN 
> administrators and the san community.
> 
> Marginal path:
> 
> A host, target, LUN (ITL path) flow  goes through SAN. It is to be noted that 
> the for each I/O request that goes to the SCSI layer, it transforms into a 
> single SCSI exchange.  In a single SAN, there are typically multiple SAN 
> network paths  for a ITL flow/path. Each SCSI exchange  can take one of the 
> various network paths that are available for the ITL path.  A SAN can be 
> based on Ethernet, FC, Infiniband physical networks to carry block storage 
> traffic (SCSI, NVMe etc.)
> 
> There are typically two type of SAN network problems that are categorized as 
> marginal issues. These issues by nature are not permanent in time and do come 
> and go away over time.
> 1) Switches in the SAN can have intermittent frame drops or intermittent 
> frame corruptions due to bad optics cable (SFP) or any such wear/tear  port 
> issues. This causes ITL flows that go through the faulty switch/port to 
> intermittently experience frame drops.  
> 2) There exists SAN topologies where there are switch ports in the fabric 
> that becomes the only  conduit for many different ITL flows across multiple 
> hosts. These single network paths are essentially shared across multiple ITL 
> flows. Under these conditions if the port link bandwidth is not able to 
> handle the net sum of the shared ITL flows bandwidth going through the single 
> path  then we could see intermittent network congestion problems. This 
> condition is called network oversubscription. The intermittent congestions 
> can delay SCSI exchange completion time (increase in I/O latency is observed).
> 
> To overcome the above network issues and many more such target issues, there 
> are frame level retries that are done in HBA device firmware and I/O retries 
> in the SCSI layer. These retries might succeed because of two reasons:
> 1) The intermittent switch/port issue is not observed
> 2) The retry I/O is a new  SCSI exchange. This SCSI exchange  can take an 
> alternate SAN path for the ITL flow, if such an SAN path exists.
> 3) Network congestion disappears momentarily because the net I/O bandwidth 
> coming from multiple ITL flows on the single shared network path is something 
> the path can handle
> 
> However in some cases we have seen I/O retries don’t succeed because the 
> retry I/Os hits a SAN network path that has  intermittent switch/port issue 
> and/or network congestion. 
> 
> On the host  thus we see configurations two or more ITL path sharing the same 
> target/LUN going through two or more HBA ports. These HBA ports are connected 
> to two or more SAN to the same target/LUN.
> If the I/O fails at the multipath layer then, the ITL path is turned into 
> Failed state. Because of the marginal nature of the network, the next Health 
> Check command sent from multipath layer might succeed, which results in 
> making the ITL path into Active state. You end up seeing the DM path state 
> going  into Active, Failed, Active transitions. This results in overall 
> reduction in application I/O throughput and sometime application I/O failures 
> (because of timing constraints). All this can happen because of I/O retries 
> and I/O request moving across multiple paths of the DM device. In the host it 
> is  to be noted all I/O retries on a single path and I/O movement across 
> multiple paths results in slowing down the forward progress of new 
> application I/O. Reason behind, the above I/O  re-queue actions are given 
> higher priority than the newer I/O requests coming from the application. 
> 
> The above condition of the  ITL path is hence called “marginal”.
> 
> What we desire is for the DM to deterministically  categorize a ITL Path as 
> “marginal” and move all the pending I/Os from the marginal Path to an Active 
> Path. This will help in meeting application I/O timing constraints. Also a 
> capability to automatically re-instantiate the marginal path into Active once 
> the marginal condition in the network is fixed.
> 
> 
> Based on the above explanation I want to rename the names as 
> marginal_path_XXXX and this is irrespective of any storage network.
> 
> Regards,
> Muneendra.

--
dm-devel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH V4 1/2] multipath-tools: intermittent IO error accounting to improve reliability

Reply via email to