Re: [dm-devel] [PATCH V4 1/2] multipath-tools: intermittent IO error accounting to improve reliability

Muneendra Kumar M Mon, 09 Oct 2017 05:03:04 -0700

Hi Guan,
Thanks for the info.
Changes looks fine.
Instead of marginal_path_err_recheck_gap_time, marginal_path_recovery_time will 
looks reasonable. 
This was only my input.


Regards,
Muneendra.

-----Original Message-----
From: Guan Junxiong [mailto:[email protected]] 
Sent: Monday, October 09, 2017 6:13 AM
To: Muneendra Kumar M <[email protected]>
Cc: Shenhong (C) <[email protected]>; niuhaoxin <[email protected]>; 
Martin Wilck <[email protected]>; Christophe Varoqui 
<[email protected]>; [email protected]
Subject: Re: [PATCH V4 1/2] multipath-tools: intermittent IO error accounting 
to improve reliability

Hi Muneendra,
Sorry for late reply because of National Holiday.

On 2017/10/6 13:54, Muneendra Kumar M wrote:
> Hi Guan,
> Did you push the patch to mainline.
> If so can you just provide me those details.
> If not can you just let me know the status.
> 

Yes, I pushed Version 6 of the patch to the mail list but it hasn't been merged 
yet.
It is still waiting for review.
You can find it at this link:
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.redhat.com_archives_dm-2Ddevel_2017-2DSeptember_msg00296.html&d=DwIDaQ&c=IL_XqQWOjubgfqINi2jTzg&r=E3ftc47B6BGtZ4fVaYvkuv19wKvC_Mc6nhXaA1sBIP0&m=N8T04oW6j0kkcf5fLp8jXA1y75SRN6PM9D-dM5nc2d4&s=sBZTTjpCVZB3NBgGXwPCE1fBtqAmx75s0DkAsVYRrwc&e=

> As couple of our clients are already using the previous patch(san_path_XX).
> If your patch is pushed then I can give them the updated patch and test the 
> same.
> 

If the patch if OK for you, can I add your Reviewed-by tag into this patch?

Regards,
Guan

> Regards,
> Muneendra.
> 
> 
> -----Original Message-----
> From: Muneendra Kumar M 
> Sent: Thursday, September 21, 2017 3:41 PM
> To: 'Guan Junxiong' <[email protected]>; Martin Wilck 
> <[email protected]>; [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected]
> Subject: RE: [PATCH V4 1/2] multipath-tools: intermittent IO error accounting 
> to improve reliability
> 
> Hi Guan,
> Thanks for adopting the naming convention. 
> Instead of marginal_path_err_recheck_gap_time, marginal_path_recovery_time 
> will looks reasonable.Could you please relook into it.
> 
> I will review the code in a day time.
> 
> Regards,
> Muneendra.
> 
> -----Original Message-----
> From: Guan Junxiong [mailto:[email protected]] 
> Sent: Thursday, September 21, 2017 3:35 PM
> To: Muneendra Kumar M <[email protected]>; Martin Wilck <[email protected]>; 
> [email protected]; [email protected]
> Cc: [email protected]; [email protected]; [email protected]
> Subject: Re: [PATCH V4 1/2] multipath-tools: intermittent IO error accounting 
> to improve reliability
> 
> Hi, Muneendra
> 
>   Thanks for your clarification. I adopt this renaming. If it is convenient 
> for you, please review the V5 patch that I sent out 2 hours ago.
> 
> Regards,
> Guan
> 
> On 2017/9/20 20:58, Muneendra Kumar M wrote:
>> Hi Guan,
>>>>> Shall we use existing PATH_SHAKY ?
>> As the path_shaky Indicates path not available for "normal" operations we 
>> can use this state. That's  a good idea.
>>
>> Regarding the marginal paths below is my explanation. And brocade is 
>> publishing couple of white papers regarding the same to educate the SAN 
>> administrators and the san community.
>>
>> Marginal path:
>>
>> A host, target, LUN (ITL path) flow  goes through SAN. It is to be noted 
>> that the for each I/O request that goes to the SCSI layer, it transforms 
>> into a single SCSI exchange.  In a single SAN, there are typically multiple 
>> SAN network paths  for a ITL flow/path. Each SCSI exchange  can take one of 
>> the various network paths that are available for the ITL path.  A SAN can be 
>> based on Ethernet, FC, Infiniband physical networks to carry block storage 
>> traffic (SCSI, NVMe etc.)
>>
>> There are typically two type of SAN network problems that are categorized as 
>> marginal issues. These issues by nature are not permanent in time and do 
>> come and go away over time.
>> 1) Switches in the SAN can have intermittent frame drops or intermittent 
>> frame corruptions due to bad optics cable (SFP) or any such wear/tear  port 
>> issues. This causes ITL flows that go through the faulty switch/port to 
>> intermittently experience frame drops.  
>> 2) There exists SAN topologies where there are switch ports in the fabric 
>> that becomes the only  conduit for many different ITL flows across multiple 
>> hosts. These single network paths are essentially shared across multiple ITL 
>> flows. Under these conditions if the port link bandwidth is not able to 
>> handle the net sum of the shared ITL flows bandwidth going through the 
>> single path  then we could see intermittent network congestion problems. 
>> This condition is called network oversubscription. The intermittent 
>> congestions can delay SCSI exchange completion time (increase in I/O latency 
>> is observed).
>>
>> To overcome the above network issues and many more such target issues, there 
>> are frame level retries that are done in HBA device firmware and I/O retries 
>> in the SCSI layer. These retries might succeed because of two reasons:
>> 1) The intermittent switch/port issue is not observed
>> 2) The retry I/O is a new  SCSI exchange. This SCSI exchange  can take an 
>> alternate SAN path for the ITL flow, if such an SAN path exists.
>> 3) Network congestion disappears momentarily because the net I/O bandwidth 
>> coming from multiple ITL flows on the single shared network path is 
>> something the path can handle
>>
>> However in some cases we have seen I/O retries don’t succeed because the 
>> retry I/Os hits a SAN network path that has  intermittent switch/port issue 
>> and/or network congestion. 
>>
>> On the host  thus we see configurations two or more ITL path sharing the 
>> same target/LUN going through two or more HBA ports. These HBA ports are 
>> connected to two or more SAN to the same target/LUN.
>> If the I/O fails at the multipath layer then, the ITL path is turned into 
>> Failed state. Because of the marginal nature of the network, the next Health 
>> Check command sent from multipath layer might succeed, which results in 
>> making the ITL path into Active state. You end up seeing the DM path state 
>> going  into Active, Failed, Active transitions. This results in overall 
>> reduction in application I/O throughput and sometime application I/O 
>> failures (because of timing constraints). All this can happen because of I/O 
>> retries and I/O request moving across multiple paths of the DM device. In 
>> the host it is  to be noted all I/O retries on a single path and I/O 
>> movement across multiple paths results in slowing down the forward progress 
>> of new application I/O. Reason behind, the above I/O  re-queue actions are 
>> given higher priority than the newer I/O requests coming from the 
>> application. 
>>
>> The above condition of the  ITL path is hence called “marginal”.
>>
>> What we desire is for the DM to deterministically  categorize a ITL Path as 
>> “marginal” and move all the pending I/Os from the marginal Path to an Active 
>> Path. This will help in meeting application I/O timing constraints. Also a 
>> capability to automatically re-instantiate the marginal path into Active 
>> once the marginal condition in the network is fixed.
>>
>>
>> Based on the above explanation I want to rename the names as 
>> marginal_path_XXXX and this is irrespective of any storage network.
>>
>> Regards,
>> Muneendra.
> 


--
dm-devel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH V4 1/2] multipath-tools: intermittent IO error accounting to improve reliability

Reply via email to