On 15.10.2018 06:03, Zhao1, Wei wrote: > Hi, Ilya Maximets > >> -----Original Message----- >> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >> Sent: Friday, October 12, 2018 6:15 PM >> To: Zhao1, Wei <wei.zh...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>; >> Laurent Hardy <laurent.ha...@6wind.com> >> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >> <konstantin.anan...@intel.com>; sta...@dpdk.org; dev@dpdk.org >> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while fiber link >> update >> >> On 12.10.2018 12:19, Zhao1, Wei wrote: >>> Hi, >>> >>>> -----Original Message----- >>>> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >>>> Sent: Thursday, October 11, 2018 6:27 PM >>>> To: Zhao1, Wei <wei.zh...@intel.com>; Zhang, Qi Z >>>> <qi.z.zh...@intel.com>; Laurent Hardy <laurent.ha...@6wind.com> >>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >>>> <konstantin.anan...@intel.com>; sta...@dpdk.org; dev@dpdk.org >>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while >>>> fiber link update >>>> >>>> On 11.10.2018 12:56, Zhao1, Wei wrote: >>>>> Hi, Ilya Maximets AND laurent.hardy >>>> >>>> Hi, thanks for sharing your thoughts. >>>> Comments inline. >>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ilya Maximets >>>>>> Sent: Wednesday, September 12, 2018 4:05 PM >>>>>> To: Zhang, Qi Z <qi.z.zh...@intel.com>; dev@dpdk.org >>>>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >>>>>> <konstantin.anan...@intel.com>; Laurent Hardy >>>>>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; >>>>>> sta...@dpdk.org >>>>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while >>>>>> fiber link update >>>>>> >>>>>> On 12.09.2018 09:49, Zhang, Qi Z wrote: >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >>>>>>>> Sent: Monday, September 10, 2018 11:09 PM >>>>>>>> To: Zhang, Qi Z <qi.z.zh...@intel.com>; dev@dpdk.org >>>>>>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >>>>>>>> <konstantin.anan...@intel.com>; Laurent Hardy >>>>>>>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; >>>>>>>> sta...@dpdk.org >>>>>>>> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while >>>>>>>> fiber link update >>>>>>>> >>>>>>>> On 04.09.2018 09:08, Zhang, Qi Z wrote: >>>>>>>>> Hi Ilya: >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ilya >>>>>>>>>> Maximets >>>>>>>>>> Sent: Friday, August 31, 2018 8:40 PM >>>>>>>>>> To: dev@dpdk.org >>>>>>>>>> Cc: Lu, Wenzhuo <wenzhuo...@intel.com>; Ananyev, Konstantin >>>>>>>>>> <konstantin.anan...@intel.com>; Laurent Hardy >>>>>>>>>> <laurent.ha...@6wind.com>; Dai, Wei <wei....@intel.com>; Ilya >>>>>>>>>> Maximets <i.maxim...@samsung.com>; sta...@dpdk.org >>>>>>>>>> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix busy polling while >>>>>>>>>> fiber link update >>>>>>>>>> >>>>>>>>>> If the multispeed fiber link is in DOWN state, ixgbe_setup_link >>>>>>>>>> could take around a second of busy polling. This is highly >>>>>>>>>> inconvenient for the case where single thread periodically >>>>>>>>>> checks the >>>>>> link statuses. >>>>>>>>>> For example, OVS main thread periodically updates the link >>>>>>>>>> statuses and hangs for a really long time busy waiting on >>>>>>>>>> ixgbe_setup_link() for a DOWN fiber ports. For case with 3 down >>>>>>>>>> ports it hangs for a 3 seconds and unable to do anything >>>>>>>>>> including >>>> packet processing. >>>>>>>>>> Fix that by shifting that workaround to a separate thread by >>>>>>>>>> alarm handler that will try to set up link if it is DOWN. >>>>>>>>> >>>>>>>>> Does that mean we will block the interrupt thread for 3 seconds? >>>>>>>> >>>>>>>> Three times for one second. Other work could be scheduled >> between. >>>>>>>> IMHO, it's much better than blocking usual caller for 3 seconds. >>>>>>>> >>>>>>>>> Also, can we guarantee there will not be any race condition if >>>>>>>>> we call >>>>>>>> ixgbe_setup_link at another thread, the base code API is not >>>>>>>> assumed to be thread-safe as I know. >>>>>>>> >>>>>>>> The only user of 'ixgbe_setup_link' is 'ixgbe_dev_start', but it >>>>>>>> could be called only if device stopped. 'ixgbe_dev_stop' cancels >>>>>>>> the >>>> alarm. >>>>>>>> Race with 'link_update' avoided by >> 'IXGBE_FLAG_NEED_LINK_CONFIG' >>>>>> flag. >>>>>>> >>>>>>> I guess, it' not only about when ixgb_setup_link race with itself, >>>>>>> but also >>>>>> when it race with other APIs. >>>>>>> Also the concern is, even in current version, we can prove there >>>>>>> is no issue, >>>>>> how can we guarantee we are safe for future base code update? It's >>>>>> not designed as thread-safe. >>>>>>> For my option, the change is risky. >>>>>> >>>>>> In current implementation interrupt handler already calls the >>>>>> 'ixgbe_dev_link_update' which subsequently calls 'ixgbe_setup_link' >>>>>> in our case if LSC interrupts enabled. So, my change makes the >>>>>> driver even safer by moving 'ixgbe_setup_link' to the same interrupt >> thread. >>>>>> Otherwise two threads (interrupts handler and the link status >>>>>> checking >>>>>> thread) could call 'ixgbe_setup_link' simultaneously. >>>>>> >>>>>>> >>>>>>> Btw, since ixgbe support LSC, it is not necessary for "single >>>>>>> thread >>>>>> periodically checks the link statuses", right? >>>>>> >>>>>> In current implementation it will take at least 5 seconds (4 + 1) >>>>>> for the interrupt handler to detect DOWN link state for ixgbe >>>>>> multispeed fiber. This is too much for many real world cases. >>>>> >>>>> I have reviewed this patch, now I agree with you of the point that >>>>> when port is down, " main thread periodically updates the link >>>>> statuses and >>>> hangs for a really long time busy waiting on ixgbe_setup_link() for a >>>> DOWN fiber ports ". >>>>> This is introduced by a patch in the following: >>>>> SHA-1: c12d22f65b132c56db7b4fdbfd5ddce27d1e9572 >>>>> * net/ixgbe: ensure link status is updated >>>>> >>>>> Because in this patch, ixgbe_setup_link() is called with input >>>>> parameter >>>> autoneg_wait_to_complete=1, this will cause loop check and sleep delay. >>>>> At least 82599 seems has this delay.(BTW, whivh type of NIC are you >>>>> use? X550 or 82599) >>>> >>>> I have 82599. >>>> >>>>> Your solution is add a eal_alarm_set for ixgbe_setup_link in the >>>>> thread of >>>> PMD driver, and do the set up work in that thread, is that right? >>>>> And main thread avoid hang by the flag of >>>> IXGBE_FLAG_NEED_LINK_CONFIG. >>>>> I think this is a good idea for this problem, but it may cause >>>>> problem for other legacy user of ixgbe pmd, because their legacy >>>>> code, which use >>>> main thread to check link state and setup_link when port is down, >>>> and they are not aware of it is done by other thread if add your patch. >>>> >>>> What are these applications? I see no public API for setup_link function. >>>> It's internal to driver and should not be used externally. >>>> Am I missing something? >>> >>> rte_eth_link_get() , it will also call the function of ixgbe_setup_link(). >>> >> >> rte_eth_link_get() does not call ixgbe_setup_link(). >> It only calls dev_ops->link_update(). > > No, dev_ops->link_update call function ixgbe_dev_link_update() > -> ixgbe_dev_link_update_share() -> ixgbe_setup_link()
But with my patch, calling of ixgbe_setup_link() happens in a separate (interrupt) thread. There is no direct call from ixgbe_dev_link_update_share(). All the calls from the interrupt thread are sequenced by implementation. > > >> >>> >>>> >>>>> >>>>> And is that ok if we change code in ixgbe_dev_link_update_share() to >>>>> >>>>> ixgbe_dev_link_update_share() >>>>> { >>>>> >>>>> /* check if it needs to wait to complete, if lsc interrupt is enabled */ >>>>> if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc != 0) >>>>> wait = 0; >>>>> >>>>> if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) && >>>>> ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) { >>>>> speed = hw->phy.autoneg_advertised; >>>>> if (!speed) >>>>> ixgbe_get_link_capabilities(hw, &speed, &autoneg); >>>>> ixgbe_setup_link(hw, speed, wait); >>>>> } >>>>> } >>>>> >>>>> Then, your application can call rte_eth_link_get_nowait () to make >>>>> wait_to_complete=0 when doing periodic link state check, Which will >>>>> not >>>> cause loop check and sleep delay. Legacy code of other user call >>>> rte_eth_link_get() will not be affected also. >>>>> But, I am NOT confident ,whether this will introduce new problem >>>>> when >>>> set up link without wait! >>>>> So, this is just a discussion topic. >>>> >>>> Unfortunately this will not help. Take a look to the function >>>> 'ixgbe_setup_mac_link_multispeed_fiber()', which is the main >>>> problematic function here. 'wait_to_complete' here used only as >>>> argument for ixgbe_setup_mac_link(), and the busy waiting loops are >> outside of it. >>>> Regardless of the 'wait_to_complete' value, this function will busy >>>> poll the link for 1040 ms trying to setup 10GB speed and 140ms more >>>> trying to setup 1GB speed. After that, it will call itself >>>> recursively and wait again... Looks like I miscalculated last time. >>>> Right now it'll be more than 2 seconds for each down port since following >> patch merged: >>>> 8fc1f32fa615 ("net/ixgbe: wait longer for link after fiber MAC setup"). >>> >>> Yes, you are right, link state check loop in function >>> ixgbe_setup_mac_link_multispeed_fiber() are not blocked by bool >> autoneg_wait_to_complete, It will cause about 1s wait when port is down. >>> And also, can we update code in function >> ixgbe_setup_mac_link_multispeed_fiber() to block link state check loop >> using autoneg_wait_to_complete? >>> I am not sure. Because there is a comment for this loop check: >>> /* >>> * Wait for the controller to acquire link. Per IEEE 802.3ap, >>> * Section 73.10.2, we may have to wait up to 500ms if KR is >>> * attempted. 82599 uses the same timing for 10g SFI. >>> */ >>> It seems we have to wait for at least 500ms for spec requirement before >> we check link after configuration. >>> If that is true, we can not do any change to these loop check. >>> But why not main thread take some action to stop periodic link sate check >> when it find it has been hang or link is down? >> >> To find that device is DOWN, thread will have to call this function at least >> once for each port and wait a few seconds. >> And how in that case we'll know that device is UP again? >> As I already wrote in discussion for this patch, LSC is not an option, >> because it >> takes at least 5 seconds to detect link state change, which is way too much >> for many real world apps. >> >>> >>> >>> >>>> >>>>> >>>>> Hi, laurent.hardy >>>>> You are the author for the patch (* net/ixgbe: ensure link status >>>>> is >>>> updated), why do you implement code that way? >>>>> Is that must that set up link with wait? >>>>> >>>>> ixgbe_setup_link(hw, speed, true); >>>>> >>>>> >>>>> >>>>>> >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Qi >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Fixes: c12d22f65b13 ("net/ixgbe: ensure link status is >>>>>>>>>> updated") >>>>>>>>>> CC: sta...@dpdk.org >>>>>>>>>> >>>>>>>>>> Signed-off-by: Ilya Maximets <i.maxim...@samsung.com> >>>>>>>>>> --- >>>>>>>>>> drivers/net/ixgbe/ixgbe_ethdev.c | 43 >>>>>>>>>> ++++++++++++++++++++++++-------- >>>>>>>>>> 1 file changed, 32 insertions(+), 11 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c >>>>>>>>>> b/drivers/net/ixgbe/ixgbe_ethdev.c >>>>>>>>>> index 26b192737..a33b9a6e8 100644 >>>>>>>>>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c >>>>>>>>>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c >>>>>>>>>> @@ -221,6 +221,8 @@ static int >>>>>>>>>> ixgbe_dev_interrupt_action(struct rte_eth_dev *dev, >>>>>>>>>> struct rte_intr_handle *handle); >>>> static >>>>>> void >>>>>>>>>> ixgbe_dev_interrupt_handler(void *param); static void >>>>>>>>>> ixgbe_dev_interrupt_delayed_handler(void *param); >>>>>>>>>> +static void ixgbe_dev_setup_link_alarm_handler(void *param); >>>>>>>>>> + >>>>>>>>>> static int ixgbe_add_rar(struct rte_eth_dev *dev, struct >>>>>>>>>> ether_addr *mac_addr, >>>>>>>>>> uint32_t index, uint32_t pool); static void >>>>>>>>>> ixgbe_remove_rar(struct rte_eth_dev *dev, uint32_t index); @@ >>>>>>>>>> -2791,6 +2793,8 @@ ixgbe_dev_stop(struct rte_eth_dev *dev) >>>>>>>>>> >>>>>>>>>> PMD_INIT_FUNC_TRACE(); >>>>>>>>>> >>>>>>>>>> + rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler, >>>> dev); >>>>>>>>>> + >>>>>>>>>> /* disable interrupts */ >>>>>>>>>> ixgbe_disable_intr(hw); >>>>>>>>>> >>>>>>>>>> @@ -3969,6 +3973,25 @@ ixgbevf_check_link(struct ixgbe_hw >> *hw, >>>>>>>>>> ixgbe_link_speed *speed, >>>>>>>>>> return ret_val; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> +static void >>>>>>>>>> +ixgbe_dev_setup_link_alarm_handler(void *param) { >>>>>>>>>> + struct rte_eth_dev *dev = (struct rte_eth_dev *)param; >>>>>>>>>> + struct ixgbe_hw *hw = >>>>>>>>>> IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); >>>>>>>>>> + struct ixgbe_interrupt *intr = >>>>>>>>>> + IXGBE_DEV_PRIVATE_TO_INTR(dev->data- >>>>> dev_private); >>>>>>>>>> + u32 speed; >>>>>>>>>> + bool autoneg = false; >>>>>>>>>> + >>>>>>>>>> + speed = hw->phy.autoneg_advertised; >>>>>>>>>> + if (!speed) >>>>>>>>>> + ixgbe_get_link_capabilities(hw, &speed, &autoneg); >>>>>>>>>> + >>>>>>>>>> + ixgbe_setup_link(hw, speed, true); >>>>>>>>>> + >>>>>>>>>> + intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG; } >>>>>>>>>> + >>>>>>>>>> /* return 0 means link status changed, -1 means not changed */ >>>>>>>>>> int ixgbe_dev_link_update_share(struct rte_eth_dev *dev, @@ - >>>>>> 3981,9 >>>>>>>>>> +4004,7 @@ ixgbe_dev_link_update_share(struct rte_eth_dev >>>> *dev, >>>>>>>>>> IXGBE_DEV_PRIVATE_TO_INTR(dev->data- >>>>> dev_private); >>>>>>>>>> int link_up; >>>>>>>>>> int diag; >>>>>>>>>> - u32 speed = 0; >>>>>>>>>> int wait = 1; >>>>>>>>>> - bool autoneg = false; >>>>>>>>>> >>>>>>>>>> memset(&link, 0, sizeof(link)); >>>>>>>>>> link.link_status = ETH_LINK_DOWN; @@ -3993,13 +4014,8 >>>> @@ >>>>>>>>>> ixgbe_dev_link_update_share(struct >>>>>>>> rte_eth_dev >>>>>>>>>> *dev, >>>>>>>>>> >>>>>>>>>> hw->mac.get_link_status = true; >>>>>>>>>> >>>>>>>>>> - if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) && >>>>>>>>>> - ixgbe_get_media_type(hw) == >>>> ixgbe_media_type_fiber) { >>>>>>>>>> - speed = hw->phy.autoneg_advertised; >>>>>>>>>> - if (!speed) >>>>>>>>>> - ixgbe_get_link_capabilities(hw, &speed, >>>> &autoneg); >>>>>>>>>> - ixgbe_setup_link(hw, speed, true); >>>>>>>>>> - } >>>>>>>>>> + if (intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) >>>>>>>>>> + return rte_eth_linkstatus_set(dev, &link); >>>>>>>>>> >>>>>>>>>> /* check if it needs to wait to complete, if lsc interrupt is >>>> enabled */ >>>>>>>>>> if (wait_to_complete == 0 || dev->data- >>>>> dev_conf.intr_conf.lsc >>>>>>>>>> != >>>>>>>>>> 0) @@ >>>>>>>>>> -4017,11 +4033,14 @@ ixgbe_dev_link_update_share(struct >>>>>> rte_eth_dev >>>>>>>> *dev, >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> if (link_up == 0) { >>>>>>>>>> - intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG; >>>>>>>>>> + if (ixgbe_get_media_type(hw) == >>>> ixgbe_media_type_fiber) >>>>>> { >>>>>>>>>> + intr->flags |= >>>> IXGBE_FLAG_NEED_LINK_CONFIG; >>>>>>>>>> + rte_eal_alarm_set(10, >>>>>>>>>> + >>>> ixgbe_dev_setup_link_alarm_handler, dev); >>>>>>>>>> + } >>>>>>>>>> return rte_eth_linkstatus_set(dev, &link); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> - intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG; >>>>>>>>>> link.link_status = ETH_LINK_UP; >>>>>>>>>> link.link_duplex = ETH_LINK_FULL_DUPLEX; >>>>>>>>>> >>>>>>>>>> @@ -5128,6 +5147,8 @@ ixgbevf_dev_stop(struct rte_eth_dev >>>> *dev) >>>>>>>>>> >>>>>>>>>> PMD_INIT_FUNC_TRACE(); >>>>>>>>>> >>>>>>>>>> + rte_eal_alarm_cancel(ixgbe_dev_setup_link_alarm_handler, >>>> dev); >>>>>>>>>> + >>>>>>>>>> ixgbevf_intr_disable(dev); >>>>>>>>>> >>>>>>>>>> hw->adapter_stopped = 1; >>>>>>>>>> -- >>>>>>>>>> 2.17.1 >>>>>>>>>