Thanks Steve, is this something which will be fixed next iterations of NIC/FW? Or architecture limitation?
On Thu, 21 Jul 2022, 2:47 pm Yang, SteveX, <stevex.y...@intel.com> wrote: > Hi Nobin, > > It seems be limitation of the firmware/NIC implementation for reset VF. > Firmware should spend some time to respond reset operation, > if app/user reset VF with higher frequency, the HW/FW perhaps would be > hung. (e.g.: 0xDEADBEEF code from register). > This patch ( > https://github.com/DPDK/dpdk/commit/be7226980c9ad4963b92b489c8afb17f08899953) > just is the workaround to delay reset checking. > It cannot resolve this pressure testing. > > Thanks & Regards, > Steve Yang. > > > -----Original Message----- > > From: Nobin Mathew <nobin.mat...@gmail.com> > > Sent: Thursday, July 21, 2022 9:45 AM > > To: Xing, Beilei <beilei.x...@intel.com> > > Cc: users@dpdk.org; Yang, SteveX <stevex.y...@intel.com>; Yang, Qiming > > <qiming.y...@intel.com> > > Subject: Re: VF is still resetting > > > > Any pointers? > > Is this a firmware problem? > > > > I am not seeing > > " dev_err(&pf->pdev->dev, "VF reset check timeout on VF %d\n", " from > > i40e driver anywhere in syslog. > > > > -Nobin > > > > On Wed, Jul 20, 2022 at 11:56 AM Xing, Beilei <beilei.x...@intel.com> > wrote: > > > > > > Hi Steve, > > > > > > Could you please help on this? Thanks. > > > > > > BR, > > > Beilei > > > > > > > -----Original Message----- > > > > From: Nobin Mathew <nobin.mat...@gmail.com> > > > > Sent: Wednesday, July 20, 2022 12:18 AM > > > > To: users@dpdk.org > > > > Subject: VF is still resetting > > > > > > > > Hi, > > > > > > > > We are running a dpdk app inside a pod, and orchestrating the app > > > > very frequently(test app). > > > > > > > > 1/100 or so we are getting an error: > > > > > > > > 2022-07-17T22:34:24.620291289+03:00 iavf_check_vf_reset_done(): > > > > reset VFR value: 3 > > > > 2022-07-17T22:34:24.620310455+03:00 iavf_init_vf(): VF is still > > > > resetting > > > > 2022-07-17T22:34:24.620339697+03:00 iavf_dev_init(): Init vf failed > > > > 2022-07-17T22:34:24.620390802+03:00 EAL: Releasing PCI mapped > > > > resource for 0000:3b:0f.5 > > > > 2022-07-17T22:34:24.620397381+03:00 EAL: Calling pci_unmap_resource > > > > for > > > > 0000:3b:0f.5 at 0x2101000000 > > > > 2022-07-17T22:34:24.620442514+03:00 EAL: Calling pci_unmap_resource > > > > for > > > > 0000:3b:0f.5 at 0x2101010000 > > > > 2022-07-17T22:34:24.729012277+03:00 EAL: Requested device > > > > 0000:3b:0f.5 cannot be used > > > > 2022-07-17T22:34:24.729028758+03:00 EAL: Bus (pci) probe failed. > > > > > > > > we added one log in dpdk lib to print the VFGEN_RSTAT register of > > > > the VF. In problematic cases, we are seeing the value 3 which maps > > > > to 0xDEADBEEF > > > > > > > > / VF reset states - these are written into the RSTAT register: > > > > * VFGEN_RSTAT on the VF > > > > * When the PF initiates a reset, it writes 0 > > > > * When the reset is complete, it writes 1 > > > > * When the PF detects that the VF has recovered, it writes 2 > > > > * VF checks this register periodically to determine if a reset has > > > > occurred, > > > > * then polls it to know when the reset is complete. > > > > * If either the PF or VF reads the register while the hardware > > > > * is in a reset state, it will return DEADBEEF, which, when masked > > > > * will result in 3. > > > > / > > > > enum virtchnl_vfr_states { > > > > VIRTCHNL_VFR_INPROGRESS = 0, > > > > VIRTCHNL_VFR_COMPLETED, > > > > VIRTCHNL_VFR_VFACTIVE, > > > > }; > > > > > > > > We tried this patch also, increasing the poll time, no help. > > > > > > https://github.com/DPDK/dpdk/commit/be7226980c9ad4963b92b489c8afb > > > > 17f08899953 > > > > > > > > Details of the setup: > > > > > > > > DPDK library version > > > > 21.11 > > > > VF Driver:- > > > > intel-iavf version 4.0.1-3.2 > > > > PF driver:- > > > > sudo ethtool -i enp94s0f1 > > > > driver: i40e > > > > version: 2.14.13 > > > > firmware-version: 8.15 0x800096ca 20.0.17 > > > > > > > > Since we are seeing 0xDEADBEEF, I am assuming VF-PF reset mailbox > > > > msg is received by PF, and PF initiated the RESET sequence by > > > > writing VFSWR to VPGEN_VFRTRIG register. > > > > > > > > I am not seeing > > > > " dev_err(&pf->pdev->dev, "VF reset check timeout on VF %d\n", " > > > > anywhere in syslog. > > > > > > > > Any pointers?, why does this happen(why VF reset is not complete)?... > > > > > > > > One more question, what is the sequence of calls in the reset path? > > > > i40e_vc_process_vf_msg() -> VIRTCHNL_OP_RESET_VF > > i40e_vc_reset_vf() > > > > -> > > > > i40e_reset_vf() -> i40e_trigger_vf_reset() & i40e_cleanup_reset_vf() > > > > > > > > this one? > > > > > > > > -Nobin >