> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-boun...@osuosl.org> On Behalf Of
> Aleksandr Loktionov
> Sent: Tuesday, June 25, 2024 11:50 AM
> To: intel-wired-...@lists.osuosl.org; Nguyen, Anthony L
> <anthony.l.ngu...@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktio...@intel.com>
> Cc: net...@vger.kernel.org; Kang, Kelvin <kelvin.k...@intel.com>;
> Kubalewski, Arkadiusz <arkadiusz.kubalew...@intel.com>
> Subject: [Intel-wired-lan] [PATCH iwl-net v5] i40e: fix: remove needless 
> retries
> of NVM update
> 
> Remove wrong EIO to EGAIN conversion and pass all errors as is.
> 
> After commit 230f3d53a547 ("i40e: remove i40e_status"), which should only
> replace F/W specific error codes with Linux kernel generic, all EIO errors
> suddenly started to be converted into EAGAIN which leads nvmupdate to
> retry until it timeouts and sometimes fails after more than 20 minutes in the
> middle of NVM update, so NVM becomes corrupted.
> 
> The bug affects users only at the time when they try to update NVM, and only
> F/W versions that generate errors while nvmupdate. For example, X710DA2
> with 0x8000ECB7 F/W is affected, but there are probably more...
> 
> Command for reproduction is just NVM update:
>  ./nvmupdate64
> 
> In the log instead of:
>  i40e_nvmupd_exec_aq err I40E_ERR_ADMIN_QUEUE_ERROR aq_err
> I40E_AQ_RC_ENOMEM)
> appears:
>  i40e_nvmupd_exec_aq err -EIO aq_err I40E_AQ_RC_ENOMEM
>  i40e: eeprom check failed (-5), Tx/Rx traffic disabled
> 
> The problematic code did silently convert EIO into EAGAIN which forced
> nvmupdate to ignore EAGAIN error and retry the same operation until
> timeout.
> That's why NVM update takes 20+ minutes to finish with the fail in the end.
> 
> Fixes: 230f3d53a547 ("i40e: remove i40e_status")
> Co-developed-by: Kelvin Kang <kelvin.k...@intel.com>
> Signed-off-by: Kelvin Kang <kelvin.k...@intel.com>
> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalew...@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktio...@intel.com>
> ---
> v4->v5 commit message update
> https://lore.kernel.org/netdev/20240618132111.3193963-1-
> aleksandr.loktio...@intel.com/T/#u
> v3->v4 commit message update
> v2->v3 commit messege typos
> v1->v2 commit message update
> ---
>  drivers/net/ethernet/intel/i40e/i40e_adminq.h | 4 ----
>  1 file changed, 4 deletions(-)

Tested-by: Tony Brelinski <tony.brelin...@intel.com>

Reply via email to