On 6/24/2025 10:13 AM, Brett Creeley wrote:


On 6/24/2025 7:26 AM, Emil Tantilov wrote:
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.


Issuing a reset when the driver is loaded without RDMA support, will
results in a crash as it attempts to remove RDMA's non-existent auxbus
device:
echo 1 > /sys/class/net/<if>/device/reset

BUG: kernel NULL pointer dereference, address: 0000000000000008
...
RIP: 0010:ice_unplug_aux_dev+0x29/0x70 [ice]
...
Call Trace:
<TASK>
ice_prepare_for_reset+0x77/0x260 [ice]
pci_dev_save_and_disable+0x2c/0x70
pci_reset_function+0x88/0x130
reset_store+0x5a/0xa0
kernfs_fop_write_iter+0x15e/0x210
vfs_write+0x273/0x520
ksys_write+0x6b/0xe0
do_syscall_64+0x79/0x3b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e

ice_unplug_aux_dev() checks pf->cdev_info->adev for NULL pointer, but
pf->cdev_info will also be NULL, leading to the deref in the trace above.

What about in ice_deinit_rdma(), can the cdev_info also be NULL there? If so kfree(pf->cdev_info->iddc_priv) will result in a similar trace on driver unload.

This bug is seen on a NIC with no RDMA support, so the RDMA init/deinit
flow will not happen. It is protected by the !ice_is_rdma_ena(pf)
check.



Introduce a flag to be set when the creation of the auxbus device is
successful, to avoid multiple NULL pointer checks in ice_unplug_aux_dev().

IMHO adding a state flag to prevent NULL pointer checks in the control path isn't enough justification unless there's something I'm missing here.


Avoid defensive programming. The use of the flag makes this path
deterministic (only attempt to remove the auxbus device if we know for a
fact that it was created). The NULL pointer checks provide protection for this crash, but only by association. Say there was another bug in that path that cleared them prematurely, with the NULL pointer checks in place you will avoid crashing, but fail to remove the auxbus device, or if such a patch was introduced, you'd want to crash, which gives validation a chance to catch it.


Fixes: c24a65b6a27c7 ("iidc/ice/irdma: Update IDC to support multiple consumers")
Signed-off-by: Emil Tantilov <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
---
  drivers/net/ethernet/intel/ice/ice.h     |  1 +
  drivers/net/ethernet/intel/ice/ice_idc.c | 10 ++++++----
  2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ ethernet/intel/ice/ice.h
index ddd0ad68185b..0ef11b7ab477 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -509,6 +509,7 @@ enum ice_pf_flags {
         ICE_FLAG_LINK_LENIENT_MODE_ENA,
         ICE_FLAG_PLUG_AUX_DEV,
         ICE_FLAG_UNPLUG_AUX_DEV,
+       ICE_FLAG_AUX_DEV_CREATED,
         ICE_FLAG_MTU_CHANGED,
         ICE_FLAG_GNSS,                  /* GNSS successfully initialized */          ICE_FLAG_DPLL,                  /* SyncE/PTP dplls initialized */ diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/ ethernet/intel/ice/ice_idc.c
index 6ab53e430f91..420d45c2558b 100644
--- a/drivers/net/ethernet/intel/ice/ice_idc.c
+++ b/drivers/net/ethernet/intel/ice/ice_idc.c
@@ -336,6 +336,7 @@ int ice_plug_aux_dev(struct ice_pf *pf)
         mutex_lock(&pf->adev_mutex);
         cdev->adev = adev;
         mutex_unlock(&pf->adev_mutex);
+       set_bit(ICE_FLAG_AUX_DEV_CREATED, pf->flags);

What if this bit is set already, should ice_plug_aux_dev() be executed?

The purpose of this bit is to mark the successful creation of the auxbus device, I am not aware of any action that needs to be taken should ice_plug_aux_dev() be called with the bit set and there was no check like this previously.



         return 0;
  }
@@ -347,15 +348,16 @@ void ice_unplug_aux_dev(struct ice_pf *pf)
  {
         struct auxiliary_device *adev;

+       if (!test_and_clear_bit(ICE_FLAG_AUX_DEV_CREATED, pf->flags))
+               return;
+

To re-iterate my comment above, I think the driver should just check if pf->cdev_info is valid before de-referencing it. Also, the local adev variable will have to be set to NULL to handle this case.

Understood. I am not opposed to changing the patch to add another NULL check if that is the consensus. Another option is to add the same check for !ice_is_rdma_ena(pf) like in the other functions. This should work at present, because the driver fails to load if RDMA init errors out.

Thanks,
Emil


Brett

         mutex_lock(&pf->adev_mutex);
         adev = pf->cdev_info->adev;
         pf->cdev_info->adev = NULL;


         mutex_unlock(&pf->adev_mutex);

-       if (adev) {
-               auxiliary_device_delete(adev);
-               auxiliary_device_uninit(adev);
-       }
+       auxiliary_device_delete(adev);
+       auxiliary_device_uninit(adev);
  }

  /**
--
2.37.3



Reply via email to