On 6/24/2025 10:13 AM, Brett Creeley wrote:
On 6/24/2025 7:26 AM, Emil Tantilov wrote:
Caution: This message originated from an External Source. Use proper
caution when opening attachments, clicking links, or responding.
Issuing a reset when the driver is loaded without RDMA support, will
results in a crash as it attempts to remove RDMA's non-existent auxbus
device:
echo 1 > /sys/class/net/<if>/device/reset
BUG: kernel NULL pointer dereference, address: 0000000000000008
...
RIP: 0010:ice_unplug_aux_dev+0x29/0x70 [ice]
...
Call Trace:
<TASK>
ice_prepare_for_reset+0x77/0x260 [ice]
pci_dev_save_and_disable+0x2c/0x70
pci_reset_function+0x88/0x130
reset_store+0x5a/0xa0
kernfs_fop_write_iter+0x15e/0x210
vfs_write+0x273/0x520
ksys_write+0x6b/0xe0
do_syscall_64+0x79/0x3b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
ice_unplug_aux_dev() checks pf->cdev_info->adev for NULL pointer, but
pf->cdev_info will also be NULL, leading to the deref in the trace above.
What about in ice_deinit_rdma(), can the cdev_info also be NULL there?
If so kfree(pf->cdev_info->iddc_priv) will result in a similar trace on
driver unload.
This bug is seen on a NIC with no RDMA support, so the RDMA init/deinit
flow will not happen. It is protected by the !ice_is_rdma_ena(pf)
check.
Introduce a flag to be set when the creation of the auxbus device is
successful, to avoid multiple NULL pointer checks in
ice_unplug_aux_dev().
IMHO adding a state flag to prevent NULL pointer checks in the control
path isn't enough justification unless there's something I'm missing here.
Avoid defensive programming. The use of the flag makes this path
deterministic (only attempt to remove the auxbus device if we know for a
fact that it was created). The NULL pointer checks provide protection
for this crash, but only by association. Say there was another bug in
that path that cleared them prematurely, with the NULL pointer checks in
place you will avoid crashing, but fail to remove the auxbus device, or
if such a patch was introduced, you'd want to crash, which gives
validation a chance to catch it.
Fixes: c24a65b6a27c7 ("iidc/ice/irdma: Update IDC to support multiple
consumers")
Signed-off-by: Emil Tantilov <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
---
drivers/net/ethernet/intel/ice/ice.h | 1 +
drivers/net/ethernet/intel/ice/ice_idc.c | 10 ++++++----
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/
ethernet/intel/ice/ice.h
index ddd0ad68185b..0ef11b7ab477 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -509,6 +509,7 @@ enum ice_pf_flags {
ICE_FLAG_LINK_LENIENT_MODE_ENA,
ICE_FLAG_PLUG_AUX_DEV,
ICE_FLAG_UNPLUG_AUX_DEV,
+ ICE_FLAG_AUX_DEV_CREATED,
ICE_FLAG_MTU_CHANGED,
ICE_FLAG_GNSS, /* GNSS successfully
initialized */
ICE_FLAG_DPLL, /* SyncE/PTP dplls
initialized */
diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/
ethernet/intel/ice/ice_idc.c
index 6ab53e430f91..420d45c2558b 100644
--- a/drivers/net/ethernet/intel/ice/ice_idc.c
+++ b/drivers/net/ethernet/intel/ice/ice_idc.c
@@ -336,6 +336,7 @@ int ice_plug_aux_dev(struct ice_pf *pf)
mutex_lock(&pf->adev_mutex);
cdev->adev = adev;
mutex_unlock(&pf->adev_mutex);
+ set_bit(ICE_FLAG_AUX_DEV_CREATED, pf->flags);
What if this bit is set already, should ice_plug_aux_dev() be executed?
The purpose of this bit is to mark the successful creation of the auxbus
device, I am not aware of any action that needs to be taken should
ice_plug_aux_dev() be called with the bit set and there was no check
like this previously.
return 0;
}
@@ -347,15 +348,16 @@ void ice_unplug_aux_dev(struct ice_pf *pf)
{
struct auxiliary_device *adev;
+ if (!test_and_clear_bit(ICE_FLAG_AUX_DEV_CREATED, pf->flags))
+ return;
+
To re-iterate my comment above, I think the driver should just check if
pf->cdev_info is valid before de-referencing it. Also, the local adev
variable will have to be set to NULL to handle this case.
Understood. I am not opposed to changing the patch to add another NULL
check if that is the consensus. Another option is to add the same check
for !ice_is_rdma_ena(pf) like in the other functions. This should work
at present, because the driver fails to load if RDMA init errors out.
Thanks,
Emil
Brett
mutex_lock(&pf->adev_mutex);
adev = pf->cdev_info->adev;
pf->cdev_info->adev = NULL;
mutex_unlock(&pf->adev_mutex);
- if (adev) {
- auxiliary_device_delete(adev);
- auxiliary_device_uninit(adev);
- }
+ auxiliary_device_delete(adev);
+ auxiliary_device_uninit(adev);
}
/**
--
2.37.3