Hi Bartosz,

On 12/11/2025 13:55, Bartosz Golaszewski wrote:
From: Bartosz Golaszewski <[email protected]>

This module scans the device tree (for now only OF nodes are supported
but care is taken to make other fwnode implementations easy to
integrate) and determines which GPIO lines are shared by multiple users.
It stores that information in memory. When the GPIO chip exposing shared
lines is registered, the shared GPIO descriptors it exposes are marked
as shared and virtual "proxy" devices that mediate access to the shared
lines are created. When a consumer of a shared GPIO looks it up, its
fwnode lookup is redirected to a just-in-time machine lookup that points
to this proxy device.

This code can be compiled out on platforms which don't use shared GPIOs.

Reviewed-by: Linus Walleij <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>


I have observed a crash on one of our boards with Linux v6.19 and I was
able to reproduce the same crash on a recent -next. The crash log I see
is ...

 Unable to handle kernel paging request at virtual address f0f21322a6ad56c5
 Mem abort info:
   ESR = 0x0000000096000004
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x04: level 0 translation fault
 Data abort info:
   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
 [f0f21322a6ad56c5] address between user and kernel address ranges
 Internal error: Oops: 0000000096000004 [#1]  SMP
 Modules linked in:
 CPU: 9 UID: 0 PID: 95 Comm: kworker/u51:4 Not tainted 
7.0.0-rc3-next-20260309-00004-g34a79c0d58ea-dirty #13 PREEMPT
 Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS 
buildbrain-gcid-42974706 11/20/2025
 Workqueue: events_unbound deferred_probe_work_func
 pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : __srcu_read_lock+0x18/0x84
 lr : gpiod_request_commit+0x30/0x174
 sp : ffff8000843fb8d0
 x29: ffff8000843fb8d0 x28: ffff800081e12000 x27: 0000000000000200
 x26: 00000000000000b9 x25: ffff000080ad92d8 x24: ffff000085cdd940
 x23: ffff800081e12cc0 x22: f0f21322a6ad56c5 x21: ffff800082df0528
 x20: f0f21322a6ad5295 x19: f0f21322a6ad56c5 x18: 00000000ffffffff
 x17: ffff000080c04d80 x16: 1fffe000101809a1 x15: ffff8000843fb530
 x14: ffff000080b17192 x13: ffff000080b1718e x12: ffff0007a1e468b8
 x11: ffff80008199ccf0 x10: 0000000000000000 x9 : 0000000000000000
 x8 : 1fffe0001014ec41 x7 : 0000000000000fff x6 : 0000000000000fff
 x5 : ffff800082df0538 x4 : ffff000081011410 x3 : ffff0000825b82b0
 x2 : ffff0000816daf40 x1 : ffff800081e12cc0 x0 : f0f21322a6ad56c5
 Call trace:
  __srcu_read_lock+0x18/0x84 (P)
  gpiod_request_commit+0x30/0x174
  gpio_device_setup_shared+0x144/0x254
  gpiochip_add_data_with_key+0xc38/0xeec
  devm_gpiochip_add_data_with_key+0x30/0x7c
  tegra186_gpio_probe+0x5cc/0x844
  platform_probe+0x5c/0x98
  really_probe+0xbc/0x2a8
  __driver_probe_device+0x78/0x12c
  driver_probe_device+0x3c/0x15c
  __device_attach_driver+0xb8/0x134
  bus_for_each_drv+0x84/0xe0
  __device_attach+0x9c/0x188
  device_initial_probe+0x50/0x54
  bus_probe_device+0x38/0xa4
  deferred_probe_work_func+0x88/0xc0
  process_one_work+0x154/0x294
  worker_thread+0x184/0x304
  kthread+0x118/0x124
  ret_from_fork+0x10/0x20
 Code: d5384102 910003fd a90153f3 aa0003f3 (f9400014)
 ---[ end trace 0000000000000000 ]---


On Tegra234, the main gpio controller has a total of 164 GPIOs (see
the tegra234_main_ports in drivers/gpio/gpio-tegra186.c). The GPIOs
are assigned a index by the kernel from 0-163, but these GPIOs are
not contiguous with respect to the device-tree specifier.

For example, in device-tree, if I have a shared-gpio with the
following specifier ...

 gpios = <&gpio TEGRA234_MAIN_GPIO(AF, 1) GPIO_ACTIVE_LOW>;

The macro TEGRA234_MAIN_GPIO(AF, 1) evaluates to (23 * 8) + 1 = 185.
This is greater than 164 and this is causing the above crash because
'entry->offset' in gpio_device_setup_shared() is greater than
'gdev->ngpio' and this causes us to access invalid memory.

This is what I have been able to determine so far and wanted to get
your inputs.

Thanks
Jon

--
nvpublic


Reply via email to