Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi Hans, On Thu, Feb 18, 2021 at 04:33:38PM +0100, Hans de Goede wrote: > On 2/17/21 5:29 PM, Hans Verkuil wrote: > > On 17/02/2021 16:11, Sean Young wrote: > >> Hi, > >> > >> On Wed, Feb 17, 2021 at 04:04:11PM +0100, Hans de Goede wrote: > >>> On 2/17/21 3:32 PM, Sean Young wrote: > On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: > > Hi Hans, > > > > On 17/02/2021 13:24, Hans de Goede wrote: > >> > >> > >> Hi Hans, > >> > >> Fedora has a (opt-in) system to automatically collect backtraces from > >> software > >> crashing on users systems. > >> > >> This includes collecting kernel backtraces (including once triggered by > >> WARN macros) while looking a the top 10 of the most reported backtrace > >> during the > >> last 2 weeks report from ABRT: > >> https://retrace.fedoraproject.org/faf/problems/ > >> > >> I noticed the following backtrace: > >> https://retrace.fedoraproject.org/faf/problems/8150/ > >> which has been reported 17 times by Fedora users who have opted-in > >> during the > >> last 14 days. > >> > >> The issue here is that cec_register_adapter ends up calling > >> request_module() > >> from an async context, triggering this warn in kernel/kmod.c > >> __request_module(): > >> > >> /* > >> * We don't allow synchronous module loading from async. > >> Module > >> * init may invoke async_synchronize_full() which will end up > >> * waiting for this task which already is waiting for the > >> module > >> * loading to complete, leading to a deadlock. > >> */ > >> WARN_ON_ONCE(wait && current_is_async()); > >> > >> The call-path leading to this goes like this: > >> > >> ? kvasprintf+0x6d/0xa0 > >> ? kobject_set_name_vargs+0x6f/0x90 > >> rc_map_get+0x30/0x60 > > > > It's not CEC, it is rc_map_get that calls request_module() for > > rc-cec.ko. > > > > I've added Sean Young to the CC list. > > > > Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is > > set? > > > > I think this issue is very specific to CEC. I would not expect to see > > this > > with any other rc keymap. > > So CEC creates an RC device with a keymap (cec keymap, of course) and > then > the keymap needs to be loaded. We certainly don't want all keymaps as > builtins, that would be a waste. > > The cec keymap is scanned once to build a map from cec codes to linux > keycodes; making it builtin is not ideal, and makes the build system a > bit messy. > > I don't think we can load the keymap later, user space may start > remapping > the keymap from udev. > > Possibly we could create the cec or rc device later but this could be a > bit > messy. > > Could CEC specify: > > #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) > MODULE_SOFTDEP("rc-cec") > #endif > >>> > >>> That would need to be: > >>> > >>> MODULE_SOFTDEP("pre: rc-cec") > >>> > >>> I see that the drm_kms_helper and i915 drivers both depend on the cec > >>> module already, > >>> so yes if that module will request for rc-cec to be loaded before it is > >>> loaded > >>> (and thus before i915 is loaded) then that should work around this. > >>> > >>> Assuming the user is using a module-loader which honors the softdep... > >>> > >>> Also this assumes that rc_map_get is smart enough to not call > >>> request_module() > >>> if the module is already loaded, is that the case ? > >> > >> Yes, see rc_map_get(). > > > > I tried this. It works if CONFIG_RC_CORE is set to m, but setting it to > > y resulted in the same problem. It looks like MODULE_SOFTDEP only works if > > rc_main > > is a module as well. > > Yeah that is a known limit of module softdeps, they only work inside modules > ... Yes, I assume this is the problem. > Still, assuming there is no easy other fix, we could still use this somehow. > > I do see that at least Fedora actually has CONFIG_RC_CORE=y for some reason. This is to make BPF IR decoding possible. > I guess we could maybe add the softdep to the CONFIG_RC_MAP module or > maybe to the module which contains the code enabled by CONFIG_DRM_DP_CEC ? > > At least Fedora has all drm stuff as modules and also has CONFIG_RC_MAP=m, > > I know this is not a real fix but a workaround to get rid of 170,000 > backtraces / 14 days being reported by (opted-in) systems running the > Fedora generic kernel config would be welcome regardless of it being the > "perfect" fix. Of course, I totally agree that a solution is needed. How about: 1) Use MODULE_SOFTDEP("rc-cec"); 2) If it's compiled as a module, rc-cec should be builtin Sean ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org htt
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi, On 2/18/21 5:38 PM, Sean Young wrote: > Hi Hans, > > On Thu, Feb 18, 2021 at 04:33:38PM +0100, Hans de Goede wrote: >> On 2/17/21 5:29 PM, Hans Verkuil wrote: >>> On 17/02/2021 16:11, Sean Young wrote: Hi, On Wed, Feb 17, 2021 at 04:04:11PM +0100, Hans de Goede wrote: > On 2/17/21 3:32 PM, Sean Young wrote: >> On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: >>> Hi Hans, >>> >>> On 17/02/2021 13:24, Hans de Goede wrote: Hi Hans, Fedora has a (opt-in) system to automatically collect backtraces from software crashing on users systems. This includes collecting kernel backtraces (including once triggered by WARN macros) while looking a the top 10 of the most reported backtrace during the last 2 weeks report from ABRT: https://retrace.fedoraproject.org/faf/problems/ I noticed the following backtrace: https://retrace.fedoraproject.org/faf/problems/8150/ which has been reported 17 times by Fedora users who have opted-in during the last 14 days. The issue here is that cec_register_adapter ends up calling request_module() from an async context, triggering this warn in kernel/kmod.c __request_module(): /* * We don't allow synchronous module loading from async. Module * init may invoke async_synchronize_full() which will end up * waiting for this task which already is waiting for the module * loading to complete, leading to a deadlock. */ WARN_ON_ONCE(wait && current_is_async()); The call-path leading to this goes like this: ? kvasprintf+0x6d/0xa0 ? kobject_set_name_vargs+0x6f/0x90 rc_map_get+0x30/0x60 >>> >>> It's not CEC, it is rc_map_get that calls request_module() for >>> rc-cec.ko. >>> >>> I've added Sean Young to the CC list. >>> >>> Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is >>> set? >>> >>> I think this issue is very specific to CEC. I would not expect to see >>> this >>> with any other rc keymap. >> >> So CEC creates an RC device with a keymap (cec keymap, of course) and >> then >> the keymap needs to be loaded. We certainly don't want all keymaps as >> builtins, that would be a waste. >> >> The cec keymap is scanned once to build a map from cec codes to linux >> keycodes; making it builtin is not ideal, and makes the build system a >> bit messy. >> >> I don't think we can load the keymap later, user space may start >> remapping >> the keymap from udev. >> >> Possibly we could create the cec or rc device later but this could be a >> bit >> messy. >> >> Could CEC specify: >> >> #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) >> MODULE_SOFTDEP("rc-cec") >> #endif > > That would need to be: > > MODULE_SOFTDEP("pre: rc-cec") > > I see that the drm_kms_helper and i915 drivers both depend on the cec > module already, > so yes if that module will request for rc-cec to be loaded before it is > loaded > (and thus before i915 is loaded) then that should work around this. > > Assuming the user is using a module-loader which honors the softdep... > > Also this assumes that rc_map_get is smart enough to not call > request_module() > if the module is already loaded, is that the case ? Yes, see rc_map_get(). >>> >>> I tried this. It works if CONFIG_RC_CORE is set to m, but setting it to >>> y resulted in the same problem. It looks like MODULE_SOFTDEP only works if >>> rc_main >>> is a module as well. >> >> Yeah that is a known limit of module softdeps, they only work inside modules >> ... > > Yes, I assume this is the problem. > >> Still, assuming there is no easy other fix, we could still use this somehow. >> >> I do see that at least Fedora actually has CONFIG_RC_CORE=y for some reason. > > This is to make BPF IR decoding possible. > >> I guess we could maybe add the softdep to the CONFIG_RC_MAP module or >> maybe to the module which contains the code enabled by CONFIG_DRM_DP_CEC ? >> >> At least Fedora has all drm stuff as modules and also has CONFIG_RC_MAP=m, >> >> I know this is not a real fix but a workaround to get rid of 170,000 >> backtraces / 14 days being reported by (opted-in) systems running the >> Fedora generic kernel config would be welcome regardless of it being the >> "perfect" fix. > > Of course, I totally agree that a solution is needed. > > How about: > > 1) Use MODULE_SOFTDEP("rc-cec"); > > 2) If it's compiled as a module, rc-cec should be builtin I assume with 2. you mea
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi, On 2/17/21 5:29 PM, Hans Verkuil wrote: > On 17/02/2021 16:11, Sean Young wrote: >> Hi, >> >> On Wed, Feb 17, 2021 at 04:04:11PM +0100, Hans de Goede wrote: >>> On 2/17/21 3:32 PM, Sean Young wrote: On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: > Hi Hans, > > On 17/02/2021 13:24, Hans de Goede wrote: >> >> >> Hi Hans, >> >> Fedora has a (opt-in) system to automatically collect backtraces from >> software >> crashing on users systems. >> >> This includes collecting kernel backtraces (including once triggered by >> WARN macros) while looking a the top 10 of the most reported backtrace >> during the >> last 2 weeks report from ABRT: >> https://retrace.fedoraproject.org/faf/problems/ >> >> I noticed the following backtrace: >> https://retrace.fedoraproject.org/faf/problems/8150/ >> which has been reported 17 times by Fedora users who have opted-in >> during the >> last 14 days. >> >> The issue here is that cec_register_adapter ends up calling >> request_module() >> from an async context, triggering this warn in kernel/kmod.c >> __request_module(): >> >> /* >> * We don't allow synchronous module loading from async. Module >> * init may invoke async_synchronize_full() which will end up >> * waiting for this task which already is waiting for the module >> * loading to complete, leading to a deadlock. >> */ >> WARN_ON_ONCE(wait && current_is_async()); >> >> The call-path leading to this goes like this: >> >> ? kvasprintf+0x6d/0xa0 >> ? kobject_set_name_vargs+0x6f/0x90 >> rc_map_get+0x30/0x60 > > It's not CEC, it is rc_map_get that calls request_module() for rc-cec.ko. > > I've added Sean Young to the CC list. > > Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is set? > > I think this issue is very specific to CEC. I would not expect to see this > with any other rc keymap. So CEC creates an RC device with a keymap (cec keymap, of course) and then the keymap needs to be loaded. We certainly don't want all keymaps as builtins, that would be a waste. The cec keymap is scanned once to build a map from cec codes to linux keycodes; making it builtin is not ideal, and makes the build system a bit messy. I don't think we can load the keymap later, user space may start remapping the keymap from udev. Possibly we could create the cec or rc device later but this could be a bit messy. Could CEC specify: #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) MODULE_SOFTDEP("rc-cec") #endif >>> >>> That would need to be: >>> >>> MODULE_SOFTDEP("pre: rc-cec") >>> >>> I see that the drm_kms_helper and i915 drivers both depend on the cec >>> module already, >>> so yes if that module will request for rc-cec to be loaded before it is >>> loaded >>> (and thus before i915 is loaded) then that should work around this. >>> >>> Assuming the user is using a module-loader which honors the softdep... >>> >>> Also this assumes that rc_map_get is smart enough to not call >>> request_module() >>> if the module is already loaded, is that the case ? >> >> Yes, see rc_map_get(). > > I tried this. It works if CONFIG_RC_CORE is set to m, but setting it to > y resulted in the same problem. It looks like MODULE_SOFTDEP only works if > rc_main > is a module as well. Yeah that is a known limit of module softdeps, they only work inside modules ... Still, assuming there is no easy other fix, we could still use this somehow. I do see that at least Fedora actually has CONFIG_RC_CORE=y for some reason. I guess we could maybe add the softdep to the CONFIG_RC_MAP module or maybe to the module which contains the code enabled by CONFIG_DRM_DP_CEC ? At least Fedora has all drm stuff as modules and also has CONFIG_RC_MAP=m, I know this is not a real fix but a workaround to get rid of 170,000 backtraces / 14 days being reported by (opted-in) systems running the Fedora generic kernel config would be welcome regardless of it being the "perfect" fix. Regards, Hans ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi, On Wed, Feb 17, 2021 at 05:29:46PM +0100, Hans Verkuil wrote: > On 17/02/2021 16:11, Sean Young wrote: > > On Wed, Feb 17, 2021 at 04:04:11PM +0100, Hans de Goede wrote: > >> On 2/17/21 3:32 PM, Sean Young wrote: > >>> On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: > Hi Hans, > > On 17/02/2021 13:24, Hans de Goede wrote: > > > > > > Hi Hans, > > > > Fedora has a (opt-in) system to automatically collect backtraces from > > software > > crashing on users systems. > > > > This includes collecting kernel backtraces (including once triggered by > > WARN macros) while looking a the top 10 of the most reported backtrace > > during the > > last 2 weeks report from ABRT: > > https://retrace.fedoraproject.org/faf/problems/ > > > > I noticed the following backtrace: > > https://retrace.fedoraproject.org/faf/problems/8150/ > > which has been reported 17 times by Fedora users who have opted-in > > during the > > last 14 days. > > > > The issue here is that cec_register_adapter ends up calling > > request_module() > > from an async context, triggering this warn in kernel/kmod.c > > __request_module(): > > > > /* > > * We don't allow synchronous module loading from async. Module > > * init may invoke async_synchronize_full() which will end up > > * waiting for this task which already is waiting for the module > > * loading to complete, leading to a deadlock. > > */ > > WARN_ON_ONCE(wait && current_is_async()); > > > > The call-path leading to this goes like this: > > > > ? kvasprintf+0x6d/0xa0 > > ? kobject_set_name_vargs+0x6f/0x90 > > rc_map_get+0x30/0x60 > > It's not CEC, it is rc_map_get that calls request_module() for rc-cec.ko. > > I've added Sean Young to the CC list. > > Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is > set? > > I think this issue is very specific to CEC. I would not expect to see > this > with any other rc keymap. > >>> > >>> So CEC creates an RC device with a keymap (cec keymap, of course) and then > >>> the keymap needs to be loaded. We certainly don't want all keymaps as > >>> builtins, that would be a waste. > >>> > >>> The cec keymap is scanned once to build a map from cec codes to linux > >>> keycodes; making it builtin is not ideal, and makes the build system a > >>> bit messy. > >>> > >>> I don't think we can load the keymap later, user space may start remapping > >>> the keymap from udev. > >>> > >>> Possibly we could create the cec or rc device later but this could be a > >>> bit > >>> messy. > >>> > >>> Could CEC specify: > >>> > >>> #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) > >>> MODULE_SOFTDEP("rc-cec") > >>> #endif > >> > >> That would need to be: > >> > >> MODULE_SOFTDEP("pre: rc-cec") > >> > >> I see that the drm_kms_helper and i915 drivers both depend on the cec > >> module already, > >> so yes if that module will request for rc-cec to be loaded before it is > >> loaded > >> (and thus before i915 is loaded) then that should work around this. > >> > >> Assuming the user is using a module-loader which honors the softdep... > >> > >> Also this assumes that rc_map_get is smart enough to not call > >> request_module() > >> if the module is already loaded, is that the case ? > > > > Yes, see rc_map_get(). > > I tried this. It works if CONFIG_RC_CORE is set to m, but setting it to > y resulted in the same problem. It looks like MODULE_SOFTDEP only works if > rc_main > is a module as well. Hmm, I'm not quite sure what is happening here. How can I reproduce this issue locally? Sean ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi, On Wed, Feb 17, 2021 at 04:04:11PM +0100, Hans de Goede wrote: > On 2/17/21 3:32 PM, Sean Young wrote: > > On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: > >> Hi Hans, > >> > >> On 17/02/2021 13:24, Hans de Goede wrote: > >>> > >>> > >>> Hi Hans, > >>> > >>> Fedora has a (opt-in) system to automatically collect backtraces from > >>> software > >>> crashing on users systems. > >>> > >>> This includes collecting kernel backtraces (including once triggered by > >>> WARN macros) while looking a the top 10 of the most reported backtrace > >>> during the > >>> last 2 weeks report from ABRT: > >>> https://retrace.fedoraproject.org/faf/problems/ > >>> > >>> I noticed the following backtrace: > >>> https://retrace.fedoraproject.org/faf/problems/8150/ > >>> which has been reported 17 times by Fedora users who have opted-in > >>> during the > >>> last 14 days. > >>> > >>> The issue here is that cec_register_adapter ends up calling > >>> request_module() > >>> from an async context, triggering this warn in kernel/kmod.c > >>> __request_module(): > >>> > >>> /* > >>> * We don't allow synchronous module loading from async. Module > >>> * init may invoke async_synchronize_full() which will end up > >>> * waiting for this task which already is waiting for the module > >>> * loading to complete, leading to a deadlock. > >>> */ > >>> WARN_ON_ONCE(wait && current_is_async()); > >>> > >>> The call-path leading to this goes like this: > >>> > >>> ? kvasprintf+0x6d/0xa0 > >>> ? kobject_set_name_vargs+0x6f/0x90 > >>> rc_map_get+0x30/0x60 > >> > >> It's not CEC, it is rc_map_get that calls request_module() for rc-cec.ko. > >> > >> I've added Sean Young to the CC list. > >> > >> Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is set? > >> > >> I think this issue is very specific to CEC. I would not expect to see this > >> with any other rc keymap. > > > > So CEC creates an RC device with a keymap (cec keymap, of course) and then > > the keymap needs to be loaded. We certainly don't want all keymaps as > > builtins, that would be a waste. > > > > The cec keymap is scanned once to build a map from cec codes to linux > > keycodes; making it builtin is not ideal, and makes the build system a > > bit messy. > > > > I don't think we can load the keymap later, user space may start remapping > > the keymap from udev. > > > > Possibly we could create the cec or rc device later but this could be a bit > > messy. > > > > Could CEC specify: > > > > #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) > > MODULE_SOFTDEP("rc-cec") > > #endif > > That would need to be: > > MODULE_SOFTDEP("pre: rc-cec") > > I see that the drm_kms_helper and i915 drivers both depend on the cec module > already, > so yes if that module will request for rc-cec to be loaded before it is loaded > (and thus before i915 is loaded) then that should work around this. > > Assuming the user is using a module-loader which honors the softdep... > > Also this assumes that rc_map_get is smart enough to not call request_module() > if the module is already loaded, is that the case ? Yes, see rc_map_get(). Thanks, Sean ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: > Hi Hans, > > On 17/02/2021 13:24, Hans de Goede wrote: > > > > > > Hi Hans, > > > > Fedora has a (opt-in) system to automatically collect backtraces from > > software > > crashing on users systems. > > > > This includes collecting kernel backtraces (including once triggered by > > WARN macros) while looking a the top 10 of the most reported backtrace > > during the > > last 2 weeks report from ABRT: > > https://retrace.fedoraproject.org/faf/problems/ > > > > I noticed the following backtrace: > > https://retrace.fedoraproject.org/faf/problems/8150/ > > which has been reported 17 times by Fedora users who have opted-in > > during the > > last 14 days. > > > > The issue here is that cec_register_adapter ends up calling request_module() > > from an async context, triggering this warn in kernel/kmod.c > > __request_module(): > > > > /* > > * We don't allow synchronous module loading from async. Module > > * init may invoke async_synchronize_full() which will end up > > * waiting for this task which already is waiting for the module > > * loading to complete, leading to a deadlock. > > */ > > WARN_ON_ONCE(wait && current_is_async()); > > > > The call-path leading to this goes like this: > > > > ? kvasprintf+0x6d/0xa0 > > ? kobject_set_name_vargs+0x6f/0x90 > > rc_map_get+0x30/0x60 > > It's not CEC, it is rc_map_get that calls request_module() for rc-cec.ko. > > I've added Sean Young to the CC list. > > Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is set? > > I think this issue is very specific to CEC. I would not expect to see this > with any other rc keymap. So CEC creates an RC device with a keymap (cec keymap, of course) and then the keymap needs to be loaded. We certainly don't want all keymaps as builtins, that would be a waste. The cec keymap is scanned once to build a map from cec codes to linux keycodes; making it builtin is not ideal, and makes the build system a bit messy. I don't think we can load the keymap later, user space may start remapping the keymap from udev. Possibly we could create the cec or rc device later but this could be a bit messy. Could CEC specify: #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) MODULE_SOFTDEP("rc-cec") #endif ? Sean > > Regards, > > Hans > > > rc_register_device+0x108/0x510 > > cec_register_adapter+0x5c/0x280 [cec] > > drm_dp_cec_set_edid+0x11e/0x178 [drm_kms_helper] > > intel_dp_set_edid+0x8d/0xc0 [i915] > > intel_dp_detect+0x188/0x5c0 [i915] > > drm_helper_probe_single_connector_modes+0xc2/0x6d0 [drm_kms_helper] > > ? krealloc+0x7b/0xb0 > > drm_client_modeset_probe+0x25b/0x1320 [drm] > > ? kfree+0x1ea/0x200 > > ? sched_clock+0x5/0x10 > > ? sched_clock_cpu+0xc/0xa0 > > __drm_fb_helper_initial_config_and_unlock+0x37/0x470 [drm_kms_helper] > > ? _cond_resched+0x16/0x40 > > intel_fbdev_initial_config+0x14/0x30 [i915] > > async_run_entry_fn+0x39/0x160 > > > > So 2 questions: > > > > 1. Can we get this fixed please ? > >Related to this, what happens if we make this an async modprobe > >(when running from async context) is that a problem, or is it fine > >if the rc_map module gets loaded later ? > > > > 2. If the answer to 1. is "tricky", "maybe" or some such then can we > > look into a workaround here ? E.g. do we know in advance which module > > is going to be requested (1), or does that depend on the EDID data ? > > > > Regards, > > > > Hans > > > > > > 1) And can we thus do tricks with a softdep on it ? > > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
On 18/02/2021 09:52, Sean Young wrote: > Hi, > > On Wed, Feb 17, 2021 at 05:29:46PM +0100, Hans Verkuil wrote: >> On 17/02/2021 16:11, Sean Young wrote: >>> On Wed, Feb 17, 2021 at 04:04:11PM +0100, Hans de Goede wrote: On 2/17/21 3:32 PM, Sean Young wrote: > On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: >> Hi Hans, >> >> On 17/02/2021 13:24, Hans de Goede wrote: >>> >>> >>> Hi Hans, >>> >>> Fedora has a (opt-in) system to automatically collect backtraces from >>> software >>> crashing on users systems. >>> >>> This includes collecting kernel backtraces (including once triggered by >>> WARN macros) while looking a the top 10 of the most reported backtrace >>> during the >>> last 2 weeks report from ABRT: >>> https://retrace.fedoraproject.org/faf/problems/ >>> >>> I noticed the following backtrace: >>> https://retrace.fedoraproject.org/faf/problems/8150/ >>> which has been reported 17 times by Fedora users who have opted-in >>> during the >>> last 14 days. >>> >>> The issue here is that cec_register_adapter ends up calling >>> request_module() >>> from an async context, triggering this warn in kernel/kmod.c >>> __request_module(): >>> >>> /* >>> * We don't allow synchronous module loading from async. Module >>> * init may invoke async_synchronize_full() which will end up >>> * waiting for this task which already is waiting for the module >>> * loading to complete, leading to a deadlock. >>> */ >>> WARN_ON_ONCE(wait && current_is_async()); >>> >>> The call-path leading to this goes like this: >>> >>> ? kvasprintf+0x6d/0xa0 >>> ? kobject_set_name_vargs+0x6f/0x90 >>> rc_map_get+0x30/0x60 >> >> It's not CEC, it is rc_map_get that calls request_module() for rc-cec.ko. >> >> I've added Sean Young to the CC list. >> >> Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is >> set? >> >> I think this issue is very specific to CEC. I would not expect to see >> this >> with any other rc keymap. > > So CEC creates an RC device with a keymap (cec keymap, of course) and then > the keymap needs to be loaded. We certainly don't want all keymaps as > builtins, that would be a waste. > > The cec keymap is scanned once to build a map from cec codes to linux > keycodes; making it builtin is not ideal, and makes the build system a > bit messy. > > I don't think we can load the keymap later, user space may start remapping > the keymap from udev. > > Possibly we could create the cec or rc device later but this could be a > bit > messy. > > Could CEC specify: > > #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) > MODULE_SOFTDEP("rc-cec") > #endif That would need to be: MODULE_SOFTDEP("pre: rc-cec") I see that the drm_kms_helper and i915 drivers both depend on the cec module already, so yes if that module will request for rc-cec to be loaded before it is loaded (and thus before i915 is loaded) then that should work around this. Assuming the user is using a module-loader which honors the softdep... Also this assumes that rc_map_get is smart enough to not call request_module() if the module is already loaded, is that the case ? >>> >>> Yes, see rc_map_get(). >> >> I tried this. It works if CONFIG_RC_CORE is set to m, but setting it to >> y resulted in the same problem. It looks like MODULE_SOFTDEP only works if >> rc_main >> is a module as well. > > Hmm, I'm not quite sure what is happening here. How can I reproduce this > issue locally? You need the right hardware for this, I'm afraid: this issue happens if you have a DisplayPort-to-HDMI adapter that supports CEC and CONFIG_DRM_DP_CEC is set to y. In my case I have an Intel NUC with a USB-C to HDMI adapter from Club 3D (CAC-2504). I can easily test patches for you. Regards, Hans ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
On 17/02/2021 16:11, Sean Young wrote: > Hi, > > On Wed, Feb 17, 2021 at 04:04:11PM +0100, Hans de Goede wrote: >> On 2/17/21 3:32 PM, Sean Young wrote: >>> On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: Hi Hans, On 17/02/2021 13:24, Hans de Goede wrote: > > > Hi Hans, > > Fedora has a (opt-in) system to automatically collect backtraces from > software > crashing on users systems. > > This includes collecting kernel backtraces (including once triggered by > WARN macros) while looking a the top 10 of the most reported backtrace > during the > last 2 weeks report from ABRT: > https://retrace.fedoraproject.org/faf/problems/ > > I noticed the following backtrace: > https://retrace.fedoraproject.org/faf/problems/8150/ > which has been reported 17 times by Fedora users who have opted-in > during the > last 14 days. > > The issue here is that cec_register_adapter ends up calling > request_module() > from an async context, triggering this warn in kernel/kmod.c > __request_module(): > > /* > * We don't allow synchronous module loading from async. Module > * init may invoke async_synchronize_full() which will end up > * waiting for this task which already is waiting for the module > * loading to complete, leading to a deadlock. > */ > WARN_ON_ONCE(wait && current_is_async()); > > The call-path leading to this goes like this: > > ? kvasprintf+0x6d/0xa0 > ? kobject_set_name_vargs+0x6f/0x90 > rc_map_get+0x30/0x60 It's not CEC, it is rc_map_get that calls request_module() for rc-cec.ko. I've added Sean Young to the CC list. Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is set? I think this issue is very specific to CEC. I would not expect to see this with any other rc keymap. >>> >>> So CEC creates an RC device with a keymap (cec keymap, of course) and then >>> the keymap needs to be loaded. We certainly don't want all keymaps as >>> builtins, that would be a waste. >>> >>> The cec keymap is scanned once to build a map from cec codes to linux >>> keycodes; making it builtin is not ideal, and makes the build system a >>> bit messy. >>> >>> I don't think we can load the keymap later, user space may start remapping >>> the keymap from udev. >>> >>> Possibly we could create the cec or rc device later but this could be a bit >>> messy. >>> >>> Could CEC specify: >>> >>> #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) >>> MODULE_SOFTDEP("rc-cec") >>> #endif >> >> That would need to be: >> >> MODULE_SOFTDEP("pre: rc-cec") >> >> I see that the drm_kms_helper and i915 drivers both depend on the cec module >> already, >> so yes if that module will request for rc-cec to be loaded before it is >> loaded >> (and thus before i915 is loaded) then that should work around this. >> >> Assuming the user is using a module-loader which honors the softdep... >> >> Also this assumes that rc_map_get is smart enough to not call >> request_module() >> if the module is already loaded, is that the case ? > > Yes, see rc_map_get(). I tried this. It works if CONFIG_RC_CORE is set to m, but setting it to y resulted in the same problem. It looks like MODULE_SOFTDEP only works if rc_main is a module as well. It was a good idea, though :-) Regards, Hans ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi, On 2/17/21 3:32 PM, Sean Young wrote: > On Wed, Feb 17, 2021 at 01:41:46PM +0100, Hans Verkuil wrote: >> Hi Hans, >> >> On 17/02/2021 13:24, Hans de Goede wrote: >>> >>> >>> Hi Hans, >>> >>> Fedora has a (opt-in) system to automatically collect backtraces from >>> software >>> crashing on users systems. >>> >>> This includes collecting kernel backtraces (including once triggered by >>> WARN macros) while looking a the top 10 of the most reported backtrace >>> during the >>> last 2 weeks report from ABRT: >>> https://retrace.fedoraproject.org/faf/problems/ >>> >>> I noticed the following backtrace: >>> https://retrace.fedoraproject.org/faf/problems/8150/ >>> which has been reported 17 times by Fedora users who have opted-in >>> during the >>> last 14 days. >>> >>> The issue here is that cec_register_adapter ends up calling request_module() >>> from an async context, triggering this warn in kernel/kmod.c >>> __request_module(): >>> >>> /* >>> * We don't allow synchronous module loading from async. Module >>> * init may invoke async_synchronize_full() which will end up >>> * waiting for this task which already is waiting for the module >>> * loading to complete, leading to a deadlock. >>> */ >>> WARN_ON_ONCE(wait && current_is_async()); >>> >>> The call-path leading to this goes like this: >>> >>> ? kvasprintf+0x6d/0xa0 >>> ? kobject_set_name_vargs+0x6f/0x90 >>> rc_map_get+0x30/0x60 >> >> It's not CEC, it is rc_map_get that calls request_module() for rc-cec.ko. >> >> I've added Sean Young to the CC list. >> >> Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is set? >> >> I think this issue is very specific to CEC. I would not expect to see this >> with any other rc keymap. > > So CEC creates an RC device with a keymap (cec keymap, of course) and then > the keymap needs to be loaded. We certainly don't want all keymaps as > builtins, that would be a waste. > > The cec keymap is scanned once to build a map from cec codes to linux > keycodes; making it builtin is not ideal, and makes the build system a > bit messy. > > I don't think we can load the keymap later, user space may start remapping > the keymap from udev. > > Possibly we could create the cec or rc device later but this could be a bit > messy. > > Could CEC specify: > > #if IS_ENABLED(CONFIG_MEDIA_CEC_RC) > MODULE_SOFTDEP("rc-cec") > #endif That would need to be: MODULE_SOFTDEP("pre: rc-cec") I see that the drm_kms_helper and i915 drivers both depend on the cec module already, so yes if that module will request for rc-cec to be loaded before it is loaded (and thus before i915 is loaded) then that should work around this. Assuming the user is using a module-loader which honors the softdep... Also this assumes that rc_map_get is smart enough to not call request_module() if the module is already loaded, is that the case ? Regards, Hans >>> rc_register_device+0x108/0x510 >>> cec_register_adapter+0x5c/0x280 [cec] >>> drm_dp_cec_set_edid+0x11e/0x178 [drm_kms_helper] >>> intel_dp_set_edid+0x8d/0xc0 [i915] >>> intel_dp_detect+0x188/0x5c0 [i915] >>> drm_helper_probe_single_connector_modes+0xc2/0x6d0 [drm_kms_helper] >>> ? krealloc+0x7b/0xb0 >>> drm_client_modeset_probe+0x25b/0x1320 [drm] >>> ? kfree+0x1ea/0x200 >>> ? sched_clock+0x5/0x10 >>> ? sched_clock_cpu+0xc/0xa0 >>> __drm_fb_helper_initial_config_and_unlock+0x37/0x470 [drm_kms_helper] >>> ? _cond_resched+0x16/0x40 >>> intel_fbdev_initial_config+0x14/0x30 [i915] >>> async_run_entry_fn+0x39/0x160 >>> >>> So 2 questions: >>> >>> 1. Can we get this fixed please ? >>>Related to this, what happens if we make this an async modprobe >>>(when running from async context) is that a problem, or is it fine >>>if the rc_map module gets loaded later ? >>> >>> 2. If the answer to 1. is "tricky", "maybe" or some such then can we >>> look into a workaround here ? E.g. do we know in advance which module >>> is going to be requested (1), or does that depend on the EDID data ? >>> >>> Regards, >>> >>> Hans >>> >>> >>> 1) And can we thus do tricks with a softdep on it ? >>> > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi Hans, On 17/02/2021 13:24, Hans de Goede wrote: > > > Hi Hans, > > Fedora has a (opt-in) system to automatically collect backtraces from software > crashing on users systems. > > This includes collecting kernel backtraces (including once triggered by > WARN macros) while looking a the top 10 of the most reported backtrace during > the > last 2 weeks report from ABRT: https://retrace.fedoraproject.org/faf/problems/ > > I noticed the following backtrace: > https://retrace.fedoraproject.org/faf/problems/8150/ > which has been reported 17 times by Fedora users who have opted-in during > the > last 14 days. > > The issue here is that cec_register_adapter ends up calling request_module() > from an async context, triggering this warn in kernel/kmod.c > __request_module(): > > /* > * We don't allow synchronous module loading from async. Module > * init may invoke async_synchronize_full() which will end up > * waiting for this task which already is waiting for the module > * loading to complete, leading to a deadlock. > */ > WARN_ON_ONCE(wait && current_is_async()); > > The call-path leading to this goes like this: > > ? kvasprintf+0x6d/0xa0 > ? kobject_set_name_vargs+0x6f/0x90 > rc_map_get+0x30/0x60 It's not CEC, it is rc_map_get that calls request_module() for rc-cec.ko. I've added Sean Young to the CC list. Sean, is it possible to treat rc-cec as a built-in if MEDIA_CEC_RC is set? I think this issue is very specific to CEC. I would not expect to see this with any other rc keymap. Regards, Hans > rc_register_device+0x108/0x510 > cec_register_adapter+0x5c/0x280 [cec] > drm_dp_cec_set_edid+0x11e/0x178 [drm_kms_helper] > intel_dp_set_edid+0x8d/0xc0 [i915] > intel_dp_detect+0x188/0x5c0 [i915] > drm_helper_probe_single_connector_modes+0xc2/0x6d0 [drm_kms_helper] > ? krealloc+0x7b/0xb0 > drm_client_modeset_probe+0x25b/0x1320 [drm] > ? kfree+0x1ea/0x200 > ? sched_clock+0x5/0x10 > ? sched_clock_cpu+0xc/0xa0 > __drm_fb_helper_initial_config_and_unlock+0x37/0x470 [drm_kms_helper] > ? _cond_resched+0x16/0x40 > intel_fbdev_initial_config+0x14/0x30 [i915] > async_run_entry_fn+0x39/0x160 > > So 2 questions: > > 1. Can we get this fixed please ? >Related to this, what happens if we make this an async modprobe >(when running from async context) is that a problem, or is it fine >if the rc_map module gets loaded later ? > > 2. If the answer to 1. is "tricky", "maybe" or some such then can we > look into a workaround here ? E.g. do we know in advance which module > is going to be requested (1), or does that depend on the EDID data ? > > Regards, > > Hans > > > 1) And can we thus do tricks with a softdep on it ? > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi Hans, Fedora has a (opt-in) system to automatically collect backtraces from software crashing on users systems. This includes collecting kernel backtraces (including once triggered by WARN macros) while looking a the top 10 of the most reported backtrace during the last 2 weeks report from ABRT: https://retrace.fedoraproject.org/faf/problems/ I noticed the following backtrace: https://retrace.fedoraproject.org/faf/problems/8150/ which has been reported 17 times by Fedora users who have opted-in during the last 14 days. The issue here is that cec_register_adapter ends up calling request_module() from an async context, triggering this warn in kernel/kmod.c __request_module(): /* * We don't allow synchronous module loading from async. Module * init may invoke async_synchronize_full() which will end up * waiting for this task which already is waiting for the module * loading to complete, leading to a deadlock. */ WARN_ON_ONCE(wait && current_is_async()); The call-path leading to this goes like this: ? kvasprintf+0x6d/0xa0 ? kobject_set_name_vargs+0x6f/0x90 rc_map_get+0x30/0x60 rc_register_device+0x108/0x510 cec_register_adapter+0x5c/0x280 [cec] drm_dp_cec_set_edid+0x11e/0x178 [drm_kms_helper] intel_dp_set_edid+0x8d/0xc0 [i915] intel_dp_detect+0x188/0x5c0 [i915] drm_helper_probe_single_connector_modes+0xc2/0x6d0 [drm_kms_helper] ? krealloc+0x7b/0xb0 drm_client_modeset_probe+0x25b/0x1320 [drm] ? kfree+0x1ea/0x200 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0xc/0xa0 __drm_fb_helper_initial_config_and_unlock+0x37/0x470 [drm_kms_helper] ? _cond_resched+0x16/0x40 intel_fbdev_initial_config+0x14/0x30 [i915] async_run_entry_fn+0x39/0x160 So 2 questions: 1. Can we get this fixed please ? Related to this, what happens if we make this an async modprobe (when running from async context) is that a problem, or is it fine if the rc_map module gets loaded later ? 2. If the answer to 1. is "tricky", "maybe" or some such then can we look into a workaround here ? E.g. do we know in advance which module is going to be requested (1), or does that depend on the EDID data ? Regards, Hans 1) And can we thus do tricks with a softdep on it ? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Issue with cec_register_adapter calling request_module() from an async context when called from intel_dp_detect
Hi Hans, Fedora has a (opt-in) system to automatically collect backtraces from software crashing on users systems. This includes collecting kernel backtraces (including once triggered by WARN macros) while looking a the top 10 of the most reported backtrace during the last 2 weeks report from ABRT: https://retrace.fedoraproject.org/faf/problems/ I noticed the following backtrace: https://retrace.fedoraproject.org/faf/problems/8150/ which has been reported 17 times by Fedora users who have opted-in during the last 14 days. The issue here is that cec_register_adapter ends up calling request_module() from an async context, triggering this warn in kernel/kmod.c __request_module(): /* * We don't allow synchronous module loading from async. Module * init may invoke async_synchronize_full() which will end up * waiting for this task which already is waiting for the module * loading to complete, leading to a deadlock. */ WARN_ON_ONCE(wait && current_is_async()); The call-path leading to this goes like this: ? kvasprintf+0x6d/0xa0 ? kobject_set_name_vargs+0x6f/0x90 rc_map_get+0x30/0x60 rc_register_device+0x108/0x510 cec_register_adapter+0x5c/0x280 [cec] drm_dp_cec_set_edid+0x11e/0x178 [drm_kms_helper] intel_dp_set_edid+0x8d/0xc0 [i915] intel_dp_detect+0x188/0x5c0 [i915] drm_helper_probe_single_connector_modes+0xc2/0x6d0 [drm_kms_helper] ? krealloc+0x7b/0xb0 drm_client_modeset_probe+0x25b/0x1320 [drm] ? kfree+0x1ea/0x200 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0xc/0xa0 __drm_fb_helper_initial_config_and_unlock+0x37/0x470 [drm_kms_helper] ? _cond_resched+0x16/0x40 intel_fbdev_initial_config+0x14/0x30 [i915] async_run_entry_fn+0x39/0x160 So 2 questions: 1. Can we get this fixed please ? Related to this, what happens if we make this an async modprobe (when running from async context) is that a problem, or is it fine if the rc_map module gets loaded later ? 2. If the answer to 1. is "tricky", "maybe" or some such then can we look into a workaround here ? E.g. do we know in advance which module is going to be requested (1), or does that depend on the EDID data ? Regards, Hans 1) And can we thus do tricks with a softdep on it ? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx