Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration
On Mon, Mar 25, 2019 at 03:05:33PM +, Mark Brown wrote: > On Mon, Mar 25, 2019 at 07:21:00AM -0700, Guenter Roeck wrote: > > > It is actually a bit more complicated than that. The stored pointer > > (drv->soc_card) > > isn't released. The problem is that dev_get_drvdata(drv->soc_card->dev) is > > NULL, > > which causes the crash. I don't think there is a UAF involved - I built the > > test image with KASAN enabled and it did not barf at me. > > What is a "UAF"? > use-after-free. Sorry, I saw that term used so often recently that I somehow thought it was common and started using it myself. Guenter > > Overall the implementation does seem a bit suspicious to me. I don't really > > understand why the platform driver handles suspend/resume for the cards. > > But that may just be my lack of understanding. However, either case, I > > think the > > Haswell driver (sst-haswell-pcm.c) has a similar problem. I am also not > > sure if > > It's certainly a bit unusual, usually the platform driver would just > deal with suspending itself and the card driver would handle overall > card suspension together with the core.
Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration
On Mon, Mar 25, 2019 at 07:21:00AM -0700, Guenter Roeck wrote: > It is actually a bit more complicated than that. The stored pointer > (drv->soc_card) > isn't released. The problem is that dev_get_drvdata(drv->soc_card->dev) is > NULL, > which causes the crash. I don't think there is a UAF involved - I built the > test image with KASAN enabled and it did not barf at me. What is a "UAF"? > Overall the implementation does seem a bit suspicious to me. I don't really > understand why the platform driver handles suspend/resume for the cards. > But that may just be my lack of understanding. However, either case, I think > the > Haswell driver (sst-haswell-pcm.c) has a similar problem. I am also not sure > if It's certainly a bit unusual, usually the platform driver would just deal with suspending itself and the card driver would handle overall card suspension together with the core. signature.asc Description: PGP signature
Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration
On Mon, Mar 25, 2019 at 09:18:04AM -0400, Pierre-Louis Bossart wrote: > On 3/25/19 8:12 AM, Mark Brown wrote: > > These are driver specific issues not device model issues as far as I can > > see? The issue fixed by this as is that you're storing a pointer in the > > ASoC level (not device model level) probe that you don't free when the > > component is unbound, causing you to dereference it later during > > suspend. There is absolutely no problem with the machine driver not > > being guaranteed to bind at the time it's initially registered, that's > > perfectly normal and should cause no problems. > Agree, what I was referring is that if the machine probe and card > registration fails (not just deferred), the parent acpi/pci driver isn't > notified - there is just no means to provide that information and that leads > to all kinds of configuration issues. If there are issues here they could happen via means other than a probe failing so there's a problem whatever is going on - someone manually unbinding a device for example. signature.asc Description: PGP signature
Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration
On 3/25/19 5:12 AM, Mark Brown wrote: On Sat, Mar 23, 2019 at 09:55:46AM -0400, Pierre-Louis Bossart wrote: I'd like to highlight that there is a fundamental flaw in the way the machine drivers are handled. Since we don't have a hook for the machine driver in the BIOS, the DSP driver creates a platform_device which will instantiate the machine driver. When errors happen in the machine driver probe, they are suppressed due to a 'feature' of the device model, so you can end-up with a broken configuration that is still reported as a successful strobe. These are driver specific issues not device model issues as far as I can see? The issue fixed by this as is that you're storing a pointer in the ASoC level (not device model level) probe that you don't free when the component is unbound, causing you to dereference it later during suspend. There is absolutely no problem with the machine driver not being guaranteed to bind at the time it's initially registered, that's perfectly normal and should cause no problems. It is actually a bit more complicated than that. The stored pointer (drv->soc_card) isn't released. The problem is that dev_get_drvdata(drv->soc_card->dev) is NULL, which causes the crash. I don't think there is a UAF involved - I built the test image with KASAN enabled and it did not barf at me. It may of course well be that there _should_ be a UAF but it doesn't happen because some pointer that should be released isn't released due to some memory or reference count leak. But that would be a different problem. Overall the implementation does seem a bit suspicious to me. I don't really understand why the platform driver handles suspend/resume for the cards. But that may just be my lack of understanding. However, either case, I think the Haswell driver (sst-haswell-pcm.c) has a similar problem. I am also not sure if there are more problems lurking - I see a similar but different crash in v4.4.y but have not been able to track it down. Actually, I found the problem fixed here while trying to reproduce that crash with the latest kernel. Guenter
Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration
On 3/25/19 8:12 AM, Mark Brown wrote: On Sat, Mar 23, 2019 at 09:55:46AM -0400, Pierre-Louis Bossart wrote: I'd like to highlight that there is a fundamental flaw in the way the machine drivers are handled. Since we don't have a hook for the machine driver in the BIOS, the DSP driver creates a platform_device which will instantiate the machine driver. When errors happen in the machine driver probe, they are suppressed due to a 'feature' of the device model, so you can end-up with a broken configuration that is still reported as a successful strobe. These are driver specific issues not device model issues as far as I can see? The issue fixed by this as is that you're storing a pointer in the ASoC level (not device model level) probe that you don't free when the component is unbound, causing you to dereference it later during suspend. There is absolutely no problem with the machine driver not being guaranteed to bind at the time it's initially registered, that's perfectly normal and should cause no problems. Agree, what I was referring is that if the machine probe and card registration fails (not just deferred), the parent acpi/pci driver isn't notified - there is just no means to provide that information and that leads to all kinds of configuration issues.
Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration
On Sat, Mar 23, 2019 at 09:55:46AM -0400, Pierre-Louis Bossart wrote: > I'd like to highlight that there is a fundamental flaw in the way the > machine drivers are handled. Since we don't have a hook for the machine > driver in the BIOS, the DSP driver creates a platform_device which will > instantiate the machine driver. When errors happen in the machine driver > probe, they are suppressed due to a 'feature' of the device model, so you > can end-up with a broken configuration that is still reported as a > successful strobe. These are driver specific issues not device model issues as far as I can see? The issue fixed by this as is that you're storing a pointer in the ASoC level (not device model level) probe that you don't free when the component is unbound, causing you to dereference it later during suspend. There is absolutely no problem with the machine driver not being guaranteed to bind at the time it's initially registered, that's perfectly normal and should cause no problems. signature.asc Description: PGP signature
Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration
On 3/22/19 6:39 PM, Guenter Roeck wrote: If codec registration fails after the ASoC Intel SST driver has been probed, the kernel will Oops and crash at suspend/resume. general protection fault: [#1] PREEMPT SMP KASAN PTI CPU: 1 PID: 2811 Comm: cat Tainted: GW 4.19.30 #15 Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014 RIP: 0010:snd_soc_suspend+0x5a/0xd21 Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89 fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03 <8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b RSP: 0018:888035407750 EFLAGS: 00010202 RAX: 0034 RBX: 01a0 RCX: RDX: dc00 RSI: 0008 RDI: 88805c417098 RBP: 8880354077b0 R08: dc00 R09: ed100b975718 R10: 0001 R11: 949ea4a3 R12: 11100b975746 R13: dc00 R14: 88805cba4588 R15: dc00 FS: 794a78e91b80() GS:888068d0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7bd5283ccf58 CR3: 4b7aa000 CR4: 001006e0 Call Trace: ? dpm_complete+0x67b/0x67b ? i915_gem_suspend+0x14d/0x1ad sst_soc_prepare+0x91/0x1dd ? sst_be_hw_params+0x7e/0x7e dpm_prepare+0x39a/0x88b dpm_suspend_start+0x13/0x9d suspend_devices_and_enter+0x18f/0xbd7 ? arch_suspend_enable_irqs+0x11/0x11 ? printk+0xd9/0x12d ? lock_release+0x95f/0x95f ? log_buf_vmcoreinfo_setup+0x131/0x131 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? __pm_pr_dbg+0x186/0x190 ? pm_notifier_call_chain+0x39/0x39 ? suspend_test+0x9d/0x9d pm_suspend+0x2f4/0x728 ? trace_suspend_resume+0x3da/0x3da ? lock_release+0x95f/0x95f ? kernfs_fop_write+0x19f/0x32d state_store+0xd8/0x147 ? sysfs_kf_read+0x155/0x155 kernfs_fop_write+0x23e/0x32d __vfs_write+0x108/0x608 ? vfs_read+0x2e9/0x2e9 ? rcu_read_lock_sched_held+0x140/0x22a ? __bpf_trace_rcu_utilization+0xa/0xa ? debug_smp_processor_id+0x10/0x10 ? selinux_file_permission+0x1c5/0x3c8 ? rcu_sync_lockdep_assert+0x6a/0xad ? __sb_start_write+0x129/0x2ac vfs_write+0x1aa/0x434 ksys_write+0xfe/0x1be ? __ia32_sys_read+0x82/0x82 do_syscall_64+0xcd/0x120 entry_SYSCALL_64_after_hwframe+0x49/0xbe In the observed situation, the problem is seen because the codec driver failed to probe due to a hardware problem. max98090 i2c-193C9890:00: Failed to read device revision: -1 max98090 i2c-193C9890:00: ASoC: failed to probe component -1 cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1 cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1 cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1 The problem is similar to the problem solved with commit 2fc995a87f2e ("ASoC: intel: Fix crash at suspend/resume without card registration"), but codec registration fails at a later point. At that time, the pointer checked with the above mentioned commit is already set, but it is not cleared if the device is subsequently removed. Adding a remove function to clear the pointer fixes the problem. Makes sense Acked-by: Pierre-Louis Bossart I'd like to highlight that there is a fundamental flaw in the way the machine drivers are handled. Since we don't have a hook for the machine driver in the BIOS, the DSP driver creates a platform_device which will instantiate the machine driver. When errors happen in the machine driver probe, they are suppressed due to a 'feature' of the device model, so you can end-up with a broken configuration that is still reported as a successful strobe. Cc: sta...@vger.kernel.org Cc: Jarkko Nikula Cc: Curtis Malainey Signed-off-by: Guenter Roeck --- sound/soc/intel/atom/sst-mfld-platform-pcm.c | 8 1 file changed, 8 insertions(+) diff --git a/sound/soc/intel/atom/sst-mfld-platform-pcm.c b/sound/soc/intel/atom/sst-mfld-platform-pcm.c index 08cea5b5cda9..0e8b1c5eec88 100644 --- a/sound/soc/intel/atom/sst-mfld-platform-pcm.c +++ b/sound/soc/intel/atom/sst-mfld-platform-pcm.c @@ -706,9 +706,17 @@ static int sst_soc_probe(struct snd_soc_component *component) return sst_dsp_init_v2_dpcm(component); } +static void sst_soc_remove(struct snd_soc_component *component) +{ + struct sst_data *drv = dev_get_drvdata(component->dev); + + drv->soc_card = NULL; +} + static const struct snd_soc_component_driver sst_soc_platform_drv = { .name = DRV_NAME, .probe = sst_soc_probe, + .remove = sst_soc_remove, .ops= _platform_ops, .compr_ops = _platform_compr_ops, .pcm_new= sst_pcm_new,