Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration

2019-03-25 Thread Guenter Roeck
On Mon, Mar 25, 2019 at 03:05:33PM +, Mark Brown wrote:
> On Mon, Mar 25, 2019 at 07:21:00AM -0700, Guenter Roeck wrote:
> 
> > It is actually a bit more complicated than that. The stored pointer 
> > (drv->soc_card)
> > isn't released. The problem is that dev_get_drvdata(drv->soc_card->dev) is 
> > NULL,
> > which causes the crash. I don't think there is a UAF involved - I built the
> > test image with KASAN enabled and it did not barf at me.
> 
> What is a "UAF"?
> 

use-after-free. Sorry, I saw that term used so often recently that I somehow
thought it was common and started using it myself.

Guenter

> > Overall the implementation does seem a bit suspicious to me. I don't really
> > understand why the platform driver handles suspend/resume for the cards.
> > But that may just be my lack of understanding. However, either case, I 
> > think the
> > Haswell driver (sst-haswell-pcm.c) has a similar problem. I am also not 
> > sure if
> 
> It's certainly a bit unusual, usually the platform driver would just
> deal with suspending itself and the card driver would handle overall
> card suspension together with the core.




Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration

2019-03-25 Thread Mark Brown
On Mon, Mar 25, 2019 at 07:21:00AM -0700, Guenter Roeck wrote:

> It is actually a bit more complicated than that. The stored pointer 
> (drv->soc_card)
> isn't released. The problem is that dev_get_drvdata(drv->soc_card->dev) is 
> NULL,
> which causes the crash. I don't think there is a UAF involved - I built the
> test image with KASAN enabled and it did not barf at me.

What is a "UAF"?

> Overall the implementation does seem a bit suspicious to me. I don't really
> understand why the platform driver handles suspend/resume for the cards.
> But that may just be my lack of understanding. However, either case, I think 
> the
> Haswell driver (sst-haswell-pcm.c) has a similar problem. I am also not sure 
> if

It's certainly a bit unusual, usually the platform driver would just
deal with suspending itself and the card driver would handle overall
card suspension together with the core.


signature.asc
Description: PGP signature


Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration

2019-03-25 Thread Mark Brown
On Mon, Mar 25, 2019 at 09:18:04AM -0400, Pierre-Louis Bossart wrote:
> On 3/25/19 8:12 AM, Mark Brown wrote:

> > These are driver specific issues not device model issues as far as I can
> > see?  The issue fixed by this as is that you're storing a pointer in the
> > ASoC level (not device model level) probe that you don't free when the
> > component is unbound, causing you to dereference it later during
> > suspend.  There is absolutely no problem with the machine driver not
> > being guaranteed to bind at the time it's initially registered, that's
> > perfectly normal and should cause no problems.

> Agree, what I was referring is that if the machine probe and card
> registration fails (not just deferred), the parent acpi/pci driver isn't
> notified - there is just no means to provide that information and that leads
> to all kinds of configuration issues.

If there are issues here they could happen via means other than a probe
failing so there's a problem whatever is going on - someone manually
unbinding a device for example.


signature.asc
Description: PGP signature


Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration

2019-03-25 Thread Guenter Roeck

On 3/25/19 5:12 AM, Mark Brown wrote:

On Sat, Mar 23, 2019 at 09:55:46AM -0400, Pierre-Louis Bossart wrote:


I'd like to highlight that there is a fundamental flaw in the way the
machine drivers are handled. Since we don't have a hook for the machine
driver in the BIOS, the DSP driver creates a platform_device which will
instantiate the machine driver. When errors happen in the machine driver
probe, they are suppressed due to a 'feature' of the device model, so you
can end-up with a broken configuration that is still reported as a
successful strobe.


These are driver specific issues not device model issues as far as I can
see?  The issue fixed by this as is that you're storing a pointer in the
ASoC level (not device model level) probe that you don't free when the
component is unbound, causing you to dereference it later during
suspend.  There is absolutely no problem with the machine driver not
being guaranteed to bind at the time it's initially registered, that's
perfectly normal and should cause no problems.


It is actually a bit more complicated than that. The stored pointer 
(drv->soc_card)
isn't released. The problem is that dev_get_drvdata(drv->soc_card->dev) is NULL,
which causes the crash. I don't think there is a UAF involved - I built the
test image with KASAN enabled and it did not barf at me.

It may of course well be that there _should_ be a UAF but it doesn't happen
because some pointer that should be released isn't released due to some memory
or reference count leak. But that would be a different problem.

Overall the implementation does seem a bit suspicious to me. I don't really
understand why the platform driver handles suspend/resume for the cards.
But that may just be my lack of understanding. However, either case, I think the
Haswell driver (sst-haswell-pcm.c) has a similar problem. I am also not sure if
there are more problems lurking - I see a similar but different crash in v4.4.y
but have not been able to track it down. Actually, I found the problem fixed 
here
while trying to reproduce that crash with the latest kernel.

Guenter


Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration

2019-03-25 Thread Pierre-Louis Bossart

On 3/25/19 8:12 AM, Mark Brown wrote:

On Sat, Mar 23, 2019 at 09:55:46AM -0400, Pierre-Louis Bossart wrote:


I'd like to highlight that there is a fundamental flaw in the way the
machine drivers are handled. Since we don't have a hook for the machine
driver in the BIOS, the DSP driver creates a platform_device which will
instantiate the machine driver. When errors happen in the machine driver
probe, they are suppressed due to a 'feature' of the device model, so you
can end-up with a broken configuration that is still reported as a
successful strobe.


These are driver specific issues not device model issues as far as I can
see?  The issue fixed by this as is that you're storing a pointer in the
ASoC level (not device model level) probe that you don't free when the
component is unbound, causing you to dereference it later during
suspend.  There is absolutely no problem with the machine driver not
being guaranteed to bind at the time it's initially registered, that's
perfectly normal and should cause no problems.


Agree, what I was referring is that if the machine probe and card 
registration fails (not just deferred), the parent acpi/pci driver isn't 
notified - there is just no means to provide that information and that 
leads to all kinds of configuration issues.


Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration

2019-03-25 Thread Mark Brown
On Sat, Mar 23, 2019 at 09:55:46AM -0400, Pierre-Louis Bossart wrote:

> I'd like to highlight that there is a fundamental flaw in the way the
> machine drivers are handled. Since we don't have a hook for the machine
> driver in the BIOS, the DSP driver creates a platform_device which will
> instantiate the machine driver. When errors happen in the machine driver
> probe, they are suppressed due to a 'feature' of the device model, so you
> can end-up with a broken configuration that is still reported as a
> successful strobe.

These are driver specific issues not device model issues as far as I can
see?  The issue fixed by this as is that you're storing a pointer in the
ASoC level (not device model level) probe that you don't free when the
component is unbound, causing you to dereference it later during
suspend.  There is absolutely no problem with the machine driver not
being guaranteed to bind at the time it's initially registered, that's
perfectly normal and should cause no problems.


signature.asc
Description: PGP signature


Re: [alsa-devel] [PATCH] ASoC: intel: Fix crash at suspend/resume after failed codec registration

2019-03-23 Thread Pierre-Louis Bossart

On 3/22/19 6:39 PM, Guenter Roeck wrote:

If codec registration fails after the ASoC Intel SST driver has been probed,
the kernel will Oops and crash at suspend/resume.

general protection fault:  [#1] PREEMPT SMP KASAN PTI
CPU: 1 PID: 2811 Comm: cat Tainted: GW 4.19.30 #15
Hardware name: GOOGLE Clapper, BIOS Google_Clapper.5216.199.7 08/22/2014
RIP: 0010:snd_soc_suspend+0x5a/0xd21
Code: 03 80 3c 10 00 49 89 d7 74 0b 48 89 df e8 71 72 c4 fe 4c 89
fa 48 8b 03 48 89 45 d0 48 8d 98 a0 01 00 00 48 89 d8 48 c1 e8 03
<8a> 04 10 84 c0 0f 85 85 0c 00 00 80 3b 00 0f 84 6b 0c 00 00 48 8b
RSP: 0018:888035407750 EFLAGS: 00010202
RAX: 0034 RBX: 01a0 RCX: 
RDX: dc00 RSI: 0008 RDI: 88805c417098
RBP: 8880354077b0 R08: dc00 R09: ed100b975718
R10: 0001 R11: 949ea4a3 R12: 11100b975746
R13: dc00 R14: 88805cba4588 R15: dc00
FS:  794a78e91b80() GS:888068d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7bd5283ccf58 CR3: 4b7aa000 CR4: 001006e0
Call Trace:
? dpm_complete+0x67b/0x67b
? i915_gem_suspend+0x14d/0x1ad
sst_soc_prepare+0x91/0x1dd
? sst_be_hw_params+0x7e/0x7e
dpm_prepare+0x39a/0x88b
dpm_suspend_start+0x13/0x9d
suspend_devices_and_enter+0x18f/0xbd7
? arch_suspend_enable_irqs+0x11/0x11
? printk+0xd9/0x12d
? lock_release+0x95f/0x95f
? log_buf_vmcoreinfo_setup+0x131/0x131
? rcu_read_lock_sched_held+0x140/0x22a
? __bpf_trace_rcu_utilization+0xa/0xa
? __pm_pr_dbg+0x186/0x190
? pm_notifier_call_chain+0x39/0x39
? suspend_test+0x9d/0x9d
pm_suspend+0x2f4/0x728
? trace_suspend_resume+0x3da/0x3da
? lock_release+0x95f/0x95f
? kernfs_fop_write+0x19f/0x32d
state_store+0xd8/0x147
? sysfs_kf_read+0x155/0x155
kernfs_fop_write+0x23e/0x32d
__vfs_write+0x108/0x608
? vfs_read+0x2e9/0x2e9
? rcu_read_lock_sched_held+0x140/0x22a
? __bpf_trace_rcu_utilization+0xa/0xa
? debug_smp_processor_id+0x10/0x10
? selinux_file_permission+0x1c5/0x3c8
? rcu_sync_lockdep_assert+0x6a/0xad
? __sb_start_write+0x129/0x2ac
vfs_write+0x1aa/0x434
ksys_write+0xfe/0x1be
? __ia32_sys_read+0x82/0x82
do_syscall_64+0xcd/0x120
entry_SYSCALL_64_after_hwframe+0x49/0xbe

In the observed situation, the problem is seen because the codec driver
failed to probe due to a hardware problem.

max98090 i2c-193C9890:00: Failed to read device revision: -1
max98090 i2c-193C9890:00: ASoC: failed to probe component -1
cht-bsw-max98090 cht-bsw-max98090: ASoC: failed to instantiate card -1
cht-bsw-max98090 cht-bsw-max98090: snd_soc_register_card failed -1
cht-bsw-max98090: probe of cht-bsw-max98090 failed with error -1

The problem is similar to the problem solved with commit 2fc995a87f2e
("ASoC: intel: Fix crash at suspend/resume without card registration"),
but codec registration fails at a later point. At that time, the pointer
checked with the above mentioned commit is already set, but it is not
cleared if the device is subsequently removed. Adding a remove function
to clear the pointer fixes the problem.


Makes sense

Acked-by: Pierre-Louis Bossart 

I'd like to highlight that there is a fundamental flaw in the way the 
machine drivers are handled. Since we don't have a hook for the machine 
driver in the BIOS, the DSP driver creates a platform_device which will 
instantiate the machine driver. When errors happen in the machine driver 
probe, they are suppressed due to a 'feature' of the device model, so 
you can end-up with a broken configuration that is still reported as a 
successful strobe.




Cc: sta...@vger.kernel.org
Cc: Jarkko Nikula 
Cc: Curtis Malainey 
Signed-off-by: Guenter Roeck 
---
  sound/soc/intel/atom/sst-mfld-platform-pcm.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/sound/soc/intel/atom/sst-mfld-platform-pcm.c 
b/sound/soc/intel/atom/sst-mfld-platform-pcm.c
index 08cea5b5cda9..0e8b1c5eec88 100644
--- a/sound/soc/intel/atom/sst-mfld-platform-pcm.c
+++ b/sound/soc/intel/atom/sst-mfld-platform-pcm.c
@@ -706,9 +706,17 @@ static int sst_soc_probe(struct snd_soc_component 
*component)
return sst_dsp_init_v2_dpcm(component);
  }
  
+static void sst_soc_remove(struct snd_soc_component *component)

+{
+   struct sst_data *drv = dev_get_drvdata(component->dev);
+
+   drv->soc_card = NULL;
+}
+
  static const struct snd_soc_component_driver sst_soc_platform_drv  = {
.name   = DRV_NAME,
.probe  = sst_soc_probe,
+   .remove = sst_soc_remove,
.ops= _platform_ops,
.compr_ops  = _platform_compr_ops,
.pcm_new= sst_pcm_new,