On Mon, Jan 16, 2017 at 10:29 AM, Luis R. Rodriguez <[email protected]> wrote:
> On Mon, Jan 16, 2017 at 02:57:06PM +0000, Jakub Kicinski wrote:
>> Commit 5d47ec02c37e ("firmware: Correct handling of fw_state_wait()
>> return value") made the assumption that any error returned from
>> fw_state_wait_timeout() means FW load has to be aborted.  This is
>> incorrect FW load only has to be aborted when load timed out or
>
> You want a comma before FW -- but also:

Thanks!

>> has been interrupted,
>
> __fw_state_wait_common() returns -ENOENT when:
>
> if (ret != 0 && fw_st->status == FW_STATUS_ABORTED)
>         return -ENOENT;
>
> Why not for when -ENOENT is returned ?

I'm just going back to the pre-5d47ec02c37e behavior, I don't get all
the details of this code.  My understanding is that pre-5d47ec02c37e
we were only aborting on ret == 0 (i.e. timeout) or -ERESTARTSYS.

>> otherwise the waking thread had already
>> cleaned up for us.
>
> What code in what waking thread would have done precisely what cleanup?

That is not clear to me.  The waking is done in
firmware_loading_store().  I don't follow why firmware_loading_store()
is using fw_load_abort() in -1 case and fw_state_aborted() on an error
path of the 0 case (it's pre-git era stuff).  I assume the
fw_load_abort() unlinks the buffer so that next calls to store will
error out in the check on line 716.  I was initially going to change
that fw_load_abort() to *_aborted() but I'm afraid of the slight
change in user-visible behavior.

> And why can't fw_load_abort() handle being called twice and why not just
> instead allow for that?

Personal preference of making sure code is correct and not just able
to handle errors, I guess.

>> Fixes: 5d47ec02c37e ("firmware: Correct handling of fw_state_wait() return 
>> value")
>
> What does this fix exactly? A fix should describe the impact, what
> issues are in place without the fix. What also happens after the fix
> and why. In this commit log none of this is clear.

Sorry :S  The bug report was here:
http://www.mail-archive.com/[email protected]/msg1310204.html
I should've done a better job, the tl;dr is that calling *_abort()
again in case user helper wrote -1 (FW not found) is causing a
NULL-deref.

>> Signed-off-by: Jakub Kicinski <[email protected]>
>> ---
>>  drivers/base/firmware_class.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
>> index 4497d263209f..ce142e6b2c72 100644
>> --- a/drivers/base/firmware_class.c
>> +++ b/drivers/base/firmware_class.c
>> @@ -1020,7 +1020,7 @@ static int _request_firmware_load(struct firmware_priv 
>> *fw_priv,
>>       }
>>
>>       retval = fw_state_wait_timeout(&buf->fw_st, timeout);
>> -     if (retval < 0) {
>> +     if (retval == -ETIMEDOUT || retval == -ERESTARTSYS) {
>
> Also, if your change is correct I will also note fw_state_wait_timeout()
> is just a wrapper for __fw_state_wait_common(), but we also have
> another wrapper for __fw_state_wait_common() now:
>
> #define fw_state_wait(fw_st)                                    \
>         __fw_state_wait_common(fw_st, MAX_SCHEDULE_TIMEOUT)
>
> Do we need to fix anything for fw_state_wait() ?

I looked at it and I think it's fine.

> Clarifying all this would help review your proposed changes. If you
> consider them a fix please be very clear as to the exact issue and
> what is fixed with your patch.

Sorry again, I hope things are clearer now.

Reply via email to