בתאריך יום ב׳, 4 באוק׳ 2021, 20:00, מאת Eric Christian ‏<ercli...@gmail.com
>:

> I am not sure that only we can recreate the KNI request overwrite.  We may
> be the only ones with a current use case that exposes the vulnerability.
>  It is possible for any KNI operation to encounter this issue with the new
> async mechanism.  As long as the call to kni_net_process_request() is a
> separate thread from rte_kni_handle_request() this has the potential to
> occur with the use of async requests.
>
> All you need is one async KNI request followed closely by a second KNI
> request before the rte_kni_handle_request() has had a chance to process the
> first request.
>
> The kernel dev driver simply returns the error value back to the caller if
> it is less than zero.
>
>
> Eric
>

What I did in order to attempt to recreate this issue was:

Create a qemu VM with four cores.
Run the KNI sample application.

Then:

1. Run ifconfig vEth0_0 down followed immediately by ifconfig vEth0_0 up

2. Same as above but in a sample userspace C application which calls ioctl
twice in a row to achieve the same.


Both did not recreate the problem.

The only way to recreate it IMHO is to either:
 run the ioctl from two parallel threads or perhaps assign RT fifo priority
to the application calling ioctl in a row.

Elad


>
> On Mon, Oct 4, 2021 at 12:19 PM Elad Nachman <ela...@gmail.com> wrote:
>
>>
>>
>> On Mon, Oct 4, 2021 at 7:05 PM Ferruh Yigit <ferruh.yi...@intel.com>
>> wrote:
>>
>>> On 10/4/2021 3:58 PM, Elad Nachman wrote:
>>> > בתאריך יום ב׳, 4 באוק׳ 2021, 17:51, מאת Ferruh Yigit ‏<
>>> > ferruh.yi...@intel.com>:
>>> >
>>> >> On 10/4/2021 3:25 PM, Elad Nachman wrote:
>>> >>
>>> >> Can you please try to not top post, it will make impossible to follow
>>> this
>>> >> discussion later from the mail archives.
>>> >>
>>> >>> 1. Userspace will get an error
>>> >>
>>> >> So there is nothing special with returning '-EAGAIN', user will only
>>> >> observe an
>>> >> error.
>>> >> Wasn't initial intention to use '-EAGAIN' to try request again?
>>> >>
>>> > To signal user-space to retry the operation.
>>> >
>>>
>>> Not sure if it will reach to the end user. If user is calling "ifconfig
>>> <iface>
>>> down", it will just fail right, it won't recognize the error type.
>>>
>>> Unless this is common usage by the Linux network drivers, having this
>>> usage in
>>> KNI won't help much. I am for handling this in the kernel side if we can.
>>>
>>>
>> If user calls ifconfig <iface> down it will not happen. It requires some
>> multi-core race condition only Eric can recreate.
>>
>>
>>> >>
>>> >>> 2. Waiting with rtnl locked causes a deadlock; waiting with rtnl
>>> unlocked
>>> >>> for interface down command causes a crash because of a race
>>> condition in
>>> >>> the device delete/unregister list in the kernel.
>>> >>>
>>> >>
>>> >> Why waiting with rthnl lock causes a deadlock? As said below we are
>>> already
>>> >> doing it, why it is different with retry logic?
>>> >>
>>> > Because it can be interface down request.
>>> >
>>>
>>> (sure you like short answers)
>>>
>>> Please help me to see why "interface down" is special. Isn't it point of
>>> your
>>> patch to wait the request execution in the userspace even it is an async
>>> request?
>>>
>>> And yet again, number of retry can be limited.
>>>
>>>
>> No, it is not. Please look again:
>> https://patches.dpdk.org/project/dpdk/patch/20210924105409.21711-1-ela...@gmail.com/
>>
>>
>>
>>>
>>> >
>>> >> I agree to not wait with rtnl unlocked.
>>> >>
>>> >>> FYI,
>>> >>>
>>> >>> Elad.
>>> >>>
>>> >>> בתאריך יום ב׳, 4 באוק׳ 2021, 17:13, מאת Ferruh Yigit ‏<
>>> >>> ferruh.yi...@intel.com>:
>>> >>>
>>> >>>> On 10/4/2021 2:09 PM, Elad Nachman wrote:
>>> >>>>> Hi,
>>> >>>>>
>>> >>>>> EAGAIN is propogated back to the kernel and to the caller.
>>> >>>>>
>>> >>>>
>>> >>>> So will the user get an error, or it will be handled by the kernel
>>> and
>>> >>>> retried?
>>> >>>>
>>> >>>>> We cannot retry from the kni kernel module since we hold the rtnl
>>> lock.
>>> >>>>>
>>> >>>>
>>> >>>> Why not? We are already waiting until a command time out, like
>>> >>>> 'kni_net_open()'
>>> >>>> can retry if 'kni_net_process_request()' returns '-EAGAIN'. And we
>>> can
>>> >>>> limit the
>>> >>>> number of retry for safety.
>>> >>>>
>>> >>>>> FYI,
>>> >>>>>
>>> >>>>> Elad
>>> >>>>>
>>> >>>>> בתאריך יום ב׳, 4 באוק׳ 2021, 16:05, מאת Ferruh Yigit ‏<
>>> >>>>> ferruh.yi...@intel.com>:
>>> >>>>>
>>> >>>>>> On 9/24/2021 11:54 AM, Elad Nachman wrote:
>>> >>>>>>> Fix lack of multiple KNI requests handling support by
>>> introducing a
>>> >>>>>>> request in progress flag which will fail additional requests with
>>> >>>>>>> EAGAIN return code if the original request has not been processed
>>> >>>>>>> by user-space.
>>> >>>>>>>
>>> >>>>>>> Bugzilla ID: 809
>>> >>>>>>
>>> >>>>>> Hi Eric,
>>> >>>>>>
>>> >>>>>> Can you please test this patch, if it solves the issue you
>>> reported?
>>> >>>>>>
>>> >>>>>>>
>>> >>>>>>> Signed-off-by: Elad Nachman <ela...@gmail.com>
>>> >>>>>>> ---
>>> >>>>>>>  kernel/linux/kni/kni_net.c | 9 +++++++++
>>> >>>>>>>  lib/kni/rte_kni.c          | 2 ++
>>> >>>>>>>  lib/kni/rte_kni_common.h   | 1 +
>>> >>>>>>>  3 files changed, 12 insertions(+)
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>> <...>
>>> >>>>>>
>>> >>>>>>> @@ -123,7 +124,15 @@ kni_net_process_request(struct net_device
>>> *dev,
>>> >>>>>> struct rte_kni_request *req)
>>> >>>>>>>
>>> >>>>>>>       mutex_lock(&kni->sync_lock);
>>> >>>>>>>
>>> >>>>>>> +     /* Check that existing request has been processed: */
>>> >>>>>>> +     cur_req = (struct rte_kni_request *)kni->sync_kva;
>>> >>>>>>> +     if (cur_req->req_in_progress) {
>>> >>>>>>> +             ret = -EAGAIN;
>>> >>>>>>
>>> >>>>>> Overall logic in the KNI looks good to me, this helps to
>>> serialize the
>>> >>>>>> requests
>>> >>>>>> even for async ones.
>>> >>>>>>
>>> >>>>>> But can you please clarify how it behaves in the kernel side with
>>> >>>> '-EAGAIN'
>>> >>>>>> return type? Will linux call the ndo again, or will it just fail.
>>> >>>>>>
>>> >>>>>> If it just fails should we handle the re-try on '-EAGAIN' within
>>> the
>>> >> kni
>>> >>>>>> module?
>>> >>>>>>
>>> >>>>>>
>>> >>>>
>>> >>>>
>>> >>
>>> >> Elad.
>>>
>>>

Reply via email to