The DeviceIOControl() error should correspond to a transport error in Linux  or 
to a system error.
We need to transfer all transaction failures through the reply buffer 
(nlmsghdr+nlmsgerr) and return success for the DeviceIOControl().
I am aware that we didn't follow the exact  semantics in the driver and we 
sometimes fail the I/O request in the case where the transaction fails. It 
should be fixed in the driver.
Thanks,
Eitan

 

-----Original Message-----
From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Ben Pfaff
Sent: Tuesday, April 14, 2015 2:31 PM
To: Alin Serdean
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] netlink-socket: Exit NL transaction loop when 
EINVAL is returned

Does every error returned by DeviceIoControl() correspond to a transport error 
(with the meaning that I explained earlier)?
nl_sock_transact_multiple__() should report that are not transport errors by 
constructing a nlmsghdr reply with an mlmsgerr as its payload.

Here is the high level situation.  With a Linux kernel, it is more efficient to 
send multiple Netlink requests to the kernel in a batch using a single system 
call.  The results are equivalent to sending the requests one-by-one.  As an 
optimization, if there is a severe error that means that no requests are going 
to succeed at all, then the code may skip sending the remaining requests.  That 
is the only case that this is meant to handle:

        } else if (error) {
            VLOG_ERR_RL(&rl, "transaction error (%s)", ovs_strerror(error));
            nl_sock_record_errors__(transactions, n, error);
        }

This code will *not* handle ordinary errors short of this kind of severe error 
properly, and should not be invoked in that case.  (It is also buggy in that 
case, and I see that the patch fixes that bug, but that is not my point here.)  
If in fact on Windows it is getting invoked in cases other than a severe error, 
then
nl_sock_record_errors__() should be modified to report ordinary errors 
differently (probably by constructing an nlmsghdr+nlmsgerr).

Thanks,

Ben.

On Tue, Apr 14, 2015 at 09:19:10PM +0000, Alin Serdean wrote:
> In nl_sock_transact_multiple__ we do the following:
> 
> if (!DeviceIoControl(sock->handle, OVS_IOCTL_TRANSACT,
>                              txn->request->data,
>                              txn->request->size,
>                              reply_buf, sizeof reply_buf,
>                              &reply_len, NULL)) {
>             /* XXX: Map to a more appropriate error. */
>             error = EINVAL;
>             break;
>         }
> 
> We map every failure to EINVAL that is why it did not pop out into Linux.
> 
> We should definitely log the error using ovs_lasterror_to_string before 
> setting it to EINVAL.
> 
> Maybe we should just increase the number of transactions in some 
> situations(i.e. STATUS_INVALID_PARAMETER) as an idea to allow the rest of the 
> transactions to be processed.
> 
> Alin.
> -----Mesaj original-----
> De la: dev [mailto:dev-boun...@openvswitch.org] ??n numele Ben Pfaff
> Trimis: Tuesday, April 14, 2015 11:42 PM
> C??tre: Sorin Vinturis
> Cc: dev@openvswitch.org
> Subiect: Re: [ovs-dev] [PATCH] netlink-socket: Exit NL transaction 
> loop when EINVAL is returned
> 
> On Tue, Apr 14, 2015 at 08:25:59PM +0000, Sorin Vinturis wrote:
> > The nl_sock_transact_multiple function enters in an infinite loop, 
> > when invalid error, EINVAL, is returned by nl_sock_transact_multiple__.
> > EINVAL is the error returned by the latter function when a driver 
> > request fails.
> > 
> > Signed-off-by: Sorin Vinturis <svintu...@cloudbasesolutions.com>
> > Reported-by: Alin Gabriel Serdean <aserd...@cloudbasesolutions.com>
> > Reported-at: 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_open
> > vswitch_ovs-2Dissues_issues_57&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXV
> > eAw-YihVMNtXt-uEs&r=CWsgHUxi6ExLXY798tmo3LJ4e3geGYp56lkcH-5cLCY&m=7b
> > qSnC4PRttbbFBZrlSXsx-0THTQOfNE_D4bKpkkChw&s=2zCC8GZSSTlR-cjRseO-dFWa
> > CIQRAb8jm-uXX-zOa5E&e=
> 
> I see that this fixes a bug, even on Linux.  Thank you.  It's actually a 
> pretty serious bug (given the infinite loop), but I guess that it must not 
> occur in any normal circumstances, otherwise we would have heard about it 
> over the years.
> 
> However, I want to make sure of something before I commit it.
> nl_sock_transact_multiple__() should only return an error in the case of a 
> "transport" error, that is, of some problem communicating with the datapath 
> (e.g. the kernel module has been removed or something similarly fatal).  It 
> should not return an error in cases where some message asks the datapath to 
> do something erroneous (e.g. to add a flow that the datapath doesn't 
> understand, to delete a vport that doesn't exist, ...).  This is because only 
> in the former case should all of the transactions be aborted; in the latter 
> case, any remaining transactions should still be processed.
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_ma
> ilman_listinfo_dev&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-
> uEs&r=CWsgHUxi6ExLXY798tmo3LJ4e3geGYp56lkcH-5cLCY&m=7bqSnC4PRttbbFBZrl
> SXsx-0THTQOfNE_D4bKpkkChw&s=VAEy6koop3mzzQnhv07O332b_domfmicbOiZ1HF2rf
> M&e=
_______________________________________________
dev mailing list
dev@openvswitch.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_mailman_listinfo_dev&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=CWsgHUxi6ExLXY798tmo3LJ4e3geGYp56lkcH-5cLCY&m=7bqSnC4PRttbbFBZrlSXsx-0THTQOfNE_D4bKpkkChw&s=VAEy6koop3mzzQnhv07O332b_domfmicbOiZ1HF2rfM&e=
 
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to