On 7/28/15 10:01 AM, Eric Dumazet wrote:
On Tue, 2015-07-28 at 14:19 +0200, Hannes Frederic Sowa wrote:
Hello Eric,

On Mon, 2015-07-27 at 15:33 -0500, Eric W. Biederman wrote:
David Ahern <d...@cumulusnetworks.com> writes:

Allow tasks to have a default device index for binding sockets. If
set
the value is passed to all AF_INET/AF_INET6 sockets when they are
created.

The task setting is passed parent to child on fork, but can be set
or
changed after task creation using prctl (if task has CAP_NET_ADMIN
permissions). The setting for a socket can be retrieved using
prctl().
This option allows an administrator to restrict a task to only
send/receive
packets through the specified device. In the case of VRF devices
this
option restricts tasks to a specific VRF.

Correlation of the device index to a specific VRF, ie.,
    ifindex --> VRF device --> VRF id
is left to userspace.

Nacked-by: "Eric W. Biederman" <ebied...@xmission.com>

Because it is broken by design.  Your routing device is only safe for
programs that know it's limitations it is not appropriate for general
applications.

Since you don't even seen to know it's limitations I think this is a
bad path to walk down.

Can you please elaborate about the broken by design?

Different operating systems are already using this approach with good
success. I read your other mail regarding isolation of different VRFs
and I agree that all code which persists state depending solely on the
IP address is affected by this and this must be dealt with and fixed
(actually, there aren't too many).

But I wouldn't call that broken by design. This stuff will get fixed
like e.g. cross-talk between fragmentation queues, icmp rate limiters
etc, which could already happen in the past.

What is your opinion on the fundamental approach only from a user
perspective? Do you think that is broken, too?

I agree with Eric here.

This sk_bind_dev_if on task_struct is quite a hack.

What will be added next ? An array of dev_if ? netfilter support ?
af_packet support ? What about /proc files and netlink dumps ?

It could just as easily be a pointer to a struct (e.g., struct net_ctx) such that the intrusion to task_struct is simply 8 bytes -- very similar to the nsproxy used for the assorted namespaces. The struct can then contain whatever network config is imposed on the task.


We already have network namespaces. Extend this if needed, instead of
bypassing them.

Problems with using network namespaces for VRFs has been discussed in the past. e.g.,
    http://www.spinics.net/lists/netdev/msg298368.html

David


No need to add something else (with lack of proper reporting for various
tools)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to