Le 21/10/2015 19:12, Hannes Frederic Sowa a écrit :
Hello,

On Wed, Oct 21, 2015, at 17:56, David Miller wrote:
From: Jiri Benc <jb...@redhat.com>
Date: Wed, 21 Oct 2015 17:25:02 +0200

On Wed, 21 Oct 2015 08:32:14 -0700 (PDT), David Miller wrote:
As you say the apps are broken, so file a bug and have them fixed.

The assumption is clearly invalid, so apps cannot make such an
assumption.

Does it mean you would be okay with a patch that always allocates and
assigns a new ifindex in the target netns when interface is moved
between name spaces?

I think you're misunderstanding me if you're still recommending
kernel changes.

I'm plainly saying to remove the assumption in the apps.

If you don't show me exactly how some kernel change can lead to
the apps implementing things properly, without the invalid
assumptions, then I can only assume you didn't hear what I
said.

I think the reason why ifindexes exists as ints is that we want to have
lightweight way to refer to interfaces without taking references or
timestamps or generation ids which completely remove the possibility for
races. But the racy nature in ifindexes is something we actually want,
otherwise a user space program acquiring an ifindex would need to get a
reference on the device and either during socket close or program
termination release it, that would be very costly.

This patch minimizes the race quite a lot, from something we could
actually see in everydays container creation to probably something only
some users will expire with depleting the ifindex pool or playing around
with CRIU.

We could come up with more heavy machinery to close the race further for
CRIU by keeping track of "poisoned" ifindexes, which would need a
hashmap which could become pretty big and we could recycle when ifindex
wraps around, but this seems too heavy weight to me.

I am in favor of a solution to minimize this race in the kernel even
though we cannot ever close it completely.
I probably miss something, but if the app listens netlink, I don't see how such
app may have a race window.

With the proposed scenario:
1. create netns 'new_netns'
2. in root netns, move the interface with ifindex 2 to new_netns
3. in new_netns, delete the interface with ifindex 2
4. in new_netns, create an interface - it will get ifindex 2

Operation 2 and 4 are done by dev_change_net_namespace() under rtnl_lock().
RTM_DELLINK(root netns) and RTM_NEWLINK(new_netns) are sent by this function.
It means that operation 3 has been done before and that RTM_DELLINK(new_netns)
has been sent before.

Regards,
Nicolas
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to