On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov
<niko...@cumulusnetworks.com> wrote:
> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfel...@gmail.com> wrote:
>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
>> <niko...@cumulusnetworks.com> wrote:
>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
>>> <step...@networkplumber.org> wrote:
>>>> On Thu, 21 May 2015 03:42:57 -0700
>>>> Nikolay Aleksandrov <niko...@cumulusnetworks.com> wrote:
>>>>
>>>>> From: Wilson Kok <w...@cumulusnetworks.com>
>>>>>
>>>>> Check in fdb_add_entry() if the source port should learn, similar
>>>>> check is used in br_fdb_update.
>>>>> Note that new fdb entries which are added manually or
>>>>> as local ones are still permitted.
>>>>> This patch has been tested by running traffic via a bridge port and
>>>>> switching the port's state, also by manually adding/removing entries
>>>>> from the bridge's fdb.
>>>>>
>>>>> Signed-off-by: Wilson Kok <w...@cumulusnetworks.com>
>>>>> Signed-off-by: Nikolay Aleksandrov <niko...@cumulusnetworks.com>
>>>>
>>>> What is the problem this is trying to solve?
>>>>
>>>> I think user should be allowed to manually add any entry
>>>> even if learning.
>>>
>>> Hi Stephen,
>>> I have been thinking about the use case and have discussed it
>>> internally with colleagues and the patch
>>> author, the main problem is when there's an external software that
>>> adds dynamic entries (learning) and
>>> it could experience a race condition, here's a possible situation:
>>> * external software learns a mac from hw, sends an add to kernel
>>>  * right before that, port goes blocking (or down) and kernel flushes
>>>    mac, sends notification about the state change and mac flush
>>>  * right after that, kernel gets the previous add from external software, 
>>> it's
>>>    allowed to add, and then sends an add notification
>>>  * mean while, external software processes the link block/down and mac 
>>> flush,
>>>    followed by the mac add from kernel.  At this point, external software 
>>> can't
>>>    really know whether it's a user adding the mac intentionally or it's
>>>    a race.
>>>
>>> This issue can't really be avoided in user-space.
>>> As I've noted local and static entries are still allowed, and iproute2
>>> bridge utility always
>>> marks the entries as static (NUD_NOARP), this only affects external
>>> dynamic entries which
>>> are usually sent by something that does the learning externally.
>>> I'll keep digging to see if there's another way to go about this since
>>> I'd like to give the user
>>> full freedom. Personally I don't have strong feeling for this patch
>>> and if it's not preferred then
>>> I'll post a revert.
>>
>> So there is already a switchdev API to add/del an externally learned
>> FDB entry which holds rtnl_lock and avoids these races.  I would
>> suggest using that and revert this patch.
>>
>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
>> the handler in br.c:br_switchdev_event().
>>
>> -scott
>
> Hmm, I'm new to the switchdev API and am possibly missing something,
> but how do you suggest to use it here ?

You need to call
call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the
device learns a new mac/vlan on the port interface.

> How can we differentiate between user-added entry and an externally
> learned one ?

Externally added ones will be marked with NTF_EXT_LEARNED set in
ndm->ndm_flags in the netlink echo.  Manually added ones from the user
will have ndm->ndm_state set to NUD_NOARP in the netlink echo.

> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding
> an entry from user-space so
> the API can get called in br_fdb_add ?

No.  br_fdb_add is the bridge's .ndo_fdb_add handler called when user
manually adds an FDB entry using netlink RTM_NEWNEIGH.  For externally
learned entries, use the internal
call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL).

-scott
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to