OK. Makes sense.
It might be OK from an sFlow perspective if the user-space lookup occasionally
fails. Just as long as it never retrieves the wrong action-list. That would
be a problem.
If you can look up the actions in user-space, do you still need to generate
that 64-bit packed-field? It seems like it would be better if you can avoid
doing that.
Neil
On Aug 11, 2011, at 11:39 AM, Ben Pfaff wrote:
> Neil, don't worry! This really is a step forward and not a step back.
> The actions passed to the kernel and back to userspace are not really
> as helpful as you think, because they discard a lot of semantic
> information. These actions do say *what* happened but they do not say
> *why*. One off-the-cuff example of the difference is that, if packets
> in some stream or streams are being mirrored ("spanned") to a
> monitoring port separate from sFlow, then the kernel actions will just
> show two outputs and we'll just report to sFlow that there were two,
> whereas it might be a better idea to ignore the mirror output and
> report the "primary" output port since that will presumably be more
> useful for analysis at the sflow collector.
>
> All that we were calculating from the actions before were the egress
> ports, VLAN IDs, and egress priority. That's only 48 bits total. The
> rest we can (and already do) recover from the flow data or the packet,
> which will still be available.
>
> When (if?) we do need more than 64 bits, the way forward is more work,
> but perfectly doable:
>
> 1. Instead of using the 64-bit data to store the sFlow
> information itself, switch to using it to store a unique
> ID. (The unique ID could even be kernel-generated.) Track
> this along with the userspace data associated with each
> flow.
>
> 2. When an sflow message arrives from the kernel, look up the
> userspace data for it based on the unique ID and use it to
> formulate the sflow message.
>
> 3. When a flow changes or gets deleted, keep a copy of the old
> version around in userspace as long as there might still be
> a sflow message coming along for it from the kernel.
>
> Only step #3 is any trouble at all. We might need to do this kind of
> thing anyway, for other reasons, if we want to have perfectly accurate
> flow stats in userspace. That's because, currently, there is a (very
> brief) window of time, after the kernel deletes a flow from its flow
> table and sends the final statistics to userspace, in which packets
> can still arrive and be processed according to the flow's previous set
> of actions. The same construct as #3 above is one possible solution.
>
> On Thu, Aug 11, 2011 at 11:09:36AM -0700, Neil McKee wrote:
>> I think the current behavior where the whole actions-list is serialized and
>> passed up with the sample is actually really elegant and future-proof. New
>> actions are being added all the time, and having access to a sampled-stream
>> of "exactly-what-is-happening-in-there" is always going to be good to have.
>> Not just for sFlow reporting, but for any analysis or optimization that you
>> might want to apply. There is a symmetry to this solution: what you push
>> down to the kernel is a list of actions, and what you get back on the
>> sampling feed is a packet and the list of actions that were applied to it.
>> What's not to like about that?
>>
>> Furthermore, whereas your current sFlow implementation may only be
>> reporting the ingress and egress ports, VLAN Ids, and L2 priorities, plus
>> the number of outbound interfaces (can you get all that into 64-bits?),
>> that doesn't mean you will never want to report more than that. Perhaps
>> there is already something else that should be exported with the sFlow feed?
>> If so, a new sFlow structure can be designed and it can go out with the
>> sampled header as another annotation. It's open-ended, and adding new
>> structures here will never require any further change to the kernel module.
>>
>> In particular, if an open-flow controller is receiving this sFlow feed so
>> it can make network-wide optimization decisions in real time, then it seems
>> likely that you would want to export details about all the actions.
>> Restricting the output might result in awkward blind-spots, which could make
>> these optimization algorithms exponentially harder to write. The HPLabs
>> article referenced in this blog post is a good example of the sort of
>> network-wide optimization I am referring to:
>> http://blog.sflow.com/2011/05/openflow-and-sflow.html
>>
>> And a first draft to define new structures for reporting OpenFlow-related
>> data over sFlow is here:
>> http://www.sflow.org/draft-sflow-openflow.txt
>> but given the rapid pace of OpenFlow development, this probably needs to be
>> reworked.
>>
>> Neil
>>
>>
>>
>>
>>
>>
>>
>> On Aug 9, 2011, at 9:42 PM, Ben Pfaff wrote:
>>
>>> On Tue, Aug 09, 2011 at 10:46:38AM -0700, pravin shelar wrote:
>>>> This patch is first step toward generic/simple sampling
>>>> action which will be completely independent of sflow.
>>>> Following patch cleanup sampling upcall by striping it down to
>>>> minimum essential data that userspace can not figure out for
>>>> itself.
>>>>
>>>> Signed-off by: Pravin Shelar <[email protected]>
>>>
>>> (That's "-by", not " by".)
>>>
>>> I like the general approach here--the sflow support is a big wart--but I
>>> think that the userspace interface changes represent a regression.
>>> Before, by passing the actions down to userspace, there was always a
>>> consistent view of those actions. With this change (as comments in the
>>> code make clear) the actions might have changed or the facet might even
>>> have been deleted.
>>>
>>> A really good fix would probably require being able to somehow keep
>>> old copies of flows that have changed around for a while in userspace,
>>> link them to the sflow messages, and free them when they are known not
>>> to be needed anymore. I think that's probably feasible, but in fact I
>>> think that we can avoid regression with something simpler.
>>>
>>> Suppose we keep the "sample pool" netlink option that this patch
>>> removes, and add a way for userspace to set a small amount of user data
>>> on a flow, and pass that along as the "user data" netlink option. 64
>>> bits of user data should be plenty of space to store what the userspace
>>> sflow implementation actually needs (16 bits of final VLAN TCI plus 32
>>> bits for sFlow's idea of output port). Plus, those same 64 bits would
>>> be more than enough for the sequence number that the "really good fix"
>>> mentioned in the previous paragraph would probably need.
>>>
>>> With the current patch, the obvious place to put this user data would be
>>> in the flow itself. Another approach (that you have probably already
>>> considered) is to make sFlow sampling a flow action, in which case the
>>> user data could be that action's argument.
>>>
>>> Thanks,
>>>
>>> Ben.
>>> _______________________________________________
>>> dev mailing list
>>> [email protected]
>>> http://openvswitch.org/mailman/listinfo/dev
>>
_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev